Cost-efficient Active Illumination Camera For Hyper-spectral Reconstruction

Yuxuan Zhang\authormark1    T.M. Sazzad    Yangyang Song    Spencer J. Chang    Ritesh Chowdhry    Tomas Mejia    Anna Hampton    Shelby Kucharski    Stefan Gerber    Barry Tillman    Marcio F. R. Resende    William M. Hammond    Chris H. Wilson    Alina Zare and Sanjeev J. Koppal\authormark2 \authormark1, 2 Yuxuan Zhang and Sanjeev J. Koppal are with the Department of Electrical and Computer Engineering, University of Florida. Sanjeev J. Koppal holds concurrent appointments as an Associate Professor of ECE at the University of Florida and as an Amazon Scholar. This paper describes work performed at the University of Florida and is not associated with Amazon. \authormark[email protected] \authormark[email protected]
journal: opticajournalarticletype: Research Article
{abstract*}

Hyper-spectral imaging has recently gained increasing attention for use in different applications, including agricultural investigation, ground tracking, remote sensing and many other. However, the high cost, large physical size and complicated operation process stop hyperspectral cameras from being employed for various applications and research fields. In this paper, we introduce a cost-efficient, compact and easy to use active illumination camera that may benefit many applications. We developed a fully functional prototype of such camera. With the hope of hel** with agricultural research, we tested our camera for plant root imaging. In addition, a U-Net model for spectral reconstruction was trained by using a reference hyperspectral camera’s data as ground truth and our camera’s data as input. We demonstrated our camera’s ability to obtain additional information over a typical RGB camera. In addition, the ability to reconstruct hyperspectral data from multi-spectral input makes our device compatible to models and algorithms developed for hyperspectral applications with no modifications required.

1 Introduction

Hyper-spectral imaging has gained increasing interest in recent years. The ability to capture rich spectral information beyond traditional RGB channels has been proven to help reveal more information from the subject of interest, including agricultural and plant researches [1, 2, 3, 4, 5]. Compared to RGB cameras, hyper-spectral imaging provides both richer information inside the visible bandwidth (380 - 700nm) and additional information outside of the visible bandwidths which would be unavailable otherwise. However, hyper-spectral cameras also tend to cost significantly more than traditional cameras and to have a large physical size. Its shortcomings limited the chance of integrating hyper-spectral imaging into a larger variety of industrial applications or research projects.

As an example, minirhizotron is a commonly used tool for research of plant roots [6, 7, 8, 9]. It is a tube shaped inspection tool made of transparent materials. Its inner diameter varies from as small as 5555cm up to 2030203020-3020 - 30cm. It reaches deep into soil near the plant to be inspected, enabling researchers to visually inspect plant roots in a nondestructive way. A lot of studies have been carried out with the help of minirhizotron and compact monochrome or RGB cameras [10, 11, 12]. Although these studies could potentially be significantly benefited from hyper-spectral imaging, up to our best knowledge, there does not exist a feasible way to provide hyperspectral imaging inside such narrow space.

Given the potential benefits of enabling hyper-spectral imaging inside narrow spaces such as a minirhizotron, we investigated a new approach of obtaining spectral information. Instead of performing a full-spectral scan, our approach focuses on a few discrete bands that were selected to carry the most potential information. Our device utilizes readily available parts from the consumer market, and our design features an ultra compact size so it can fit into applications with very narrow operating space for imaging. In addition to compact size, our device also has the advantages of low cost and ease of use. Such advantages makes it possible to be massively deployed and autonomously operated.

2 Related Work

Hyper-spectral imaging (HSI) have had a significant impact in agriculture and remote sensing. For example, classification allows for separating diseased from healthy plants using additional color band information [13, 14, 15, 16, 17]. In addition, detecting the maturity of crops and finding stressed crops are important HSI tasks [18, 19, 20, 21, 22, 23].

The ever-increasing accessibility of HSI has fueled widespread impact[24, 25, 26, 27, 28, 29], with studies on hundreds of spectral bands [30, 31, 32, 33]. These bands provide rich information that can be analyzed and identified to differentiate objects in the scene [34, 35]. HSI have impacts on fields as diverse as agriculture (plant disease detection and classification, etc), tissue analysis in medical science, forest land detection, mining, and mineral study, vegetation estimation, protection of the environment, biological analysis, etc [36, 37, 38, 39, 40, 41].

However, the large data footprint of the spectral bands has disadvantages, such as high computational time complexity, transmission, storage and analysis [32, 42, 43, 44, 45]. Therefore, redundant information and time complexity minimization is important to reconstruct hyper-spectral data efficiently [43, 46, 47, 48]. Techniques for compressing the measurements include clustering [49, 50], Ranking approaches [51, 52, 53, 54], greedy approaches [55, 56] and evolutionary approaches [57, 58].

Dimensionality reduction of HS samples: A range of band-selection approaches have been proposed in the existing literature. Typically, to decrease the dimensionality of a hyper-spectral image, these band selection approaches can be categorised into two specific groups that include feature extraction [59, 60] and feature selection (band selection) [61, 62, 63, 64, 46, 65]. Interestingly, the above mentioned two specific methods extract or select data from all HSI bands to correspond to the whole spectral cube, and the outcomes are almost or roughly equivalent to the full HSI bands. Currently, traditional feature selection approaches are considered to be the most commonly applied techniques which include Principal Component Analysis (PCA) [53], Maximum Noise Fraction (MNF) [66], Genetic Algorithm, and FICA (Fast Independent Component Analysis) [59]. For feature extraction approaches, high dimensional spatial data are mapped into low-dimensional space considering specific criteria, thus extracting a complete new sub-set of features which represents the original HS data. Unfortunately, during spatial transformation, the physical denotation of the original HS data cannot be found same as well as it is also possible that some of the key or main information may be lost forever. On the other hand, for the band selection approaches a distinct and representative sub-set is selected from the unique hyper-spectral data which preserves the physical denotation and information without loss. Additionally, it also preserves the intrinsic characteristics of the HS data. In this article, we have focused on feature (band) selection rather than feature (band) extraction.

Supervised and Semi-supervised labelling: Supervised approaches require labelled samples for the selection of most favourable bands during training and learning [67] where similarity measure metric is used among the class labels. These techniques require a number of assessment conditions which can be categorised as: information divergence [68], maximum ellipsoid volume [69], Euclidean distance [70] etc. For semi-supervised approaches graph based models are used for labelled and unlabelled data samples for the selection of appropriate spectral bands [71, 72] but suffers from providing contextual information.

There exist attempts to integrate multispectral camera into minirhizotron[73]. This work implements an automatically operated multi-spectral camera that can actuate by itself inside a minirhizotron. Their camera was also equipped with multi-bandwidth light sources. However, instead of exploring the information contained in data acquired from such device, this work primarily focused on autonomous operation and remote deployment.

3 Hardware Design

We propose a type of camera that uses LEDs of different bands instead of color filters to capture spectral information. In our setup, the camera sensor does not need to be equipped with any type of optical filter, not even the Bayer filter for an RGB sensor or IR filter commonly seen on the back of lenses. This setup works perfectly in the scenarios where the on-camera LEDs are the only source of light. Moreover, it can also deal with static scenes with a moderate amount of ambient light by treating the ambient light as "dark field" and only tracking the incremental light upon firing each LED.

Refer to caption
(a)
Refer to caption
(b)
Figure 1: (a) Photo of Version-0 camera designed for data collection; (b) Layout of the LEDs that minimizes the intensity distribution of different colors. Left: The LED module; Center: All LEDs turned on, excluding UV; Right: Single color lit.

Our prototype was designed to house 8 different types of LEDs, each having a unique bandwidth of our choice. The optical characteristics of each LED model is listed in Table.1, the selection of bandwidths will be further discussed in next section. We mount 4 LEDs per color, and as is shown in Fig.1(b), the LEDs with same color are mounted in a rectangular pattern in order to illuminate the subject as evenly as possible. Such arrangement minimizes the difference of light distribution across different colors.

Since all parts of our camera are standard market-ready parts, we can achieve an extremely low unit cost compared to hyper-spectral cameras currently in the market. In addition, the constraints of the environment in which the camera will be operated is another crucial aspect we should take care of. In our case, we are targeting the minirhizotrons broadly used to study plants. Minirhizotron is a specialized tool used in biology and agriculture researches to study roots in their natural environment. It is a transparent tube inserted into the soil, making it possible to observe roots as they grow over time.

The challenge lies in the geometry of the minirhizotron. The diameter of a typical minirhizotron is between 5 to 10 centimeters. And the one used by our collaborating research group is only 5cm in diameter. Even for our camera, fitting itself into such a narrow tube is a significant engineering challenge. Most of the existing hyper-spectral cameras in the market feature a much larger dimension, making them impossible to be deployed in such environments.

Refer to caption
Figure 2: The same components can fit into a minirhizotron with a dedicated case design.

Our design addressed these challenges by both utilizing small monochrome sensor modules and a custom designed printed circuit board (PCB) to integrate the LEDs. The camera module and the LED module were stacked upon each other with only 5mm distance. The LED module has an opening in its center to let through a wide angle lens that sits on the camera module’s PCB. Each module has an upstream USB port to be connected with the host computer (a raspberry pi), and the USB port handles both power and communication. The first camera was designed to work with the flat surface of a rhizobox, but the same core components can be easily adapted into a 5cm diameter minirhizotron with a different case (Fig.2).

4 Selection of Bandwidths

Table 1: Characteristics of Selected LEDs
Type λpeaksubscript𝜆peak\lambda_{\text{peak}}italic_λ start_POSTSUBSCRIPT peak end_POSTSUBSCRIPT ΔλΔ𝜆\Delta\lambdaroman_Δ italic_λ VFsubscript𝑉FV_{\text{F}}italic_V start_POSTSUBSCRIPT F end_POSTSUBSCRIPT Imaxsubscript𝐼maxI_{\text{max}}italic_I start_POSTSUBSCRIPT max end_POSTSUBSCRIPT
Ultra Violet 395395395395 nm 10101010 nm 3.33.33.33.3 V 6060~{}6060 mA
Blue 466466466466 nm 15151515 nm 2.92.92.92.9 V 3030~{}3030 mA
Green 520520520520 nm 15151515 nm 2.92.92.92.9 V 3030~{}3030 mA
Yellow-Green 573573573573 nm 20202020 nm 2.42.42.42.4 V 2525~{}2525 mA
Yellow 585585585585 nm 20202020 nm 2.42.42.42.4 V 2525~{}2525 mA
Orange 600600600600 nm 20202020 nm 2.42.42.42.4 V 2525~{}2525 mA
Red 660660660660 nm 17171717 nm 2.12.12.12.1 V 100100100100 mA
Infrared 940940940940 nm 40404040 nm 1.31.31.31.3 V 200200200200 mA
λpeaksubscript𝜆peak\lambda_{\text{peak}}italic_λ start_POSTSUBSCRIPT peak end_POSTSUBSCRIPT is the peak wavelength of the spectrum;
ΔλΔ𝜆\Delta\lambdaroman_Δ italic_λ is the half width of the spectrum.

We performed analysis on data gathered by the reference HSI camera, using the same rhizobox samples that will be used by our camera (analysis was performed by T. M. Sazzad). Out of all 8 bandwidths, three of our selections matched exactly with the optimal bands (blue, green and red). The other bandwidths are limited by stock availability when producing the prototype, and thus did not match exactly with calculated optimal values.

5 Data Collection and Processing

We collected a dataset to train the U-Net based model for spectrum reconstruction. Since the ground truth was required to train the model, and our reference data needed to be captured by a reference HSI device which does not fit into a minirhizotron, the initial version of camera we developed (shown in Fig.1(a)) was especially designed to match the results from the reference camera. Both cameras would take hyper-spectral images of plant roots in rhizoboxes. The rhizobox is a container that works similarly to a minirhizotron, the difference being the rhizobox features a flat transparent surface so it’s easier to take photos with all types of cameras. In addition, the compact size of a rhizobox makes it easy to be moved around.

Refer to caption
(a) Original
Refer to caption
(b) Intensity Normalized
Refer to caption
(c) Undistorted
Figure 3: Calibration pipeline for our custom camera.
Refer to caption
(a) Raw image captured by our camera
Refer to caption
(b) Results of post-processing pipeline
Figure 4: (a) Raw image captured by our camera, showing 8 different bands of a same region, each rendered with corresponding pseudo color; (b) Processed post-processing results, all calibrations applied. The distortions are corrected, camera frames are clipped out of view, and the dark corners are adjusted so that the brightness is uniform across the entire image.

Due to the distortion and uneven intensity distribution introduced by hardware, the raw data (Fig.4(a)) captured from our camera has to be pre-processed before proceeding. In our calibration pipeline, the raw image is firstly mapped by a "reference white" calibration image, eliminating uneven intensity distribution introduced by light sources and lens projection. The reference white was captured and applied separately for each color. Geometric correction was then applied based on a chess board image, which is captured by the same camera. The brightness distribution calibrated in the previous step will be retained during this process (calibration pipeline shown in Fig.3, sample calibrated image shown in Fig.4(b)).

Refer to caption
Figure 5: Comparison of the sizes of the rhizobox, the reference camera’s imaging area, and our camera’s imaging area. Dimensions shown in the figure are conceptual, and are not strictly proportional to their actual sizes.
Refer to caption
(a) Raw HSI image (Ground Truth)
Refer to caption
(b) Matched Reference Image
Refer to caption
(c) Our Camera’s Result
Refer to caption
(d) Align Mismatch
Figure 6: Sample of a successful match between reference camera and our camera.

The final step of data processing is matching our camera’s result with the reference camera’s result. The sizes of (1) the rhizobox, (2) the reference camera’s imaging area, and (3) our camera’s imaging area are shown in Fig.5. We first manually downscale the size of our camera’s output (1024×1024102410241024\times 10241024 × 1024 after calibration) to match the pixel density of the reference camera (286×286286286286\times 286286 × 286). Then we are able to run template matching on the reference image (using the down-scaled image as template). Fig.6 shows a sample of a successful match.

6 Reconstruction Model

A model based on U-Net was implemented and trained using the data we collected. The model was intended for both removing the bright LED light spots introduced by our custom light module and expanding the number of bands from 8 to 299.

6.1 Model Architecture

U-Net is a convolutional neural network architecture that is commonly used for segmentation tasks in computer vision [74]. However, instead of performing typical segmentation tasks, we explored the feasibility of using such architecture to perform spectral reconstruction tasks. This is based on the assumption that the U-Net could learn the inherent spectrum structure of different soil and root components based on the training set, and then transfer its prior knowledge to new data while maintaining contextual awareness.

Refer to caption
Figure 7: Asymmetric network structure designed for spectral reconstruction.

In order to fully leverage the resolution of our camera sensor, we reformed the model into an asymmetric structure. The left side of the network (encoder) descends faster on spatial resolution, while the right side (decoder) descends slower on spectral dimension. In this setup, the dimensions of the corresponding layers on each side no longer match. A resizer had to be implemented within each feed-forward shortcut connection.

6.2 Data Augmentation

Refer to caption
Figure 8: Samples of random affine augmentation. Data points are shown in pseudo RGB.
Table 2: Ranges of Parameters for Random Affine
Type Lower Limit Upper Limit Distribution
Translate X 20%percent20-20\%- 20 % 20%percent2020\%20 % NORMAL
Translate Y 20%percent20-20\%- 20 % 20%percent2020\%20 % NORMAL
Scale 80%percent8080\%80 % 160%percent160160\%160 % NORMAL
Rotate 30deg30degree-30\deg- 30 roman_deg 30deg30degree30\deg30 roman_deg NORMAL
Shear 5deg5degree-5\deg- 5 roman_deg 5deg5degree5\deg5 roman_deg NORMAL

Random affine augmentation was employed in the training process. The affine transformation is a combination of five different transformations. Each transformation is controlled by a separate set of parameters. The list of applicable transformations and their probability distributions are listed in Table.2, all parameters follow normal distribution within the given range. This data augmentation technique is used to compensate for the limited amount of samples we are able to collect. It also helps to prevent the model from over fitting to scale and/or distortion. If the transformed image cannot fill up the entire canvas, we have two different methods to fill it up. The simpler approach is to use a uniform average color for filling, another is mirroring the original image so the content looks consistent and meaningful. In practice, the mirroring option would consume too much resources (8 times more memory consumption) and would consequently limit the batch size to 1. Therefore, we adopted uniform solid color fill throughout all stages of training. In addition, the mask of valid pixels was preserved for each affine transform and re-applied for the loss function.

Refer to caption
Figure 9: Sample of otsu method preprocessing. Left: original image with LED light spots. Middle: Binary masks computed with otsu method. Right: enhanced image. All images have 8 channels, all displayed in pseudo RGB

Another challenge is the bright LED spots found in our camera’s results. These light spots (reflections of the LED lights) are supposed to be ignored and filled up by the model. In order to help our U-Net model learn to do this, we implemented Otsu’s thresholding as an optional add-on to the data augmentation pipeline. The Otsu method searches for the threshold that minimizes the intra-class variation for both sides of the threshold. This method applies well on our case, which has a cluster of very bright pixels (the light spots) and another cluster of dim pixels (the soil and root pixels). As is shown in Fig.9, this method creates an accurate mask separating light spot pixels from useful data. Based on the mask, our framework then performs an interpolation based on other bandwidths of the same pixel which are not affected by the light spot and uses the interpolated value in place of the over-exposed value.

6.3 Training Process

Our processed dataset has 95 pairs of working samples, each of them consists of a reference matrix of shape 286×286×299286286299286\times 286\times 299286 × 286 × 299 pixels (the ground truth) and an input matrix of shape 1024×1024×81024102481024\times 1024\times 81024 × 1024 × 8 pixels. These samples (95 total) were divided into 85 samples as training set and 10 samples as test set. The list of samples used as test set was randomly generated upon initialization of a training. No manual selection was involved in the process of splitting training set and test set. However, it is worth noticing that there might be a minor overlap between nearby samples. The percentage of overlap is generally less than 10%, and our splitting algorithm does not take care of the overlap.

We used a combination of several different loss function to train the model. The effective loss function is the weighted sum of the following functions:

  • Mean average error (MAE), also known as L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT distance.

    L1[x,y]=n=1N|xnyn|subscript𝐿1𝑥𝑦superscriptsubscript𝑛1𝑁subscript𝑥𝑛subscript𝑦𝑛L_{1}\left[~{}\vec{x},~{}\vec{y}~{}\right]=\sum_{n=1}^{N}\left|\vec{x}_{n}-% \vec{y}_{n}\right|italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT [ over→ start_ARG italic_x end_ARG , over→ start_ARG italic_y end_ARG ] = ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | over→ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - over→ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT |
  • Mean square error (MSE), also known as L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT distance.

    L2[x,y]=n=1N(xnyn)2subscript𝐿2𝑥𝑦superscriptsubscript𝑛1𝑁superscriptsubscript𝑥𝑛subscript𝑦𝑛2L_{2}\left[~{}\vec{x},~{}\vec{y}~{}\right]=\sum_{n=1}^{N}\left(\vec{x}_{n}-% \vec{y}_{n}\right)^{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT [ over→ start_ARG italic_x end_ARG , over→ start_ARG italic_y end_ARG ] = ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ( over→ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT - over→ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
  • Delta-pixel error

    This is a custom loss function that measures how different a pixel is against its neighboring pixels (in both width and height direction). The total error is the mean of all pixels’ max absolute difference against their neighboring pixels. This loss function smooths out the noise introduced by the model.

  • Delta-bands error

    This is a custom loss function that measures how different a pixel of a band is against its neighboring bands. The total error is the mean of all bands’ max absolute difference against their neighboring bands. This loss function rewards smoothness of the band plot for each pixel.

It is worth mentioning that, weights for each component above were manually configured in each stage of training. In pre-train phase, all above loss functions are used to derive the total loss. The delta-pixel and delta-bands error each has a weight of 4, the MSE and the MAE each has a weight of 1. In main training phase, we only use smooth L1 loss.

Table 3: Test Set Prediction Performance (Statistical)
Sampling Date Box ID Region MAE Loss MSE Loss
03/23/2023 77 F1 0.036009 0.002377
03/23/2023 77 C5 0.029735 0.001592
03/23/2023 69 F5 0.032207 0.001837
03/23/2023 77 F5 0.036013 0.002481
03/23/2023 77 F7 0.032229 0.001981
03/23/2023 69 C1 0.029626 0.001686
Refer to caption
Figure 10: Samples of model prediction results, each comes with 4 manually selected sample points (A, B on plant root pixels; C, D on soil pixels). Samples are selected with minimum bias. Definition of "Prediction Error" is described in Sec.6.4

The model was trained with a batch size of 5 for more than 1000 epochs each stage.

In the first stage, the model was fed with ground truth as input (pre-train). The ground truth (299 bands) data-points were first projected into the 8 bands corresponding to our camera’s results, and then reshaped to mimic the dimensions of actual input data. This stage is designed to teach the network to learn to extend low dimension data into high dimensions without any interference. Random affine augmentation was enabled in this stage.

The model parameters from the first stage of training were then loaded back for the second (main) stage of training. In this stage, 3 out of 5 samples (each batch) were projected from the ground truth, just like what we did in the first stage. And the rest 2 samples are loaded from raw samples from our camera. Affine augmentation was also enabled in this process. This stage was added in order to make the network aware of LED light spots. The spots can appear in any shape, size and form factor with the help of affine transformation, they could also be completely absent since 60%percent6060\%60 % of the inputs used in this stage were converted directly from ground truth. This stage helps the network to gradually learn to deal with LED spots in our input dataset.

The mixture of ground truth inputs and raw inputs also ensured that the model "respects" the details of its input. In our previous trials, the model over-processed the input and removed the details entirely due to pixel shifts caused by alignment errors. This small misalignment confused the network and issued a punishment when the model generated sharp and detailed predictions. This misalignment can be found by comparing point B in sample (3) of Fig.10. In ground truth, point B sits on the left side of the root, while in the raw input, point B sits on the right side. This mismatch can also be spotted from the prediction error, the blue curve and red curve concentrate on different sides of the sample point.

6.4 Training Result and Model Performance

Due to the high volume of data expected in both spatial and spectral dimensions, it is not straightforward to demonstrate the overall performance in generalized metrics. Instead, we randomly picked a set of test samples and the spectrum of select pixels in these samples to demonstrate the training result (shown in Fig.10).

In the following analysis and "Prediction Error" graphs shown in Fig.10, we used the spectral angular difference to quantize prediction. It is defined as:

Letp1=IMGground truth(x,y)Letsubscript𝑝1subscriptIMGground truth𝑥𝑦\displaystyle\text{Let}~{}~{}\vec{p}_{1}=\text{IMG}_{\text{ground truth}}(x,y)Let over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = IMG start_POSTSUBSCRIPT ground truth end_POSTSUBSCRIPT ( italic_x , italic_y ) ,andp2=IMGprediction(x,y)\displaystyle,\quad\text{and}~{}~{}\vec{p}_{2}=\text{IMG}_{\text{prediction}}(% x,y), and over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = IMG start_POSTSUBSCRIPT prediction end_POSTSUBSCRIPT ( italic_x , italic_y )
E(x,y)=arccos𝐸𝑥𝑦\displaystyle E(x,y)=\arccositalic_E ( italic_x , italic_y ) = roman_arccos (p1p2|p1|2|p2|2)subscript𝑝1subscript𝑝2superscriptsubscript𝑝12superscriptsubscript𝑝22\displaystyle{\left(\dfrac{\vec{p}_{1}\cdot\vec{p}_{2}}{\sqrt{|\vec{p}_{1}|^{2% }\cdot|\vec{p}_{2}|^{2}}}\right)}( divide start_ARG over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG square-root start_ARG | over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ⋅ | over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG end_ARG )

This per-pixel error quantization treats pixels as high dimensional vectors (299 dimensions in our case), and calculates per-pixel angular difference for each spatial location (x,y)𝑥𝑦(x,y)( italic_x , italic_y ) between ground truth p1subscript𝑝1\vec{p}_{1}over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and model prediction p2subscript𝑝2\vec{p}_{2}over→ start_ARG italic_p end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In this way, the error compares the shape of spectral dimension and ignores difference in pixel intensity.

As is shown in the figure, the reconstruction model was able to bring out the details of the spectrum of a root pixel, especially the curved region near 700700700700nm. The model also did well in reconstructing average soil pixels, which are expected to look like a nice Gaussian distribution because it is composed by a mixture of many different ingredients.

With the help of the data processing and training techniques shown in in Sec.6.3, our model performed well in handling light spots. Although the output is relatively blurry in spatial dimensions, a significant amount of plant root features were conserved in the spectral dimension and showed a high relevance towards the ground truth.

Refer to caption
Figure 11: Statistical model prediction error. Error bars for root and soil pixels are shown separately for each individual sample. Angular error was divided by a factor of π2𝜋2\frac{\pi}{2}divide start_ARG italic_π end_ARG start_ARG 2 end_ARG, normalizing the max possible error to 1.01.01.01.0.

In addition to the above qualitative analysis, thanks to the segmentation model[5] provided by Spencer et al., the prediction error was categorized into plant root pixels and soil pixels respectively. As is shown in Fig.11. derived prediction error from 4 prediction images are each divided into root pixels and soil pixels. The figure indicates that our model performed slightly better when handling plant root pixels.

6.4.1 Limitations

As shown in Sec.6.4, our model did not perform as great then handling soil pixels. This is partially caused by the difficulty to accurately match up high spatial frequency details of fine-grained soil pixels (see Fig.6.D). In addition to the error introduced by template matching algorithms, we also noticed that some soil grains did not stay in the same location before and after we retrieved the rhizobox from the cabinet for the reference HSI camera. Although done carefully, the transportation inevitably caused some soil grains to move around, causing a mismatch between ground truth and our camera’s data. This might also be a significant contributing factor for the blurry output of our model.

Although the model showed certain capabilities of providing extended spectral information based on its 8-band input, it is worth mentioning that we do not expect the same model to work as well for unknown plants or unknown variants. The reason we are able to use only 8 bands to reconstruct the entire spectrum is the network learns the types and states of a pixel in 8 different bands and internally maps it back to the same known state of a high resolution spectrum distribution. This process is purely based on the previous "knowledge" stored inside the model. It should not be able to be transferred to unknown plant species nor different soil types.

7 Conclusion

In this work, we demostrated the feasibility of a cost-efficient approach to build a compact sized hyper-spectral camera. We performed experiments on spectrum reconstruction from a reduced spectral image back into a full-sized hyper-spectral image. The reconstruction results showed higher relevance for plant root pixels, but did not perform as well on soil pixels. Such a result indicates that additional, useful data can indeed be obtained by our active illumination camera setup.

For the next step, we will capture data from real minirhizotrons and validate our model’s ability to transfer its knowledge to those data. This is expected to be more challenging because the plants might be of a different species, and the type of soil might be drastically different than the rhizoboxes from which we obtained data for training.

\bmsection

Funding Placeholder: will be replaced in submission build.

\bmsection

Disclosures The authors declare no conflicts of interest.

\bmsection

Data availability Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

  • [1] J. M. Amigo, H. Babamoradi, and S. Elcoroaristizabal, “Hyperspectral image analysis. a tutorial,” \JournalTitleAnalytica chimica acta 896, 34–51 (2015).
  • [2] P. Mishra, M. S. M. Asaari, A. Herrero-Langreo, et al., “Close range hyperspectral imaging of plants: A review,” \JournalTitleBiosystems Engineering 164, 49–67 (2017).
  • [3] J. Behmann, J. Steinrücken, and L. Plümer, “Detection of early plant stress responses in hyperspectral images,” \JournalTitleISPRS Journal of Photogrammetry and Remote Sensing 93, 98–111 (2014).
  • [4] J. M. Amigo, I. Martí, and A. Gowen, “Hyperspectral imaging and chemometrics: a perfect combination for the analysis of food structure, composition and quality,” in Data handling in science and technology, vol. 28 (Elsevier, 2013), pp. 343–370.
  • [5] S. J. Chang, R. Chowdhry, Y. Song, et al., “HyperPRI: A Dataset of Hyperspectral Images for Underground Plant Root Study,” \JournalTitlebioRxiv (2023).
  • [6] B. Rewald and J. Ephrath, Minirhizotron techniques (CRC Press, 2013), pp. 735–750. Publisher Copyright: © 2013 by Taylor & Francis Group, LLC.
  • [7] A. Sharma, P. Saini, P. Saini, et al., “Root system architecture in cereals: exploring different perspectives of the hidden half,” \JournalTitleRev. Bras. Bot. (2024).
  • [8] F. Postic, K. Beauchêne, D. Gouache, and C. Doussan, “Scanner-based minirhizotrons help to highlight relations between deep roots and yield in various wheat cultivars under combined water and nitrogen deficit conditions,” \JournalTitleAgronomy (Basel) 9, 297 (2019).
  • [9] M. G. Johnson, D. T. Tingey, D. L. Phillips, and M. J. Storm, “Advancing fine root research with minirhizotrons,” \JournalTitleEnviron. Exp. Bot. 45, 263–289 (2001).
  • [10] A. B. Rajurkar, S. M. McCoy, J. Ruhter, et al., “Installation and imaging of thousands of minirhizotrons to phenotype root systems of field-grown plants,” \JournalTitlePlant Methods 18, 39 (2022).
  • [11] M. Liedgens and W. Richner, “Minirhizotron observations of the spatial distribution of the maize root system,” \JournalTitleAgron. J. 93, 1097–1104 (2001).
  • [12] J. Chen, L. Liu, Z. Wang, et al., “Determining the effects of nitrogen rate on cotton root growth and distribution with soil cores and minirhizotrons,” \JournalTitlePLoS One 13, e0197284 (2018).
  • [13] M. Ahmad, “A Fast 3D CNN for Hyperspectral Image Classification,” \JournalTitleIEEE Geoscience and Remote Sensing Letters 19, 1–5 (2022).
  • [14] X. Hu, W. Yang, H. Wen, et al., “A Lightweight 1-D Convolution Augmented Transformer with Metric Learning for Hyperspectral Image Classification,” \JournalTitleSensors (Basel, Switzerland) 21, 1751 (2021).
  • [15] Y. Chen, Z. Lin, X. Zhao, et al., “Deep Learning-Based Classification of Hyperspectral Data,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 7, 2094–2107 (2014). Conference Name: IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing.
  • [16] Y. Chen, N. M. Nasrabadi, and T. D. Tran, “Hyperspectral Image Classification Using Dictionary-Based Sparse Representation,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 49, 3973–3985 (2011). Conference Name: IEEE Transactions on Geoscience and Remote Sensing.
  • [17] S. A. El\_Rahman, “Hyperspectral Image Classification Using Unsupervised Algorithms,” \JournalTitleInternational Journal of Advanced Computer Science and Applications (IJACSA) 7 (2016). Number: 4 Publisher: The Science and Information (SAI) Organization Limited.
  • [18] Z. Gao, Y. Shao, G. Xuan, et al., “Real-time hyperspectral imaging for the in-field estimation of strawberry ripeness with deep learning,” \JournalTitleArtificial Intelligence in Agriculture 4, 31–38 (2020).
  • [19] L. A. Varga, J. Makowski, and A. Zell, “Measuring the Ripeness of Fruit with Hyperspectral Imaging and Deep Learning,” \JournalTitlearXiv:2104.09808 [cs] (2021).
  • [20] S. Zou, Y.-C. Tseng, A. Zare, et al., “Peanut maturity classification using hyperspectral imagery,” \JournalTitleBiosystems Engineering 188, 165–177 (2019).
  • [21] C. Nguyen, V. Sagan, M. Maimaitiyiming, et al., “Early Detection of Plant Viral Disease Using Hyperspectral Imaging and Deep Learning,” \JournalTitleSensors (Basel, Switzerland) 21, 742 (2021).
  • [22] V. Aredo, L. Velásquez, J. Carranza-Cabrera, and R. Siche, “Predicting of the Quality Attributes of Orange Fruit Using Hyperspec-tral Images,” \JournalTitleJournal of Food Quality and Hazards Control (2019).
  • [23] J. Behmann, J. Steinrücken, and L. Plümer, “Detection of early plant stress responses in hyperspectral images,” \JournalTitleISPRS Journal of Photogrammetry and Remote Sensing 93, 98–111 (2014).
  • [24] Z. Khan, F. Shafait, and A. Mian, “Joint Group Sparse PCA for Compressed Hyperspectral Imaging,” \JournalTitleIEEE Transactions on Image Processing 24, 4934–4942 (2015).
  • [25] W. Li, C. Chen, H. Su, and Q. Du, “Local Binary Patterns and Extreme Learning Machine for Hyperspectral Imagery Classification,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 53, 3681–3693 (2015).
  • [26] Y. Yuan, X. Zheng, and X. Lu, “Spectral–Spatial Kernel Regularized for Hyperspectral Image Denoising,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 53, 3815–3832 (2015).
  • [27] M. Uzair, A. Mahmood, and A. Mian, “Hyperspectral Face Recognition With Spatiospectral Information Fusion and PLS Regression,” \JournalTitleIEEE Transactions on Image Processing 24, 1127–1137 (2015).
  • [28] W. Tang, Z. Shi, Y. Wu, and C. Zhang, “Sparse Unmixing of Hyperspectral Data Using Spectral A Priori Information,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 53, 770–783 (2015).
  • [29] X. Lu, Y. Yuan, and X. Zheng, “Joint Dictionary Learning for Multispectral Change Detection,” \JournalTitleIEEE Transactions on Cybernetics 47, 884–897 (2017).
  • [30] A. Sellami, M. Farah, I. R. Farah, and B. Solaiman, “Hyperspectral imagery semantic interpretation based on adaptive constrained band selection and knowledge extraction techniques,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 11, 1337–1347 (2018).
  • [31] W. Sun and Q. Du, “Hyperspectral Band Selection: A Review,” \JournalTitleIEEE Geoscience and Remote Sensing Magazine 7, 118–139 (2019).
  • [32] Z. Zheng, Y. Liu, M. He, et al., “Effective band selection of hyperspectral image by an attention mechanism-based convolutional network,” \JournalTitleRSC advances 12, 8750–8759 (2022).
  • [33] Y. Yuan, X. Zheng, and X. Lu, “Discovering diverse subset for unsupervised hyperspectral band selection,” \JournalTitleIEEE Transactions on Image Processing 26, 51–64 (2016).
  • [34] Q. Wang, Z. Yuan, Q. Du, and X. Li, “GETNET: A General End-to-End 2-D CNN Framework for Hyperspectral Image Change Detection,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 57, 3–13 (2019).
  • [35] Y. Wei, X. Zhu, C. Li, et al., “Applications of hyperspectral remote sensing in ground object identification and classification,” \JournalTitleAdvances in Remote Sensing 6, 201–211 (2017).
  • [36] K. S. He, D. Rocchini, M. Neteler, and H. Nagendra, “Benefits of hyperspectral remote sensing for tracking plant invasions,” \JournalTitleDiversity and Distributions 17, 381–392 (2011).
  • [37] F. D. Van der Meer, H. M. Van der Werff, F. J. Van Ruitenbeek, et al., “Multi-and hyperspectral geologic remote sensing: A review,” \JournalTitleInternational Journal of Applied Earth Observation and Geoinformation 14, 112–128 (2012).
  • [38] B. Luo, C. Yang, J. Chanussot, and L. Zhang, “Crop yield estimation based on unsupervised linear unmixing of multidate hyperspectral imagery,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 51, 162–173 (2012).
  • [39] H. Akbari, Y. Kosugi, K. Kojima, and N. Tanaka, “Detection and analysis of the intestinal ischemia using visible and invisible hyperspectral imaging,” \JournalTitleIEEE Transactions on Biomedical Engineering 57, 2011–2017 (2010).
  • [40] S. Liu, D. Marinelli, L. Bruzzone, and F. Bovolo, “A review of change detection in multitemporal hyperspectral images: Current techniques, applications, and challenges,” \JournalTitleIEEE Geoscience and Remote Sensing Magazine 7, 140–158 (2019).
  • [41] X. Zheng, T. Gong, X. Li, and X. Lu, “Generalized scene classification from small-scale datasets with multitask learning,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 60, 1–11 (2021).
  • [42] F. Xie, F. Li, C. Lei, et al., “Unsupervised band selection based on artificial bee colony algorithm for hyperspectral image classification,” \JournalTitleApplied Soft Computing 75, 428–440 (2019).
  • [43] Q. Zhang, Q. Yuan, J. Li, et al., “Deep spatio-spectral Bayesian posterior for hyperspectral image non-iid noise removal,” \JournalTitleISPRS Journal of Photogrammetry and Remote Sensing 164, 125–137 (2020).
  • [44] H. Yang, Q. Du, and G. Chen, “Unsupervised hyperspectral band selection using graphics processing units,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 4, 660–668 (2011).
  • [45] X. Zheng, H. Sun, X. Lu, and W. Xie, “Rotation-invariant attention network for hyperspectral image classification,” \JournalTitleIEEE Transactions on Image Processing 31, 4251–4265 (2022).
  • [46] Q. Wang, Q. Li, and X. Li, “Hyperspectral band selection via adaptive subspace partition strategy,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 12, 4940–4950 (2019).
  • [47] S. Sawant and P. Manoharan, “Hyperspectral band selection based on metaheuristic optimization approach,” \JournalTitleInfrared Physics & Technology 107, 103295 (2020).
  • [48] H. Yang, Q. Du, H. Su, and Y. Sheng, “An efficient method for supervised hyperspectral band selection,” \JournalTitleIEEE Geoscience and Remote Sensing Letters 8, 138–142 (2010).
  • [49] S. S. Sawant and P. Manoharan, “Unsupervised band selection based on weighted information entropy and 3D discrete cosine transform for hyperspectral image classification,” \JournalTitleInternational Journal of Remote Sensing 41, 3948–3969 (2020).
  • [50] S. Jia, Z. Ji, Y. Qian, and L. Shen, “Unsupervised band selection for hyperspectral imagery classification without manual band removal,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 5, 531–543 (2012).
  • [51] A. Ghorbanian, Y. Maghsoudi, and A. Mohammadzadeh, “Clustering-Based Band Selection Using Structural Similarity Index and Entropy for Hyperspectral Image Classification.” \JournalTitleTraitement du Signal 37, 785–791 (2020).
  • [52] A. MartÍnez-UsÓMartinez-Uso, F. Pla, J. M. Sotoca, and P. García-Sevilla, “Clustering-based hyperspectral band selection using information measures,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 45, 4158–4171 (2007).
  • [53] M. Song, X. Shang, Y. Wang, et al., “Class information-based band selection for hyperspectral image classification,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 57, 8394–8416 (2019).
  • [54] C.-I. Chang and S. Wang, “Constrained band selection for hyperspectral imagery,” \JournalTitleIEEE transactions on geoscience and remote sensing 44, 1575–1585 (2006).
  • [55] J. M. Haut, M. E. Paoletti, J. Plaza, et al., “Visual attention-driven hyperspectral image classification,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 57, 8065–8080 (2019).
  • [56] K. Z. Mao, “Orthogonal forward selection and backward elimination algorithms for feature subset selection,” \JournalTitleIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 34, 629–634 (2004).
  • [57] Y. Yuan, G. Zhu, and Q. Wang, “Hyperspectral band selection by multitask sparsity pursuit,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 53, 631–644 (2014).
  • [58] B. Tu, X. Liao, C. Zhou, et al., “Feature extraction using multitask superpixel auxiliary learning for hyperspectral classification,” \JournalTitleIEEE Transactions on Instrumentation and Measurement 70, 1–16 (2021).
  • [59] D. Lupu, I. Necoara, J. L. Garrett, and T. A. Johansen, “Stochastic Higher-Order Independent Component Analysis for Hyperspectral Dimensionality Reduction,” \JournalTitleIEEE Transactions on Computational Imaging 8, 1184–1194 (2022).
  • [60] Z. Wang, S. Liang, L. Xu, et al., “Dimensionality reduction method for hyperspectral image analysis based on rough set theory,” \JournalTitleEuropean Journal of Remote Sensing 53, 192–200 (2020).
  • [61] H. Zhai, H. Zhang, L. Zhang, and P. Li, “Laplacian-regularized low-rank subspace clustering for hyperspectral image band selection,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 57, 1723–1740 (2018).
  • [62] S. Jia, G. Tang, J. Zhu, and Q. Li, “A novel ranking-based clustering approach for hyperspectral band selection,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 54, 88–102 (2015).
  • [63] J. Feng, L. Jiao, T. Sun, et al., “Multiple kernel learning based on discriminative kernel clustering for hyperspectral band selection,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 54, 6516–6530 (2016).
  • [64] Y. Yuan, J. Lin, and Q. Wang, “Dual-clustering-based hyperspectral band selection by contextual analysis,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 54, 1431–1445 (2015).
  • [65] W. Zhang, X. Li, Y. Dou, and L. Zhao, “A geometry-based band selection approach for hyperspectral image analysis,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 56, 4318–4333 (2018).
  • [66] H. Sun, K. Zheng, M. Liu, et al., “Hyperspectral Image Mixed Noise Removal Using a Subspace Projection Attention and Residual Channel Attention Network,” \JournalTitleRemote Sensing 14, 2071 (2022).
  • [67] Q. Wang, X. He, and X. Li, “Locality and structure regularized low rank representation for hyperspectral image classification,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 57, 911–923 (2018).
  • [68] H. Sun, L. Zhang, J. Ren, and H. Huang, “Novel hyperbolic clustering-based band hierarchy (HCBH) for effective unsupervised band selection of hyperspectral images,” \JournalTitlePattern Recognition 130, 108788 (2022).
  • [69] X. Geng, K. Sun, L. Ji, and Y. Zhao, “A fast volume-gradient-based band selection method for hyperspectral image,” \JournalTitleIEEE Transactions on Geoscience and Remote Sensing 52, 7111–7119 (2014).
  • [70] Q. Wang, J. Lin, and Y. Yuan, “Salient band selection for hyperspectral image classification via manifold ranking,” \JournalTitleIEEE transactions on neural networks and learning systems 27, 1279–1289 (2016).
  • [71] B. Gao, A. Lu, Y. Pan, et al., “Additional sampling layout optimization method for environmental quality grade classifications of farmland soil,” \JournalTitleIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 10, 5350–5358 (2017).
  • [72] K. Sun, X. Geng, and L. Ji, “Exemplar component analysis: A fast band selection method for hyperspectral imagery,” \JournalTitleIEEE Geoscience and Remote Sensing Letters 12, 998–1002 (2014).
  • [73] G. Rahman, H. Sohag, R. Chowdhury, et al., “SoilCam: A fully automated minirhizotron using multispectral imaging for root activity monitoring,” \JournalTitleSensors (Basel) 20, 787 (2020).
  • [74] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015, N. Navab, J. Hornegger, W. M. Wells, and A. F. Frangi, eds. (Springer International Publishing, Cham, 2015), pp. 234–241.