1. Introduction
The monitoring of plants during the growing season is the basis of precision agriculture. With the support of quantity and quality information on plants (i.e., crop parameters), farmers can plan the crop management and input use (for example, nutrient application and crop protection) in a controlled way. Biomass is the most common crop parameter indicating the amount of the yield [
1]; and together with nitrogen content information, it can be used to determine the need for additional nitrogen fertilization. When farm inputs are correctly aligned, both the environment and the farmer benefit by following the principle of sustainable intensification [
2]
Remote sensing has provided tools for precision agriculture since the 1980s [
3]. However, drones (or UAV (Unmanned Aerial Vehicles) or RPAS (Remotely Piloted Aircraft System)) have developed rapidly, offering new alternatives to traditional remote sensing technologies [
1,
4]. Remote sensing instruments that collect spectral reflectance measurements have typically been operated from satellites and aircraft to estimate crop parameters. Due to technological innovations, lightweight multi- and hyper-spectral sensors have become available in recent years. These sensors can be carried by small UAVs that offer novel remote sensing tools for precision agriculture. One type of lightweight hyperspectral sensor is based on the Fabry-Pérot interferometer (FPI) technique [
5,
6,
7,
8], and this was used in this study. This technology provides spectral data cubes with a frame format. The FPI sensor has already shown potential in various environmental mapping applications [
7,
9,
10,
11,
12,
13,
14,
15,
16,
17,
18,
19,
20]. In addition to spectra, data about the 3D structure of plants can be collected at the same time because the frame-based sensors and modern photogrammetry enable the generation of spectral Digital Surface Models (DSM) [
21,
22]. The use of drone-based photogrammetric 3D data has already provided promising results in biomass estimation, but combining the 3D and spectral reflectance data has further improved the estimation results [
23,
24,
25].
A large number of studies regarding crop parameter estimation using remote sensing technologies have been published during the last decades. The vast majority of them have been conducted using spectral information captured from satellite or manned aircraft platforms. Since laser scanning became widespread, 3D information on plant height and structure became available for crop parameter estimation. Terrestrial approaches have mostly been used thus far [
26,
27,
28] due to the requirements of high spatial resolution and the relatively large weight of high-performance systems. The fast development of drone technology and photogrammetry, especially the structure from motion (SFM) technologies, have made 3D data collection more efficient, flexible and low in cost. Not surprisingly, photogrammetric 3D data from drones were taken under scrutiny for precision agriculture applications [
16,
25,
29,
30,
31,
32]. Instead of 3D data, various studies have exploited Vegetation Indices (VI) adopted from multispectral [
33,
34,
35,
36,
37] or hyperspectral data [
21,
38,
39]. However, only a few studies have integrated UAV-based spectral and 3D information for crop parameter estimation. Yue et al. [
24] combined spectral and crop height information from a Cubert UHD 180 hyperspectral sensor (Cubert GmbH, Ulm, Germany) to estimate the biomass of winter wheat. They concluded that combining the crop height information with two-band VIs improved the estimation results. But they suggested that the accuracy of their estimations could be improved by utilizing full spectra, more advanced estimation methods, and ground control points (in the georeferencing process to improve geometric accuracy). In the study by Bendig et al. [
23], photogrammetric 3D data was combined with spectrometer measurements from the ground. Ground-based approaches, which have combined spectral and 3D data, have also been performed [
28,
40,
41]. Completely drone-based approaches were investigated by Geipel et al. [
37], Schirrmann et al., [
42] and Li et al. [
32] for crop parameter estimation based on RGB point clouds with uncalibrated spectral data. The study by Li et al. [
32]) showed that point cloud metrics other than the mean height of the crop are also relevant information for biomass modelling.
In the vast majority of biomass estimation studies, estimators such as linear models and nearest neighbour approaches have been applied [
43]. In particular, drone-based crop parameter estimation studies have been performed mostly by regression techniques using a few features and linear models [
4,
21,
23,
28,
37] or using the nearest neighbour technique [
7,
14]. Thus, the use of estimators which are able to exploit the full spectra, such as the Random Forest (RF), have been suggested in UAV-based crop parameter estimation [
21,
25]. Since the publication of the RF technique [
44], it has received increasing attention in remote sensing applications [
45]. The main advantages of the RF over many other methods include high prediction accuracy, the possibility to integrate various features in the estimation process, no need for feature selection (because calculations include measures of feature importance order), and it is less sensitive to overfitting and in parameter selection [
45,
46,
47]. In biomass estimation, RF has shown competitive accuracy among other estimation methods applied in forestry [
43,
48] and in agricultural [
32,
49,
50,
51] applications. Only some studies have used RF in crop parameter estimations. Liu et al. [
50] used RF to estimate the nitrogen level of wheat using multispectral data. Li et al. [
32] and Yue et al. [
51] used successfully RF for estimating the biomass of maize and winter wheat. Previously, Viljanen et al. [
5] used RF for the fresh and dry matter biomass estimation of grass silage, using 3D and multispectral features. Existing studies have focused more on biomass estimation than on nitrogen content estimation. Especially the studies on the use of hyperspectral data in nitrogen estimation have commonly used terrestrial approaches (e.g., [
52,
53,
54]).
The objective of this investigation was to develop and assess a novel optimized workflow based on the RF algorithm for estimating crop parameters employing both spectral and 3D features. Hyperspectral and photogrammetric imagery was collected using the FPI camera and a regular consumer RGB camera. This study employed the full hyperspectral and structural information for the biomass and nitrogen content estimation of malt barley and grass silage utilizing datasets captured using a drone and aircraft. We also evaluated the impact of the radiometric processing level on the results. This paper extends our previous work [
55], which performed a preliminary study with the barley data using linear regression techniques. The major contributions of this study were the development and assessment of the integrated use of spectral and 3D features in the crop parameter estimation in different conditions, the comparison of RGB and hyperspectral imaging based remote sensing techniques and the consideration of impacts of various parameters, especially the flying height and the radiometric processing level on the results.
4. Discussion
We developed and assessed a machine learning technique integrating 3D and spectral features for the estimation of fresh and dry matter yield (FY, DMY), nitrogen amount and nitrogen percentage (N%) of malt barley crop and grass silage fields. Our approach was to extract a variety of remote sensing features from the datasets that were collected using RGB and imaging hyperspectral cameras. The features included 3D features from the canopy height model (CHM) and spectral features as a spectral, and various vegetation indices (VI) from the orthomosaics. Furthermore, we investigated the impact of the radiometric correction and flying height on the estimation results. Our approach was to use the Random Forest estimator (RF), but the results of the Simple Linear Regression (SLR) estimator was also calculated to validate the performance of the RF.
The best estimation results for the barley biomass and nitrogen content estimations were obtained by combining features from the FPI and RGB cameras. In most cases, the spectral features from the FPI camera provided the most or nearly the most accurate results. Adding the FPI camera 3D features did not improve the results, which was an expected result since FPI based CHM did not have high quality (
Table 5,
Figure 2d) due to relatively large GSD of 0.14 m. The data from the RGB camera provided good estimation results—typically almost as good as the FPI camera and in some cases the best results. We could also observe that the combination of RGB spectral and 3D features improved the estimation accuracy, especially in the case of biomass estimation. The RF performed well with various features and combinations and provided in most cases better results than the SLR, but some exceptions also appeared (
Appendix A;
Table A1,
Table A2 and
Table A3). Especially when only a limited number of features from one sensor (‘RGB 3D’ and ‘RGB spe’) was used, the SLR yielded competitive or even better estimation results than the RF, but when the amount and variation of features was high, the RF provided regularly better estimation results than the SLR. This is a logical performance, because with a small number of features there are not great difference in the SLR and RF models, but with large number of features, the SLR still uses only single feature in the estimation but RF can take advantage of various features during model building. A similar observation was also made by Li et al. [
32], where the dry biomass of maize was estimated; they obtained an R
2 of 0.52 and an RMSE% of 18.8% with SLR and an R
2 of 0.78 and an RMSE% of 16.7% with the RF. The RF thus provided more accurate estimation results. They also concluded that photogrammetric 3D features strongly contributed to the estimation models, in addition to the spectral features from the RGB camera. They suggested that hyperspectral data could improve the estimation results, and our study showed that this was a valid assumption in many situations. Yue et al. [
51] compared eight different regression techniques for the winter wheat biomass estimation, using near-surface spectroscopy and achieved R
2 values of 0.79–0.89. They concluded that machine learning techniques such as RF were less sensitive to noise than conventional regression techniques.
In the biomass estimations of barley, the PCC and RMSE% were at best 0.95% and 33.2%, respectively, for the DMY, and 0.97% and 31.0%, respectively, for the FY. The corresponding statistics for the grass dataset with the 140 m flying height were 0.79% and 1.9% for the DMY, and 0.64% and 4.3% for the FY, and for the dataset with the flying height of 50 m, the results were on the same level. Concerning the impacts of different features used in the estimations of barley DMY, the inclusion of the 3D features from the RGB camera in addition to the spectral features from the FPI camera improved the RMSEs by 14.7% for uncalibrated FPI, and 7.95% for calibrated FPI. The results were the similar for the barley FY. The possible explanation for this is that the estimation accuracies reached almost the best possible quality with the calibrated spectral features and so the 3D features could not provide further improvement whereas for the uncalibrated spectral features they improved still significantly accuracy. Inclusion of the 3D features based on the FPI camera did not improve the accuracy with either uncalibrated or calibrated data. The reason for this was the insufficient quality of the height data with the FPI camera, and therefore it could not provide quantitative information of differences of various samples to the estimation process. Considering the RGB sensor, the 3D features improved the RMSE% in the DMY and FY estimation by 12.18% and 8.6%, respectively, for barley. The corresponding improvements were 24.2% for the grass DMY and 8.1% for the FY. In the study by Bendig et al. [
23], adding the height features with the spectral indices either did not improve or only slightly improved the estimation accuracy of barley biomass when using multilinear regression models. In the study by Yue et al. [
24], the correlation between the winter wheat dry biomass and the partial least squares regression (PLS) model based on spectral features was improved from 0.53 to 0.74 and the RMSE from 1.69 to 1.20 t/ha when 3D features were included. These results are comparable to our results for the barley DMY. In studies with spectrally uncalibrated RGB values and 3D features, R
2 values of 0.74 have been reported for the corn grain yield estimation [
37] and 0.88 for the maize biomass estimation [
32], which indicated lower correlations than our results using the RGB data for the barley FMY estimation (PCC = 0.95, RMSE% = 34.74%).
In the nitrogen estimations for barley, the PCC and RMSE% were at best 0.966 and 21.6%, respectively, for the nitrogen amount and 0.919% and 34.4%, respectively, for the N%. Concerning the impacts of different features used in the estimations, the inclusion of the RGB camera 3D features with the spectral features of the FPI camera improved the RMSEs (0–30%), which indicated that the 3D features provided additional information to the estimation model. Also combining the 3D features to the RGB spectral features improved the estimation accuracy of the nitrogen amount and the N% by 12.5% and 23.6%, respectively, providing similar accuracy as the FPI based spectral features. It is worth noting, that even though the variation of N% on sample references was not high (
Table 1), good accuracies were achieved. The variation in the nitrogen amount was mainly related to variation in the biomass amount, which explains the similar estimation accuracies of the two quantities. Liu et al. [
50] used several different algorithms to estimate the nitrogen content (N%) of winter wheat based on multispectral data and achieved the best results with an R
2 of 0.79 and an RMSE% of 11.56% with the RF. Geipel et al. [
37] used SLR models based on a multispectral sensor to estimate the N content and achieved accuracies with an R
2 of 0.58–0.89 and an RMSE% of 7.6–11.7%. Schirrmann et al. [
42] achieved at the best R
2 value of 0.65 between the nitrogen content and the principal components of RGB image. Our results were on the same level with Liu et al. [
50], Geipel et al. [
37] and Schirrmann et al. [
42]; but with terrestrial approaches, even higher accuracies have been achieved [
52]. Furthermore, data from tractor-mounted Yara N-sensor has reported good correlations of R
2 0.80 with N-uptake in grass sward [
54]. However, it is important to notice that the estimation accuracies of different studies are not directly comparable because they are also impacted by the properties of the crop sample data, such as the variation in their values.
When comparing the estimation accuracy with the spectral features only from the FPI and RGB cameras, the FPI camera provided 15.4% and 18.5% better RMSEs than the RGB camera for the barley DMY and FY, respectively, but up to 16.5% and 14.4% worse RMSEs than the RGB camera for the grass DMY and FY, respectively. Better performance of the FPI camera was expected since the hyperspectral images provide more spectral information than the RGB images. The challenges with the grass study were the small number of samples and the small variation in the biomass amount, and therefore the grass results should be considered as indicative. In the estimation of the nitrogen, the FPI camera outperformed the RGB camera by 25.0% for the barley nitrogen amount and by 21.1% for the barley N%. The nitrogen content of plants is relatively small (
Table 1), thus it is expected that they only slightly affect the spectra. Consequently, FPI provides higher accuracy than RGB, because it is collecting more information from spectra.
In most cases, the radiometric calibration of the datasets using the radiometric block adjustment improved the estimation results. In the case of barley parameter estimations with all features, the radiometric correction improved the RMSE by 17.0% for the DMY, 20.3% for the FY and 25.0% for the nitrogen amount. In the case of the grass estimation, the impact was smaller—the correction either slightly decreased or improved the RMSE by −6.3–3.6% for the DMY and 0–2.4% for the FY. The improvement was largest in the datasets having many flight lines (
Table 2). The effect was the most noticeable in the ‘Barley UAV 140 m’ dataset, which was collected during 4.5 h, when illumination changed significantly, and in the ‘Grass UAV 50 m’ dataset, which was collected during sunny conditions at a low flying height that caused remarkable anisotropy effects (
Figure 5). Multiple studies have shown that radiometric correction using the RBA method improved the uniformity of image orthomosaics [
7,
11,
12,
63,
77]. Our results showed that the corrections also improved the accuracy of the crop parameter estimations.
The barley datasets were collected from the UAV and aircraft using various flying heights, which provided different GSDs. In the case of the RGB camera, the GSDs were 0.05 and 0.10 m, and the estimation results were similar when spectral features were applied. However, the flying height and GSD had a significant impact on the accuracy of the 3D features, which we could already deduce based on the CHM quality (
Table 5,
Figure 2 and
Figure 3). The most reliable CHM was obtained using data with the smallest GSD, ‘Barley UAV 140 m RGB’, where the correlation between in situ reference measurements and the CHM were highest, even though the CHM regularly underestimated in situ measurements (
Figure 2a). The quality of the DSM decreased when the GSD increased, and when the GSD was too large (like in the case of ‘Barley AC 700 m FPI’ with a GSD of 0.60 m), the 3D features were useless. It is also important to notice that in all cases, the height accuracy of the blocks was good and according to expectations—on the level of 0.5–2 times the GSD. At the smallest GSDs, the UAV and aircraft provided comparable accuracies. Thus the low-cost sensors used in this study can also be operated from small aircraft. The advantage of the aircraft-based method is that larger areas can be covered more efficiently. However, in smaller areas drones are more affordable.
It is worth noting that in the barley field the growth was not ideal due to poor weather conditions at the beginning of the growing season. In the grass canopy, the number of field reference samples were relatively low (8 samples) and variation in the biomass and nitrogen amounts was low, which generally decreases the correlation and estimation results. However, if we think practical solution for crop parameter estimation, collection of even a small number of samples is time-consuming and increase costs. The result with a few samples with a small variation was slightly better than when using the average value as the estimate; this indicated that with the comprehensive machine learning method the estimation accuracy could be improved from the case of using only average values, as it revealed relatively small spatial variations. Although we obtained promising results using datasets from the 140 m or higher flight heights, the use of lower height data, and thus more precise CHMs, can improve the estimations, as shown in previous studies using flight heights of 50 m or less [
4,
21,
25]. We assume that the spatial and radiometric resolution of the images are the fundamental factors impacting the quality of CHM thus we expect that alternatively a better-quality imaging system could also provide good results from higher altitudes; this would be advantageous if aiming at mapping larger areas.
To our knowledge, this study was the first one that comprehensively integrated and compared UAV-based hyperspectral, RGB and point cloud features in crop parameter estimation. We developed an approach for utilizing a combination of spectral data and 3D features in the estimation process that simultaneously and efficiently utilizes all available information. Furthermore, our results showed that the integration of spectral and 3D features improved the accuracy of the biomass estimation; but in the nitrogen estimation, the spectral features were more important. The results also indicated that the hyperspectral data provided only a slight or no improvement to the estimation accuracy of the biomass compared to the RGB data. This result thus suggests that the low-cost RGB sensors are suitable for the biomass estimation task. However, more studies are recommended to validate this result in different conditions. In the nitrogen estimation, the hyperspectral data appeared to be more advantageous than the RGB data. The aircraft-based data capture also provided results comparable to the UAV-based results.
In the future, further studies using more accurate hyperspectral sensors and higher variability test sites will be of interest. The datasets also give possibilities for new types of analysis, such as utilizing the spectral DSM more rigorously [
21,
22] and utilizing the multiview spectral datasets in the analysis [
78,
79,
80]. Our future objective will be to develop generalized estimators that can be used without in situ training data, for example, training an estimator with a dataset from one sample area and then using it in other areas. Various machine learning techniques exist that can be used in this process. Our results showed that the SLR was not ideal for this task. The RF behaved well, and further studies will be necessary to evaluate its suitability for the generalized procedures. For example, the deep learning neural network estimators will be very interesting alternatives [
81].