Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Field-Scale Assessment of Land and Water Use Change over the California Delta Using Remote Sensing
Previous Article in Journal
Bidirectional Long Short-Term Memory Network for Vehicle Behavior Recognition
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Exploring the Sensitivity of Sampling Density in Digital Mapping of Soil Organic Carbon and Its Application in Soil Sampling

1
College of Resources and Environment, Huazhong Agricultural University, Wuhan 430070, China
2
Geographical and Sustainability Sciences, The University of Iowa, Iowa City, IA 52246, USA
3
Key Laboratory for Geo-Environmental Monitoring of Coastal Zone of National Administration of Surveying, Mapping and GeoInformation & Shenzhen Key Laboratory of Spatial Smart Sensing and Services & College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518060, China
4
School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2018, 10(6), 888; https://doi.org/10.3390/rs10060888
Submission received: 25 April 2018 / Revised: 2 June 2018 / Accepted: 4 June 2018 / Published: 6 June 2018
(This article belongs to the Section Remote Sensing in Agriculture and Vegetation)
Graphical abstract
">
Figure 1
<p>Geographical location of study area in Iowa and corresponding soil and land use types. The soil type map is from the Online Web Soil Survey (official USDA soil information), whereas the land use type map was interpreted using hyperspectral images. The first sampling was executed in six random places to obtain prior knowledge about soil organic carbon (SOC). Second and third sampling were executed using grid sampling at a distance of 130 m.</p> ">
Figure 2
<p>Parameters of Headwall Micro-Hyperspec airborne sensors (A- and X-Series) and corresponding remote-sensing images in study region (true color composition: bands at 660, 550 and 480 nm in RGB for A-Series; bands at 13, 54 and 26 nm in RGB for X-Series).</p> ">
Figure 3
<p>Integrated soil sampling flow. (PLSR: partial least square regression; PLSRK: partial least square regression kriging; RMSE: root mean square error; RPD: ratio of standard deviation to RMSE; DSSI: density of soil sample index; CEPA: comprehensive evaluation index of prediction accuracy; RSEP: ratio of sampling efficiency to performance).</p> ">
Figure 4
<p>Distribution characteristics of soil organic carbon (SOC) values. SD: standard deviation, CV: Coefficient of variation.</p> ">
Figure 5
<p>Soil organic carbon (SOC) maps predicted by partial least squares regression (<b>a</b>) and partial least squares regression kriging (<b>c</b>) from airborne hyperspectral remote-sensing images and their differences (<b>b</b>).</p> ">
Figure 6
<p>Evaluation indices of ordinary kriging models based on different densities of soil samples. (RANDOM: the random sampling; LHS: the Latin hypercube sampling; GRID: the grid sampling; RMSE: the root mean square error; RPD: the ratio of standard deviation to RMSE. ANOVA: Analysis of Variance).</p> ">
Figure 7
<p>Evaluation indices of DSSI, CEPA and RSEP used to evaluate the performances of different soil sampling plans with different numbers of soil samples. (DSSI: density of soil samples index; CEPA: comprehensive evaluation index of the prediction accuracy; RSEP: ratio of sampling efficiency to performance; RANDOM: the random sampling; LHS: Latin hypercube sampling; GRID: grid sampling; RMSE: root mean square error).</p> ">
Figure 8
<p>Spatial distribution of soil organic carbon (SOC) predicted by partial least squares regression kriging with aid of hyperspectral images (<b>j</b>, original) and predicted by ordinary kriging models of different sampling methods (RANDOM: random sampling; LHS: Latin hypercube sampling; GRID: grid sampling) with various sampling densities (<b>a</b>–<b>i</b>, x: 5.13, y: 0.57 and z: 0.16).</p> ">
Figure 9
<p>Differences between original reference and estimated soil organic carbon (SOC) maps by random sampling (<b>a</b>); Latin hypercube sampling (LHS) (<b>b</b>) and grid sampling (<b>c</b>) at sampling density of 0.57. H: high level (&gt;30%); M: median level (30–10%); L: low level (&lt;10%).</p> ">
Figure 10
<p>Histogram of error values between original reference and estimated SOC maps by random (RANDOM), Latin hypercube (LHS) and grid sampling (GRID) at sampling density of 0.57. H: high level (&gt;30%); M: median level (30–10%); L: low level (&lt;10%).</p> ">
Figure 11
<p>Comparison of parameters of semivariograms of Gaussian (<b>a</b>); circular (<b>b</b>); spherical (<b>c</b>) and exponential (<b>d</b>) semi-variance functions of soil organic carbon (SOC) based on field soil samples (<span class="html-italic">n</span> = 180) using ordinary kriging.</p> ">
Figure 12
<p>Original reference (<b>a</b>) and predicted SOC maps by ordinary kriging model (<b>b</b>) based on field soil samples (<span class="html-italic">n</span> = 180) and differences between them (<b>c</b>) and percentages of different SOC error levels in study area by the pie chart. H: high level (&gt;30%); M: median level (30–10%); L: low level (&lt;10%).</p> ">
Figure 13
<p>The trend lines of the quadratic polynomial function between the RSEP and the sampling density in different sampling plans.</p> ">
Versions Notes

Abstract

:
The rapid monitoring and accurate estimation of dynamic changes in soil organic carbon (SOC) can make great efforts in understanding the global carbon cycle. Traditional field survey is the main approach to obtain soil data and measure SOC content. However, the limited number of soil samples and the sampling cost hinder the quality of digital soil mapping. This research aims to explore the sensitive of sampling density in digital soil mapping, and then design a suitable soil sampling plan based on a series of sampling indices. Headwall hyperspectral images (400–1700 nm) were used to estimate the SOC map by partial least squares regression (PLSR) and PLSR kriging (PLSRK). Three traditional soil sampling methods (random, grid, and Latin hypercube sampling) with 10 classes of sampling densities (6.26, 2.79, 1.57, 1.01, 0.69, 0.53, 0.39, 0.30, 0.26, and 0.20 ha−1) were designed. The R2, root mean square error (RMSE) and ratio of standard deviation to RMSE (RPD) were used to evaluate the prediction accuracy in digital soil mapping by ordinary kriging. Three new indices, namely, the ratio of sampling efficiency to performance (RSEP), the density of soil samples index and the comprehensive evaluation index of prediction accuracy, were used to select a suitable soil sampling plan. Results showed that (1) the prediction accuracy of PLSRK (RPD = 2.00) was higher by approximately 11.73% than that of PLSR (RPD = 1.79), and the hyperspectral images provided an actual referential SOC map for the study of soil sampling; (2) the grid sampling plan performed better than the random and Latin hypercube sampling methods, and the quality of SOC map improves with the increase of the sampling density, and (3) the computer simulation and field verification indicated that RSEP is one feasible index in designing a suitable soil sampling plan.

Graphical Abstract">

Graphical Abstract

1. Introduction

The soil is an important global reservoir of soil organic carbon (SOC), which plays a key role in understanding the global carbon cycle [1]. Moreover, SOC determines soil aeration and water infiltration, and is an important source of soil nutrients [2]. Traditional soil surveys are fundamental and necessary in obtaining accurate soil information and constructing a reliable dataset [3]. The limited number of soil samples and the sampling cost hinder the quality of digital soil mapping. Therefore, a suitable soil sampling plan should be designed for high-quality and efficient digital soil map.
High-quality soil datasets enable high-precision digital mapping of SOC, which is essential in understanding the spatiotemporal variations in SOC and testing performance for different soil models [4,5]. The quality of soil dataset was influenced by the sampling density for the invisible and variable of soil properties [3]. The quality of soil dataset has positive relationship with the sampling density for more soil samples can reveal more information of the soil properties. Additionally, the dispersion degree and spatial characteristics of SOC can influence the quality of digital soil mapping [6]. The complex spatial distribution of SOC needs more soil samples to draw the trend of SOC in detail. Thus, it is a challenging work to balance the quality of the soil dataset and the number of the soil samples, and the construction of a high-quality soil dataset by a low sampling density remains challenging. A soil sampling plan determines the geographical locations of soil samples and influences the parameters of geostatistical models for digital soil mapping [7]. Thus, a suitable sampling density plays an important role in displaying the spatial distribution characteristics of soil properties.
Soil sampling mainly aims to characterize the nutrient status of a study region as accurately and inexpensively as possible [8]. Many typical soil sampling methods, such as traditional (random and grid sampling and their derivative methods) and model-based sampling strategies, such as zone, topographic/geographic unit, remote-sensing and Latin hypercube sampling (LHS), have been provided [9]. The traditional methods were designed by the spatial distribution characteristics of the soil samples in the study area based on the random or grid ways. These traditional methods can be widely applied in different fields. However, the spatial characteristics of SOC should be considered in random sampling. Similarly, it is hard to select a suitable sampling distance for grid sampling with few prior knowledge [10]. Therefore, the spatial characteristics of SOC and more valuable auxiliary information should be considered to resolve the limitations of traditional sampling methods. The model-based sampling strategies not only consider the spatial characteristics of the soil samples, but also the auxiliary variables that influence pedogenesis. With the development of remote sensing technology, soil covariate factors, including vegetation maps, geology maps and their derivatives, can be used as auxiliary variables for the digital soil mapping [11]. These datasets play important roles in influencing soil properties, and the representative variables can be chosen for model-based soil sampling to select the suitable geographical locations of soil samples [12,13]. In the research of Minasny and McBratney [14], a digital elevation model, slope, compound topographic index, and normalized difference vegetation index as the ancillary data were used to design one optimal and efficient soil sampling plan by a conditioned Latin hypercube sampling method. Lin, et al. [15] explored the representative sampling points by a cluster analysis of soil environmental co-variates and then formulated a sampling design method. These researchers introduced a feasible and reasonable sampling method by easily acquiring exhaustive secondary information. However, the relationships between soil properties and environmental factors are uncertain in large regions and inconspicuous in small or field-scale regions [16,17]. Consequently, environmental factors cannot offer accurate auxiliary information for the spatial unbiasedness of soil properties, especially in field or local regions [18].
Airborne hyperspectral imaging technology, which has attracted research attention in the study of natural and artificial materials, can respond to the spectral reflectance and absorption characteristics of study objects on land surface [19,20]. Visible and near-infrared (VNIR) spectral absorption features created by the electronic transition of soil carbon atoms, and the vibrational stretching and bending of O–H, C–H or C=O functional groups are important factors for quantifying SOC [21]. Laboratorial or field VNIR reflectance spectroscopy is a valuable technique used to predict SOC [17,22]. Hyperspectral sensors placed on fixed-wing aircraft or helicopters can perform spatially continuous mapping with high spatial resolution, and spectral reflectance can capture the valuable information of topsoil [18]. Many studies have indicated the potential of airborne or satellite hyperspectral images in the prediction of topsoil SOC or other properties [23,24,25,26]. Ciampalini, et al. [18] used the prototypal multisensor hyperspectral system produced by Galileo Avionica, which consists of two electro-optical heads in VNIR and short-wavelength infrared (SWIR) spectral range (from 400 nm to 24,500 nm, over 700 bands) to obtain the hyperspectral images. Their study showed the ability of airborne hyperspectral data from relatively inexpensive and efficient sensing methods for producing reliable and high-resolution maps of clay content. Franceschini, et al. [27] used the ProSpecTIR V-S sensor (SpecTIR LLC), which consists of two imaging sensors (AISA Eagle–Hawk; Specim), to obtain two aerial hyperspectral images. Each image contains 357 spectral bands from 400 nm to 2500 nm with a spatial resolution of 1 m and a spectral resolution of 4.6 nm in VNIR and 5.3 nm in SWIR. Their results showed that most of the soil properties (clay, sand, cation-exchange capacity, etc.) can be predicted using the airborne hyperspectral remotely sensed data with an increased accuracy. Although the soil masking by vegetation is one important issue in predicting soil properties by hyperspectral images, the residual spectra unmixing and semi-blind source separation methods were developed to separate the soil and vegetation spectra. Bartholomeus, et al. [28] put forward a residual spectral unmixing to resolve the problem when fields are partially covered with vegetation. Ouerghemmi, et al. [29,30] developed a blind source separation method to separate the soil and vegetation spectra and a double-extraction approach for clay content estimation over semi-vegetated surfaces. All of these studies showed the potential of the airborne hyperspectral images in predicting the topsoil properties. Meanwhile, the spatially continuous mapping of hyperspectral images enables continuous digital soil mapping. Thus, hyperspectral images can introduce important auxiliary information to reflect the status of SOC and formulate an optimal soil sampling approach, especially at farm or field study scale. Moreover, hyperspectral images can resolve the disadvantages of environmental factors.
This study aims to (i) explore the reliability of airborne hyperspectral images in mapping the SOC of topsoil (0–15 cm); (ii) assess the sensitivity of sampling density in the digital mapping of SOC based on three sampling methods (random, grid, and LHS); and (iii) design one suitable sampling method based on three new evaluation indices, namely, ratio of sampling efficiency to performance (RSEP), density of soil samples index (DSSI), and comprehensive evaluation index of prediction accuracy (CEPA).

2. Materials and Methods

2.1. Study Area

A farmland (approximately 385.45 ha) in southeast Iowa, United States, was selected as the study region (Figure 1). Agriculture is a major economy component of Iowa, which is often called ‘the breadbasket of USA’, with 95% of its surface area devoted to cultivation of crops [31]. This state has a humid continental climate because of its variable weather conditions. In addition, the climate in Iowa is cold and temperate with significant rainfall even in its driest month. The mean annual precipitation (including snowfall) is 903 mm, the mean snowfall is 104 mm, and the mean annual temperature is 9.8 °C which ranges from −11.6 °C in January to 18.3 °C in July. The geography of Iowa is generally not flat, and most of the state comprises rolling hills. The landform and topography is drift plain, with an elevation of 756–881 m and slope of 2°–23°. In the study area, the main land use type is farmland, and one river goes through the middle of the study area, and there are less vegetation and shrub. The main soil types in the study region are Typic Argiudolls, Cumulic Endoaquolls, Oxyaquic Argiudolls, Mollic Hapludalfs, and Aquic Hapludolls (Figure 1), according to soil taxonomy class [32]. The soils are formed by wind-blown, dominantly silt-sized particles known as loess. The topsoil and subsoil have a silty clay loam texture. The soil profile is dominated by silt-sized particles and generally less than 5% sand-sized particles. The topsoil layer has a granular structure that provides for water infiltration into the soil. The three-year crop rotations consisting of corn, soybeans, cereal rye, and red clover in the study area differ from the corn-soybean rotation commonly seen in Iowa, and the farmland is ploughed fallow by large machinery before seeding.

2.2. Soil Sample Collection and Analysis

Three soil sampling campaigns were conducted in May and October of 2015 and March of 2016. In May of 2015, 24 surface soil samples (0–20 cm) were collected in six places to obtain prior knowledge about the SOC in the study area. In October of 2015, 50 surface soil samples (0–20 cm) were collected using grid soil sampling with a sampling distance of 130 m. In March of 2016, 130 surface soil samples (0–20 cm) were collected using grid soil sampling with a sampling distance of 130 m. A total of 204 soil samples were used in this study. Soil samples were collected from farmlands that have been ploughed to avoid the influence of straw or weed. For each soil sample, five soil subsamples were collected from the four corners and the centre of a 1 m2 sample square and thoroughly mixed for a representative sample. The soil samples were first air-dried in a laboratory at 20 °C to 25 °C for 14 days. The dried soil samples were gently crushed in a porcelain mortar to break up large aggregates, and sieved using a 2 mm stainless steel sieve. The soil samples were subjected to dry combustion using a CHNS combustion gas analyzer (Vario El Cube, Elementar, Germany) to determine SOC. Prior to analysis, the sample was acidified by HCl (4 mol/L) to remove carbonates. The combustion temperature ranged from 950 °C to 1200 °C, and the combustion time was approximately 10 min [33]. The nonlinear correction curve was used to correct the analytical precision of the CHNS combustion gas analyzer, and the error was controlled to below 0.1% [34].

2.3. Hyperspectral Images and Spectral Pre-Treatment

Aerial hyperspectral images (Figure 2) were obtained on 19 November 2015 (10:00–14:00) using Headwall Micro-Hyperspec airborne sensors (SpecTIR LLC), which consisted of Micro-Hyperspec VNIR imaging spectrometers (A-Series) and Micro-Hyperspec NIR imaging spectrometers (X-Series). The aerial hyperspectral images were acquired after at least one week with no rain. The sun was shining brightly, and clouds were not apparent on the sky during image acquisition. The detail performance parameters of the Headwall Micro-Hyperspec airborne sensors were showed in Table 1. A black landscaping material (50 m × 20 m), a light brown canvas (50 m × 20 m), and a white building material commonly known as Tyvek (50 m × 20 m) were used as reference objects to adjust the influence of the atmosphere, sunlight and other external factors. Radiation correction was applied using Hyperspec® III with SpectralView™ (Copyright 2013, Headwall Photonics, Inc., Fitchburg, MA, USA). Raw data were converted into radiance using calibration files generated during radiometric and wavelength calibrations before and after sensor mobilization. Spectral reflectance was converted from spectral radiance on the basis of the empirical line correction in ENVI/IDL (Version 5.1, Esri Inc., Redlands, CA, USA). A standard three-band (red, green, and blue) orthorectified digital aerial photograph with 1 m spatial resolution scanned in 2011 was used for geometric corrections, which were applied using ENVI/IDL.
Aside from radiometric and geometric corrections, such factors as multicollinearity, random noise, and multiple scattering effects were considered. The aerosol concentration, the ozone content, and the water vapor column contents influence the spectral transmission process. These extraneous factors influence the robustness and accuracy of calibration models [35]. Thus, pre-processing was important in spectral modelling, and partial least squares regression (PLSR) was used to build a soil spectral model. The calibration dataset consisted of the measured SOC values and the corresponding hyperspectral reflectance. In this research, 204 soil samples, which were collected through three samplings, were used to fit the relationships between SOC and spectral reflectance for digital soil mapping. Leave-one-out cross-validation was used to evaluate the accuracy of PLSR and check the spectral outliers based on different pre-processing methods. The MATLAB® (R2008a, MathWorks, Inc., Natick, MA, USA) toolbox functionality was used to conduct a robust principal component analysis to detect and eliminate outliers from the original datasets. The wavelengths from 380 nm to 430 nm, and from 950 nm to 1000 nm in A-Series suffered from noise, and the range from 1350 nm to 1420 nm in X-Series were significantly influenced by water. Therefore, these wavelengths were removed in the prediction of SOC. After multiple tests, the Savitzky-Golay (SG) smoothing with a moving window of 30 nm was used to smoothen the radiance curves. Subsequently, and the log10 was used to transform the spectral radiance. Finally, standard normal variate (SNV) was used as an example of a scatter-corrective method to partially remove undesired scatter- or particle-sized information from the spectrum reflectance pattern. Thus, a combined processing method of SG, log10 and SNV was used to the spectral pre-process of soil samples. The spectral reflectance was used as explanatory variable to construct soil spectral models by PLSR after pre-processing. The PLSR and the pre-processing methods were implemented using the PLS toolbox (Version 7.9.3, Eigenvector Research, Inc., Wenatchee, WA, USA) and operated in MATLAB®.

2.4. Ordinary Kriging (OK)

Kriging is an advanced geostatistical procedure that can generate a continuous surface from sparse soil samples on the basis of their attributes [36]. In this research, OK was selected as the geostatistical model for estimating the spatial distribution of SOC on the basis of soil samples. Z(Xi) is assumed to be a regionalized variable (namely, SOC) with a variogram γ(h), which is a function describing the spatial aggregation field or stochastic process Z(u). In this research, the exponential (Formula 1), Gaussian (Formula 2), spherical (Formula 3), and circular (Formula 4) methods were used as the semi-variance model, and a suitable variogram based on the leave-one-out cross-validation results was selected.
The exponential function was defined as:
γ ( h ) = {   0 ,   h = 0 C 0 + C ( 1 e h a   ) ,     h > 0   .
The Gaussian function was defined as:
γ ( h ) = {   C 0 + C ( 1 e x p ( h 2 a 2 ) ) ,     h > 0   0 ,     h = 0   .
The spherical function was defined as:
γ ( h ) = {   C 0 + C ( 3 h 2 a 1 2 ( h a ) 3   ) ,     0 < h a   C 0 + C ,     h > a     0 ,     h = 0   .
The circular function was defined as:
γ ( h ) = {   C 0 + C ( 1 2 π cos 1 ( h a ) + 1 h 2 a 2   ) ,     0 < h a C 0 + C ,     h > a     0 ,     h = 0   .
In these equations, a, a, 3a and 3 a are the effective ranges for the spherical, circular, exponential and Gaussian functions, respectively. h is the spatial lag, C0 is the nugget and C is the partial sill. The spatial variation of the soil samples for these variograms was isotropic.
Traditional OK can provide unbiased estimates with minimum error. The function of OK was expressed as:
Z * ( x 0 ) = i = 1 n λ i ( x 0 ) Z ( x i )   ,
where   i = 1 n λ i ( x 0 ) = 1 , Z * ( x 0 ) is the predicted value of variable z at location x0; Z ( x i ) is the measured data; λ i ( x 0 ) refers to the weights associated with the measured values and n is the number of predicted values within certain neighbor soil samples. The standard search neighborhood defined by the ellipse parameters was used for OK [37]. The sector type comprised four sectors with 45° offset, and the maximum and minimum neighbors were 5 and 2, respectively. The OK was implemented using the Create Fishnet tool in ArcGIS (Version 10.4.1, Esri, Inc., Redlands, CA, USA).

2.5. PLSR and Partial Least Regression Kriging (PLRK)

PLSR, which is a well-known modelling technique often used in chemometric and quantitative spectral analyses, is based on a linear transition from a large number of original descriptors to a new variable space based on a small number of orthogonal factors [38]. A two-step descriptor selection procedure was applied to achieve the best model quality. The first step involved the dimensionality reduction of the spectral reflectance from thousands of wavelengths into a few valuable components. The pre-processing method of the spectral reflectance played an important role in this step. The second step involved selecting the suitable number of components and then fitting the relationships between the components and SOC contents. The root mean square error (RMSE) and R2 of the leave-one-out cross-validation model (RMSECV and R2CV) were important evaluation indices in this step. Additional details about the PLSR algorithm can be obtained from the study of Martens [39].
PLSRK is a combined model that considers the predicted values of SOC by PLSR and the spatial correlation of the predicted residuals by OK. Two steps can be used to describe the PLSRK workflow. Firstly, the predicted SOC contents are obtained by PLSR and the predicted residuals of the SOC are estimated by OK in different semi-variance functions. Secondly, the predicted SOC content of PLSRK is calculated as the sum of the predicted SOC of PLSR and the estimated residuals of OK at the same locations. Additional information about PLSRK can be found in Guo, et al. [17].

2.6. Sampling Methods

In this research, the grid sampling method was implemented using the Create Fishnet tool in ArcGIS. The fixed sampling distances between the neighbor soil samples were 40, 60, 80, 100, 120, 140, 160, 180, 200, and 220 m. The numbers of soil samples were 2414, 1075, 606, 388, 266, 203, 150, 117, 99 and 79, with sampling densities of 6.26, 2.79, 1.57, 1.01, 0.69, 0.53, 0.39, 0.30, 0.26, and 0.20 ha−1, respectively. Random sampling was implemented using the Create Random Points tool in ArcGIS, and the numbers of soil samples were the same with those in the grid sampling plans.
LHS is a statistical method of generating a near-random sample of parameter values from a multi-dimensional distribution [40]. In the context of statistical sampling, a square grid containing sample positions is a Latin square if and only if only one sample is present in each row and column. A Latin hypercube is the generalization of this concept to an arbitrary number of dimensions, where each sample is the only one in each axis-aligned hyperplane [14]. The LHS soil sampling plan was implemented using MATLAB on the basis of the number of grid soil samples and the geographical range of the study region. In this study, only the coordinate information was considered in designing the LHS sampling plan, and the numbers of soil samples were 2414, 1075, 606, 388, 266, 203, 150, 117, 99, and 79, which were the same with those under the two other sampling methods, thereby maintaining data consistency.

2.7. Evaluation Indices

The RMSE and R2 of the calibration (RMSEC and R2C), RMSECV, R2CV, the mean squared deviation ration (MSDR), and the ratio of standard deviation to RMSE (RPD) were used as the evaluation indices [41,42,43] as follows:
R 2 = 1 i = 1 n ( y i y ^ i ) 2 i = 1 n ( y i y ¯ ) 2 ,  
RMSE = 1 n i = 1 n ( y i y ^ i ) 2 ,  
MSDR = MSE σ K 2 ,    
RPD = S . D RMSE p ,  
where n is the sample number; y i is the measured value for the soil sample; y ^ i is the predicted value; y ¯ is the mean measured value; MSDR is the average of the ratios of the squared errors (MSE) to the kriging variances ( σ K 2 ) at the prediction points, and S.D is the standard deviation of the measured SOC from the validation dataset. In general, a model that performs well has large R2 and RPD, and a small RMSE, and the number of MSDR should be close to 1.

2.8. Sampling Indices

Two important variables influence the performance of a soil sampling method, namely, the sampling density and the quality of the sample dataset. Three new indices (DSSI, CEPA, and RSEP) were developed in this paper to quantify the performance of soil sampling method. DSSI was used to describe the sampling density. The predicted capability of OK was used to replace the quality of sample dataset. CEPA was used to check the predicted accuracy of OK. Finally, RSEP was used as a comprehensive index to consider the equilibrium points of DSSI and CEPA, and a suitable soil sampling plan based on different demands was selected based on these three indices.
DSSI was used to evaluate the sampling density in different soil sampling methods, and its function was defined as follows:
DSSI   =   f g d   = f 1 n i = 1 n f i 1 n i = 1 n ( f i g ) 2 ,  
where f is the sampling density, g   is the average sampling density, d is the standard deviation of the sampling density and n is the number of potential soil sampling methods.
CEPA was used to evaluate the quality of the soil datasets and defined as follows:
CEPA   =   w μ σ   = w 1 n i = 1 n w i 1 n i = 1 n ( w i μ ) 2 ,  
where w is the reciprocal of RPD of the prediction model, μ is the mean value of the reciprocal of RPD, σ is the standard deviation of the reciprocal of RPD and n is the number of potential soil sampling methods.
RSEP was defined as follows:
RSEP = DSSI × CEPA  
RSEP is a combination index of DSSI and CEPA and can be used to evaluate the comprehensive performance of the soil sampling plan. When RSEP is close to 0, the sampling method can balance the relationship between the sampling density and the quality of the sample dataset. Thus, the suitable sampling density is when RSEP is close to 0.

2.9. Soil Sampling Workflow

This research aimed to estimate one accurate and continuous SOC map using hyperspectral remote-sensing images through PLSR and PLSRK and explore the sensitivity of the number of soil samples in digital soil mapping using a traditional geostatistical model (OK). Furthermore, a comprehensive index (RSEP) was applied to select a suitable sampling density for digital soil mapping. This process involved four steps (Figure 3).
  • The hyperspectral images were obtained using the hyperspectral sensors on the helicopter, and the radiation correction of the images and the pre-processing of the spectral reflectance were executed using the PLS toolbox operated in MATLAB®. Additionally, 204 soil samples were collected from the study region to construct the relationships between the hyperspectral reflectance and SOC.
  • PLSR and PLSRK were used to build the SOC prediction models, and the best soil spectral model based on R2 was used to predict SOC map of bare soil. The SOC map was used as the original referential map in the succeeding research.
  • Random, LHS, and grid soil sampling were simulated with different sampling densities on the basis of the original estimated SOC map. A total of 30 soil sampling plans were simulated in this research. The evaluation dataset consisted of 200 random soil samples was extracted from the original estimated SOC map. The RMSE, R2, and RPD were used to explore the sensitivity of sampling densities and geographical locations to the accuracy of digital soil mapping. In addition, DSSI was used to evaluate the sampling density, CEPA was used to evaluate the prediction accuracy of the OK models, and RSEP was used to select a suitable soil sampling plan and an appropriate sampling density.
  • The optimal soil sampling plan was executed in the study region, and the differences between the SOC maps predicted using the hyperspectral images and OK were then compared.

3. Results

The spatial distribution of SOC was predicted by the hyperspectral images using PLSR and PLSRK models. The evaluation indices showed that PLSRK has better prediction result than PLSR. The SOC map predicted by PLSRK was used as the reference map to check the performance of different soil samples. Three traditional soil sampling methods (random, grid and Latin hypercube sampling) with 10 classes of sampling densities (6.26, 2.79, 1.57, 1.01, 0.69, 0.53, 0.39, 0.30, 0.26, and 0.20 ha−1) were designed, and the performance of these sampling methods were quantified by DSSI, CEPA and RSEP. Also, one suitable soil sampling method can be designed, and the field sampling result also can verify the feasibility of these sampling indices.

3.1. Estimation of Referential SOC Map by Hyperspectral Images

A total of 204 soil samples were analyzed using the histogram in Figure 4, and the mean value of SOC was 2.36%, with values of 1.31% to 3.14%. The S.D and coefficient of variation (CV) were used to evaluate the distributional characteristics of SOC content. The S.D and CV were 0.34% and 14.41%, respectively. The histogram also showed the normal distribution of SOC. According to the study by Wilding [44] in this region, these values indicated that SOC was a moderate variable. The main reason for this observation was that all the soil samples were collected from neighboring farmlands belonging to and managed by one family. Thus, S.D and CV were excessively small, and the SOC content showed a minor difference.
The pre-processed hyperspectral reflectance was used to construct the prediction model for SOC using the PLSR model. The number of soil samples was 204, and the sampling density was approximately 0.529 ha−1. Figure 5a presents the predicted SOC map by PLSR. In this work, only the farmland with bare soil was selected as the study object. The other land use types were removed to decrease the influence of the land surface. The number of latent variables was five. The RMSEC, R2C, and RPD values of the calibration model were 0.17%, 0.77%, and 2.00%, respectively, whereas the RMSECV, R2CV, and RPD values of the leave-one-out cross-validation model were 0.19%, 0.69, and 1.79, respectively. This result indicated the feasible prediction capability of the hyperspectral images in predicting SOC by PLSR. Thus, the soil hyperspectral model was used to estimate the spatial distribution of SOC in Figure 5a. Several studies have shown that SOC is characteristic of a strong spatial autocorrelation [45]. Furthermore, the Moran’s I value of SOC was 0.25 in the present study region. The Moran’s I was implemented in ArcGIS by the geostatistical tools. Thus, the regression–kriging model (PLSRK) was used to further improve prediction accuracy. The RMSECV, R2CV, and RPD values of PLSRK were 0.17%, 0.75%, and 2.00%, respectively, whereas prediction accuracy was improved by 11.73% relative to PLSR, as indicated by RPD. Thus, the SOC map predicted by PLSRK was used as the reference SOC map to explore the sensitivity of the sampling density to the digital soil mapping (Figure 5c). The SOC values of the farmland ranged from 0.55% to 3.91%. High and low SOC values were gathered from the northern and southern parts of the study region, respectively. The SOC residuals of PLSR ranged from −0.49% to 0.45%, according to the SOC residual map predicted by OK (Figure 5b). The exponential model was used as the semi-variance one. The nugget was 0.026, the partial sill (C + C0) was 0.045, and the major range was 1620.464 m. High positive and negative residuals were observed around and far from the river, respectively. The main reason for these values may be due to the soil moisture, which influenced the spectral reflectance of the field soil, which then affected the predicted results of the soil spectral models [46]. These SOC maps could satisfactorily show the detailed spatial distribution characteristics of SOC in different geographical locations and offer important soil information for farmland management and sampling studies.

3.2. Simulation of Soil Sampling Plans

This study simulated three soil sampling plans (random, LHS and grid soil sampling) with 10 classes of sampling densities on the basis of SOC map predicted using the hyperspectral images (Figure 5c). Table 2 shows the parameters of the semi-variance functions by different soil sampling densities. These OK models selected the exponential function as the covariance function for different sampling plans due to the following reasons. (1) The exponential function was the optimal semi-variance function in the field example verification; (2) The same semi-variance function can remove the differences from different semi-variance functions. Table 2 shows the detailed parameters of the different semi-variance functions. The sampling density was decided by the soil sample numbers and the study area. The nugget values means the un-estimated regionalized variable when less than the measure scale. The partial sill and sill showed the variation of variance which created by the spatial autocorrelation. Also, the nugget values were close to 0 in Table 2, and this means the semi-variance functions can estimate the spatial variation of SOC. The C0/(C + C0) shows the degree of spatial variation, and most of these values were smaller than 50%, which showed that the spatial variation were mainly influenced by the spatial autocorrelation.
The evaluation dataset, including 200 random soil samples extracted from the referential SOC map (Figure 5b), was used to evaluate the predicted accuracy of OK models and compare the performances of different soil sampling methods. Figure 6 shows the corresponding evaluation indices. The double-factor analysis of the analysis of variance (ANOVA) was used to check the significance of these evaluation indices. The p values of RMSE, R2, and RPD were 0.028, 0.008, and 0.042, respectively. All values were less than 0.05. Thus, the three soil samples exhibited significant differences. The RMSE values decreased with the increase of sampling density, whereas the R2 and RPD values increased. In the random, LHS and grid sampling methods, the smallest values of RMSE (0.129, 0.140, and 0.123) and the largest values of R2 (0.812, 0.790, and 0.755) and RPD (2.272, 2.172, and 1.999) were calculated at the highest sampling density (5.132 ha−1). The largest values of RMSE (0.258, 0.239, and 0.252) and the smallest values of R2 (0.256, 0.175, and 0.279) and RPD (1.113, 1.087, and 1.171) were computed at the lowest sampling density (0.163 ha−1) (Figure 6a). Therefore, the sampling density was an important factor that influenced the quality of the soil dataset. The quality of the sample dataset in these three sampling plans was favorable when the sampling density was high. Moreover, the evaluation indices showed that the grid sampling method had the lowest RMSE and largest R2 and RPD relative to other sampling methods at the same sampling density. Therefore, the prediction accuracy of the grid sampling method was superior to that of the other two methods. This difference was probably because the sampling locations of the random and LHS sampling were random and the completeness of the representative soil samples cannot be guaranteed. The sampling distance of the grid sampling was fixed, and the soil samples were evenly spread over the main study area, which aided the modelling of the semi-variance function. Thus, the grid soil sampling method was relatively more stable, and preferred than the random and LHS sampling when the sampling density was similar.
The evaluation indices DSSI, CEPA, and RSEP were calculated using the defined functions (10), (11) and (12). Figure 7 shows the varying trends of DSSI, CEPA, and RSEP at different sampling densities. DSSI increased with the sampling density (The lower the DSSI, the less number of soil samples). DSSI had the largest value (2.589) at the sampling density of 5.13 ha−1. The lowest value of DSSI was −0.634 and when the sampling density was 0.16 ha−1. The CEPA values showed a decreasing trend with the increase of the sampling density. The larger the CEPA, the lower the prediction accuracy. CEPA had the smallest values at −2.001, −2.149, and −1.894 for random, LHS and grid sampling, respectively, at the sampling density of 5.13 ha−1. Meanwhile, CEPA values were largest when the sampling density was 0.16 ha−1. Moreover, most of the CEPA values of the grid soil sampling were lower than those of random and LHS sampling methods under the same sampling density. This result further indicated that the grid soil sampling method performed better than the other two methods. RSEP was a combination index of DSSI and CEPA. For random, LHS and grid sampling methods, RSEP showed an increasing trend at sampling densities from 0.16 ha−1 (−0.810, −0.680, and −0.531, respectively) to 0.83 ha−1 (0.052, −0.034, and 0.027, respectively) and then a decreasing trend to 5.31 ha−1 (−5.181, −5.564, and −4.903, respectively). When the sampling density was higher than 0.83 ha−1, the soil sampling plan required additional labor for the collection of soil samples. When the sampling density was lower than 0.83 ha−1, the quality of the sample dataset was unsatisfactory. The performance of the soil sampling method was optimum when the RSEP value approached 0. Therefore, the suitable sampling density was from 0.43 to 1.23 ha−1, which corresponded to the grid sampling distances of 140 to 80 m.

3.3. Comparison of Drawing Quality of SOC Maps

The sampling densities of 5.13, 0.57, and 0.16 ha−1 were selected as study objects to show the differences of the estimated SOC maps in different sampling densities. Six SOC maps were predicted using the OK models based on random sampling, LHS and grid sampling with various sampling densities. The original SOC map, which was estimated by the hyperspectral images with a spatial resolution of 1 m, was used as the reference map (Figure 8). Results showed that the quality of the predicted SOC map was favorable when the sampling density was high. When the sampling density was 5.13 ha−1, the spatial distribution characteristics of the estimated SOC maps were similar to the original estimated SOC map. When the sampling density was 0.57 ha−1 which was selected as the suitable sampling plan, detailed spatial characteristics of SOC were exhibited and the spatial trend of SOC became smooth and rough. When the sampling density was 0.16 ha−1, little valuable soil information was described. Accurate soil maps could be predicted when the sampling density was high, but the field sampling work need more time and labor. However, at low sampling density, the spatial variability of SOC could not be showed in detail by a few soil samples. Meanwhile, when the sampling density was similar, differences of the soil maps using varying sampling methods could not be differentiated.
Figure 9 and Figure 10 show the predicted error maps and histogram of the error values. These data further quantified the differences between the predicted SOC maps and the original referential SOC map. The error values were calculated on the basis of these two-series SOC maps. The error levels were defined as the percentage of the error values to the original SOC values ( o r i g i n a l   v a l u e s e s t i m a t e d   v a l u e s o r i g i n a l   v a l u e s ). Six levels were divided on the basis of the percentages as follows: negative high level (−H: <−30%), negative median level (−M: −30% to −10%), negative low level (−L: −10–0%), low level (L: 0–10%), median level (M: 10–30%), and high level (H: >30%). Large predicted error values appeared on the southwestern, middle, and northeastern parts of the study region (Figure 9a). The main reason can be due to different soil types, the topographic features, and the land use types in these regions, which lead to the strong spatial heterogeneity of SOC, thus the spatial distribution characteristics of SOC cannot be accurately estimated by the geostatistical model. More −H and −M regions appeared on the southwestern parts (Figure 9a,b) than those shown in Figure 9c. Meanwhile, more H and M regions appeared on the central parts (Figure 9a,b) than those shown in Figure 9c. Results showed that random sampling and LHS could easily create large estimation errors relative to the grid sampling method. In addition, the grid sampling method could accurately show the total spatial distribution characteristics of SOC. In Figure 10, the summed areas of the −L and L levels (69.79%, 72.24%, and 73.14%) of the three soil sampling methods occupied most of the study region. This finding showed that a sampling density of 0.57 ha−1 was acceptable. In addition, the low-level area of grid sampling (73.14%) was larger than that of the random sampling (69.79%) and LHS (72.24%). The high-level area of grid sampling (3.67%) was smaller than that of the two other sampling methods (4.36% and 4.95%). These data showed that the performance of the grid sampling method was superior to that of the two other techniques. To further verify the feasibility and accuracy of the grid soil sampling plan, grid sampling method with a sampling distance of 130 m (approximately 180 soil samples with a sampling density of 0.467 ha−1) was used to collect the soil samples in this study.

3.4. Evaluation and Application

Grid soil sampling with a sampling distance of 130 m was selected as the soil sampling method. The original referential SOC map from Figure 5a was used as the referential SOC map. Meanwhile, the SOC map predicted by the OK model based on the 180 grid samples was used to evaluate the performance of the grid sampling method, and the soil sampling density of the grid sampling method was 0.467 ha−1. Various semi-variance functions were executed for the OK models. Table 3 and Figure 11 show the parameters. The RMSE and MSDR of the leave-one-out cross-validation were used to select the suitable semi-variance function, whereas the exponential model with RMSE of 0.3189% and MSDR of 0.823 was preferred. Figure 12 presents the SOC error map between the original referential SOC map (Figure 12a) and the predicted map (Figure 12b) these two SOC maps. The figure illustrates the error levels using a pie chart. The overall distribution trend of SOC was similar in Figure 12a,b, although the patches of SOC values in Figure 12b were smoother than those in Figure 12a. Areas with low (68.60%) and median (27.59%) error levels occupied more than half of the study region. High error levels were observed around the central river and regions with complicated soil types. The main reason can be due to the soil types in these regions were different. Different soil types were characterized by varying levels of soil moisture, and the soil moisture could influence the spectral reflectance of the hyperspectral images. This was why the high error levels always appeared around the middle river. In addition, the OK model could not accurately estimate the local spatial variations of SOC in the transition regions of different land use types. Overall, the 130 m grid sampling distance (with a soil sample density of 0.467 ha−1) could be used to display the total spatial distribution of SOC in the study region. RSEP was a useful index for evaluating the performances of different soil sampling methods and can be used to design a feasible and suitable soil sampling plan.

4. Discussion

4.1. Exploring the Relationships between the Sampling Densities and the Sampling Indices

Traditional soil survey is regarded as one fundamental and necessary part of soil assessment framework, which supports soil-related decisions and policy making [47]. Soil sampling plan can directly influence the quality of the soil datasets and the hypothesis testing of soil models, and plays an important role in digital soil mapping. Many soil sampling plans have been designed and introduced, and the relationship between the sampling density and the quality of the soil dataset has been explored [11,15,47]. However, the performance of the sampling plan is not totally quantified and not unified, and most of the performance cannot be compared in different study regions and sampling methods. Thus, this paper tried to introduce three new sampling indices (DSSI, CEPA, and RSEP) to quantify the performance of the sampling plan, and realize the unification and comparability of the sampling performance. This paper also showed that the three new indices can be used to choose the suitable sampling method and the sampling density, but they needed to be calculated through a series of computer simulations, and the mean and S.D values could float the ranges of these indices. To improve the practicability of these performances in the future work, the fitting research was discussed. Through prior knowledge and practical examples in this paper, the sampling density had a strong relationship with the sampling indices of DSSI, CEPA, and RSEP, and the relationships were explored by the Pearson’s correlation coefficient (r) in Table 4. The DSSI has a strong positive relationship with the sampling density for r = 1, and the CEPA and RSEP also have strong negative relationships with the sampling density in different sampling methods.
To further explore these relationships, the linear function, the logarithmic function and the quadratic polynomial function were used to quantify the relationships between these sampling indices and the sampling density. The R2 was used to evaluate the fitting degree of the parameter models in Table 4. The results showed that the relationship between the sampling density and the DSSI was linear correlation, and D S S I = 0.65 × s a m p l i n g   d e n s i t y 0.74 , thus the DSSI can be easily predicted by the sampling density in this study region. Also, the results showed that CEPA can be easily simulated by the sampling density, and the best function was the quadratic polynomial function, and the R2 values were larger than 0.93 for different sampling plans. Meanwhile, the results showed that RSEP has quadratic polynomial correlation with the sampling density, and R2 values were larger than 0.98. The parameters were showed in Table 4. The trend lines between the RSEPs and the sampling densities is drawn in Figure 13. The peak point of the trend line showed the ideal sampling density in different sampling methods, and the peak points were 1.14 ha−1, 0.98 ha−1, and 1.17 ha−1 for the random, LHS and gird sampling. In our research, the suitable sampling density was inferred from 0.43 to 1.23 ha−1, and the results were consistent with the ideal peak points. Thus, the quadratic polynomial function can be regarded as the ideal model to simulate the sampling density and RSEP. Meanwhile, just a few random sampling density (at least 3) can be used to obtain the fitting function in the study region, and then infer the ideal sampling density. This discussion can widely reduce the workload to find the fitting function, and get the suitable sampling density.

4.2. The Performance, the Potential and the Limitations of the Integrated Soil Sampling Flow

The hyperspectral images, the soil sampling and the RSEP played different roles to guarantee the operation of the soil sampling flow. This sampling plan can be used not only in large-scale regions with complex environmental factors, but also in small- or field-scale regions with unclear environmental information. The actual spatial distribution of SOC cannot be obtained by field sampling alone due to the massive fieldwork and complicated soil pre-treatment. In the traditional research, the auxiliary variables mainly were helped to estimate the soil properties or design the sampling methods, which were the terrain factors (e.g., elevation, slope, and aspect), the distance factors (e.g., distance to river, residential area, and road), land cover types and soil types, and the remote sensing vegetation indices (the normalized difference vegetation index and the normalized difference moisture index) [48,49,50]. Based on the previous studies, the prediction accuracies of SOC by these auxiliary variables were not high (Table 5), and this can largely influence the performance of the soil sampling plan. With the development of spectral technology, VNIR diffuse reflectance spectroscopy has become a practical and affordable alternative for the accurate and rapid estimation of the soil physical and chemical attributes [21,51,52]. Meanwhile, airborne hyperspectral technology emerged as a new tool of obtaining bare soil information at the field scale [53,54]. Hyperspectral images are characterized by abundant spectral information and high spatial resolution and play an important role in estimating continuous SOC maps [23,24]. Several studies have successfully used airborne hyperspectral image data for soil attribute estimation (Table 6). These studies have demonstrated the potential of airborne hyperspectral technology in predicting SOC and introduced a reliable data source for reflecting the spatial distribution characteristics of soil. The wavelength of VNIR spectral reflectance was mostly located between 380 and 2500 nm, and the prediction accuracy of SOC or SOM ranged from 0.59 to 0.96 (R2CV). The spatial resolution of these hyperspectral images ranged from 1 m to 6 m and could guarantee the continuous digital soil mapping of SOC [24,25,26,27]. In the studies of Franceschini, et al. [27] and Paz-Kagan, et al. [55], not only the SOM, but also many other soil properties, such as clay and sand, and soil bulk density, K and P, could be accurately predicted by hyperspectral images. Thus, hyperspectral images can be used as a valuable data source for digital soil mapping relative to the environmental factors. The hyperspectral images can provide a reliable SOC map which can be used to explore the sensitivity of sampling density in digital soil mapping and evaluate the performance of different soil sampling methods.
The application of hyperspectral images in digital soil mapping is limited by two factors. First, hyperspectral images can only reflect the spectral characteristics of surface soil attributes, and land cover (trees, construction and grass) can influence the continuity of the spatial distribution of soil attributes [60,61]. Second, hyperspectral sensors have certain limitations, such as atmospheric correction, complex equipment, professional technology, high cost and narrow sweep width [23]. Thus, traditional field investigation and traditional geostatistical models also play an important role in digital soil mapping. Sampling density is a dominant factor that influences the quality of the sample dataset and the prediction accuracy of SOC maps. Furthermore, the sampling method influences the geographical locations of soil samples and performance of geostatistical models. Thus, a suitable sampling density and sampling method must be selected.

5. Conclusions

This study estimated an original referential SOC map on the basis of measured SOC values from 204 field soil samples and hyperspectral images by PLSR and PLSRK. Three soil sampling methods (random, LHS and grid sampling) were computer-simulated on the basis of the referential SOC map. Each soil sampling plan included 10 classes of sampling densities (6.26, 2.79, 1.57, 1.01, 0.69, 0.53, 0.39, 0.30, 0.26, and 0.20 ha−1). The RMSE, R2 and RPD values were used to explore the sensitivity of sampling density in digital soil mapping, whereas RSEP, CEPA, and DSSI were used to select a suitable sampling density for describing the spatial distribution trend of SOC. Several interesting results were determined by synthetically comparing the performances of the three soil sampling methods with 10 classes of sampling densities.
(1) The hyperspectral images produced a relatively accurate SOC map with an R2CV of 0.69, RMSECV of 0.19%, and RPD of 1.79 by PLSR. The prediction accuracy of PLSRK was higher by approximately 11.73% than that of PLSR, which had an R2CV of 0.75, RMSECV of 0.17% and RPD of 2.00. The SOC map could display the trend of spatial distribution of SOC at a spatial resolution of 1 m. Thus, hyperspectral images could be used as a useful data source for the digital soil mapping of SOC.
(2) In the simulation of soil sampling plans, the R2, RMSE, and RPD values showed that sampling density had a positive relationship with the prediction accuracy of the estimated SOC map. The quality of soil sample dataset was favorable when the sampling density was high. Meanwhile, the performance of grid soil sampling was superior to that of random sampling and LHS at similar sampling density. Therefore, the grid soil sampling method was the first option for field sampling.
(3) Three new indices (DSSI, CEPA, and RSEP) were introduced to quantify the sampling density, prediction accuracy of digital soil mapping and comprehensive performance of the sampling plan. DSSI was negatively related to CEPA, and RSEP could balance the relationships among them. RSEP was nearly 0, which was a good balance point between DSSI and CEPA. RSEP showed that the suitable sampling density was from 0.43 ha−1 to 1.23 ha−1, which corresponded to grid sampling distances of 140 m to 80 m. In addition, the grid soil sampling plan with a sampling distance of 130 m (soil sample density was 0.467 ha−1) was executed in the study region. The prediction results verified the acceptability of this sampling plan. These data showed that RSEP was an effective index for selecting the suitable sampling density.
However, the disadvantages of the hyperspectral images, limitation of the field soil samples and other uncertainty factors generally influenced the performance of the soil sampling plan. Hence, additional studies and landscapes are necessary to evaluate the feasibility of the proposed sampling indices.

Author Contributions

L.G. conceived and designed the experiments; M.L. contributed reagents/materials/analysis tools; T.S. analyzed the data, Y.C. performed the experiments; L.D. reviewed and edited this paper, H.Z. funded the experiments.

Funding

This research was funded by the National Natural Science Foundation of China, grant number (41371227); the Fundamental Research Funds for the Central Universities, grant number (2662016QD032); the Natural Science Foundation of Hubei, grant number (2018CFB372); and the National Undergraduate Innovation and Entrepreneurship Training Program, grant number (201810504023, 201810504030).

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Pataki, D.; Alig, R.; Fung, A.; Golubiewski, N.; Kennedy, C.; McPherson, E.; Nowak, D.; Pouyat, R.; Romero Lankao, P. Urban ecosystems and the north american carbon cycle. Glob. Change Biol. 2006, 12, 2092–2102. [Google Scholar] [CrossRef]
  2. Vasques, G.; Grunwald, S.; Sickman, J. Comparison of multivariate methods for inferential modeling of soil carbon using visible/near-infrared spectra. Geoderma 2008, 146, 14–25. [Google Scholar] [CrossRef]
  3. Hartemink, A.E.; McBratney, A.; de Lourdes Mendonça-Santos, M. Digital Soil Mapping with Limited Data; Springer Science & Business Media: Berlin, Germany, 2008. [Google Scholar]
  4. Akramkhanov, A. The Spatial Distribution of Soil Ssalinity: Detection and Prediction; Cuvillier Verlag: Göttingen, Germany, 2005; p. 118. [Google Scholar]
  5. Lucà, F.; Conforti, M.; Castrignanò, A.; Matteucci, G.; Buttafuoco, G. Effect of calibration set size on prediction at local scale of soil carbon by vis-nir spectroscopy. Geoderma 2017, 288, 175–183. [Google Scholar] [CrossRef]
  6. Yu, D.-S.; Zhang, Z.-Q.; Yang, H.; Shi, X.-Z.; Tan, M.-Z.; Sun, W.-X.; Wang, H.-J. Effect of soil sampling density on detected spatial variability of soil organic carbon in a red soil region of China. Pedosphere 2011, 21, 207–213. [Google Scholar] [CrossRef]
  7. Cressie, N. Spatial prediction and ordinary kriging. Math. Geol. 1988, 20, 405–421. [Google Scholar] [CrossRef]
  8. Dinkins, C.P. Soil Sampling Strategies; Montana State University: Bozeman, MT, USA, 2008; pp. 1–4. [Google Scholar]
  9. Boettinger, J.L. Digital Soil Mapping: Bridging Research, Environmental Application, and Operation; Springer: Berlin, Germany, 2010; Volume 2. [Google Scholar]
  10. Mickelson, J.A.; Stougaard, R.N. Assessment of soil sampling methods to estimate wild oat (avena fatua) seed bank populations. Weed Sci. 2011, 51, 226–230. [Google Scholar] [CrossRef]
  11. Higo, M.; Isobe, K.; Yamaguchi, M.; Torigoe, Y. Impact of a soil sampling strategy on the spatial distribution and diversity of arbuscular mycorrhizal communities at a small scale in two winter cover crop rotational systems. Ann. Microbiol. 2015, 65, 985–993. [Google Scholar] [CrossRef]
  12. Thompson, A.N.; Shaw, J.N.; Mask, P.L.; Touchton, J.T.; Rickman, D. Soil sampling techniques for alabama, USA grain fields. Precis. Agric. 2004, 5, 345–358. [Google Scholar] [CrossRef]
  13. Viscarra Rossel, R.A.; McBratney, A.B.; Minasny, B. Proximal Soil Sensing; Springer: Berlin, Germany, 2010; Volume 1. [Google Scholar]
  14. Minasny, B.; McBratney, A.B. A conditioned latin hypercube method for sampling in the presence of ancillary information. Comput. Geosci. 2006, 32, 1378–1388. [Google Scholar] [CrossRef]
  15. Lin, Y.; Axing, Z.; Chengzhi, Q. A soil sampling method based on representativeness grade of sampling points. Acta Pedol. Sin. 2011, 48, 938–946. [Google Scholar]
  16. Wang, K.; Zhang, C.; Li, W. Predictive mapping of soil total nitrogen at a regional scale: A comparison between geographically weighted regression and cokriging. Appl. Geogr. 2013, 42, 73–85. [Google Scholar] [CrossRef]
  17. Guo, L.; Zhao, C.; Zhang, H.; Chen, Y.; Linderman, M.; Zhang, Q.; Liu, Y. Comparisons of spatial and non-spatial models for predicting soil carbon content based on visible and near-infrared spectral technology. Geoderma 2017, 285, 280–292. [Google Scholar] [CrossRef]
  18. Ciampalini, A.; André, F.; Garfagnoli, F.; Grandjean, G.; Lambot, S.; Chiarantini, L.; Moretti, S. Improved estimation of soil clay content by the fusion of remote hyperspectral and proximal geophysical sensing. J. Appl. Geophys. 2015, 116, 135–145. [Google Scholar] [CrossRef]
  19. Galvao, L.S.; Formaggio, A.R.; Couto, E.G.; Roberts, D.A. Relationships between the mineralogical and chemical composition of tropical soils and topography from hyperspectral remote sensing data. Isprs J. Photogramm. Remote Sens. 2008, 63, 259–271. [Google Scholar] [CrossRef]
  20. Stevens, A. Detection of carbon stock change in agricultural soils using spectroscopic techniques. Soil Sci. Soc. Am. J. 2006, 70, 844–850. [Google Scholar] [CrossRef]
  21. Viscarra Rossel, R.A.; Hicks, W.S. Soil organic carbon and its fractions estimated by visible-near infrared transfer functions. Eur. J. Soil Sci. 2015, 66, 438–450. [Google Scholar] [CrossRef]
  22. Viscarra Rossel, R.A.; Chen, C. Digitally mapping the information content of visible–near infrared spectra of surficial australian soils. Remote Sens. Environ. 2011, 115, 1443–1455. [Google Scholar] [CrossRef]
  23. Gomez, C.; Rossel, R.A.V.; McBratney, A.B. Soil organic carbon prediction by hyperspectral remote sensing and field vis-nir spectroscopy: An Australian case study. Geoderma 2008, 146, 403–411. [Google Scholar] [CrossRef]
  24. Stevens, A.; Miralles, I.; Wesemael, B.V. Soil organic carbon predictions by airborne imaging spectroscopy: Comparing cross-validation and validation. Soil Sci. Soc. Am. J. 2012, 76, 2174–2183. [Google Scholar] [CrossRef]
  25. Denis, A.; Stevens, A.; van Wesemael, B.; Udelhoven, T.; Tychon, B. Soil organic carbon assessment by field and airborne spectrometry in bare croplands: Accounting for soil surface roughness. Geoderma 2014, 226, 94–102. [Google Scholar] [CrossRef]
  26. Peon, J.; Fernandez, S.; Recondo, C.; Calleja, J.F. Evaluation of the spectral characteristics of five hyperspectral and multispectral sensors for soil organic carbon estimation in burned areas. Int. J. Wildland Fire 2017, 26, 230–239. [Google Scholar] [CrossRef]
  27. Franceschini, M.; Demattê, J.; da Silva Terra, F.; Vicente, L.; Bartholomeus, H.; de Souza Filho, C. Prediction of soil properties using imaging spectroscopy: Considering fractional vegetation cover to improve accuracy. Int. J. Appl. Earth Obs. Geoinf. 2015, 38, 358–370. [Google Scholar] [CrossRef]
  28. Bartholomeus, H.; Kooistra, L.; Stevens, A.; Leeuwen, M.V.; Wesemael, B.V.; Ben-Dor, E.; Tychon, B. Soil organic carbon mapping of partially vegetated agricultural fields with imaging spectroscopy. Int. J. Appl. Earth Observ. Geoinf. 2011, 13, 81–88. [Google Scholar] [CrossRef]
  29. Ouerghemmi, W.; Gomez, C.; Naceur, S.; Lagacherie, P. Semi-blind source separation for the estimation of the clay content over semi-vegetated areas using vnir/swir hyperspectral airborne data. Remote Sens. Environ. 2016, 181, 251–263. [Google Scholar] [CrossRef]
  30. Ouerghemmi, W.; Gomez, C.; Naceur, S.; Lagacherie, P. Applying blind source separation on hyperspectral data for clay content estimation over partially vegetated surfaces. Geoderma 2011, 163, 227–237. [Google Scholar] [CrossRef]
  31. Freedman, J. Iowa: Past and Present; The Rosen Publishing Group: New York, NY, USA, 2010. [Google Scholar]
  32. Burt, R.; Staff, S. Kellogg Soil Survey Laboratory Methods Manual; Natural Resources Conservation Services: Washington, DC, USA; National Soil Survey Center: Lincoln, NE, USA, 2014. [Google Scholar]
  33. ISO10694, ISO. Soil Quality—Determination of Organic and Total Carbon after Dry Combustion (Elementary Analysis); International Organization for Standardization: Geneva, Switzerland, 1995. [Google Scholar]
  34. Sokolova, O.V.; Vorozhtsov, D.L. Development of rapid method for determining the total carbon in boron carbide samples with elemental analyzer. Russ. J. Appl. Chem. 2014, 87, 1640–1643. [Google Scholar] [CrossRef]
  35. Shi, T.; Chen, Y.; Liu, Y.; Wu, G. Visible and near-infrared reflectance spectroscopy—An alternative for monitoring soil contamination by heavy metals. J. Hazard. Mater. 2014, 265, 166–176. [Google Scholar] [CrossRef] [PubMed]
  36. Goovaerts, P. Geostatistics for Natural Resource Evaluation; Oxford University Press: Oxford, UK, 1997; p. 42. [Google Scholar]
  37. Gribov, A.; Krivoruchko, K. Geostatistical mapping with continuous moving neighborhood. Math. Geol. 2004, 36, 267–281. [Google Scholar] [CrossRef]
  38. Geladi, P.; Kowalski, B.R. Partial least-squares regression: A tutorial. Anal. Chim. Acta 1986, 185, 1–17. [Google Scholar] [CrossRef]
  39. Martens, H. Multivariate Calibration; John Wiley & Sons: Hoboken, NJ, USA, 1989. [Google Scholar]
  40. McKay, M.D.; Beckman, R.J.; Conover, W.J. A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 2000, 42, 55–61. [Google Scholar] [CrossRef]
  41. Gogé, F.; Gomez, C.; Jolivet, C.; Joffre, R. Which strategy is best to predict soil properties of a local site from a national vis–nir database? Geoderma 2014, 213, 1–9. [Google Scholar] [CrossRef]
  42. Bellon-Maurel, V.; Fernandez-Ahumada, E.; Palagos, B.; Roger, J.-M.; McBratney, A. Critical review of chemometric indicators commonly used for assessing the quality of the prediction of soil attributes by nir spectroscopy. TrAC Trends Anal. Chem. 2010, 29, 1073–1081. [Google Scholar] [CrossRef]
  43. Stafford, J.V. Precision Agriculture’07; Academic Publishers: Kolkata, India, 2007. [Google Scholar]
  44. Wilding, L. Spatial Variability: Its Documentation, Accommodation and Implication to Soil Surveys; PUDOC: Las Vegas, NV, USA, 1985; pp. 166–194. [Google Scholar]
  45. Kumar, S.; Lal, R.; Liu, D.S.; Rafiq, R. Estimating the spatial distribution of organic carbon density for the soils of Ohio, USA. J. Geogr. Sci. 2013, 23, 280–296. [Google Scholar] [CrossRef]
  46. Oltracarrió, R.; Baup, F.; Fabre, S.; Fieuzal, R.; Briottet, X. Improvement of soil moisture retrieval from hyperspectral vnir-swir data using clay content information: From laboratory to field experiments. Remote Sens. 2015, 7, 3184–3205. [Google Scholar] [CrossRef]
  47. Crepin, J.; Johnson, R.L. Soil sampling for environmental assessment. In Soil Sampling and Methods of Analysis; NIPA: Islamabad, Pakistan, 1993; pp. 5–18. [Google Scholar]
  48. Kumar, S.; Lal, R.; Liu, D. A geographically weighted regression kriging approach for mapping soil organic carbon stock. Geoderma 2012, 189, 627–634. [Google Scholar] [CrossRef]
  49. Sun, W.; Zhu, Y.Q.; Huang, S.L.; Guo, C.X. Mapping the mean annual precipitation of China using local interpolation techniques. Theor. Appl. Climatol. 2015, 119, 171–180. [Google Scholar] [CrossRef]
  50. Zhang, H.; Guo, L.; Chen, J.; Fu, P.; Gu, J.; Liao, G. Modeling of spatial distributions of farmland density and its temporal change using geographically weighted regression model. Chin. Geogr. Sci. 2013, 24, 1–14. [Google Scholar] [CrossRef]
  51. Ji, W.; Viscarra Rossel, R.A.; Shi, Z. Improved estimates of organic carbon using proximally sensed vis–nir spectra corrected by piecewise direct standardization. Eur. J. Soil Sci. 2015, 66, 670–678. [Google Scholar] [CrossRef]
  52. Conforti, M.; Matteucci, G.; Buttafuoco, G. Using laboratory vis-nir spectroscopy for monitoring some forest soil properties. J. Soils Sediments 2018, 18, 1009–1019. [Google Scholar] [CrossRef]
  53. Dutta, D.; Goodwell, A.E.; Kumar, P.; Garvey, J.E.; Darmody, R.G.; Berretta, D.P.; Greenberg, J.A. On the feasibility of characterizing soil properties from aviris data. IEEE Trans. Geosci. Remote Sens. 2015, 53, 5133–5147. [Google Scholar] [CrossRef]
  54. Zabcic, N.; Rivard, B.; Ong, C.; Mueller, A. Using airborne hyperspectral data to characterize the surface ph and mineralogy of pyrite mine tailings. Int. J. Appl. Earth Obs. Geoinf. 2014, 32, 152–162. [Google Scholar] [CrossRef]
  55. Paz-Kagan, T.; Zaady, E.; Salbach, C.; Schmidt, A.; Lausch, A.; Zacharias, S.; Notesco, G.; Ben-Dor, E.; Karnieli, A. Mapping the spectral soil quality index (ssqi) using airborne imaging spectroscopy. Remote Sens. 2015, 7, 15748–15781. [Google Scholar] [CrossRef]
  56. Conrad, O.; Bechtel, B.; Bock, M.; Dietrich, H.; Fischer, E.; Gerlitz, L.; Wehberg, J.; Wichmann, V.; Böhner, J. System for automated geoscientific analyses (saga) v. 2.1.4. Geosci. Model Dev. Discuss. 2015, 8, 2271–2312. [Google Scholar] [CrossRef]
  57. Pearse, G.D.; Morgenroth, J.; Watt, M.S.; Dash, J.P. Optimising prediction of forest leaf area index from discrete airborne lidar. Remote Sens. Environ. 2017, 200, 220–239. [Google Scholar] [CrossRef]
  58. Were, K.; Bui, D.T.; Dick, Ø.B.; Singh, B.R. A comparative assessment of support vector regression, artificial neural networks, and random forests for predicting and mapping soil organic carbon stocks across an afromontane landscape. Ecol. Indic. 2015, 52, 394–403. [Google Scholar] [CrossRef]
  59. Vaudour, E.; Gilliot, J.M.; Bel, L.; Lefevre, J.; Chehdi, K. Regional prediction of soil organic carbon content over temperate croplands using visible near-infrared airborne hyperspectral imagery and synchronous field spectra. Int. J. Appl. Earth Obs. Geoinf. 2016, 49, 24–38. [Google Scholar] [CrossRef]
  60. Capolupo, A.; Kooistra, L.; Berendonk, C.; Boccia, L.; Suomalainen, J. Estimating plant traits of grasslands from uav-acquired hyperspectral images: A comparison of statistical approaches. ISPRS Int. J. Geo-Inf. 2015, 4, 2792–2820. [Google Scholar] [CrossRef]
  61. Vaudour, E.; Noirot-Cosson, P.E.; Membrive, O. Early-season mapping of crops and cultural operations using very high spatial resolution pléiades images. Int. J. Appl. Earth Obs. Geoinf. 2015, 42, 128–141. [Google Scholar] [CrossRef]
Figure 1. Geographical location of study area in Iowa and corresponding soil and land use types. The soil type map is from the Online Web Soil Survey (official USDA soil information), whereas the land use type map was interpreted using hyperspectral images. The first sampling was executed in six random places to obtain prior knowledge about soil organic carbon (SOC). Second and third sampling were executed using grid sampling at a distance of 130 m.
Figure 1. Geographical location of study area in Iowa and corresponding soil and land use types. The soil type map is from the Online Web Soil Survey (official USDA soil information), whereas the land use type map was interpreted using hyperspectral images. The first sampling was executed in six random places to obtain prior knowledge about soil organic carbon (SOC). Second and third sampling were executed using grid sampling at a distance of 130 m.
Remotesensing 10 00888 g001
Figure 2. Parameters of Headwall Micro-Hyperspec airborne sensors (A- and X-Series) and corresponding remote-sensing images in study region (true color composition: bands at 660, 550 and 480 nm in RGB for A-Series; bands at 13, 54 and 26 nm in RGB for X-Series).
Figure 2. Parameters of Headwall Micro-Hyperspec airborne sensors (A- and X-Series) and corresponding remote-sensing images in study region (true color composition: bands at 660, 550 and 480 nm in RGB for A-Series; bands at 13, 54 and 26 nm in RGB for X-Series).
Remotesensing 10 00888 g002
Figure 3. Integrated soil sampling flow. (PLSR: partial least square regression; PLSRK: partial least square regression kriging; RMSE: root mean square error; RPD: ratio of standard deviation to RMSE; DSSI: density of soil sample index; CEPA: comprehensive evaluation index of prediction accuracy; RSEP: ratio of sampling efficiency to performance).
Figure 3. Integrated soil sampling flow. (PLSR: partial least square regression; PLSRK: partial least square regression kriging; RMSE: root mean square error; RPD: ratio of standard deviation to RMSE; DSSI: density of soil sample index; CEPA: comprehensive evaluation index of prediction accuracy; RSEP: ratio of sampling efficiency to performance).
Remotesensing 10 00888 g003
Figure 4. Distribution characteristics of soil organic carbon (SOC) values. SD: standard deviation, CV: Coefficient of variation.
Figure 4. Distribution characteristics of soil organic carbon (SOC) values. SD: standard deviation, CV: Coefficient of variation.
Remotesensing 10 00888 g004
Figure 5. Soil organic carbon (SOC) maps predicted by partial least squares regression (a) and partial least squares regression kriging (c) from airborne hyperspectral remote-sensing images and their differences (b).
Figure 5. Soil organic carbon (SOC) maps predicted by partial least squares regression (a) and partial least squares regression kriging (c) from airborne hyperspectral remote-sensing images and their differences (b).
Remotesensing 10 00888 g005
Figure 6. Evaluation indices of ordinary kriging models based on different densities of soil samples. (RANDOM: the random sampling; LHS: the Latin hypercube sampling; GRID: the grid sampling; RMSE: the root mean square error; RPD: the ratio of standard deviation to RMSE. ANOVA: Analysis of Variance).
Figure 6. Evaluation indices of ordinary kriging models based on different densities of soil samples. (RANDOM: the random sampling; LHS: the Latin hypercube sampling; GRID: the grid sampling; RMSE: the root mean square error; RPD: the ratio of standard deviation to RMSE. ANOVA: Analysis of Variance).
Remotesensing 10 00888 g006
Figure 7. Evaluation indices of DSSI, CEPA and RSEP used to evaluate the performances of different soil sampling plans with different numbers of soil samples. (DSSI: density of soil samples index; CEPA: comprehensive evaluation index of the prediction accuracy; RSEP: ratio of sampling efficiency to performance; RANDOM: the random sampling; LHS: Latin hypercube sampling; GRID: grid sampling; RMSE: root mean square error).
Figure 7. Evaluation indices of DSSI, CEPA and RSEP used to evaluate the performances of different soil sampling plans with different numbers of soil samples. (DSSI: density of soil samples index; CEPA: comprehensive evaluation index of the prediction accuracy; RSEP: ratio of sampling efficiency to performance; RANDOM: the random sampling; LHS: Latin hypercube sampling; GRID: grid sampling; RMSE: root mean square error).
Remotesensing 10 00888 g007
Figure 8. Spatial distribution of soil organic carbon (SOC) predicted by partial least squares regression kriging with aid of hyperspectral images (j, original) and predicted by ordinary kriging models of different sampling methods (RANDOM: random sampling; LHS: Latin hypercube sampling; GRID: grid sampling) with various sampling densities (ai, x: 5.13, y: 0.57 and z: 0.16).
Figure 8. Spatial distribution of soil organic carbon (SOC) predicted by partial least squares regression kriging with aid of hyperspectral images (j, original) and predicted by ordinary kriging models of different sampling methods (RANDOM: random sampling; LHS: Latin hypercube sampling; GRID: grid sampling) with various sampling densities (ai, x: 5.13, y: 0.57 and z: 0.16).
Remotesensing 10 00888 g008
Figure 9. Differences between original reference and estimated soil organic carbon (SOC) maps by random sampling (a); Latin hypercube sampling (LHS) (b) and grid sampling (c) at sampling density of 0.57. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Figure 9. Differences between original reference and estimated soil organic carbon (SOC) maps by random sampling (a); Latin hypercube sampling (LHS) (b) and grid sampling (c) at sampling density of 0.57. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Remotesensing 10 00888 g009
Figure 10. Histogram of error values between original reference and estimated SOC maps by random (RANDOM), Latin hypercube (LHS) and grid sampling (GRID) at sampling density of 0.57. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Figure 10. Histogram of error values between original reference and estimated SOC maps by random (RANDOM), Latin hypercube (LHS) and grid sampling (GRID) at sampling density of 0.57. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Remotesensing 10 00888 g010
Figure 11. Comparison of parameters of semivariograms of Gaussian (a); circular (b); spherical (c) and exponential (d) semi-variance functions of soil organic carbon (SOC) based on field soil samples (n = 180) using ordinary kriging.
Figure 11. Comparison of parameters of semivariograms of Gaussian (a); circular (b); spherical (c) and exponential (d) semi-variance functions of soil organic carbon (SOC) based on field soil samples (n = 180) using ordinary kriging.
Remotesensing 10 00888 g011
Figure 12. Original reference (a) and predicted SOC maps by ordinary kriging model (b) based on field soil samples (n = 180) and differences between them (c) and percentages of different SOC error levels in study area by the pie chart. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Figure 12. Original reference (a) and predicted SOC maps by ordinary kriging model (b) based on field soil samples (n = 180) and differences between them (c) and percentages of different SOC error levels in study area by the pie chart. H: high level (>30%); M: median level (30–10%); L: low level (<10%).
Remotesensing 10 00888 g012
Figure 13. The trend lines of the quadratic polynomial function between the RSEP and the sampling density in different sampling plans.
Figure 13. The trend lines of the quadratic polynomial function between the RSEP and the sampling density in different sampling plans.
Remotesensing 10 00888 g013
Table 1. The performance parameters of the Headwall Micro-Hyperspec airborne sensors.
Table 1. The performance parameters of the Headwall Micro-Hyperspec airborne sensors.
TypesWavelengths (nm)Spectral Resolution (nm)The Number of Spectral BandsSpatial Resolution (m)
Micro-Hyperspec VNIR imaging spectrometersA-Series380–10001.93251
Micro-Hyperspec NIR imaging spectrometersX-Series900–170012.9671
Table 2. Parameters of semi-variance functions of ordinary kriging models used to predict soil organic carbon (SOC) at different numbers of soil samples.
Table 2. Parameters of semi-variance functions of ordinary kriging models used to predict soil organic carbon (SOC) at different numbers of soil samples.
The Sampling MethodsGrid Distances (m)Soil Sample NumbersSampling Density (ha−1)Major Range (m)Nugget (C0)Partial Sill (C)Sill (C0 + C)C0/(C + C0) (%)
Random Sampling 19845.13738.460.0090.070.07911.39%
8732.26610.10.0120.0830.09512.63%
4771.23697.790.0250.0580.08330.12%
3220.83605.330.0090.0750.08410.71%
2220.57970.570.0390.0340.07353.42%
1680.43257.3400.0770.0770
1280.33884.2400.0970.0970
930.24380.2100.1050.1050
780.20743.880.0090.0410.0518.00%
630.16891.30.010.0720.08212.20%
LHS 19845.13630.70.0110.0750.08612.79%
8732.26542.870.0120.0730.08514.12%
4771.23541.20.0180.0640.08221.95%
3220.83553.430.0120.0640.07615.79%
2220.57827.410.0150.0580.07320.55%
1680.43853.270.0260.0490.07534.67%
1280.331091.870.040.0560.09641.67%
930.24738.460.0290.0570.08633.72%
780.20509.3400.140.140
630.16271.7200.0940.0940
Grid Sampling4019845.13660.210.0150.0740.08916.85%
608732.26793.380.0220.0680.0924.44%
804771.23617.590.0250.0710.09626.04%
1003220.83971.830.030.0660.09631.25%
1202220.57618.750.00980.0780.087811.16%
1401680.43756.690.0210.0640.08524.71%
1601280.33955.920.0090.0810.0910.00%
180930.24564.7700.1080.1080
200780.20851.600.1030.1030
220630.16520.7200.1230.1230
LHS: Latin hypercube sampling. The exponential function was used as the semivariance function for these ordinary kriging models.
Table 3. Comparison of parameters of different semivariogram functions of soil organic carbon (SOC) based on field soil samples (n = 180) using ordinary kriging.
Table 3. Comparison of parameters of different semivariogram functions of soil organic carbon (SOC) based on field soil samples (n = 180) using ordinary kriging.
C0CDistance Parameter (a, Unit: m)RMSE (%)MSDR
Circular0.070.06595.540.31990.804
Spherical0.070.06680.710.32030.803
Exponential0.060.08879.880.31890.823
Gaussian0.080.05631.620.32050.810
C0: the nugget value, a: effective range, C: the sill value, RMSE: the root mean square error, MSDR: the mean squared deviation ratio.
Table 4. Quantifying the relationships between the sampling density and the sampling indices.
Table 4. Quantifying the relationships between the sampling density and the sampling indices.
DSSICEPARSEP
RANDOMLHSGRIDRANDOMLHSGRID
Pearson’s r to Sampling density1.00−0.91−0.94−0.89−0.90−0.92−0.90
Linear function ( y = a x + b )
a0.65−0.59−0.61−0.57−0.92−1.00−0.86
b−0.740.670.700.650.220.290.18
R21.000.840.890.780.810.850.81
Logarithmic function ( y = a l n x + b )
a −0.89−0.84−0.87−0.87−1.00−0.83
b −0.43−0.41−0.42−1.24−1.34−1.20
R2 0.980.880.940.380.440.39
Quadratic polynomial function ( y = a x 2 + b x + c )
a 0.170.120.19−0.32−0.31−0.31
b −1.45−1.25−1.560.730.610.71
c 1.081.001.12−0.55−0.47−0.56
R2 0.940.950.930.980.990.99
Table 5. Summary of estimation of soil properties using environmental factors.
Table 5. Summary of estimation of soil properties using environmental factors.
LabelsNumber of SamplesEnvironment VariablesArea of Research AreaSoil PropertiesR2CR2VReference
1110elevation, slope, cosine value of aspect, NDVI, distances from sample points to water body, soil erosion intensity, ferrous minerals index1260 km2SOM0.3140.094[56]
2272Terrain factors (e.g., elevation, slope, and aspect); Distance factors (e.g., distance to river, residential area, and road); Land cover type, and spectral indices (NDVI and NDMI)153 km2SOCD0.220.312[17]
33485Slope, elevation, mean annual air temperature (MAAT), mean annual precipitation (MAP), land use (LULC), bedrock geology, and NDVI1.98 × 106 km2SOCNone0.58[45]
4878DEM, slope gradient, elevation, hillshade, mean annual air temperature (MAAT), mean annual precipitation (MAP), geology and land use maps, and NDVI117,599 km2SOCNone0.36[48]
5202soil taxonomic units, soil parent materials, slope classes, soil texture classes, drainage classes, thickness of humiferous horizon, gravel layer, geologic units, lithologic units; slope gradient, aspect and curvature, contributing area, wetness index, the position of grid cells; Choropleth lithologic map; NDVI15 km2SOCNone0.36[57]
6320temperature, rainfall, land cover data, elevation, slope, curvature, aspect, TWI, NDVI≈650 km2SOCNone0.64[58]
DEM: digital elevation model, NDVI: normalized difference vegetation index, TWI: topographic wetness index, NDMI: normalized difference moisture index.
Table 6. Summary of estimation of soil properties using airborne hyperspectral dataset.
Table 6. Summary of estimation of soil properties using airborne hyperspectral dataset.
LabelsSamples NumbersThe Hyperspectral SensorSoil PropertiesR2CVR2VReference
TypeWavelengthSpectral ResolutionSpatial Resolution
1238CASI-2405–950 nm6 nm6 mSOC0.850.85[20]
2325AHS-160430–2540 nm18~90 nm2.6 mSOC0.770.73[24]
3165AHS-160430–2540 nm18~90 nm2.6 mSOC0.93–0.96-[25]
489SpecTIR LLC400–2500 nm4.3 or 5.3 nm1 mSOM0.600.33[27]
575AisaFENIX380–2500 nm3.5~5.5 nm1 mSOM-0.61[55]
AisaDUAL420–2450 nm4.5~6.3 nm1 mSOM-0.95
6267AISA-Eagle400–1000 nm2.9 nm1 mSOC0.590.29[59]
789CASI407–1053 nm61.5 mSOC0.87-[26]
Hyper SIM-GA: prototypal multisensor hyperspectral system of Galileo Avionica; SpecTIR LLC: ProSpecTIR V-S sensor; AHS-160: Airborne Hyperspectral Sensor 160; AisaFENIX: SPECIM’s AISA hyperspectral airborne sensors with AisaFENIX; AisaDUAL: SPECIM’s AISA hyperspectral airborne sensors with AisaDUAL; CASI-2: Compact Airborne Spectrographic Imager; CASI: Compact Airborne Spectrographic Imager; SOM: soil organic matter; SOC: soil organic carbon; R2CV: the definition coefficient of the cross validation model; R2V: the definition coefficient of the validation model.

Share and Cite

MDPI and ACS Style

Guo, L.; Linderman, M.; Shi, T.; Chen, Y.; Duan, L.; Zhang, H. Exploring the Sensitivity of Sampling Density in Digital Mapping of Soil Organic Carbon and Its Application in Soil Sampling. Remote Sens. 2018, 10, 888. https://doi.org/10.3390/rs10060888

AMA Style

Guo L, Linderman M, Shi T, Chen Y, Duan L, Zhang H. Exploring the Sensitivity of Sampling Density in Digital Mapping of Soil Organic Carbon and Its Application in Soil Sampling. Remote Sensing. 2018; 10(6):888. https://doi.org/10.3390/rs10060888

Chicago/Turabian Style

Guo, Long, Marc Linderman, Tiezhu Shi, Yiyun Chen, Lijun Duan, and Haitao Zhang. 2018. "Exploring the Sensitivity of Sampling Density in Digital Mapping of Soil Organic Carbon and Its Application in Soil Sampling" Remote Sensing 10, no. 6: 888. https://doi.org/10.3390/rs10060888

APA Style

Guo, L., Linderman, M., Shi, T., Chen, Y., Duan, L., & Zhang, H. (2018). Exploring the Sensitivity of Sampling Density in Digital Mapping of Soil Organic Carbon and Its Application in Soil Sampling. Remote Sensing, 10(6), 888. https://doi.org/10.3390/rs10060888

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop