Nothing Special   »   [go: up one dir, main page]

Next Article in Journal
Enhancing Precipitation Nowcasting Through Dual-Attention RNN: Integrating Satellite Infrared and Radar VIL Data
Previous Article in Journal
Assessing the Air Quality Impact of Train Operation at Tokyo Metro Shibuya Station from Portable Sensor Data
You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Technical Note

Identifying Optimal Variables to Predict Soil Organic Carbon in Sandy, Saline, and Black Soil Regions: Remote Sensing, Terrain, or Climate Factors?

by
Liping Wang
1,2,
Huanjun Liu
2,
Xiang Wang
3,
Xiaofeng Xu
4,
Liyuan He
4,
Chong Luo
2,
Yong Li
1,2,
Xinle Zhang
5,
Deqiang Zang
6,
Shufeng Zheng
1,* and
Xiaodan Mei
7
1
School of Hydraulic and Electric-Power, Heilongjiang University, Harbin 150080, China
2
State Key Laboratory of Black Soils Conservation and Utilization, Northeast Institute of Geography and Agroecology Chinese Academy of Sciences, Changchun 130102, China
3
College of Geodesy and Geomatics, Shandong University of Science and Technology, Qingdao 266590, China
4
Biology Department, San Diego State University, San Diego, CA 92182, USA
5
College of Information Technology, Jilin Agricultural University, Changchun 130118, China
6
School of Public Administration and Law, Northeast Agricultural University, Harbin 150030, China
7
School of Surveying and Mapping Engineering, Heilongjiang Institue of Technogly, Harbin 150050, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2025, 17(2), 237; https://doi.org/10.3390/rs17020237
Submission received: 27 November 2024 / Revised: 2 January 2025 / Accepted: 8 January 2025 / Published: 10 January 2025
(This article belongs to the Section Remote Sensing for Geospatial Science)
Figure 1
<p>Locations of the study area: (<b>a</b>) location of the study area; (<b>b</b>) map of sandy region and sampling sites in AH; (<b>c</b>) image of sandy soil in AH; (<b>d</b>) map of saline soil and sampling sites in DM&amp;LD; (<b>e</b>) image of soil salinization in DM&amp;LD; (<b>f</b>) map of black soil region and sampling sites in HL; (<b>g</b>) image of black soil in HL.</p> ">
Figure 2
<p>Original reflectance (OR, at the top of each figure) and continuum removal (CR, at the bottom of each figure) with different SOC contents. (<b>a</b>) AH, sandy region; (<b>b</b>) DM&amp;LD, saline region; (<b>c</b>) HL, black soil region.</p> ">
Figure 3
<p>The weighting values of bands, spectral indexes, and terrain factors ((<b>a</b>): AH; (<b>b</b>): DM&amp;LD; (<b>c</b>): HL; (<b>d</b>): all regions). The yellow points mean two selected variables with the highest values in each part of the study areas (Bn (The nth bands of Sentinel-2), Green Normalized Difference Vegetation Index (GNDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Total Vegetation Index (SATVI), Transformed Vegetation Index (TVI), Ratio Vegetation Index (RVI), Green Ratio Vegetation Index (GRVI), Land Surface Water Index (LSWI), Moisture Stress Index (MSI), Soil Adjusted Vegetation Index (SAVI), Normalized Differences Vegetation Index (NDVI), slope (S), aspect (A), plan curvatures (PlC), profile curvatures (PrC), topographic wetness index (TWI), roughness (Rn), relief (RL), slope length (SL), and hillshade (HS)).</p> ">
Figure 4
<p>Spatial map of precipitation and temperature in the three different regions.</p> ">
Figure 5
<p>Training results for each region and all regions based on RF model. ((<b>a</b>). Sandy soil area in AH. (<b>b</b>). Saline soil area in DM&amp;LD. (<b>c</b>). Black soil region in HL. (<b>d</b>). All regions.)</p> ">
Figure 6
<p>Validation results of each region and all regions based on RF model. ((<b>a</b>). Sandy soil area in AH. (<b>b</b>). Saline soil area in DM&amp;LD. (<b>c</b>). Black soil region in HL. (<b>d</b>). All regions.).</p> ">
Figure 7
<p>Training and validation results using the local regression method based on RF.</p> ">
Figure 8
<p>Spatial distribution of predicted SOC content in cultivated land in RF model ((<b>a</b>): SOC of AH; (<b>b</b>): SOC of DM&amp;LD; (<b>c</b>): SOC of HL).</p> ">
Versions Notes

Abstract

:
Environmental variables have a substantial effect on the reliability of soil organic carbon (SOC) mapping. However, it is still challenging to identify which environmental variables are effective in cropland SOC prediction in sandy, saline, and black soil regions. To address this issue, we used the principal component analysis (PCA) method for the feature selection of bands, spectral indexes, and terrain factors for each region. Based on the selection feature, we used global RF and local RF for SOC prediction for these three regions. Our results indicated that (1) climate factors, particularly mean annual precipitation and mean annual temperature, were the most effective predictors in SOC mapping across sandy, saline, and black soil regions, as indicated by their significant contribution to RF model performance (R2 > 0.63); (2) followed by climate factors, the Transformed Vegetation Index (TVI) was consistently identified as the most influential variable for SOC prediction among spectral indexes in all three regions; (3) a local regression method based on RF models showed good performance compared to a global model; (4) desertification and salinization were the main reasons for the spatial differences in AH and DM&LD, respectively. The SOC of HL in black soil regions was consistent with the climate change trend because of the latitude difference. This study provides valuable information for constructing a more precise soil prediction strategy for cultivated land in sandy, saline, and black soil regions.

1. Introduction

Soil organic carbon (SOC) constitutes a significant fraction of the soil carbon pool [1]. The decomposition of soil organic matter releases carbon dioxide, which is a major contributor to the rising atmospheric CO2 levels [2]. Therefore, SOC mapping in croplands is necessary for comprehending its dynamics and assessing its contribution to the global carbon cycle.
In previous studies, SOC prediction has primarily relied on statistical indexes comparing predicted values with field-collected soil samples. However, the substantial costs associated with field sampling and laboratory analysis pose significant challenges to large-scale SOC monitoring [3,4]. The visible (VIS) spectral region has been employed for SOC estimation, as increasing SOC levels typically reduce reflectance in this region [5]. Meanwhile, the NIR and SWIR spectral regions have demonstrated high sensitivity to SOC prediction. Collectively, these findings underscore the potential of the VIS, NIR, and SWIR spectral regions in effectively detecting SOC [6,7].
Recent works showed that a bare soil image could be obtained from time-series mean images, effectively minimizing the interference caused by vegetation cover and soil moisture [1,8]. Furthermore, auxiliary variables such as terrain and climate factors are strongly connected to SOC. Terrain characteristics control the redistribution rate of soil on the hillside and affect the quantity and quality of SOC in the landscape because topography plays an important role in soil accumulation in the process of soil erosion and deposition [9]. Climate factors regulate the balance between carbon input from plant residues and carbon loss due to the decomposition of SOC by soil microorganisms [10]. Hence, with digital soil mapping (DSM) technology, SOC can be quantitatively predicted by establishing the relationship between field soil observations and easy-to-measure environmental data [11,12]. For instance, Chen et al. [13] examined the spatio-temporal distribution of SOM by MODIS, terrain, and climate variables in Hubei province. Zeraatpisheh et al. [1] used remote sensing and topography covariates in the Cubist, regression tree, and RF models to estimate SOC content in a semiarid region in central Iran. However, their research merely combined different variables based on experience, lacking a quantitative method for selecting variables of different types.
The accuracy of SOC prediction is notably influenced by the complexity of soil classes, as certain spectral response bands may be obscured. In China, sandy, saline, and black soil regions represent three distinct areas with unique environmental characteristics that shape their soil spectral features. For instance, sandy regions are typically associated with low SOC content [14]. Thus, the spectral reflectance of aeolian sandy soil exceeds that of loamy or clayey soil such as black soil in visible light. The spectral curve after continuum removal presents slender absorption valleys in visible light [15]. The spectral curves of different salinized soils tend to be consistent in morphology, and the reflectance increases with the wavelength [16]. Black soil is predominantly found in the northeast of China, with high content of soil nutrients and SOC. Due to its dark color and elevated organic carbon levels, black soil demonstrates low spectral reflectance in visible light, and its spectral curve after envelope removal presents deep and pronounced absorption valleys [17]. In large or distinct datasets, the relationship between soil spectral properties and SOC is highly nonlinear and spatial-dependent. Despite the observable differences in soil spectral features across sandy, saline, and black soil regions, it remains unclear which variables—remote sensing, terrain or climate factors—are most effective for SOC prediction.
The random forest (RF) model is capable of generating predictions from high-dimensional data by averaging the outcomes of multiple decision trees, each constructed with varying parameters. This ensemble approach enhances the model’s predictive accuracy and robustness. Additionally, to account for variations in soil or environmental conditions, separate prediction models can be developed using local regression techniques. This allows for more tailored and precise predictions by considering the specific characteristics of different datasets. Shi et al. [18] proposed a local regression model to predict SOC which can effectively avoid the problem with the cover-up response band. The optimal model can also be used in each class of soil to predict SOC. Ward et al. [19] used two clustering methods to build local regression models, and they found that the accuracy of the local prediction model using spectral feature parameters performed well.
This study aims to address the gap in previous research regarding the influence of environmental variables on different soil class regions by utilizing multiple remote sensing datasets and prediction methods to analyze the spatial distribution of SOC. Specifically, the objectives are threefold: (1) to compare the optimal variables to inverse SOC in sandy, saline, and black soil regions; (2) to evaluate whether incorporating region-specific inputs can enhance prediction accuracy; and (3) to generate a spatial distribution map with a 10 m spatial resolution. By focusing on these objectives, this study seeks to provide a more nuanced understanding of how environmental covariates affect SOC distribution across varied soil class regions.

2. Materials and Methods

2.1. Study Area

The study area comprises three distinct regions: Aohan Banner (AH), located in a sandy zone; Durbert Mongolian Autonomous Region and Lindian County (DM&LD), situated in a salinization zone; and Hailun County (HL) positioned within a black soil zone (Figure 1).
AH is located at 119°32′E~120°53′E, 41°41′N~43°02′N, encompassing a land area of 8181 km2, with a quarter of it being a desertification area [20]. The predominant soil type in this region is Arenosol. The average annual temperature is 6.6 °C, with an annual rainfall of 449 mm. Elevations range from 0 to 1233 m above sea level, with the highest elevations predominantly located in the southern parts of AH.
DM&LD covers approximately 9539 km2 at 45°53′N~47°28′N and 123°45′E~125°20′ E. More than 15% of this area is affected by salinization [21]. Its terrain features a gentle topography, characterized by low elevations ranging from 111m to 196m. The region experiences an average annual temperature of 5.2 °C and an annual rainfall of 533 mm.
HL is situated at 46°58′N~47°50′N, 126°13′E~127°45′E and is characterized by black (Phaeozem) soils [22]. It spans an area of 4643 km2 and an elevation from 142m to 485m. The average annual temperature is 3.5 °C, with an annual rainfall of 608 mm. Additionally, there is a “bare soil period” in the study area from the end of March to May, during which there is no snow cover and minimal crop residue in the fields.

2.2. Collection and Measurement of Soil Samples

In total, 309 topsoil (0–20cm) samples were collected from April to May in 2014, 2016, 2018, and 2019 to ensure comprehensive coverage of the study area, with 102 samples collected in AH, 96 in DM&LD, and 117 in HL. With consistent cropping practices maintained throughout the study period (2014–2019), minimal SOC changes could be disregarded. Five soil samples, including four corners and one center of a 1 m2 frame, were mixed for one representative sample. To ensure sample integrity and prevent cross-contamination, each soil sample was put in a separate bag. Subsequently, all soil samples were air-dried and passed through a 2mm sieve before SOC analysis. SOC was determined using the Potassium Dichromate volumetric method [23].
The portable spectrometer used for analyzing the spectral properties of air-dried soil samples was the ASD FieldSpec@3 (malvern panalytical, Malvern, UK). The soil spectral range covered wavelengths from 350 to 2500 nm. The sampling interval and spectral resolution were 1.4 nm and 3 nm, respectively, within the range of 350–1000 nm, and 2 nm and 10 nm within the range of 1000–2500 nm. Subsequently, the spectrometer resampled the data to 1 nm resolution.
During testing, each soil sample was placed in a sample dish measuring 12 cm in diameter and 1.8 cm in depth. A 50 W halogen lamp, positioned at a zenith angle of 30° and a distance of 100 cm from the sample, served as the light source. The lamp was oriented nearly parallel to the soil surface to minimize the influence of shadows resulting from soil roughness. An optical fiber probe with an 8° field of view was positioned at 15 cm from the soil surface, with the sensor probe perpendicular to the sample. To ensure accurate measurements, the spectrometer accounted for and removed the influence of dark current in radiation intensity before testing and was calibrated using a white board. Each soil sample underwent testing with 10 spectral curves, and the average value was calculated to determine the sample’s reflectance. Subsequently, spectral reflection data were subjected to a nine-point smoothing process and resampled to a 10 nm resolution.

2.3. Explanatory Covariates for SOC Prediction

SOC is influenced by environmental and ecological factors and their interactions. Based on previous studies [2,13], explanatory covariates including spectra, spectral indexes, and terrain and climate factors were selected to predict SOC content (Table 1).

2.3.1. Remote Sensing Data

Sentienl-2 Level-1C images were processed using the official atmosphere correction model, Sen2Cor [24]. These images consist of ten bands with varying spatial resolutions: Bands B2 (490 nm), B3 (560 nm), B4 (665 nm), and B8 (842 nm) possess a spatial resolution of 10m, whereas bands B5 (705nm), B6 (740nm), B7 (783nm), B8a (865 nm), B11 (1610 nm), and B12 (2190 nm) have a spatial resolution of 20m. Additionally, three atmosphere correction channels, B1, B9, and B10, have a spatial resolution of 60 m
To ensure high spatial detail and reliable spectral information, the bilinear interpolation method was employed to refine the spatial resolution of Sentinel-2 images to 10 m [7,24,26]. Mean spectral reflectance over a time series enhances the accuracy of soil property prediction, albeit with minor influences from soil moisture, cloud cover, and green vegetation [8]. We also calculated the mean spectral reflectance as the final spectral reflectance used in the study area. Specifically, the mean spectral reflectance was extracted from 20 March to 20 April 2018 in AH and from 20 April to 20 May 2018 in HL and LD&DM. These intervals represent the bare soil period in these areas.
Mean spectral reflectance values acquired from Sentinel-2 (the mean value from July to August, which corresponds to the growing season) were also used to calculate remote sensing indexes. These indexes include the Green Normalized Difference Vegetation Index (GNDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Total Vegetation Index (SATVI), Transformed Vegetation Index (TVI), Ratio Vegetation Index (RVI), Green Ratio Vegetation Index (GRVI), Land Surface Water Index (LSWI), Moisture Stress Index (MSI), Soil Adjusted Vegetation Index (SAVI), and the Normalized Differences Vegetation Index (NDVI).

2.3.2. Terrain Factors

To obtain the terrain attributes, a digital elevation model (DEM) was downloaded from the Shuttle Radar Topography Mission (SRTM) at a resolution of 30 m. The DEM was obtained from https://search.asf.alaska.edu/#/ (accessed on 25 March 2020). The other nine terrain factors obtained from DEM using ENVI 5.3 included slope (S), aspect (A), plan curvatures (PlCs), profile curvatures (PrCs), topographic wetness index (TWI), roughness (Rn), relief (RL), slope length (SL), and hillshade (HS).

2.3.3. Climate Factors

We retrieved meteorological station data from the China Meteorological Data Center (http://data.cma.cn/) for the year 2018. The data included mean annual precipitation (MAP) and mean annual temperature (MAT). To standardize the data and align them with the temporal and spatial resolution of Sentinel-2 imagery, we performed spline interpolation, as outlined in [13,27]. Subsequently, the meteorological data were reprojected and resampled to match the temporal and spatial resolution of the Sentinel-2 dataset.

2.4. Selection of Predictors for SOC Prediction

The principal component analysis (PCA) evaluation method was used to assess the weighting coefficient (W) of the predictors across different classes, as described by [5,28]. We extracted 10 bands, spectral indexes, and terrain factors, each with the highest two W values as the inputs of the prediction model. This approach effectively mitigates collinearity and redundancy among the predictors. The specific calculation formula for W is as follows:
α n i = ε n i σ n i
W n = α n 1 × δ n 1 + α n 2 × δ n 2 + + α n i × δ n i δ n 1 + δ n 2 + + δ n i
Note: q represents the Q-statistic value with a range value of [0, 1]; L represents the number of strata; N and N h represent the total number of field samples and the number of samples in strata, respectively; and σ 2 is the variance of the SOM in the study area. The GDM was executed with Geodetector (Version 1.5) software. W n is the weighting factor corresponding to band n. α n i , σ n i , δ n i , and ε n i are the linear coefficient, initial eigenvalue, initial eigenvalue variance, and load number corresponding to the i t h principal component of the n t h band, respectively.

2.5. Random Forest Model

RF is an ensemble learning technique frequently applied in regression tasks. As a supervised learning algorithm, RF constructs a collection of decision trees by resampling the training data and combining the predictions from multiple trees, collectively forming the “forest”. The RF model contains a random selection of features and evaluates variable importance based on the increase in mean error associated with each tree. RF uses Out-Of-Bag (OOB) data to generate reliable error estimates [17]. This regression approach is effective in mitigating common challenges such as overfitting, noise, and irrelevant variables [7]. The number of trees in the forest (n_estimators), the minimum number of samples required to be at a leaf node (min_samples_leaf), and the maximum depth of the tree (max_depth) were set to 500, 1, and 40, respectively [29]. In this study, we employed two types of RF models: global RF and local regression based on RF. Both models were used to predict SOC but differed in their approach to handling spatial variability across the study area. Global RF generates one model based on the entire study area, while local regression based on RF builds separate models for the three regions within the study area.

2.6. Model Calibration and Validation

The calibration set and validation set were randomly divided in a 2:1 ratio to model the relationships between the SOC observations and the predictors. The RF algorithms were employed to construct predictive SOC models. Then, we used two validation criteria to determine the model’s performance: R2 and RMSE. A well-performing model is always accompanied by a higher R2 and a lower RMSE.
R 2 = 1 i = 1 n y i y 𝚤 ^ 2 i = 1 n y i y ¯ 2
R M S E = i = 1 n y i y 𝚤 ^ 2 n
where n is the number of samples, y i is the observed SOC value of sample i, and y 𝚤 ^ is the predicted SOC of sample i.

3. Results and Discussion

3.1. Description of SOC Content

Overall, the mean (min-max) values of SOC contents were 5.90 (2.00–12.16) g·kg−1, 17.87 (3.72–38.89) g·kg−1, and 30.32(10.42–64.07) g·kg−1, respectively (Table 2). AH and HL had the lowest and highest variation. The coefficient of variation (CV) in the three regions ranged from 35.82 to 36.31%. It showed moderate variability in AH, DM&LD, and HL separately, whereas CV considering all regions had high variability.

3.2. Characteristics of Soil Spectra

Due to unclear differences among various SOC contents using multi-spectral features, we used the measured indoor hyper-spectral features to analyze spectral changes with variation in SOC. The spectral curves of soils from the three different classes show remarkable similarities in the overall position of absorbance features, which relate to major soil constituents. However, the different classes also exhibited various shape characteristics, especially in Vis-NIR spectral regions; the differences in reflectance were clearer in the Vis region (400–700 nm) than in NIR (Baumgardner, 1985). Soil spectral reflectance is inversely proportional to SOC content between 500 and 700 nm, while this is not the case for longer wavelengths. The decline extent and speed were the fastest in HL, and the change rate of soil spectrum accompanied with the change in SOC was the slowest in AH (Figure 2). The continuum removal (CR) values strongly correlated with Vis-NIR reflectance and SOC content. Indeed, we observed an absorption valley centered at 500 nm, with two smaller ones between 600 nm and 900 nm in AH (Figure 2a). In DM&LD, there were two evident absorption valleys and one reflection peak at 400–800 nm, while after 800 nm, there was a tiny absorption valley (Figure 2b). The spectral reflectance in HL has two absorption valleys. With the increase in SOC, the second absorption valley performed more strongly, which was related to SOC content (Figure 2c).

3.3. Selection of Predictors for SOC Prediction

To select prominent explanatory covariates and reduce the redundancy and the complexity of the prediction model, we calculated the weighting values among the bands, spectral indexes, and terrain factors by PCA, with the two highest weighting values for each used as the inputs in every covariate class (Figure 3). In addition, we chose MAP and MAT as the climate factors in all RF models due to their excellent performance in SOC prediction. In AH, B6, B8, TVI, GNDVI, S, and A were selected as the predictors by PCA. The weighting values of each class (bands, spectral indexes, and terrain) accounted for 20.775%, 22.077%, and 33.330%, respectively (Figure 3). B4, B8, TVI, RVI, S, and PrC, as selected by PCA, were the most effective covariates in predicting SOC content in DM&LD, with the weighting values of each class (bands, spectral indexes, and terrain) accounting for 20.776%, 22.953%, and 27.199%, respectively. In HL, B2, B4, TVI, SATVI, S, and TWI were responsible for the prediction of SOC as determined by PCA. The weighting values of each covariate class (bands, spectral indexes, and terrain) accounted for 20.775%, 22.077%, and 33.330%, respectively. When the sampling points of three regions were considered comprehensively, B4, B5, TVI, RVI, slope, and TWI in pairs were selected as the band, spectral index, and terrain input variables, respectively.

3.4. RF Model Performance

Based on the validation criteria, climate factors showed the greatest contribution to the RF model in the study area. Considering all three regions, climate factors alone could explain 77% of SOC variation in the study area (Table 3). The result was consistent with Spohn et al. (2023) [30], who proved that the input of soil carbon sources depends on climate change. Terrain factors showed minor impacts on SOC in the study area compared with remote sensing and climate variables. Climate factors (such as temperature and precipitation) directly regulate key processes in the carbon cycle in terms of its spatial variation, including plant productivity, litter decomposition, and microbial activity (Figure 4). These processes are the core determinants of SOC dynamics, making climate factors more influential than terrain factors. In spectral indexes, TVI was a good predictor of SOC in this study (Figure 3).
Affected by sandy characteristics and climate variations, bands, spectral indexes, and climate variables together showed the highest performance in AH (training: R2 = 0.94, RMSE = 1.97 g kg−1, and validation: R2 = 0.70, RMSE = 3.54 g kg−1) (Figure 5 and Figure 6). The mean spectral reflectance from multi-date bare soil images could be used to improve the prediction accuracy by eliminating the effects of water, clouds, and vegetation [8]. B6 and B8 were the optimal bands, which indicated that the high-weight band moves backward to around 650–900nm (Figure 2a), and the weight values of the three red edge bands were higher. B6 could represent the red edge band due to its narrow range. The soil spectrum of AH had a slight absorption characteristic at about 900 nm, which was more evident than that of DM&LD. Therefore, B8 was also selected as a high-importance band. The high-weight band of sandy soil in AH moved backward, which was consistent with [25], who found that 600~900 nm was the optimum spectral range when using fractional differential transform spectroscopy to retrieve SOC in arid soil. GNDVI reflects the background influence of plant canopies, which is also related to vegetation coverage [11,31], as well as to a high correlation value when the background soil is demarcated as dark or bright [32]. This demarcation could distinguish soils with different degrees of desertification: soil would be sunnier and drier in areas with more severe desertification. Temperature and precipitation regulate the balance between carbon input from plant residues and carbon output through the decomposition of SOC by soil microorganisms [10]. Humid areas have good plant diversity and a vigorous metabolism, conducive to the reproduction of microorganisms in the soil, while desert areas are the opposite. Therefore, climate factors are an important reason affecting the spatial pattern of SOC in sandy regions.
In DM&LD, bands, spectral indexes, and climate variables jointly contribute to model accuracy, with an R2 of 0.60 and RMSE of 13.95 g kg−1. The optical bands B4 and B8 were selected as predictive variables. The first absorption valley (around 500 nm) had a certain absorption depth in all regions (Figure 2b), and the absorption intensity of the second absorption valley (about 650 nm) was more affected by SOC. Therefore, the position of the second absorption valley was selected and corresponded to the B4 band of Sentinel-2. The sensitive band of SOC in DM&LD moved in the long-wave direction and had a slight absorption characteristic before 900 nm, so B8 was also selected as the characteristic band of DM&LD. RVI is a sensitive indicator of vegetation biomass and can also reflect soil degradation conditions in saline regions [33]. A dry climate is the main external factor in soil salinization, and soil freezing can also aggravate the salinization process. When a specific geothermal gradient is formed, the soil moisture moves from the bottom to the freezing front, and then the salt moves upward. Thus, the increase in soil surface salinity will decrease soil compaction and fertility and lead to a corresponding decrease in SOC [34].
Due to climate variations, SOC content varies significantly in black soil areas, forming different spectra characteristics. Bands, spectral indexes, and the former two-class factors with terrain variables showed similar SOC prediction performance (R2 = 0.73) in HL. Still, RMSE for the model without terrain factors was lower (RMSE = 5.67g kg−1) than the value obtained for the model with terrain factors added as predictors. It was further proven that terrain factors had little effect on HL. SATVI is composed of the red and SWIR bands by band calculation. Studies have shown that Vis-NIR spectra are sensitive to both organic and inorganic soil compositions [35,36]. In contrast, the SWIR spectral range is primarily characterized by absorption features associated with clay minerals (such as smectite and illite) and organic compounds (particularly aliphatics) [37]. These spectral features have been identified as key variables for predicting soil organic carbon (SOC) in rich, dark soils. Due to the difference in latitude, the differential distribution of temperature and precipitation is an important reason for the zonal distribution of SOC.
We constructed a local regression model combining the three regions, and this local regression model was found to be superior to all other models. There was an increase of 0.03 in R2 values for both the training and validation results. Additionally, the RMSE values increased by 0.33 and 0.19 for the training and validation results, respectively (Figure 6 and Figure 7). The local regression model was recommended in this study. But there was also some local uncertainty due to the different numerical ranges of samples in other regions. When only taking climate factors into account for AH, DM&LD, and HL, the performance was not as good as that achieved with optimally selected variables.

3.5. Spatiotemporal Changes in SOC in the Study Area

The prediction map in AH indicated that the SOC content ranged from 5 to 30 g kg−1 and increased gradually from the central part of the region to the south (Figure 8). Still, there was a slight drop at the southernmost part, caused by the expansion of urban construction land, which led to more intensive human activities, and frequent human activities put pressure on agricultural water usage and contributed to the degradation of soil aggregates [14]. A reduction in aggregate formation leads to carbon loss in the topsoil, as aggregates play a key role in carbon storage [38]. Thus, rapid urban expansion leads to soil carbon loss by transforming soil from fertile soil to marginal soil. The lower SOC content in the northern parts of AH could be predicted due to drought and desertification. Despite the positive impact of sandy land consolidation measures on SOC restoration, the vulnerability of the sandy ecosystem, coupled with human activities such as poor management practices and grazing, has led to the gradual formation of fixed and semi-fixed dunes, ultimately resulting in SOC loss [1,14].
In DM&LD, the highest values of SOC were in the north and in a little part located in the southwest of the study area, with the SOC content mainly ranging from 5 to 25 g kg−1. Lower SOC values were distributed in the central and south-central regions affected by salinization. High salinity can cause colloid dispersion and soil damage in two ways: one is that a meager infiltration rate and permeability hinder the water exchange between the soil and air, with the lack of oxygen, which can limit the activities of microorganisms, indirectly affecting the nutrient absorption and supply of soil to crops [39,40]. The other is that too much alkaline salt in the soil will cause a strong alkali reaction in the soil, and plant nutrient elements such as iron, calcium, and magnesium will easily form sediments, affecting the process of soil nutrient availability [41]. Due to the presence of paddy fields, the southwest has a high content of SOC.
The SOC content in HL was higher than that in AH and DM&LD because of the good properties and high fertility of black soil, which accounted for a large proportion of 20 to 50 g kg−1. The content of SOC from the central to the north area was higher than that in the southwest part. Temperature is an essential factor affecting SOC content. As latitude increases, temperature decreases, which leads to lower temperatures and weaker microbial activity. This results in slower nutrient decomposition rates in the soil and, consequently, a higher SOC content in colder areas. Wang et al. [14] also pointed that temperature was negatively correlated with SOC.

3.6. Limitations and Future Research

While this study provides valuable insights into the prediction of SOC in black, sandy, and salinized soils, there are several limitations that need to be addressed in future research. First, the PCA method was used to select inputs in this study, but any variable selection method has certain limitations. When the factor load of PCA is both positive and negative, the functional significance of the comprehensive evaluation will not be clear. More input selection methods need to be compared to reduce the uncertainty of variable selection in future research. Moreover, several agricultural activities, such as tillage practice, irrigation, and residue management, are essential variables affecting the spatial heterogeneity. However, due to the spatial and data-related limitations of this study, these factors were not adequately considered. We only considered environmental variables due to the lack of agricultural data. To address this limitation, future studies could incorporate more comprehensive agricultural management data, such as time-series information on agricultural activities, fertilization practices, and tillage methods, to better analyze the effects of agricultural practices on SOC. Additionally, combining remote sensing technology with climate data and adopting spatiotemporal dynamic models could help assess the long-term impacts of different agricultural management practices. This approach would enable more accurate predictions of SOC changes under various management scenarios, providing a more comprehensive and precise basis for regional SOC modeling.

4. Conclusions

In this study, we extracted the mean band value of Sentinel-2 and median spectral indexes to predict SOC, which eliminated the effects of water and vegetation. We applied RF models to indicate SOC in sandy, saline, and black soil regions and identify the main environmental factors (remote sensing, terrain, and climate factors). Climate factors were more critical than other environmental covariates. They explained a large proportion of the variation in SOC in the three areas, while terrain factors had a minimal impact on the accuracy of SOC prediction. TVI was also an excellent spectral index for SOC prediction across the three regions. In addition to the above standard features, B6, B8, and GNDVI played an essential role in SOC prediction in AH. B4, B8, and RVI contributed greatly to explaining SOC variation in DM&LD, while in HL, SATVI contributed more to this.

Author Contributions

Conceptualization, L.W., H.L. and S.Z.; methodology, L.W. and X.W.; software, L.W. and X.W.; validation, H.L., X.W., X.X., L.H., C.L., Y.L., X.Z., X.M. and D.Z.; formal analysis, L.W.; investigation, L.W.; resources, H.L.; data curation, L.W.; writing—original draft preparation, L.W.; writing—review and editing, L.W. and X.W.; visualization, X.W; supervision, S.Z.; project administration, H.L.; funding acquisition, S.Z. and H.L. All authors have read and agreed to the published version of the manuscript.

Funding

The authors acknowledge the support of the project “National Key R&D Program of China” (2021YFD1500100); “Development and Reform Commission innovation capacity-building project of Jilin Province” (2021C044-10); and the Natural Science Foundation of Heilongjiang Province, China (No. LH2022C076).

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Wang, X.; Li, S.; Wang, L.; Zheng, M.; Wang, Z.; Song, K. Effects of cropland reclamation on soil organic carbon in China’s black soil region over the past 35 years. Glob. Change Biol. 2023, 29, 5460–5477. [Google Scholar] [CrossRef]
  2. Zeraatpisheh, M.; Ayoubi, S.; Jafari, A.; Tajik, S.; Finke, P. Digital mapping of soil properties using multiple machine learning in a semi-arid region, central Iran. Geoderma 2019, 338, 445–452. [Google Scholar] [CrossRef]
  3. Ma, X.; Ma, W.; Wang, C.; Xu, Y. Nitrogen and phosphorus supply controls stability of soil organic carbon in alpine meadow of the Qinghai-Tibetan Plateau. Agric. Ecosyst. Environ. 2025, 379, 109336. [Google Scholar] [CrossRef]
  4. Niu, X.; Zhang, S.; Zhang, C.; Yan, P.; Wang, H.; Xu, W.; Song, M.; Aurangzeib, M. Key factors influencing the spatial distribution of soil organic carbon and its fractions in Mollisols. Catena 2024, 247, 108522. [Google Scholar] [CrossRef]
  5. Wang, X.; Li, L.; Liu, H.; Song, K.; Wang, L.; Meng, X. Prediction of soil organic matter using VNIR spectral parameters extracted from shape characteristics. Soil Tillage Res. 2022, 216, 105241. [Google Scholar] [CrossRef]
  6. Wang, X.; Zhang, Y.; Atkinson, P.M.; Yao, H. Predicting soil organic carbon content in Spain by combining Landsat TM and ALOS PALSAR images. Int. J. Appl. Earth Obs. Geoinf. 2020, 92, 102182. [Google Scholar] [CrossRef]
  7. Castaldi, F.; Hueni, A.; Chabrillat, S.; Ward, K.; Buttafuoco, G.; Bomans, B.; Vreys, K.; Brell, M.; van Wesemael, B. Evaluating the capability of the Sentinel 2 data for soil organic carbon prediction in croplands. ISPRS J. Photogramm. Remote Sens. 2019, 147, 267–282. [Google Scholar] [CrossRef]
  8. Gasmi, A.; Gomez, C.; Lagacherie, P.; Zouari, H.; Laamrani, A.; Chehbouni, A. Mean spectral reflectance from bare soil pixels along a Landsat-TM time series to increase both the prediction accuracy of soil clay content and mapping coverage. Geoderma 2021, 388, 114864. [Google Scholar] [CrossRef]
  9. Fissore, C.; Dalzell, B.J.; Berhe, A.A.; Voegtle, M.A.; Evans, M.A.; Wu, A. Influence of topography on soil organic carbon dynamics in a Southern California grassland. Catena 2017, 149, 140–149. [Google Scholar] [CrossRef]
  10. Post, W.M.; Emanuel, W.R.; Zinke, P.J.; Stangenberger, A.G. Soil Carbon Pools and World Life Zones. Nature 1982, 298, 156–159. [Google Scholar] [CrossRef]
  11. Wang, C.; Gao, B.; Yang, K.; Wang, Y.; Sukhbaatar, C.; Yin, Y.; Feng, Q.; Yao, X.; Zhang, Z.; Yang, J. Inversion of soil organic carbon content based on the two-point machine learning method. Sci. Total Environ. 2024, 943, 173608. [Google Scholar] [CrossRef] [PubMed]
  12. Sakhaee, A.; Scholten, T.; Taghizadeh-Mehrjardi, R.; Liess, M.; Don, A. Spatial Prediction of Organic Matter Quality in German Agricultural Topsoils. Agriculture 2024, 14, 1298. [Google Scholar] [CrossRef]
  13. Chen, D.; Chang, N.; Xiao, J.; Zhou, Q.; Wu, W. Mapping dynamics of soil organic matter in croplands with MODIS data and machine learning algorithms. Sci. Total Environ. 2019, 669, 844–855. [Google Scholar] [CrossRef]
  14. Wang, L.; Wang, X.; Wang, D.; Qi, B.; Zheng, S.; Liu, H.; Luo, C.; Li, H.; Meng, L.; Meng, X.; et al. Spatiotemporal Changes and Driving Factors of Cultivated Soil Organic Carbon in Northern China’s Typical Agro-Pastoral Ecotone in the Last 30 Years. Remote Sens. 2021, 13, 3607. [Google Scholar] [CrossRef]
  15. Wang, X.; Zhang, X.; Li, H.; Zhang, X.; Liu, H.; Dou, X.; Yu, Z. The minimum level for soil allocation using topsoil reflectance spectra: Genus or species? Catena 2019, 174, 36–47. [Google Scholar] [CrossRef]
  16. Nabiollahi, K.; Taghizadeh-Mehrjardi, R.; Shahabi, A.; Heung, B.; Amirian-Chakan, A.; Davari, M.; Scholten, T. Assessing agricultural salt-affected land using digital soil mapping and hybridized random forests. Geoderma 2021, 385, 114858. [Google Scholar] [CrossRef]
  17. Wang, X.; Wang, L.; Li, S.; Wang, Z.; Zheng, M.; Song, K. Remote estimates of soil organic carbon using multi-temporal synthetic images and the probability hybrid model. Geoderma 2022, 425, 116066. [Google Scholar] [CrossRef]
  18. Shi, Z.; Ji, W.; Rossel, R.A.V.; Chen, S.; Zhou, Y. Prediction of soil organic matter using a spatially constrained local partial least squares regression and the Chinese vis-NIR spectral library. Eur. J. Soil Sci. 2015, 66, 679–687. [Google Scholar] [CrossRef]
  19. Ward, K.J.; Chabrillat, S.; Neumann, C.; Foerster, S. A remote sensing adapted approach for soil organic carbon prediction based on the spectrally clustered LUCAS soil database. Geoderma 2019, 353, 297–307. [Google Scholar] [CrossRef]
  20. Duan, H.; Wang, T.; Xue, X.; Yan, C. Dynamic monitoring of aeolian desertification based on multiple indicators in Horqin Sandy Land, China. Sci. Total Environ. 2019, 650 Pt 2, 2374–2388. [Google Scholar] [CrossRef]
  21. Zhao, Y.; Wang, S.; Li, Y.; Liu, J.; Zhuo, Y.; Chen, H.; Wang, J.; Xu, L.; Sun, Z. Extensive reclamation of saline-sodic soils with flue gas desulfurization gypsum on the Songnen Plain, Northeast China. Geoderma 2018, 321, 52–60. [Google Scholar] [CrossRef]
  22. Jia-nan, W.; Hao-ming, F.; Yan-feng, J. Spatial variation characteristics and influencing factors of sediment connectivity in the black soil region of northeast China. Geoderma 2024, 446, 116895. [Google Scholar] [CrossRef]
  23. Nelson, D.W.; Sommers, L.E. Total carbon, organic matter, and carbon. In Methods of Soil Analysis; Soil Science Society of America: Madison, WI, USA, 1982; pp. 539–577. [Google Scholar]
  24. Gholizadeh, A.; Žižala, D.; Saberioon, M.; Borůvka, L. Soil organic carbon and texture retrieving and mapping using proximal, airborne and Sentinel-2 spectral imaging. Remote Sens. Environ. 2018, 218, 89–103. [Google Scholar] [CrossRef]
  25. Zhou, J.; Khot, L.R.; Bahlol, H.Y.; Boydston, R.; Miklas, P.N. Evaluation of ground, proximal and aerial remote sensing technologies for crop stress monitoring. IFAC PapersOnLine 2016, 49, 22–26. [Google Scholar] [CrossRef]
  26. Liu, Q.; Gui, Z.; Xiong, S.; Zhan, M. A principal component analysis dominance mechanism based many-objective scheduling optimization. Appl. Soft Comput. 2021, 113, 107931. [Google Scholar] [CrossRef]
  27. Accadia, C.; Mariani, S.; Casaioli, M.; Lavagnini, A.; Speranza, A. Sensitivity of precipitation forecast skill scores to bilinear interpolation and a simple nearest-neighbor average method on high-resolution verification grids. Weather Forecast. 2003, 18, 918–932. [Google Scholar] [CrossRef]
  28. Abdelhafidi, N.; Bachari, N.E.I.; Abdelhafidi, Z. Estimation of solar radiation using stepwise multiple linear regression with principal component analysis in Algeria. Meteorol. Atmos. Phys. 2020, 133, 205–216. [Google Scholar] [CrossRef]
  29. Spohn, M.; Bagchi, S.; Biederman, L.A.; Borer, E.T.; Bråthen, K.A.; Bugalho, M.N.; Caldeira, M.C.; Catford, J.A.; Collins, S.L.; Eisenhauer, N.; et al. The positive effect of plant diversity on soil carbon depends on climate. Nat. Commun. 2023, 14, 6624. [Google Scholar] [CrossRef]
  30. Wang, Y.; Zhang, X.; Zhang, X.; Meng, Q.; Gao, F.; Zhang, Y. Characterization of spectral responses of dissolved organic matter (DOM) for atrazine binding during the sorption process onto black soil. Chemosphere 2017, 180, 531–539. [Google Scholar] [CrossRef]
  31. Liu, H.Q.; Huete, A. A feedback based modification of the NDVI to minimize canopy background and atmospheric noise. IEEE Trans. Geosci. Remote Sens. 1995, 33, 457–465. [Google Scholar] [CrossRef]
  32. Dunagan, S.C.; Gilmore, M.S.; Varekamp, J.C. Effects of mercury on visible/near-infrared reflectance spectra of mustard spinach plants (Brassica rapa P.). Environ. Pollut. 2007, 148, 301–311. [Google Scholar] [CrossRef] [PubMed]
  33. Zhao, S.; Liu, J.; Banerjee, S.; Zhou, N.; Zhao, Z.; Zhang, K.; Hu, M.; Tian, C. Biogeographical distribution of bacterial communities in saline agricultural soil. Geoderma 2020, 361, 114095. [Google Scholar] [CrossRef]
  34. Zhao, L.; Hong, H.; Liu, J.; Fang, Q.; Yao, Y.; Tan, W.; Yin, K.; Wang, C.; Chen, M.; Algeo, T.J. Assessing the utility of visible-to-shortwave infrared reflectance spectroscopy for analysis of soil weathering intensity and paleoclimate reconstruction. Palaeogeogr. Palaeoclimatol. Palaeoecol. 2018, 512, 80–94. [Google Scholar] [CrossRef]
  35. Dong, Z.; Wang, N.; Xie, J.; Ke, X. Coupled Vis-NIR spectroscopy with chemometrics strategy for soil organic carbon prediction in the Agro-pastoral Transitional zone of northwest China. Spectrochim. Acta Part A Mol. Biomol. Spectrosc. 2024, 318, 124496. [Google Scholar] [CrossRef]
  36. Padilha, M.C.d.C.; Vicente, L.E.; Demattê, J.A.M.; dos Santos Wendriner Loebmann, D.G.; Vicente, A.K.; Salazar, D.F.U.; Guimarães, C.C.B. Using Landsat and soil clay content to map soil organic carbon of oxisols and Ultisols near São Paulo, Brazil. Geoderma Reg. 2020, 21, e00253. [Google Scholar] [CrossRef]
  37. Chen, Y.; Day, S.D.; Wick, A.F.; McGuire, K.J. Influence of urban land development and subsequent soil rehabilitation on soil aggregates, carbon, and hydraulic conductivity. Sci. Total Environ. 2014, 494–495, 329–336. [Google Scholar] [CrossRef] [PubMed]
  38. Ge, X.Y.; Ding, J.L.; Jin, X.L.; Wang, J.Z.; Chen, X.Y.; Li, X.H.; Liu, J.; Xie, B.Q. Estimating Agricultural Soil Moisture Content through UAV-Based Hyperspectral Images in the Arid Region. Remote Sens. 2021, 13, 1562. [Google Scholar] [CrossRef]
  39. Ge, X.Y.; Ding, J.L.; Teng, D.X.; Xie, B.Q.; Zhang, X.L.; Wang, J.J.; Han, L.J.; Bao, Q.L.; Wang, J.Z. Exploring the capability of Gaofen-5 hyperspectral data for assessing soil salinity risks. Int. J. Appl. Earth Obs. Geoinf. 2022, 112, 102969. [Google Scholar] [CrossRef]
  40. Almeida, Â.; Calisto, V.; Esteves, V.I.; Schneider, R.J.; Soares, A.M.V.M.; Freitas, R. Salinity-dependent impacts on the effects of antiepileptic and antihistaminic drugs in Ruditapes philippinarum. Sci. Total Environ. 2022, 806, 150369. [Google Scholar] [CrossRef]
  41. Zare, S.; Shamsi, S.R.F.; Abtahi, S.A. Weakly-coupled geo-statistical mapping of soil salinity to Stepwise Multiple Linear Regression of MODIS spectral image products. J. Afr. Earth Sci. 2019, 152, 101–114. [Google Scholar] [CrossRef]
Figure 1. Locations of the study area: (a) location of the study area; (b) map of sandy region and sampling sites in AH; (c) image of sandy soil in AH; (d) map of saline soil and sampling sites in DM&LD; (e) image of soil salinization in DM&LD; (f) map of black soil region and sampling sites in HL; (g) image of black soil in HL.
Figure 1. Locations of the study area: (a) location of the study area; (b) map of sandy region and sampling sites in AH; (c) image of sandy soil in AH; (d) map of saline soil and sampling sites in DM&LD; (e) image of soil salinization in DM&LD; (f) map of black soil region and sampling sites in HL; (g) image of black soil in HL.
Remotesensing 17 00237 g001
Figure 2. Original reflectance (OR, at the top of each figure) and continuum removal (CR, at the bottom of each figure) with different SOC contents. (a) AH, sandy region; (b) DM&LD, saline region; (c) HL, black soil region.
Figure 2. Original reflectance (OR, at the top of each figure) and continuum removal (CR, at the bottom of each figure) with different SOC contents. (a) AH, sandy region; (b) DM&LD, saline region; (c) HL, black soil region.
Remotesensing 17 00237 g002
Figure 3. The weighting values of bands, spectral indexes, and terrain factors ((a): AH; (b): DM&LD; (c): HL; (d): all regions). The yellow points mean two selected variables with the highest values in each part of the study areas (Bn (The nth bands of Sentinel-2), Green Normalized Difference Vegetation Index (GNDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Total Vegetation Index (SATVI), Transformed Vegetation Index (TVI), Ratio Vegetation Index (RVI), Green Ratio Vegetation Index (GRVI), Land Surface Water Index (LSWI), Moisture Stress Index (MSI), Soil Adjusted Vegetation Index (SAVI), Normalized Differences Vegetation Index (NDVI), slope (S), aspect (A), plan curvatures (PlC), profile curvatures (PrC), topographic wetness index (TWI), roughness (Rn), relief (RL), slope length (SL), and hillshade (HS)).
Figure 3. The weighting values of bands, spectral indexes, and terrain factors ((a): AH; (b): DM&LD; (c): HL; (d): all regions). The yellow points mean two selected variables with the highest values in each part of the study areas (Bn (The nth bands of Sentinel-2), Green Normalized Difference Vegetation Index (GNDVI), Enhanced Vegetation Index (EVI), Soil Adjusted Total Vegetation Index (SATVI), Transformed Vegetation Index (TVI), Ratio Vegetation Index (RVI), Green Ratio Vegetation Index (GRVI), Land Surface Water Index (LSWI), Moisture Stress Index (MSI), Soil Adjusted Vegetation Index (SAVI), Normalized Differences Vegetation Index (NDVI), slope (S), aspect (A), plan curvatures (PlC), profile curvatures (PrC), topographic wetness index (TWI), roughness (Rn), relief (RL), slope length (SL), and hillshade (HS)).
Remotesensing 17 00237 g003
Figure 4. Spatial map of precipitation and temperature in the three different regions.
Figure 4. Spatial map of precipitation and temperature in the three different regions.
Remotesensing 17 00237 g004
Figure 5. Training results for each region and all regions based on RF model. ((a). Sandy soil area in AH. (b). Saline soil area in DM&LD. (c). Black soil region in HL. (d). All regions.)
Figure 5. Training results for each region and all regions based on RF model. ((a). Sandy soil area in AH. (b). Saline soil area in DM&LD. (c). Black soil region in HL. (d). All regions.)
Remotesensing 17 00237 g005
Figure 6. Validation results of each region and all regions based on RF model. ((a). Sandy soil area in AH. (b). Saline soil area in DM&LD. (c). Black soil region in HL. (d). All regions.).
Figure 6. Validation results of each region and all regions based on RF model. ((a). Sandy soil area in AH. (b). Saline soil area in DM&LD. (c). Black soil region in HL. (d). All regions.).
Remotesensing 17 00237 g006
Figure 7. Training and validation results using the local regression method based on RF.
Figure 7. Training and validation results using the local regression method based on RF.
Remotesensing 17 00237 g007
Figure 8. Spatial distribution of predicted SOC content in cultivated land in RF model ((a): SOC of AH; (b): SOC of DM&LD; (c): SOC of HL).
Figure 8. Spatial distribution of predicted SOC content in cultivated land in RF model ((a): SOC of AH; (b): SOC of DM&LD; (c): SOC of HL).
Remotesensing 17 00237 g008
Table 1. Explanatory variables used for soil organic carbon prediction.
Table 1. Explanatory variables used for soil organic carbon prediction.
ClassesPredictorsOrigin or FormulaResolution (m)Reference
Bands_meanB2_meanBlue (0.490 mm)10[24]
B3_meanGreen (0.560 mm)10
B4_meanRed (0.665 mm)10
B5_meanVegetation Red Edge (0.705 mm)10
B6_meanVegetation Red Edge (0.740 mm)10
B7_meanVegetation Red Edge (0.783 mm)10
B8_meanNIR (0.842 mm)10
B8A_meanNarrow NIR (0.865 mm)10
B11_meanSWIR (1.610 mm)10
B12_meanSWIR (2.190 mm)10
Spectral indices_meanGNDVI_mean B 8 _ m e a n B 3 _ m e a n B 8 _ m e a n + B 3 _ m e a n 10[25]
EVI_mean 2.5 × B 8 _ m e a n B 4 _ m e a n B 8 _ m e a n + 6 × B 4 _ m e a n 7.52 × B 2 _ m e a n + 1 10
SATVI_mean B 11 _ m e a n B 4 _ m e a n B 11 _ m e a n + B 4 _ m e a n + 1 × B 12 _ m e a n 10
TVI_mean ( B 8 m e a n B 4 m e a n B 8 m e a n + B 4 m e a n + 0.5 ) 1 / 2 × 100 10
RVI_mean B 8 _ m e a n B 4 _ m e a n 10
GRVI_mean B 3 _ m e a n B 4 _ m e a n B 3 _ m e a n + B 4 _ m e a n 10
LSWI_mean B 8 _ m e a n B 11 _ m e a n B 8 _ m e a n + B 11 _ m e a n 10
MSI_mean B 11 _ m e a n B 8 _ m e a n 10
SAVI_mean ( B 8 _ m e a n B 4 _ m e a n ) × 1.5 B 8 _ m e a n B 4 _ m e a n + 0.5 10
NDVI_mean B 8 _ m e a n B 4 _ m e a n B 8 _ m e a n + B 4 _ m e a n 10
Terrain factorsDEMhttps://search.asf.alaska.edu/#/ (accessed on 25 March 2020)30[13]
SCalculated by DEM30
ACalculated by DEM30
PlCCalculated by DEM30[2]
PrCCalculated by DEM30
TWICalculated by DEM30[13]
RnCalculated by DEM30
RLCalculated by DEM30
SLCalculated by DEM30
HsCalculated by DEM30
Climate factorsMATDerived from http://data.cma.cn/ (accessed on 18 June 2022)1000
MAPDerived from http://data.cma.cn/ (accessed on 18 June 2022)1000
Table 2. Descriptive statistics of SOC in AH, DM&LD, and HL.
Table 2. Descriptive statistics of SOC in AH, DM&LD, and HL.
SetNMax
(g·kg−1)
Min
(g·kg−1)
Mean (g·kg−1)SD
(g·kg−1)
SkewnessKurtosisCV
(%)
AH10212.162.005.902.110.33−0.1335.82
DM&LD9638.893.7217.876.430.480.3335.99
HL11764.0710.4230.3211.010.750.8736.31
All31564.072.0018.6312.840.910.5968.89
Note: SD represents the standard deviation, where the larger the value is, the more discrete and unstable the dataset is; CV indicates the coefficient of variation.
Table 3. Model accuracy with inputs of different variables or variable combinations.
Table 3. Model accuracy with inputs of different variables or variable combinations.
AHDM&LDHLALL
R2RMSE
(g/kg)
R2RMSE
(g/kg)
R2RMSE
(g/kg)
R2RMSE
(g/kg)
B+T+S+C0.623.830.574.770.696.320.765.11
B0.155.600.355.390.2110.690.428.41
B+T0.285.130.415.210.319.560.507.43
B+S0.345.010.405.270.2210.270.487.52
B+C0.653.820.574.770.666.430.735.32
T0.086.030.016.730.1210.670.0210.70
T+S0.275.270.016.660.1410.230.169.42
T+C0.414.640.435.020.725.900.735.33
S0.125.690.017.160.0311.390.139.77
S+C0.604.040.425.130.735.670.765.04
C0.633.810.435.040.715.900.774.93
B+T+C0.574.040.574.680.706.170.765.03
T+S+C0.564.050.435.040.735.790.755.16
B+S+C0.703.540.644.630.686.340.765.11
B+T+S0.414.750.455.120.299.640.507.47
Note: B, bands; T, terrain; S, spectral indexes; C, climate.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, L.; Liu, H.; Wang, X.; Xu, X.; He, L.; Luo, C.; Li, Y.; Zhang, X.; Zang, D.; Zheng, S.; et al. Identifying Optimal Variables to Predict Soil Organic Carbon in Sandy, Saline, and Black Soil Regions: Remote Sensing, Terrain, or Climate Factors? Remote Sens. 2025, 17, 237. https://doi.org/10.3390/rs17020237

AMA Style

Wang L, Liu H, Wang X, Xu X, He L, Luo C, Li Y, Zhang X, Zang D, Zheng S, et al. Identifying Optimal Variables to Predict Soil Organic Carbon in Sandy, Saline, and Black Soil Regions: Remote Sensing, Terrain, or Climate Factors? Remote Sensing. 2025; 17(2):237. https://doi.org/10.3390/rs17020237

Chicago/Turabian Style

Wang, Liping, Huanjun Liu, Xiang Wang, Xiaofeng Xu, Liyuan He, Chong Luo, Yong Li, Xinle Zhang, Deqiang Zang, Shufeng Zheng, and et al. 2025. "Identifying Optimal Variables to Predict Soil Organic Carbon in Sandy, Saline, and Black Soil Regions: Remote Sensing, Terrain, or Climate Factors?" Remote Sensing 17, no. 2: 237. https://doi.org/10.3390/rs17020237

APA Style

Wang, L., Liu, H., Wang, X., Xu, X., He, L., Luo, C., Li, Y., Zhang, X., Zang, D., Zheng, S., & Mei, X. (2025). Identifying Optimal Variables to Predict Soil Organic Carbon in Sandy, Saline, and Black Soil Regions: Remote Sensing, Terrain, or Climate Factors? Remote Sensing, 17(2), 237. https://doi.org/10.3390/rs17020237

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop