Open AccessArticle

Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning

Jiaxue Wang

^1,2

Yujiao Wei

^1,2

Zheng Sun

^1,2

Shixiang Gu

³,

Shihan Bai

³,

Jinming Chen

³,

Jing Chen

³,

Yongsheng Hong

^4,5 and

Yiyun Chen

^1,2,*

School of Resource and Environmental Science, Wuhan University, Wuhan 430079, China

Soil Survey and Monitoring Lab of the Wuhan University, Wuhan 430079, China

Yunnan Institute of Water and Hydropower Engineering Investigation, Design and Research, Kunming 650021, China

⁴

State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China

⁵

University of Chinese Academy of Sciences, Beijing 100049, China

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(16), 3017; https://doi.org/10.3390/rs16163017

Submission received: 8 June 2024 / Revised: 6 August 2024 / Accepted: 15 August 2024 / Published: 17 August 2024

(This article belongs to the Special Issue Remote Sensing of Soil Condition Assessment and Degradation Drivers Monitoring)

Download

Browse Figures

Versions Notes

Abstract

Soil erodibility (K) refers to the inherent ability of soil to withstand erosion. Accurate estimation and spatial prediction of K values are vital for assessing soil erosion and managing land resources. However, as most K-value estimation models are empirical, they suffer from significant extrapolation uncertainty, and traditional studies on spatial prediction focusing on individual empirical K values have neglected to explore the spatial pattern differences between various empirical models. This work proposed a universal framework for selecting an optimal soil-erodibility map using empirical models enhanced by machine learning. Specifically, three empirical models, namely, the erosion-productivity impact calculator model (K_EPIC), the Shirazi model (K_Shirazi), and the Torri model (K_Torri) were used to estimate K values. Random Forest (RF) and Gradient-Boosting Decision Tree (GBDT) algorithms were employed to develop prediction models, which led to the creation of three K-value maps. The spatial distribution of K values and associated environmental covariates were also investigated across varying empirical models. Results showed that RF achieved the highest accuracy, with R² of K_EPIC, K_Shirazi, and K_Torri increasing by 46%, 34%, and 22%, respectively, compared to GBDT. And distinctions among environmental variables that shape the spatial patterns of empirical models have been identified. The K_EPIC and K_Shirazi are influenced by soil porosity and soil moisture. The K_Torri is more sensitive to soil moisture conditions and terrain location. More importantly, our study has highlighted disparities in the spatial patterns across the three K-value maps. Considering the data distribution, spatial distribution, and measured K values, the K_Torri model outperformed others in estimating soil erodibility in the plateau lake watershed. This study proposed a framework that aimed to create optimal soil-erodibility maps and offered a scientific and accurate K-value estimation method for the assessment of soil erosion.

Keywords:

soil erodibility; environmental covariates; K models; soil erosion

1. Introduction

The soil erosion process stands as a central mechanism in the shaping of geological features, potentially causing land degradation, jeopardizing food security, and causing agricultural nonpoint source pollution [1,2,3]. Soil erodibility (K) refers to the inherent ability of soil to withstand erosion [4], and serves as a crucial quantitative parameter for assessing soil sensitivity to external erosive forces [5,6]. Accurate estimation and spatial prediction of K values are essential for identifying soil erosion patterns, assessing soil erosion severity, and formulating soil- and water-conservation plans [7,8,9].

The methods that estimate K values can be broadly classified into two categories. The first category is the experimental determination method, which calculates soil erodibility values based on the soil loss equation combined with the measured field-scale sediment and runoff values [10,11]. The method requires extensive long-term, empirically validated data, and is limited to a few sites or small area [12]. The second category is the empirical model, including the nomogram model, the EPIC (erosion–productivity impact calculator) model [13], the Shirazi model [14], and the Torri model [15]. Ravi Raj et al. [16] estimated K values in India using the nomogram model, but this approach relies on a considerable number of empirically derived data on infiltration coefficients, limiting its widespread applicability. The EPIC model estimates K values based on soil particle size and soil organic carbon (SOC), while the Shirazi model and the Torri model are K-value estimation models based on the average geometric diameter of the soil. Borrelli et al. [17] estimated K values using the KPIC model to estimate global soil erosion, while Zhao et al. [18] applied the Shirazi and Torri models to estimate soil erodibility on the plateau. In order to identify the most suitable model for the study area, researchers have also compared the accuracy of different empirical models through measured site data [18,19]. These studies assume a fixed K value within the specific regions. However, this assumption has largely disregarded the actual spatial heterogeneity of soil erodibility.

As a physical property of soil, the spatial patterns of the K value are influenced by various factors such as climate, precipitation, soil properties, and human activities. Therefore, recognizing the spatial patterns of the K value can assist in identifying models that are better suited to specific regions or soil types. Machine learning (ML) techniques have emerged as a powerful tool for predicting the spatial distribution of soil properties by modeling their relationship with environmental covariates [20,21,22]. The highly sophisticated learning capabilities of ML enable it to outperform traditional spatial prediction techniques by capturing the nonlinear relationship of soil properties relative to environmental covariates [23,24]. Panos et al. [25] and Sun et al. [26] employed machine learning models to generate high-precision soil erodibility maps in Europe and China, respectively. Yu et al. [27] employed the random forest method to generate a soil erodibility-factor map in southeastern Tibet. However, these studies have focused exclusively on modeling the estimated values from a single empirical model. The differences in the spatial patterns of various K-value empirical models have not been explored in depth.

To this end, here we propose a framework for creating optimal soil-erodibility maps using empirical models enhanced by machine learning. The empirical models were used to estimate K values, and machine learning algorithms were employed to develop prediction models, which led to the creation of three K-value maps. Specifically, the study objectives include the following: (1) to compare the sample point-distribution characteristics and spatial distribution of K values estimated by different empirical models, (2) to analyze the differences in the main driving factors of K values estimated by different empirical models, and (3) to select the most suitable K-value empirical model for the plateau lake watershed. The Qilu Lake watershed, a typical plateau lake watershed facing persistent issues of soil erosion and lake pollution, was selected for this study. Lakes serve as accumulation points for soil erosion products within a watershed [28,29]. Soil erosion not only degrades the soil, but also deteriorates the quality of the lake’s water, potentially leading to lake pollution [30,31,32].

2. Materials and Methods

2.1. Study Area and Soil Samples

The Qilu Lake watershed covers a drainage area of 354 km² (24°5′–24°14′N, 102°33′–102°52′E). Qilu Lake is one of the nine plateau lakes in Yunnan with a multi-year average water production of 119 million m³. The region experiences a subtropical semi-humid plateau monsoon climate, with an average annual sunshine duration of 2274 h and a multi-year average precipitation of 899 mm, mainly concentrated from June to August. The main soil types are red soil and paddy soil, with weathered sandy shale as the primary soil-forming parent material [33]. The catchment contains a large number of highly dispersed agricultural areas. The unique geographical location and fragmented scale of farmland have exacerbated the spatial pattern of soil erosion and soil erodibility in the study area [34,35].

A total of 216 soil samples were collected from the 0–20 cm surface cultivated (irrigated) land from 16 to 20 November 2018 (Figure 1). The cultivated land area in the watershed spans 93 km², with a total sown area of crops reaching 266 km² (including multiple cropping). The main crops grown here are cauliflower, cabbage, flue-cured tobacco, cantaloupe, fruit, flowers and other crops [36]. Sampling points were randomly distributed and the density of the layout was related to the distribution characteristics of the land parcels. The GPS location and surrounding information of each sampling point was recorded at the same time as sampling. Before analysis, the soil samples were air-dried in a place protected from direct sunlight, then ground and sifted through a 10-mesh nylon screen (2 mm). Soil moisture content, soil porosity, bulk density, and SOC were determined by laboratory experiments. The volume percentage of soil particle composition were determined used Mastersizer 3000.

2.2. Estimation of K Values

The K values were estimated using three common models, namely the erosion-productivity impact calculator model (K_EPIC) developed by Sharpley and Williams [13], the Shirazi model (K_ Shirazi) developed by Shirazi and Boersma [14], and the Torri model (K_Torri) developed by Torri [15].

The formula of the EPIC model is

K_{E P I C} = 0.1317 \{0.2 + 0.3 e^{[- 0.0256 S_{1} (1 - \frac{S_{2}}{100})]}\} \times {(\frac{S_{2}}{S_{3} + S_{2}})}^{0.3} \times [1 - \frac{0.25 C}{C + e^{(3.72 - 2.95 C)}}] \times [1 - \frac{0.7 n}{n + e^{(- 5.51 + {22.98 S}_{3})}}]

(1)

The formula of the Shirazi model is

K_S h i r a z i = 0.1317 \times 7.594 \{0.0017 + 0.0494 e^{[- \frac{1}{2} {[\frac{\log (D_{g_S h i r a z i}) + 1.675}{0.6986}]}^{2}]}\} D_{g_S h i r a z i} = e^{(0.01 \sum f_{i} l n m_{i})}

(2)

The formula of the Torri model is

K_T o r r i = 0.0293 (0.65 - D_{g_T o r r i} + 0.24 {D_{g_T o r r i}}^{2}) e^{\{- 0.0021 \frac{C}{S_{3}} - 0.00037 {\{\frac{C}{S_{3}}\}}^{2} - 4.02 + 1.72 {S_{3}}^{2}\}} D_{g_T o r r i} = \sum f_{i} l g \sqrt{d_{i} d_{i - 1}}

(3)

where

S_{1}

is the sand content (%),

S_{2}

is the silt content (%),

S_{3}

is the clay content (%),

C

is SOC content (%), and

n

= 1 −

S_{1}

/100;

D_{g_S h i r a z i}

and

D_{g_T o r r i}

are the particle-size geometric mean;

f_{i}

represents the content of the

i

-th soil particle size (%);

m_{i}

is the arithmetic mean of the

i

-th soil particle size (mm); and

d_{i}

and

d_{i - 1}

are the maximum value and minimum value of the

i

-th soil particle size in the soil mechanical composition (mm). The calculation formulae of

D_{g_S h i r a z i}

and

D_{g_T o r r i}

are obtained based on the

S_{1}

S_{2}

, and

S_{3}

D_{g_S h i r a z i} = e x p \{0.01 [S_{1} l n (\frac{2 + 0.05}{2}) + S_{2} l n (\frac{0.05 + 0.002}{2}) + S_{3} l n (\frac{0.002}{2})]\}

(4)

D_{g_{T o r r i}} = S_{1} \times \lg (\sqrt{2 \times 0.05}) + S_{2} \times \lg (\sqrt{0.05 \times 0.002}) + S_{3} \times l g (\sqrt{0.002 \times 0.00005})

(5)

2.3. Environmental Covariates

The K value is associated with SOC and soil texture. We selected a variety of environmental covariates related to soil-particle formation and polishing and SOC formation and accumulation [37]. These environmental covariates are described in Table 1. Topography significantly shapes the flow patterns and velocities of surface water, thereby directly influencing soil erodibility. Concurrently, soil properties determine the capacity to retain and transport water, which in turn affects soil stability and erodibility. The location of an area dictates the distribution and movement of surface water and also signifies the extent of human activities’ impact on soil erosion. Surface moisture conditions have a direct bearing on soil cohesion and erosion resistance. Moreover, landscape metrics are an essential tool for assessing the long-term effects of agricultural practices and human interventions on soil [38,39,40].

The topography data were derived from the Space Shuttle Rader Terrain Mission (SRTM) at 30 m resolution, including elevation, relief degree of land surface (RDLS), slope, aspect, topographic humidity index (TWI), and topographic position index (TPI). Total soil porosity (SP) and soil bulk density (SBD) are based on laboratory soil-property data, and the spatial interpolation method is used to obtain a 30 m resolution raster layer. Location conditions are characterized by measuring distance to major geographical features, including distance from lakes (Dis_lake), distance from rivers (Dis_river), distance from construction land (Dis_con), and distance from roads (Dis_road). In addition to soil moisture content (SMC) measured in the laboratory, vegetation conditions and surface moisture conditions include 9 remote-sensing indices calculated based on the preprocessed Landsat-8 image on 28 November 2018 [41,42]. The study area has a rich agricultural heritage, characterized by a distinct and deliberate mosaic of fragmented farmland [34]. In this context, we have meticulously selected seven fully recognized landscape-pattern indices to comprehensively delineate the landscape characteristics of the study area.

2.4. Modeling and Mapping

2.4.1. Random Forests (RFs)

RFs are an ensemble learning algorithm based on decision trees, which have been widely used in digital soil mapping research [43,44]. The construction of RF models with three K values is based on the scikit-learn library in Python 3.7. The parameters are optimized by traversal, and the main parameters of the RF model for the last three K values (K_EPIC, K_Shirazi, K_Torri) are set as follows: the number of decision trees are 11, 6 and 14 respectively, the maximum depth of the decision tree (max_depth) are 10, 14 and 12 respectively, and other parameters are default.

2.4.2. Gradient-Boosting Decision Tree (GBDT)

GBDT effectively combines decision trees with ensemble ideas [45,46]. Each decision tree is trained based on the residual of the previous tree and, by continuously optimizing the loss function, it gradually approaches the optimal solution. The parameters are optimized by traversal, and the main parameters of the GBDT model for the last three K values (K_EPIC, K_Shirazi, K_Torri) are set as follows: the number of decision trees are 11, 24 and 40 respectively, the maximum depth of the decision tree (max_depth) are 5, 8 and 4 respectively, and other parameters are default.

2.4.3. Accuracy Evaluation

The data set is divided into 80% modeling set and 20% independent validation set. The modeling set is used to train the model and the validation dataset was used to evaluate the predictive capability of the model. The predictive performance of the model was measured using three metrics, the mean absolute error, root mean squared error and coefficient of determination [47].

3. Results

3.1. Descriptive Statistics

Table 2 illustrates the descriptive statistics of soil properties related to soil erodibility. The results show that the SOC content ranges from 2.8 g/kg to 115 g/kg, with an average of 34.97 g/kg and a coefficient of variation of 0.59, indicating medium variation. The sand, clay and silt contents range from 2.42% to 83.47%, 3.81% to 46.05% and 12.72% to 82.12%, respectively. The mean values are 19.95%, 20.25% and 59.79% and the coefficients of variation are 0.76, 0.39 and 0.21, indicating strong, medium, and weak variation, respectively. The relationship between soil texture classification and SOC content was plotted in the triangular coordinate diagram of the American Soil Texture Classification (Figure 2). The soil texture is generally silty, covering a wide range of types except sand and clay. The most common soil types are silt loam, loam and silty clay loam. The SOC of loam is relatively low, mostly concentrated around 30 g/kg, while that of silty clay loam is relatively high, concentrated at around 50 g/kg.

3.2. Characteristics of the Distribution of the K Value

Figure 3 shows box plots of K_EPIC, K_Shirazi and K_Torri estimated at the sample scale. K_Torri has the largest range of variation, but the lowest mean value is 0.0318 t ha h ha⁻¹ MJ⁻¹ mm⁻¹. K_EPIC has the smallest range of variation, with an average value of 0.0433 t ha h ha⁻¹ MJ⁻¹ mm⁻¹. K_Shirazi has a wider range of variation than K_EPIC and a higher mean value of 0.0465 t ha h ha⁻¹ MJ⁻¹ mm⁻¹. Influenced by the Shirazi model formula, it is difficult for the value range of K_Shirazi to exceed 0.05, and most of it is concentrated around this value. This is due to the fact that the silt contents in the dataset are predominantly between 50% and 80%, with a relatively narrow range, and the calculation formula is particularly sensitive to silt contents. Figure 4 shows the correlation between the three estimated K values and SOC and soil particle size. K_EPIC is more sensitive to silt and sand, showing positive and negative significant correlations, respectively, which is consistent with K_Shirazi (Table S1). For K_Torri, fluctuations in SOC have a greater impact on its changes. By analyzing these correlations, we can evaluate the precision and applicability of the calculation equation used. This process aims to refine our calculated results for a more accurate portrayal of the true soil characteristics.

3.3. Prediction Model Performance

Table 3 presents a comparative analysis of two prediction models, RF and GBDT, to estimate three distinct soil-erodibility K values, K_EPIC, K_Shirazi, and K_Torri. The results indicate that K_Torri exhibits the highest accuracy for both RF and GBDT. Regarding K_EPIC, the RF model achieved a higher level of accuracy, with an R² of 0.45, RMSE and MAE of 0.0099 and 0.0079, respectively. This indicates a better fit to the data than the GBDT model, with an R² of 0.37, RMSE and MAE of 0.0106 t ha h ha⁻¹ MJ⁻¹ mm⁻¹ and 0.0085 t ha h ha⁻¹ MJ⁻¹ mm⁻¹, respectively. Similarly, the RF model demonstrated superior performance for K_Shirazi and K_Torri, with R² values of 0.43 and 0.38, RMSE values of 0.0050 and 0.0046 t ha h ha⁻¹ MJ⁻¹ mm⁻¹, and MAE values of 0.0038 and 0.0031 t ha h ha⁻¹ MJ⁻¹ mm⁻¹, respectively.

3.4. Spatial Distribution and Uncertainty Maps

Figure 5 presents the spatial distribution and associated uncertainty maps of K values predicted by RF for three distinct K: K_EPIC, K_Shirazi, and K_Torri. The K value is divided into six levels: low (<0.020), relatively low (0.020–0.026), medium low (0.026–0.033), medium (0.033–0.040), medium high (0.040–0.046), and high (>0.046). For K_EPIC, the medium-high erodibility area accounts for 85.9% of the total area, the medium erodibility area for 5.9%, and the high erodibility area for 8.1%, predominantly located on the north and south sides of the lake and in the central cultivated region. For K_Shirazi, the high-erodibility area accounts for 69.8% of the total area, the medium-high erodibility area for 24.4%, and the medium-erodibility area for 4.9%. For K_Torri, the distribution of K-value levels is relatively even. The medium-low erodibility area is the largest, accounting for 32% of the total region. The medium-erodibility areas account for 29.3%, and the relatively low erodibility areas for 14.6%. The medium-high erodibility area, comprising 14.6% of the total area, is mainly distributed along the eastern and western edges of the watershed. The low-erodibility area, which makes up 8% of the total area, is primarily located in the middle of the watershed.

The uncertainty maps (Figure 6) were derived from the standard deviation of the simulation results, which were run 100 times using the RF model. The uncertainty ranges for K_EPIC, K_Shirazi, and K_Torri are 0.0039–0.0060, 0.0051–0.0075, and 0.0012–0.0058 t ha h ha⁻¹ MJ⁻¹ mm⁻¹, respectively. The uncertainty of K_EPIC is generally low, while the spatial uncertainty of K_Shirazi varies significantly. Overall, the uncertainty distribution is higher along the central lake and lower around the lake.

3.5. Importance of Environmental Covariates

Figure 7 compares the importance of environmental covariates in the three K-prediction models. For K_EPIC, soil properties are the most important environmental variables, with an average variable importance of 7.89%, followed by soil moisture content (SMC), the number of patches index (NP) representing soil landscape fragmentation, and aspect. For K_Shirazi, SMC is the most important environmental variable, with a variable importance of 20%. This is followed by soil properties, landscape shape index (LSI), and slope. The ranking of environmental variables in the K_Torri prediction model is slightly different from the previous two. SMC remains the attribute of highest importance, with a variable importance of 17.4%. This is followed by the distance to roads (Dis_road) and residential areas (Dis_road) and the Enhanced Vegetation Index (EVI).

4. Discussion

4.1. Distinct Soil-Erodibility Map

The EPIC, Shirazi, and Torri models are the three most commonly used empirical models for estimating the K value. This paper employs the RF and GBDT algorithms to construct three prediction models for K estimation. The RF algorithm is suited to handling complex data relationships and is more effective in identifying the spatial variation of soil erodibility in the study area. The R² of K_EPIC, K_Shirazi, and K_Torri increased by 46%, 34%, and 22%, respectively, compared to GBDT.

Figure 5 illustrated the distinct soil erodibility map. The highest soil erodibility in K_EPIC occurs in the middle of the study area, where the terrain is lower and the SOC is higher. Alternatively, the highest soil erodibility in K_Torri occurs at the periphery of the study area, where the terrain is higher and near the forest edge. When depth and direction are consistent, the location of the tillage is the most important factor influencing the tillage erodibility [48,49,50]. Thus, the spatial distribution of soil erodibility in K_Torri exhibits a more logical and coherent pattern. Soil erodibility is high around the lakes in both K_EPIC and K_Torri. Periodic fluctuations in lake levels periodically expose coastal soils to the water surface, which accelerates the degradation of soil structure [51,52,53]. In addition, the spatial distribution of K_Shirazi (universal soil erodibility throughout the watershed) may overestimate the risk of soil erodibility in the study area. Rao et al. [33] pointed out that the average area of moderate-and-above soil erosion in Yunnan province in the past 30 years accounted for 33.55% of the total land area. The Yunnan Provincial Soil and Water Conservation Bulletin recorded that the area of moderate-and-above soil erosion in the Qilu Lake Watershed accounted for 30–40% of the total land area in the watershed. Soil erodibility can be estimated using a range of empirical models. In this study, only three commonly used models were selected for comparison. There are many modified models that were not included in the study, such as the approximate nomograph model developed by Auerswald et al. [54] using the Central European soil data set and the corrected nomograph equation, the Shirzai formula method, and the proposed EPIC model. Future studies could consider incorporating a broader range of K-value estimation models for comparison.

4.2. Environmental Mechanisms on Soil Erodibility Models

We have compared the main environmental factors with different K values. SOC content and soil particle size are the main soil properties that directly influence the K value. Under natural conditions, soil erosion is primarily caused by rainfall, runoff, and wind. Soils rich in clay have a loose structure and good permeability, reducing the risk of soil stripping by runoff [55,56]. Smaller sand particles are easily washed away by runoff or wind, due to their light weight and poor cohesion [57,58]. The presence of abundant SOC aids in the proliferation of plant root systems, thereby anchoring the soil and curbing erosion [59,60]. At the same time, organic matter contributes to soil cohesion, reducing its susceptibility to external forces [61]. In general, K is negatively correlated with SOC and sand content, and negatively correlated with clay and silt content. However, the strength of these correlations varies slightly, depending on the model used to estimate K.

Environmental covariates indirectly influence the spatial pattern of K values by influencing SOC content and soil particle size. In this study, using the optimal machine learning model (RF), we ranked the importance of environmental covariates in predicting the spatial variability of different K values. Soil moisture is the crucial environmental variable for K estimates, characterizing the soil water retention. As simple parameters that describe soil particle composition, total soil porosity and soil bulk density have important effects on soil water, air, and nutrient status and on the plant growth environment. They rank high in the importance of environmental covariates in K_EPIC and K_Shirazi, but have limited effects in K_Torri, which is more sensitive to organic matter content. Topography is most important in K_EPIC, but has limited effects in K_Torri and K_Shirazi. Slope affects the cycling and distribution of nutrients in the soil, and the slope surface affects soil temperature and moisture conditions, which in turn affect soil microbial activity and the rate of decomposition of organic matter. The Qilu Lake watershed’s terrain is predominantly flat with gentle slopes and deep soils, which facilitates the accumulation of soil organic matter and mineral nutrients. For K_EPIC and K_Shirazi, distance to the lake is the most important location variable, meaning that the natural environment has a greater influence on the soil erodibility values of K_EPIC and K_Shirazi. In contrast, for K_Torri, distance to the road is the most important location variable, which means that human activities have a greater impact on the soil erodibility value of K_Torri. Vegetation is an important variable that adds organic matter to the soil, and its root system improves the structural composition of the soil [22,61]. The study area, a primary grain and economic-crop-production region in Yunnan Province, is subject to high-intensity farming activities, which can significantly challenge soil resilience. At present, vegetation variables are mostly obtained by remote sensing. In the future, it may be possible to try to obtain relevant environmental covariates from aspects such as cropping systems and land management practices. The landscape has the same stable performance in the importance ranking of environmental variables for the three K values, which means that the landscape is a reliable predictor of spatial variation in soil erodibility. The plate number index indicates the degree of landscape fragmentation. In the field survey, it was observed that the local farmers, who are primarily engaged in vegetable cultivation, are seeking to diversify their crops through techniques such as intercropping and to enhance the efficiency of agricultural production via land fragmentation [34]. The practice of multi-crop vegetable planting contributes to an increased transfer of organic matter to the soil, due to a high volume of crop residues and extensive use of chemical fertilizers [47]. Consequently, in the Qilu Lake watershed, a diverse and fragmented farmland landscape pattern will reduce soil erodibility.

4.3. Optimal Soil-Erodibility Map Based on Empirical Models Empowered by Machine Learning

Our case study has demonstrated that the choice of model for estimating the K value significantly influences the spatial distribution of soil erodibility. Therefore, we used three criteria to compare the suitability of the K estimation model. The first criterion is the distribution of the estimated data. The normal curve shows that the distribution stability of the three K-value estimations is K_EPIC > K_Torri > K_Shirazi (Figure 3). The second criterion is the spatial distribution of soil erodibility. Compared to other models, the spatial distribution of K_Torri is more in line with the actual situation (see Section 4.1). It has a high soil-erodibility-area ratio and distribution characteristics that closely match statistical data and the theory of high-soil-erosion areas, such as those around lakes and cultivated land. The third criterion is the proximity to the measured value. The K value can be estimated using the USLE conversion formula through long-term field observation data on runoff and erosion. Zhang et al. [62] found that the measured value of K in areas similar to the study area (clayey, moist, iron-rich soil, and irrigated land) was 0.0346 t ha h ha⁻¹ MJ⁻¹ mm⁻¹. An examination of the differences between the empirical and projected K values reveals that the K_Torri estimation model is the most precise. Thus, the ranking of model suitability for soil erodibility estimation is K_Torri > K_EPIC > K_Shirazi. Compared to the analysis of the characteristics of the data-distribution and spatial-distribution patterns, the closeness to the measured K value is an objective quantitative method to assess the suitability of the K-value model.

Empirical models rely on statistical relationships between soil erodibility and organic matter and particle composition to estimate the K value. Machine learning enhances these models to discern diverse spatial patterns of soil erodibility from intricate datasets, which is crucial for the optimal soil-erodibility map. Nonetheless, these empirical models do not fully capture the underlying physical mechanisms of soil erosion. Future studies could develop a new framework on a physically based model empowered by machine learning to improve the parsimony, interpretability, and predictive capability of the soil erodibility mapping [63]. This would be achieved by integrating key physical soil-erosion processes, such as water-flow shear force, soil adhesion, and structural stability.

5. Conclusions

Spatial prediction models for K_EPIC, K_Shirazi, and K_Torri were constructed and optimized using machine learning algorithms. Then, we comprehensively considered the model prediction accuracy, generalization ability, and sensitivity to environmental covariates, and finally determined the optimal K-value estimation model suitable for a plateau lake watershed. The main results of the research were as follows: by comparing the prediction results of different models, we found that the K_EPIC and K_Torri models showed higher consistency and stability in the distribution characteristics and spatial distribution patterns of sample points. Environmental covariates exert an indirect influence on K values by affecting the SOC content and the composition of soil particles. The K_EPIC and K_Shirazi are influenced by two key factors: soil porosity and soil moisture. The K_Torri is more sensitive to soil moisture conditions and terrain location. Following a comprehensive review of model prediction accuracy, spatial distribution characteristics and agreement with measured values, it was concluded that the Torri model is the optimal choice for the plateau lake watershed. This study proposed a framework that aimed to create optimal soil-erodibility maps and offered a scientific and accurate K-value estimation method for the assessment of soil erosion in a plateau lake watershed.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/rs16163017/s1, Table S1: Pearson correlation between the K value and factors. These factors include the main soil properties known to directly influence the K value and the top three environmental variables ranked by their predictive importance in the model.

Author Contributions

Conceptualization, J.W. and Y.C.; methodology, Y.C. and Z.S.; software, J.W. and Y.W.; validation, J.W., Y.W. and S.G.; formal analysis, J.W. and Y.H.; investigation, J.W. and S.G.; resources, S.B. and J.C. (Jinming Chen); data curation, J.C. (Jing Chen); writing—original draft preparation, J.W.; writing—review and editing, S.G., S.B. and Y.C.; visualization, J.W. and Y.W.; supervision, S.G. and S.B.; project administration, S.G.; funding acquisition, S.G. and Y.C. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the High-Resolution Earth Observation System Major Special Government Comprehensive Governance Application and Scale Industrialization Demonstration Project (No. 89-Y50G31-9001-22/23-05), and the LIESMARS Special Research Funding.

Data Availability Statement

Data is contained within the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

Lal, R. Soil Erosion and the Global Carbon Budget. Environ. Int. 2003, 29, 437–450. [Google Scholar] [CrossRef] [PubMed]
Amundson, R.; Berhe, A.A.; Hopmans, J.W.; Olson, C.; Sztein, A.E.; Sparks, D.L. Soil and Human Security in the 21st Century. Science 2015, 348, 1261071. [Google Scholar] [CrossRef] [PubMed]
Van Oost, K.; Quine, T.A.; Govers, G.; De Gryze, S.; Six, J.; Harden, J.W.; Ritchie, J.C.; McCarty, G.W.; Heckrath, G.; Kosmas, C.; et al. The Impact of Agricultural Soil Erosion on the Global Carbon Cycle. Science 2007, 318, 626–629. [Google Scholar] [CrossRef]
Qian, M.; Zhou, W.; Wang, S.; Li, Y.; Cao, Y. The Influence of Soil Erodibility and Saturated Hydraulic Conductivity on Soil Nutrients in the Pingshuo Opencast Coalmine, China. Int. J. Environ. Res. Public Health 2022, 19, 4762. [Google Scholar] [CrossRef]
Ganasri, B.P.; Ramesh, H. Assessment of Soil Erosion by RUSLE Model Using Remote Sensing and GIS—A Case Study of Nethravathi Basin. Geosci. Front. 2016, 7, 953–961. [Google Scholar] [CrossRef]
Pimentel, D.; Kounang, N. Ecology of Soil Erosion in Ecosystems. Ecosystems 1998, 1, 416–426. [Google Scholar] [CrossRef]
Keesstra, S.D.; Bouma, J.; Wallinga, J.; Tittonell, P.; Smith, P.; Cerdà, A.; Montanarella, L.; Quinton, J.N.; Pachepsky, Y.; van der Putten, W.H.; et al. The Significance of Soils and Soil Science towards Realization of the United Nations Sustainable Development Goals. SOIL 2016, 2, 111–128. [Google Scholar] [CrossRef]
Kastridis, A.; Stathis, D.; Sapountzis, M.; Theodosiou, G. Insect Outbreak and Long-Term Post-Fire Effects on Soil Erosion in Mediterranean Suburban Forest. Land 2022, 11, 911. [Google Scholar] [CrossRef]
Baartman, J.E.M.; Nunes, J.P.; van Delden, H.; Vanhout, R.; Fleskens, L. The Effects of Soil Improving Cropping Systems (SICS) on Soil Erosion and Soil Organic Carbon Stocks across Europe: A Simulation Study. Land 2022, 11, 943. [Google Scholar] [CrossRef]
Ferro, V. Deducing the USLE Mathematical Structure by Dimensional Analysis and Self-Similarity Theory. Biosyst. Eng. 2010, 106, 216–220. [Google Scholar] [CrossRef]
Jin, F.; Yang, W.; Fu, J.; Li, Z. Effects of Vegetation and Climate on the Changes of Soil Erosion in the Loess Plateau of China. Sci. Total Environ. 2021, 773, 145514. [Google Scholar] [CrossRef]
Rozos, D.; Skilodimou, H.D.; Loupasakis, C.; Bathrellos, G.D. Application of the Revised Universal Soil Loss Equation Model on Landslide Prevention. An Example from N. Euboea (Evia) Island, Greece. Environ. Earth Sci. 2013, 70, 3255–3266. [Google Scholar] [CrossRef]
Williams, J.R.; Greenwood, D.J.; Nye, P.H.; Walker, A. The Erosion-Productivity Impact Calculator (EPIC) Model: A Case History. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 1997, 329, 421–428. [Google Scholar] [CrossRef]
Shirazi, M.A.; Boersma, L. A Unifying Quantitative Analysis of Soil Texture. Soil Sci. Soc. Am. J. 1984, 48, 142–147. [Google Scholar] [CrossRef]
Torri, D.; Poesen, J.; Borselli, L. Predictability and Uncertainty of the Soil Erodibility Factor Using a Global Dataset. CATENA 1997, 31, 1–22. [Google Scholar] [CrossRef]
Raj, R.; Saharia, M.; Chakma, S. Mapping Soil Erodibility over India. CATENA 2023, 230, 107271. [Google Scholar] [CrossRef]
Borrelli, P.; Robinson, D.A.; Fleischer, L.R.; Lugato, E.; Ballabio, C.; Alewell, C.; Meusburger, K.; Modugno, S.; Schütt, B.; Ferro, V.; et al. An Assessment of the Global Impact of 21st Century Land Use Change on Soil Erosion. Nat. Commun. 2017, 8, 2013. [Google Scholar] [CrossRef] [PubMed]
Zhao, W.; Wei, H.; Jia, L.; Daryanto, S.; Zhang, X.; Liu, Y. Soil Erodibility and Its Influencing Factors on the Loess Plateau of China: A Case Study in the Ansai Watershed. Solid Earth 2018, 9, 1507–1516. [Google Scholar] [CrossRef]
Zhang, K.L.; Shu, A.P.; Xu, X.L.; Yang, Q.K.; Yu, B. Soil Erodibility and Its Estimation for Agricultural Soils in China. J. Arid Environ. 2008, 72, 1002–1011. [Google Scholar] [CrossRef]
Wadoux, A.M.J.-C.; Minasny, B.; McBratney, A.B. Machine Learning for Digital Soil Mapping: Applications, Challenges and Suggested Solutions. Earth-Sci. Rev. 2020, 210, 103359. [Google Scholar] [CrossRef]
Wadoux, A.; Samuel-Rosa, A.; Poggio, L.; Mulder, V.L. A Note on Knowledge Discovery and Machine Learning in Digital Soil Mapping. Eur. J. Soil Sci. 2019, 71, 133–136. [Google Scholar] [CrossRef]
Yang, X. Deriving RUSLE Cover Factor from Time-Series Fractional Vegetation Cover for Hillslope Erosion Modelling in New South Wales. Soil Res. 2014, 52, 253–261. [Google Scholar] [CrossRef]
Ding, J.; Yang, S.; Shi, Q.; Wei, Y.; Wang, F. Using Apparent Electrical Conductivity as Indicator for Investigating Potential Spatial Variation of Soil Salinity across Seven Oases along Tarim River in Southern Xinjiang, China. Remote Sens. 2020, 12, 2601. [Google Scholar] [CrossRef]
Ding, J.; Yang, A.; Wang, J.; Sagan, V.; Yu, D. Machine-Learning-Based Quantitative Estimation of Soil Organic Carbon Content by VIS/NIR Spectroscopy. PeerJ 2018, 6, e5714. [Google Scholar] [CrossRef] [PubMed]
Panagos, P.; Meusburger, K.; Ballabio, C.; Borrelli, P.; Alewell, C. Soil Erodibility in Europe: A High-Resolution Dataset Based on LUCAS. Sci. Total Environ. 2014, 479–480, 189–200. [Google Scholar] [CrossRef]
Sun, L.; Liu, F.; Zhu, X.; Zhang, G. High-Resolution Digital Mapping of Soil Erodibility in China. Geoderma 2024, 444, 116853. [Google Scholar] [CrossRef]
Yu, W.; Jiang, Y.; Liang, W.; Wan, D.; Liang, B.; Shi, Z. High-Resolution Mapping and Driving Factors of Soil Erodibility in Southeastern Tibet. CATENA 2023, 220, 106725. [Google Scholar] [CrossRef]
Wang, S.; Nie, X.; Ran, F.; Liao, W.; Yang, C.; Xiao, T.; Liu, Y.; Liu, Y.; Li, Z. Human Activities Control the Source of Eroded Organic Carbon in Lake Sediments over the Last 100 Years: Evidence from Stable Isotope Fingerprinting. Fundam. Res. 2023, 4. [Google Scholar] [CrossRef]
Doetterl, S.; Six, J.; Van Wesemael, B.; Van Oost, K. Carbon Cycling in Eroding Landscapes: Geomorphic Controls on Soil Organic C Pool Composition and C Stabilization. Glob. Change Biol. 2012, 18, 2218–2232. [Google Scholar] [CrossRef]
Xiao, T.; Ran, F.; Li, Z.; Wang, S.; Nie, X.; Liu, Y.; Yang, C.; Tan, M.; Feng, S. Sediment Organic Carbon Dynamics Response to Land Use Change in Diverse Watershed Anthropogenic Activities. Environ. Int. 2023, 172, 107788. [Google Scholar] [CrossRef] [PubMed]
Sun, Y.; Nie, X.; Li, Z.; Wang, S.; Chen, J.; Ran, F. The Applicability of Commonly-Used Tracers in Identifying Eroded Organic Matter Sources. J. Hydrol. 2021, 603, 126949. [Google Scholar] [CrossRef]
Cooper, R.J.; Pedentchouk, N.; Hiscock, K.M.; Disdle, P.; Krueger, T.; Rawlins, B.G. Apportioning Sources of Organic Matter in Streambed Sediments: An Integrated Molecular and Compound-Specific Stable Isotope Approach. Sci. Total Environ. 2015, 520, 187–197. [Google Scholar] [CrossRef]
Rao, W.; Shen, Z.; Duan, X. Spatiotemporal Patterns and Drivers of Soil Erosion in Yunnan, Southwest China: RULSE Assessments for Recent 30 Years and Future Predictions Based on CMIP6. CATENA 2023, 220, 106703. [Google Scholar] [CrossRef]
Yu, P.; Fennell, S.; Chen, Y.; Liu, H.; Xu, L.; Pan, J.; Bai, S.; Gu, S. Positive Impacts of Farmland Fragmentation on Agricultural Production Efficiency in Qilu Lake Watershed: Implications for Appropriate Scale Management. Land Use Policy 2022, 117, 106108. [Google Scholar] [CrossRef]
Chen, J.; Yang, X.; Dao, H.; Gu, H.; Chen, G.; Mao, C.; Bai, S.; Gu, S.; Zhou, Z.; Yan, Z. Analyses on Characteristics of Spatial Distribution and Matching of the Human–Land–Water–Heat System on the Yunnan Plateau. Water 2024, 16, 867. [Google Scholar] [CrossRef]
Wei, Y.; Chen, Y.; Wang, J.; Wang, B.; Yu, P.; Hong, Y.; Zhu, L. Unveiling the Explanatory Power of Environmental Variables in Soil Organic Carbon Mapping: A Global-Local Analysis Framework. Geoderma 2024. accepted. [Google Scholar]
Sun, Z.; Liu, F.; Wang, D.; Wu, H.; Zhang, G. Improving 3D Digital Soil Mapping Based on Spatialized Lab Soil Spectral Information. Remote Sens. 2023, 15, 5228. [Google Scholar] [CrossRef]
Gessler, P.E.; Chadwick, O.A.; Chamran, F.; Althouse, L.; Holmes, K. Modeling Soil–Landscape and Ecosystem Properties Using Terrain Attributes. Soil Sci. Soc. Am. J. 2000, 64, 2046–2056. [Google Scholar] [CrossRef]
McGarigal, K.S.; Cushman, S.; Neel, M.; Ene, E. FRAGSTATS: Spatial Pattern Analysis Program for Categorical Maps; University of Massachusetts: Amherst, MA, USA, 2002. [Google Scholar]
McGarigal, K.; Tagil, S.; Cushman, S.A. Surface Metrics: An Alternative to Patch Metrics for the Quantification of Landscape Structure. Landsc. Ecol. 2009, 24, 433–450. [Google Scholar] [CrossRef]
Geng, J.; Tan, Q.; Lv, J.; Fang, H. Assessing Spatial Variations in Soil Organic Carbon and C: N Ratio in Northeast China’s Black Soil Region: Insights from Landsat-9 Satellite and Crop Growth Information. Soil Tillage Res. 2024, 235, 105897. [Google Scholar] [CrossRef]
Zhang, X.; Xue, J.; Chen, S.; Wang, N.; Xie, T.; Xiao, Y.; Chen, X.; Shi, Z.; Huang, Y.; Zhuo, Z. Fine Resolution Mapping of Soil Organic Carbon in Croplands with Feature Selection and Machine Learning in Northeast Plain China. Remote Sens. 2023, 15, 5033. [Google Scholar] [CrossRef]
Peters, J.; Baets, B.D.; Verhoest, N.E.C.; Samson, R.; Degroeve, S.; Becker, P.D.; Huybrechts, W. Random Forests as a Tool for Ecohydrological Distribution Modelling. Ecol. Model. 2007, 207, 304–318. [Google Scholar] [CrossRef]
Carlisle, D.; Falcone, J.; Wolock, D.; Meador, M.; Norris, R. Predicting the Natural Flow Regime: Models for Assessing Hydrological Alteration in Streams. River Res. Appl. 2009, 26, 118–136. [Google Scholar] [CrossRef]
Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. LightGBM: A Highly Efficient Gradient Boosting Decision Tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 7–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 3149–3157. [Google Scholar]
Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Stat. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
Wu, Z.; Liu, Y.; Han, Y.; Zhou, J.; Liu, J.; Wu, J. Mapping Farmland Soil Organic Carbon Density in Plains with Combined Cropping System Extracted from NDVI Time-Series Data. Sci. Total Environ. 2021, 754, 142120. [Google Scholar] [CrossRef]
Jia, L.Z.; Zhang, J.H.; Zhang, Z.H.; Wang, Y. Assessment of Gravelly Soil Redistribution Caused by a Two-Tooth Harrow in Mountainous Landscapes of the Yunnan-Guizhou Plateau, China. Soil Tillage Res. 2017, 168, 11–19. [Google Scholar] [CrossRef]
Kimaro, D.N.; Deckers, J.A.; Poesen, J.; Kilasara, M.; Msanya, B.M. Short and Medium Term Assessment of Tillage Erosion in the Uluguru Mountains, Tanzania. Soil Tillage Res. 2005, 81, 97–108. [Google Scholar] [CrossRef]
Ziegler, A.D.; Giambelluca, T.W.; Sutherland, R.A.; Nullet, M.A.; Vien, T.D. Soil Translocation by Weeding on Steep-Slope Swidden Fields in Northern Vietnam. Soil Tillage Res. 2007, 96, 219–233. [Google Scholar] [CrossRef]
Bao, Y.; He, X.; Wen, A.; Gao, P.; Tang, Q.; Yan, D.; Long, Y. Dynamic Changes of Soil Erosion in a Typical Disturbance Zone of China’s Three Gorges Reservoir. CATENA 2018, 169, 128–139. [Google Scholar] [CrossRef]
Siqueira, A.G.; Azevedo, A.A.; Dozzi, L.F.S.; Duarte, H. Monitoring Program of Reservoir Bank Erosion at Porto Primavera Dam, Parana River, SP/MS, Brazil. Engineering Geology for Society and Territory; Lollino, G., Arattano, M., Rinaldi, M., Giustolisi, O., Marechal, J.C., Grant, G.E., Eds.; Springer International Publishing: Cham, Switzerland, 2015; Volume 3, pp. 351–355. [Google Scholar]
Saint-Laurent, D.; Touileb, B.N.; Saucet, J.P.; Whalen, A.; Gagnon, B.; Nzakimuena, T. Effects of Simulated Water Level Management on Shore Erosion Rates. Case Study: Baskatong Reservoir, Québec, Canada. Can. J. Civ. Eng. 2001, 28, 482–495. [Google Scholar] [CrossRef]
Auerswald, K.; Fiener, P.; Martin, W.; Elhaus, D. Use and Misuse of the K Factor Equation in Soil Erosion Modeling: An Alternative Equation for Determining USLE Nomograph Soil Erodibility Values. CATENA 2014, 118, 220–225. [Google Scholar] [CrossRef]
Bonilla, C.A.; Johnson, O.I. Soil Erodibility Mapping and Its Correlation with Soil Properties in Central Chile. Geoderma 2012, 189–190, 116–123. [Google Scholar] [CrossRef]
Mallick, J.; Al-Wadi, H.; Rahman, A.; Ahmed, M.; Abad Khan, R. Spatial Variability of Soil Erodibility and Its Correlation with Soil Properties in Semi-Arid Mountainous Watershed, Saudi Arabia. Geocarto Int. 2016, 31, 661–681. [Google Scholar] [CrossRef]
Efthimiou, N. The Importance of Soil Data Availability on Erosion Modeling. CATENA 2018, 165, 551–566. [Google Scholar] [CrossRef]
Geng, J.; Cheng, S.; Fang, H.; Pei, J.; Xu, M.; Lu, M.; Yang, Y.; Cao, Z. Nitrogen Fertilization Changes the Molecular Composition of Soil Organic Matter in a Subtropical Plantation Forest. Soil Sci. Soc. Am. J. 2020, 84, 68–81. [Google Scholar] [CrossRef]
Ostovari, Y.; Ghorbani-Dashtaki, S.; Kumar, L.; Shabani, F. Soil Erodibility and Its Prediction in Semi-Arid Regions. Arch. Agron. Soil Sci. 2019, 65, 1688–1703. [Google Scholar] [CrossRef]
Chen, S.; Arrouays, D.; Angers, D.A.; Martin, M.P.; Walter, C. Soil Carbon Stocks under Different Land Uses and the Applicability of the Soil Carbon Saturation Concept. Soil Tillage Res. 2019, 188, 53–58. [Google Scholar] [CrossRef]
Saha, R.; Tomar, J.M.S.; Ghosh, P.K. Evaluation and Selection of Multipurpose Tree for Improving Soil Hydro-Physical Behaviour under Hilly Eco-System of North East India. Agrofor. Syst. 2007, 69, 239–247. [Google Scholar] [CrossRef]
Zhang, W.; Yu, D.; Shi, X.; Xiangyan, Z.; Hongjie, W.; Zhu-jun, G. Uncertainty in Prediction of Soil Erodibility K-Factor in Subtropical China. Acta Pedol. Sin. 2009, 46, 185–191. [Google Scholar]
Zhang, Z.; Chen, Y.; Wu, K.; Hong, Y.; Shi, T.; Mouazen, A.M. On the Parsimony, Interpretability, and Predictive Capability of a Physically-Based Model in the Optical Domain for Estimating Soil Moisture Content. Geoderma 2024, 49, 116996. [Google Scholar] [CrossRef]

Figure 1. Location of the Qilu Lake watershed and 216 soil samples.

Figure 2. Soil-texture classification triangle and SOC distribution.

Figure 3. The distribution of K_EPIC, K_Shirazi and K_Torri is depicted in the box plots.

Figure 4. Scatter plot of the relationship between soil-erodibility model estimate (K_EPIC, K_Shirazi and K_Torri) and soil texture and SOC.

Figure 5. The spatial distribution maps of estimation K.

Figure 6. The uncertainty maps of estimation K.

Figure 7. Importance of the environmental covariates of K_EPIC (a), K_Shirazi (b), and K_Torri (c).

Table 1. Environmental covariate used for soil erodibility (K) mapping.

Category	Covariate	Description
Topography	DEM	Elevation above sea level (m)
	TWI	Topographic wetness index
	RDLS	Relief degree of land surface
	aspect	Aspect derived from elevation
	slope	Slope derived from elevation (%)
Soil properties	SBD	Bulk density measured in the laboratory
Soil properties	SP	Soil porosity measured in the laboratory
Location	Dis_con	Distance to construction land (m)
	Dis_lake	Distance to lakes (m)
	Dis_river	Distance to rivers (m)
	Dis_road	Distance to roads (m)
Vegetation	NDVI	Normalized Difference Vegetation Index
	EVI	Enhanced Vegetation Index
	TVI	Transformed Vegetation Index
	NDSI	Normalized Difference Soil Index
Surface moisture	SMC	Soil moisture content measured in the laboratory
	NDWI	Normalized Difference Water Index
	MSI	Moisture Stress Index
	NSDSI1	Normalized Shortwave–Infrared (SWIR) Difference Bare Soil Moisture Indices
	NSDSI2
	NSDSI3
Landscape	DIVISION	Landscape division index
	LPI	Largest patch index
	LSI	Landscape shape index
	IJI	Interspersion and juxtaposition index
	NP	Number of patches
	SHID	Shannon’s diversity index
	COHESION	Patch cohesion index

Table 2. Descriptive statistics of SOC (g/kg), sand (%), clay (%), and silt (%).

Soil Properties	Maximum	Minimum	Mean	Standard Deviation	Coefficient of Variation	Skewness	Kurtosis
SOC	115.00	2.80	34.97	20.58	0.59	0.82	0.89
Sand	83.47	2.42	19.95	15.12	0.76	1.31	1.56
Clay	46.05	3.81	20.25	7.87	0.39	1.02	1.10
Silt	82.12	12.72	59.79	12.70	0.21	−0.83	0.75

Table 3. Accuracy of K predicted with RF and GBDT.

	RF			GBDT
	EPIC	Shirazi	Torri	EPIC	Shirazi	Torri
R²	0.38	0.43	0.45	0.38	0.43	0.37
RMSE	0.0046	0.0050	0.0099	0.0046	0.0050	0.0106
MAE	0.0031	0.0038	0.0079	0.0031	0.0038	0.0085

Note: the unit of RMSE and MAE was t ha h ha⁻¹ MJ⁻¹ mm⁻¹.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Wang, J.; Wei, Y.; Sun, Z.; Gu, S.; Bai, S.; Chen, J.; Chen, J.; Hong, Y.; Chen, Y. Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning. Remote Sens. 2024, 16, 3017. https://doi.org/10.3390/rs16163017

AMA Style

Wang J, Wei Y, Sun Z, Gu S, Bai S, Chen J, Chen J, Hong Y, Chen Y. Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning. Remote Sensing. 2024; 16(16):3017. https://doi.org/10.3390/rs16163017

Chicago/Turabian Style

Wang, Jiaxue, Yujiao Wei, Zheng Sun, Shixiang Gu, Shihan Bai, Jinming Chen, Jing Chen, Yongsheng Hong, and Yiyun Chen. 2024. "Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning" Remote Sensing 16, no. 16: 3017. https://doi.org/10.3390/rs16163017

APA Style

Wang, J., Wei, Y., Sun, Z., Gu, S., Bai, S., Chen, J., Chen, J., Hong, Y., & Chen, Y. (2024). Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning. Remote Sensing, 16(16), 3017. https://doi.org/10.3390/rs16163017

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Optimal Mapping of Soil Erodibility in a Plateau Lake Watershed: Empirical Models Empowered by Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Study Area and Soil Samples

2.2. Estimation of K Values

2.3. Environmental Covariates

2.4. Modeling and Mapping

2.4.1. Random Forests (RFs)

2.4.2. Gradient-Boosting Decision Tree (GBDT)

2.4.3. Accuracy Evaluation

3. Results

3.1. Descriptive Statistics

3.2. Characteristics of the Distribution of the K Value

3.3. Prediction Model Performance

3.4. Spatial Distribution and Uncertainty Maps

3.5. Importance of Environmental Covariates

4. Discussion

4.1. Distinct Soil-Erodibility Map

4.2. Environmental Mechanisms on Soil Erodibility Models

4.3. Optimal Soil-Erodibility Map Based on Empirical Models Empowered by Machine Learning

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI