Environmental Monitoring and Assessment | 2021

Large-scale digital mapping of topsoil total nitrogen using machine learning models and associated uncertainty map

 
 
 
 
 
 

Abstract


Understanding the spatial distribution of soil nutrients and factors affecting their concentration and availability is crucial for soil fertility management and sustainable land utilization while quantifying factors affecting soil nitrogen distribution in Qorveh-Dehgolan plain is mostly lacking. This study, thus, aimed at digital modeling and mapping the spatial distribution of topsoil total nitrogen (TN) in Qorveh-Dehgolan plain with an area of 150,000 ha using random forest (RF), decision tree (DT), and cubist (CB) algorithms. A total of 130 observation points were collected from a depth of 0 to 30 cm from topsoil surfaces based on a random sampling pattern. Then, soil physicochemical properties, calcium carbonate equivalent, organic carbon, and topsoil total nitrogen were measured. A number of 51 environmental variables including 31 geomorphometric attributes derived from a digital elevation model with 12.5-m spatial resolution, 13 spectral indices and reflectance from SENTINEL-2 satellite (MSIsensor), and five soil properties and two spatial variables of latitude and longitude were used as covariates for digital mapping of topsoil total nitrogen. The most appropriate covariates were then selected by the Boruta algorithm in the R software environment. A standard deviation map was produced to show model uncertainty. The covariate selection resulted in the separation of 14 effective covariates in the spatial prediction of topsoil total nitrogen by using the data mining algorithms. The validation of digital mapping of topsoil total nitrogen by RF, DT, and CB models using 20% of independent data showed root mean square error (RMSE) of 0.032, 0.035, and 0.043%; mean absolute error (MAE) of 0.0008, 0.001, and 0.002%; and based on the coefficients of determination of 0.42, 0.38, 0.35, respectively. Relative importance (RI) of environmental covariates using the %IncMSE index indicated the importance of two geomorphometric variables of midslope position and normalized height along with SAVI and NDVI remote sensing variables in the spatial modeling and distribution of total nitrogen in the studied lands. The RF prediction and associated uncertainty maps, with show high accuracy and low standard deviation in the most part of study area, reveled low overfitting and overtraining in soil-landscape modeling; so, this model can lead to the development of a digital map of soil surface properties with acceptable accuracy for sustainable land utilization.

Volume 193
Pages None
DOI 10.1007/s10661-021-08947-w
Language English
Journal Environmental Monitoring and Assessment

Full Text