The assessment of class frequency in soil map legends is affected by uncertainty, especially at small scales wheregeneralization is greater. The aim of this study was to test the hypothesis that data mining techniques providebetter estimation of class frequency than traditional deterministic pedology in a national soil map.In the 1:5,000,000 map of Italian soil regions, the soil classes are the WRB reference soil groups (RSGs). Differentdata mining techniques, namely neural networks, randomforests, boosted tree, classification and regression tree,and supported vector machine (SVM), were tested and the last one gave the best RSG predictions using selectedauxiliary variables and 22,015 classified soil profiles. The five most frequent RSGs resulting from the twoapproaches were compared. The outcomes were validated with a Bayesian approach applied to a subset of 10%of geographically representative profiles, which were kept out before data processing. The validation providedthe values of both positive and negative prediction abilities.The most frequent classes were equally predicted by the two methods,which differed however from the forecastof the other classes. The Bayesian validation indicated that the SVMmethod wasmore reliable than the deterministicpedological approach and that both approaches were more confident in predicting the absence rather thanthe presence of a soil type.
|Number of pages||9|
|Publication status||Published - 2014|
All Science Journal Classification (ASJC) codes
- Soil Science
Fantappie', M., Barbetti, R., Costantini, E. A. C., L'Abate, G., Fantappiè, M., & Lorenzetti, R. (2014). Comparing data mining and deterministic pedology to assess the frequency of WRB reference soil groups in the legend of small scale maps. Geoderma, 237-238, 237-245.