ارزیابی قابلیت دو روش رگرسیون و جنگل تصادفی برای تخمین منحنی نگه‌داری آب خاک با ایجاد توابع انتقالی شبه‌پیوسته

نویسندگان

1 گروه علوم و مهندسی خاک، دانشکده کشاورزی، دانشگاه بوعلی سینا، همدان، ایران.

2 عضو هیات علمی دانشگاه بو علی سینا همدان

10.22034/ws.2024.60933.2556

چکیده

تا کنون توابع انتقالی نقطه‌ای و پارامتریک با روش‌های زیادی برای تخمین منحنی نگه‌داری آب خاک (SWRC) استفاده شده‌اند، اما از روش جنگل تصادفی (RF) با برخی متغیرهای ورودی تاکنون در هیچ مطالعه‌ای برای ایجاد توابع انتقالی شبه‌پیوسته استفاده نشده است. تعداد 120 نمونه خاک از دو استان تهران و همدان برداشت و ویژگی‌های فیزیکی آن‌ها اندازه‌گیری گردید. ده تابع انتقالی شبه‌پیوسته با روش‌‌های رگرسیون خطی و RF ایجاد گردید. از متغیرهای مکش آب خاک، بافت خاک، درصد رس و شن، جرم مخصوص ظاهری، میانگین و انحراف معیار هندسی قطر ذرات، و رطوبت در ظرفیت زراعی (FC) و نقطه پژمردگی دائم (PWP) در ترکیب‌های مختلف برای تخمین SWRC استفاده شد. استفاده از مکش خاک به‌عنوان تنها متغیر ورودی برای تخمین SWRC در روش رگرسیون خطی، مدلی با نتایج قابل قبول ایجاد کرد (R2 مراحل آموزش و معتبرسازی به ترتیب 675/0 و 674/0 بود). استفاده از درصد رس و شن به‌‌عنوان تخمین‌گر موجب بهبود تخمین (5/1 تا 0/25 درصد) گردید. جرم مخصوص ظاهری موجب بهبود معنی‌دار درستی تخمین‌ها در دامنه 9/6 تا 1/13 درصد گردید. برخلاف PWP، استفاده از FC موجب بهبود درستی تخمین‌ها در دامنه 5/3 تا 4/24 درصد شد. توزیع خطا (RMSE) بر روی مثلث بافت خاک وابسته به نوع متغیرهای ورودی و روش ایجاد توابع بود. در تمام توابع شبه‌پیوسته، درستی تخمین‌ها، بر مبنای RMSE، در روش RF بطور معنی‌دار و قابل توجهی در دامنه 22 تا 46 درصد بیشتر از رگرسیون خطی بود.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Evaluation of the Capability of Regression and Random Forest Methods to Estimate Soil Water Retention Curve by Developing Pseudo-continuous Pedotransfer Functions

نویسنده [English]

  • Reza Kiani 1
1 Dept. of Soil Science and Engineering, Faculty of Agriculture, Bu-Ali Sina University, Hamedan, Iran.
چکیده [English]

Background and Objectives
Direct methods of measuring soil water retention curve (SWRC) are time-consuming and expensive, so they are not easily applicable to large scales. Therefore, researchers use pedotransfer functions (PTFs) to obtain it. Various point and parametric pedotransfer functions have been used so far, with numerous methods to estimate the SWRC, each of which has its drawbacks. However, rare methods have been used to develop pseudo-continuous pedotransfer functions. The random forest (RF) method has not been utilized in any study so far, to create pseudo-continuous pedotransfer functions. Some variables have not been used as predictors in pseudo-continuous pedotransfer functions in any research. Therefore, the objectives of this article include investigating the potential of the RF method in creating pseudo-continuous pedotransfer functions, comparing its performance with linear regression, and examining the probability of improving the performance of these functions using the geometric mean and standard deviation of particles diameter and field capacity (FC) and permanent wilting point (PWP) as predictors.
Methodology
A total of 120 disturbed and undisturbed soil samples were collected from two provinces of Tehran and Hamedan. Soil texture, bulk density, and soil water retention curve in the range of 0 to 15000 hPa were measured. Then pseudo-continuous pedotransfer functions were created using two methods of linear regression and random forest. The soil water matric suction, soil texture, percentage of silt and sand, bulk density, geometric mean, standard deviation of particles diameter, and moisture content at FC and PWP were used in various combinations to estimate the soil water retention curve. The accuracy and reliability of the generated functions were compared between the two methods and within each method.
Findings
Using soil water matric suction as the only input variable for estimating moisture at different matric suctions was not effective in the RF method, and no model was created. However, in the linear regression method, a model with acceptable results was developed (with R2 values of 0.675 and 0.674 for training and validation stages, respectively), which can be utilized in situations where additional information is not available. The inclusion of soil texture in the linear regression method significantly improved the accuracy of estimates by 5.4% and 5.3% in both training and validation stages, respectively. In the third function, incorporating the percentage of clay and sand alongside soil water matric suction as predictors improved SWRC estimation by 1.5% to 25.0% in both training and validation stages for both RF and linear regression compared to the second function. In the fourth function, using bulk density as an additional predictor led to a significant improvement in accuracy by 6.9% to 13.1%, because bulk density serves as an indicator of soil structure, enhancing the estimation of the soil water retention curve. Utilizing FC improved estimation accuracy by 3.5% to 24.4%, because FC is a point on the SWRC and enters direct information to the models. However, using the PWP as a predictor did not significantly improve estimation accuracy. Using geometric mean (dg) and geometric standard deviation (Sg) instead of percentage of clay and sand in pseudo-continuous pedotransfer functions did not lead to noticeable improvements. Error distribution across soil texture triangle in the linear regression method showed no dependence on soil texture. Because, in pedotransfer functions 1, 2, 4, 7, and 8, the highest error values were obtained in coarse-textured soils, while in pedotransfer functions 5, 6, 9, and 10, the lowest error values were associated with coarse-textured soils. Error distribution across soil texture triangle depended on the type of input variables and the method used to create pedotransfer functions. In all pseudo-continuous pedotransfer functions created by both methods, the accuracy of estimates in both training and validation stages in the RF method was significantly and noticeably higher, ranging from 22% to 46% more than those in linear regression.
Conclusion
Using the regression method and solely relying on soil water matric suction as a predictor, an acceptable pseudo-continuous pedotransfer function was developed. Investigating the potential of establishing a similar relationship using the state-of-the-art estimation methods may lead to independence from relying on numerous soil water retention curve models. Utilizing more detailed information such as particle size distribution and FC for estimating the SWRC through pseudo-continuous pedotransfer functions is recommended. The dependence of error distribution on soil texture triangle on the type of input variables and the method used to create pedotransfer functions underscores the importance of selecting an appropriate combination of input variables and method for creating pseudo-continuous pedotransfer functions for estimating the SWRC. Given the significant superiority of the random forest method over linear regression, using soil water matric suction, percentage of clay and sand, bulk density, and FC as predictors in pseudo-continuous pedotransfer functions with the RF method yielded the best results in estimating the SWRC.

کلیدواژه‌ها [English]

  • Geometric standard deviation
  • Linear regression
  • Pseudo-continuous pedotransfer functions
  • Random forest
  • Soil moisture capacity