Estimation of Daily Reference Evapotranspiration Using Random Forest Optimized by Genetic Algorithm

Authors

1 PhD. Student, Dept. of Soil Sci. Eng., Faculty of Agric., University of Tabriz, Iran

2 Assist. Prof., Dept. of Water Eng., Faculty of Agric., University of Tabriz, Iran

3 Prof., Dept. of Soil Sci. Eng., Faculty of Agric., University of Tabriz, Iran

4 Dept. of Water Eng., Faculty of Agric., University of Tabriz, Iran

Abstract

Background and Objectives: Reference evapotranspiration (ET0) is an important parameter in the interactions among soil, vegetation, atmosphere, surface energy and water. Direct measurement of evapotranspiration values is costly and time consuming. On the other hand, modeling this complex process in which many variables interact with each other is not feasible without considering multiple assumptions. In this regard, the FAO Penman-Monteith method is used in a wide range of climatic and environmental conditions. One of the weaknesses of FAO Penman-Monteith method is its dependence on various meteorological parameters. Therefore, it is necessary to implement methods with lower meteorological variables that can estimate ET0 with suitable accuracy. Thus, in the present study, an attempt was made to estimate ET0 with acceptable accuracy using machine learning models.
Methodology: In the present study, daily meteorological parameters in the time period of 2000-2020 including maximum and minimum air temperature (Tmax, Tmin), mean temperature (T), wind speed (U2), average relative humidity (RH), maximum and minimum relative humidity (RHmax, RHmin) and sunshine hours (n) were obtained on a daily basis in three stations of East Azerbaijan province (Tabriz, Sarab, and Maragheh). Moreover, six scenarios were defined as input combinations. Then, using random forest (RF) method in two cases: Single random forest and using the genetic algorithm (GA) to optimize its effective parameters with considering the FAO Penman-Monteith model as a basis, the machine learning models were calibrated and validated for estimating ET0 values at studied stations. Furthermore, the performance of empirical equations in three groups based on temperature (Hargreaves, Blaney-Criddle and Romanenko), radiation (Irmak) and mass transfer (Meyer) were also investigated. It should be noted that 75% of the data were considered for calibration and 25% for the validation of machine learning methods. Finally, using the statistical criteria of correlation coefficient (CC), scattered index (SI) and Willmott’s Index of agreement (WI), a suitable machine learning method was introduced to estimate the reference evapotranspiration. Also, the most suitable combination of meteorological parameters for ET0 estimation was suggested.
Findings: The obtained results showed that in all studied stations, scenario 6 has the best performance, either in the case of single random forest (RF) or in the case of random forest optimized by genetic algorithm (GA-RF). Meteorological parameters of this scenario include minimum and maximum air temperature, minimum and maximum relative humidity, sunshine hours and wind speed. By optimizing the RF-6 parameters with the genetic algorithm at Tabriz station, the statistical criteria were improved (CC from 0.990 to 0.991, SI from 0.103 to 0.098). At Sarab station, the CC was increased from 0.980 to 0.982, the SI was decreased from 0.140 to 0.132 and the WI was increased from 0.989 to 0.990. At Maragheh station, CC was increased from 0.990 to 0.991, SI was decreased from 0.103 to 0.098 and WI remained unchanged at 0.995. In general, the decreasing trend of the scattered index for RF method from scenarios 1 to 6 can be understood by increasing the input parameters of the random forest method. Among the three groups of empirical methods based on air temperature, radiation and mass transfer for estimating ET0, the best performance was seen for the Blaney-Criddle method based on air temperature. In all studied stations, the GA-RF model showed better performance than the empirical methods. Also, GA-RF-5 with similar meteorological parameters with Blaney-Criddle method provided accurate ET0 estimations.
Conclusion: Determining the amount of daily evapotranspiration and consequently accurate estimation of water requirement of plants provide the basis for proper designing of irrigation systems by reducing installation costs and providing a suitable program for the use of water resources in the agriculture sector. So, in the present study, meteorological data from Tabriz, Sarab and Maragheh stations were used to evaluate the ability of machine learning methods including RF and GA-RF to estimate the values of reference evapotranspiration. The results showed the high accuracies of RF-6 and GA-RF-6 for all studied stations and Belany-criddel among the empirical models. In a more detailed look, the genetic algorithm had positive effects on increasing the model accuracies by reducing scattered index of GA-RF scenarios 1, 4, 5 and 6 in Tabriz and Maragheh stations as well as scenarios 1, 5 and 6 at Sarab station. Finally, it can be concluded that both RF and GA-RF models provided the most accurate estimates of daily reference evapotranspiration in the East Azerbaijan province.

Keywords