The effect of meteorological parameters and satellite images in estimating daily reference evapotranspiration using a random forest model optimized with genetic algorithm

Authors

1 PhD student, Department of Water Science and Engineering, Faculty of Agriculture, University of Tabriz

2 University of Tabriz

3 Prof. Department of Remote Sensing and GIS, Faculty of Planning and Environmental Sciences, University of Tabriz

Abstract

Abstract
Background and Objectives: Water resources management, especially irrigation practices, is heavily reliant on reference evapotranspiration (ET0). ET0 is the rate of evaporation and transpiration from a standard reference surface with a presumed surface resistance of 70 s.m-1, the height of 0.12 m and an albedo of 0.23. Penman-Monteith FAO-56 (P-M FAO-56) approach is the most commonly used method for calculating ET0. In spite of the fact that FAO-PM is achievable, its implementation remains inconvenient because it requires a large amount of meteorological data, which is derived from standard meteorological observation stations. In the absence of complete climate data, it is highly desirable to have a model with fewer input climatic dates. Therefore, remote sensing methods have been used and improved over time to estimate ET0 at various spatial scales. Alternatively, it has been observed that the research community has become increasingly interested in obtaining data from metaheuristic algorithms that are based on artificial intelligence (AI).
Methodology: In this research, it has been attempted to estimate the amount of daily reference evapotranspiration (ET0) using two data-driven models, using a combination of inputs from meteorological station data and satellite imagery data from MODIS sensor, by considering different inputs from these sources. The models include the random forest (RF) and hybridized RF with genetic algorithm optimization (GA-RF). Moreover, the correlation of input variables with ET0 is evaluated and the possibility of training a simple and accurate machine learning model in the conditions of lack or absence of meteorological data using satellite image data is investigated. So, this study aimed to determine ET0 in the time period of 2003-2021 using land surface temperature (LST) data and leaf area index (LAI) acquired from MODIS sensor and Tabriz meteorological station data including maximum and minimum air temperatures (Tmax, Tmin), average temperatures (T), wind speeds (U2), average relative humidity (RH), maximum and minimum relative humidity (RHmax, RHmin), and sunny hours (n). For the study area, daily LST were extracted from the Terra (MOD11A1) and Aqua (MYD11A1) satellites. Moreover, the LST of Terra and Aqua satellites were combined, since the LST values had missing data due to the presence of clouds. Furthermore, MODIS MCD15A3H version 6.1 using four-day data from Terra and Aqua satellites was used to determine the leaf area index (LAI). The standard P-M FAO-56 method for calculating daily reference evapotranspiration was considered as the base method. The set of input parameters was considered based on the cross-correlation of the parameters with reference evapotranspiration obtained from the FAO-Penman-Monteith equation.
Findings: The results of two data-driven models including standalone random forest (RF) and hybridized RF model with genetic algorithm (GA) to estimate ET0 values were compared with calculated ET0 by P-M FAO-56 equation. The results indicated that all of the studied input variables are highly correlated with the target variable. Based on the P-M FAO-56 method, the average air temperature with the highest value (R2=0.903) and the wind speed with the lowest value (R2=0.282) has a high and low correlation with reference evapotranspiration. Also, by comparing LAI and LST MODIS parameters, LST has the highest correlation coefficient with ET0 with R2=0.865. A total of twelve scenarios for estimating ET0 are evaluated, each with a different set of input parameters. Based on the correlation between the parameters and ET0, the first ten scenarios are categorized. Additionally, the eleventh scenario is based only on satellite images, and the twelfth scenario is based solely on weather station data. Based on the results, the GA-RF-10 (R2=0.976, RMSE=0.200, MAPE=11.373, and MBE=0.028), which includes all input parameters, outperforms the other models. There was a greater degree of accuracy with the RF-10 (R2=0.949, RMSE=0.293, MAPE=16.442, and MBE=0.017) when compared with the other random forest models. Based on the comparison of scenario 11 (satellite image data) and scenario 12 (meteorological station data), it appears that scenario 12 is more accurate for both RF (R2=0.922, RMSE=0.357, MAPE=20.712, and MBE=0.009) and GA-RF (R2=0.944, RMSE=0.306, MAPE=17.037, and MBE=0.013) models. Despite the fact that only satellite image parameters did not provide accurate estimation of ET0 compared to independent meteorological parameters, the inclusion of these parameters in the ET0 estimation resulted in more acceptable results, demonstrating the importance of satellite image parameters. Thus, satellite data may be useful and recommended for estimating ET0, particularly in areas without meteorological stations.

Keywords