Estimation of Daily Evaporation from Class A Pan using Five Data Mining Methods (Case Study: Tabriz Synoptic Station)

Authors

1 Former M.Sc. student, Dept. of Soil Eng., Faculty of Agric., University of Tabriz, Iran

2 Assist. Prof, Dept. of Water Eng., Faculty of Agric., University of Tabriz, Iran

3 Prof, Dept. of Soil Eng., Faculty of Agric., University of Tabriz, Iran

4 Dept. of Water Eng., Faculty of Agric., University of Tabriz, Iran

Abstract

Background and Objectives: Evaporation is one of the main components of hydrological cycle and one of the effective climatic variables in arid areas such as Iran. Accurate estimate of evaporation rate plays an important role in sustainable development and optimal management of water resources. Evaporation is one of the essential processes, because it depends on meteorological variables such as solar radiation, air temperature, wind speed, relative humidity and atmospheric pressure, which are related to the topography and the climate of the region. Class A pan-evaporation is one of the standard and direct tools for measuring evaporation, which is used all over the world due to its ease of application in determining evaporation. However, in most stations accurate evaporation recording is not practical due to instrument limitations and maintenance problems. On the other hand, the temporal and spatial distribution of evaporation stations compared to meteorological stations is limited, so according to the problems mentioned, the use of meteorological variables in estimating the rate of evaporation from the pan will be useful. In different regions, the impact of different climatic factors on changes evaporation from the pan has not be fully understood, so the relatively accurate estimation and prediction of this phenomenon is an effective step in the relevant fields. In recent years, for estimating the amount of evaporation from the pan, a variety of intelligent systems and software calculations such as data mining methods have been developed.
Methodology: In this study, meteorological data of Tabriz station in the period of 2003 to 2018 have been used to estimate the evaporation values from the class A pan. For this purpose, a simple correlation between meteorological variables and evaporation from class A pan was created and based on the result of this correlation, in the studied station the minimum temperature and relative humidity were inversely and the maximum and average temperature were directly affected by evaporation. Thus, ten combined scenarios were defined and modeling was performed using Support vector regression (SVR), Gaussian process regression (GPR), M5tree, Random forest (RF) and Linear regression (LR) methods. It should be noted that in this study, 70% of the data were selected for training and 30% for testing. Finally, the performance of each method in estimating evaporation values was evaluated using root mean squared error (RMSE), mean absolute error (MAE), Nash- Sutcliffe coefficient (NS) and Akaike information criterion (AIC).
Findings: The results showed that GPR10 method with RMSE = 1.90 mm/day, MAE = 1.48, NS = 0.81 and SVR10 method with RMSE = 1.92 mm/day, MAE = 1.51, NS = 0.8 had reasonable performance in estimating the values of daily evaporation from class A pan. The GPR method showed its higher capability to estimate daily evaporation values in all definition scenarios with the least error and the most accuracy. The SVR model with appropriate results was in the second place. The results of statistical parameters for random forest model were even weaker than the results of linear regression. In general, scenario number 10 with all meteorological variables and scenario number 1 with only the input minimum temperature variable had the best and weakest results among all defined scenarios, respectively. Scenarios 6 to 10 have more accuracy and less error and modeling structures with the least number of variables has the least accuracy. Also, wind speed and solar radiation variables were introduced as the most effective factors in estimating the evaporation rate from class A pan.
Conclusion: Evaporation is one of the important processes that cause the losses of half of precipitation in arid and semi- arid regions. Accordingly, knowledge of the amount of evaporation and its modeling as one of the most important hydrological variables in agricultural research and factors related to water and soil of great importance. So, accurate estimation of this phenomenon is essential. In this study, meteorological data from Tabriz station were utilized to assessment capability of machine learning methods. Evaporation values were estimated using five data mining methods including SVR, GPR, M5, RF and LR. Conclusively, the results of evaluation criteria indicated that GPR and SVR models using all variable of meteorological data performed more accurate than others. Finally, both of them are recommended to estimate the amount of evaporation from class A pan.

Keywords