Simulation Using Machine Learning and Multiple Linear Regression in Hydraulic Engineering

Authors

1 Deputy of Technical and Engineering, Plan and Budjet Organization. IRAN

2 Ph.D student Lorestan University

3 Department of Water Engineering, Kermanshah Branch, Islamic Azad University, Kermanshah, Iran

Abstract

Extended Abstract
Background and Objectives
Artificial intelligence models as powerful methods in modeling nonlinear complex problems, have a significant ability and this has been proven in numerous articles. Artificial intelligence has been used in various issues, including engineering, medicine, etc. The success of this method in comparison with analytical and numerical methods, their easiness, speed and accuracy caused to open their place among researchers as much as possible. Today, Considering that one of the challenges of human life is the issues related to water resources management, so in this study, an attempt has been made to investigate the performance of artificial intelligence and regression models in the cases of water resources. Various researches have been done in the case of modeling and parametric analysis of water resources. However, in this study, artificial intelligence (Learning Machine) models were used to simulate the qualitative and quantitative parameters of water. The models used in this study are: Self-Adapting Extreme Learning Machine (SAELM), Least Square Support Vector Machine (LSSVR), Adaptive Neuro Fuzzy Inference System (ANFIS) and Multiple Linear Regression (MLR) model which was used to predict changes in hydrogeological parameters. Today, due to the growing global population, one of the most important challenges is access to safe drinking water. In our country, Iran, due to its location in the semi-arid region and low rainfall, this danger is felt more than ever. One of the serious issues is the salinity leakage into groundwater resources. In this study, an attempt has been made to simulate the leakage of salinity dynamic flow into the freshwater resources of the coastal aquifer, using artificial intelligence and statistical models. At the end, the simulation results and the accuracy of the models are given. The study area in this work, is Mighan Wetland and Mighan aquifer in Markazi province. Annual rainfall occurs in small amounts in this area. According to the statistical results provided by synoptic and rain gauge stations in the region, the maximum and minimum rainfall values range from 461 mm in the northeast to 208 mm in the center of Arak plain. The hydraulic outlet of the aquifer to the Mighan plain is located in the center of the plain. The water entering the Mighan plain and leaves the system due to evaporation from the water table. Observatory wells were used to sampling this lake due to its saline water. The wells were located in an area called Vismeh near the lake.

Methodology
In this study, qualitative and quantitative parameters: water salinity, total dissolved solids (TDS), chlorine ion (cl), sampling time (t), electrical conductivity (EC), Salinity and groundwater level (GWL) were simulated. In this work, Adaptive Neuro Fuzzy Inference System (ANFIS), Least square support vector machine (LSSVM), Self Adaptive Extreme learning machine (SAELM) and Multiple linear regression (MLR) models were used for simulation. In this study, data from 173 months of sampling were used. 70% of the sample size was used for training and 30% for testing models.

Findings
Simulation was performed using artificial intelligence models and regression model. The simulation results showed higher accuracy of artificial intelligence models. After simulation and obtaining the results, then the uncertainty analysis was performed by Wilson Score method without continuity correction. In this method, the prediction error (ei), the mean prediction error (Mean) and the standard deviation of the error is (Se). If the mean error value of a model in predicting the target variable is positive, it means that the performance of the model is Over Estimated. Also, if the average value of the model error is negative, the performance of the model is Under Estimated. Moreover, the results of Uncertainty Analysis with a significance of 5% were obtained. and finally we briefly write the subsequent performance Over Estimated (OS) and Under Estimated (US).

Conclusions
The results showed that different models were successful in predicting water parameters. In order to comprehensively evaluate the accuracy of the models in the simulation, the performance of the models was measured by five approaches. The proposed approaches were: 1) Evaluation of prediction by accuracy chart, 2) Performance evaluation by mathematical indices, 3) Performance evaluation, by Uncertainty Analysis by Wilson Score method without continuity correction, 4) Accuracy evaluation by error distribution charts and 5) Performance evaluation by discrepancy rate (DR) charts. Finally, all the results are given at the end of each section, respectively.
Approach 1- According to the prediction accuracy charts, 16 charts were drawn and the most accurate models of which are depicted in Figures 4 to 7. After modeling, the results showed that the most accurate models in simulating groundwater parameters were SAELM model in GWL simulation. According to the results, SAELM model in GWL and EC simulation, LSSVM in TDS simulation and MLR in Salinity simulation were the superior midel, Respectively.
Approach 2- According to the performance measurement indices, finally the results showed that SAELM model was the best model in simulating parameters (EC) and (GWL). The LSSVM model was also the most accurate model in modeling (TDS). MLR model was the best model in (Salinity) parameter simulation.
Approach 3- Uncertainty analysis was performed based on Wilson score method. The performance of the models in the simulation showed that the performance of the SAELM model was determined as Under estimated and other superior models in simulation had Over estimated performance.
Approach 4- Based on the error distribution diagrams, the best accuracy was assigned to SAELM and MLR models.
Approach 5- Based on the discrepancy ratio, SAELM and MLR models were estimated to be the most accurate models in the simulation.

Keywords