Modeling estimation of river suspended sediment amount using support vector regression and data control group method

Document Type : Research Paper

Author

Assistant Professor, Soil Conservation and Watershed Management Research Department, Markazi Agricultural and Natural Resources Research and Education Center, AREEO, Arak, Iran. Postal addresses: amir_24619@ yahoo.com-Tel: 09188630923

Abstract

Introduction
Erosion and deposition cause irreparable damage to water construction projects. Among these damages, we can mention the accumulation of sediment behind the dams and the reduction of their useful volume, the destruction of structures, the reduction of capacity and the increase in the maintenance cost of irrigation canals. On the other hand, sediment transport has an effect on water quality indicators in terms of drinking and agriculture. Accurate estimation of river suspended sediment load from various aspects of water resources engineering, environmental issues and water quality is of particular importance. In this regard, the hydrological models of the basin have not shown proper efficiency in estimating the amount of suspended sediment, due to the many influencing factors. Most of the simulation studies of suspended sediment estimation are based only on the flow rate of the basin outlet, and the obtained results are proof of their inefficiency. In the meantime, the development and use of new sediment estimation methods that are easy to use in addition to sufficient accuracy will play an important role. Nowadays, the fuzzy and neural intelligent conductor system has found many applications in various water engineering problems, including sedimentation, due to its ability to solve complex and nonlinear phenomena. Due to the great importance of sediment transport in the optimal use of water resources and the design of dams, it is very necessary to obtain an accurate method for estimating the amount of suspended sediment load in rivers.
Materials and Methods
The purpose of this research is to evaluate and compare the two methods of support vector machine models (SVM) and group data control method (GMDH) in estimating the suspended sediment load of the Pol Doab station of Qarachai River in Markazi Province and comparing it with the results of the measuring curve. is for this purpose, the daily data of discharge, sediment, temperature and rainfall parameters of Shazand Pol Doab station were used. For this work, 13 scenarios and different combinations of parameters were defined. Then the results of the two methods were compared with each other and with the results of the measuring curve. Finally, the best method was suggested. For this purpose, library and field studies and review of related sources, statistics and information were collected. The statistics of temperature, rainfall, daily average discharge of stream and sediment measured daily during a long-term statistical period of 40 years at Pol Doab Hydrometry Station were received from the Meteorology and Regional Water Department of Markazi Province. The received data were categorized and converted into the input format of the models. According to the discharge and corresponding sediment, the curve of the sediment gauge was drawn and its equation was obtained. Appropriate patterns of input variables were selected based on trial and error. Considering that the mentioned parameters have a historical course, therefore, the design of the input patterns of soft computing models should be done based on time delays (like what is discussed in the analysis and forecasting of time series). Then the model was taken for each input and output pattern. In the next step, the most appropriate time delay of the input parameters in the modeling, which had a higher R2 determination coefficient and a lower root mean square error (RMSE) was selected. In this research, 70% of research data was used as training and 30% for validation and testing. Finally, two data-mining methods were compared with each other and also with gauge curve and observational data.
Results and Discussion
The results obtained from this research indicated the acceptable performance of the methods used in predicting suspended sediment amounts. Comparing the results of GMDH, SVR and SRC models shows the superiority of GMDH and SVR models in predicting suspended sediment values compared to Verdi model number 6. The results showed the acceptable performance of the GMDH model with the highest R2 determination coefficient of 0.99 and the lowest root mean square error of 83 tons per day. According to the obtained results, it can be said that the GMDH model as a powerful and high-speed model can be used to model suspended sediment. The results of the research showed that both data-mining methods have far better efficiency and accuracy in estimating the suspended load of river sediment than the sediment gauge curve. Data mining-based methods can be used as an alternative to estimate the river's suspended load. It should be noted that due to climate change and droughts, industrial development, colonization of land use and changes in the morphology of watersheds, the obtained results cannot be used forever at any time, but should be used whenever the models need to be updated. Another weakness of the models is that with the increase in the number of developed layers, the accuracy of the produced answers increases, but the produced relationships between the input and output variables become very complicated.
Conclusions
The results of modeling with (SVM) showed that scenario number 6, which includes discharge at the current time and discharge and sediment with a time delay step, with the highest determination coefficient R2 with a value of 0.98 and the lowest root mean square error RMSE equal to 185 kg per day. It performs better in other scenarios. In the next step, the best model selection model (SVM) was used as the input of the GMDH model. The results were compared with the model (SVM) and sediment gauge curve. The results show the acceptable performance of the GMDH model with the highest R2 determination coefficient of 0.99 and the lowest root mean square error of 83 kg per day compared to the other two methods. The obtained results showed that both investigated data mining methods provide much better results than the sediment gauge curve. The coefficients and functions used to calibrate the intelligent models used in this research can be very useful for estimating the suspended sediments of nearby stations without statistics having the same geological and hydrological conditions at the regional level.

Keywords

Main Subjects