Abstract
Prediction of river water quality indicators (RWQIs) using artificial intelligence (AI)–based hybrid soft computing modeling techniques could provide essential predictions required for efficient river health planning and management. The study described the development of a novel AI-based relative weighted ensemble (AIRWE) hybrid model for predicting critical RWQIs, i.e., biochemical oxygen demand (BOD) and total coliform (TC). The study involved comprehensive water quality (WQ) monitoring from 30 locations along the Damodar River to establish the baseline data and delineate the WQ. The representative input features showing a strong association with BOD and TC were identified using Spearman’s rank-coupled orthogonal linear transformation (SOT). The relative weighted ensemble (RWE) method was applied to determine the relative weights for base learners in the AIRWE model. The statistical analysis of the developed model revealed that it was most efficient and accurate for predicting BOD (R2, 0.97; RMSE, 0.06; MAE, 0.04) and TC (R2, 0.98; RMSE, 0.06; MAE, 0.05) over the traditional techniques. The tstat (BOD 0.02 and TC 0.47) was lesser than tcrit (1.672), confirming its unbiased predictions. The SOT technique removed the data noise and multicollinearity, whereas RWE curtailed the individual model’s limitations and predicted more reliable results. The model resulted 97% accuracy with high precision (96%) in classifying the river water quality for various end uses. The study describes a novel approach for researchers, scientists, and decision-makers for modeling and predicting various environmental attributes.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability
Data will be made available on request.
References
Ahmed AN, Othman FB, Afan HA, Ibrahim RK, Fai CM, Hossain MS, Elshafie A (2019) Machine learning methods for better water quality prediction. J Hydrol 578:124084. https://doi.org/10.1016/j.jhydrol.2019.124084
Al-Sulttani AO, Al-Mukhtar M, Roomi AB, Farooque AA, Khedher KM, Yaseen ZM (2021) Proposition of new ensemble data-intelligence models for surface water quality prediction. IEEE Access 9:108527–108541. https://doi.org/10.1109/ACCESS.2021.3100490
Baird RB, Eaton AD, Rice EW, Bridgewater L (eds) (2017) Standard methods for the examination of water and wastewater. American Public Health Association, Washington, DC
Arhonditsis G (2022) Does mathematical modelling fit within the scope of ecological informatics?. Ecol Inform 101915. https://doi.org/10.1016/j.ecoinf.2022.101915
Ashworth M, Elsheikh AH, Doster F (2022) Machine learning-based multiscale constitutive modelling: development and application to dual-porosity mass transfer. Adv Water Resour 163:104166. https://doi.org/10.1016/j.advwatres.2022.104166
Bagherzadeh F, Mehrani MJ, Basirifard M, Roostaei J (2021) Comparative study on total nitrogen prediction in wastewater treatment plant and effect of various feature selection methods on machine learning algorithms performance. J Water Process Eng 41:102033. https://doi.org/10.1016/j.jwpe.2021.102033
Bandyopadhyay S, Maiti SK (2021) Application of statistical and machine learning approach for prediction of soil quality index formulated to evaluate trajectory of ecosystem recovery in coal mine degraded land. Ecol Eng 170:106351. https://doi.org/10.1016/j.ecoleng.2021.106351
Bui DT, Khosravi K, Tiefenbacher J, Nguyen H, Kazakis N (2020) Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Sci Total Environ 721:137612. https://doi.org/10.1016/j.scitotenv.2020.137612
Cai H, Shimoda Y, Mao J, Arhonditsis GB (2023) Development of a sensitivity analysis framework for aquatic biogeochemical models using machine learning. Eco Inform 75:102079. https://doi.org/10.1016/j.ecoinf.2023.102079
Chakraborty SK, Chakraborty SK (2021) River pollution and perturbation: perspectives and processes. Riverine Ecology Volume 2: Biodiversity Conservation, Conflicts and Resolution 443–530. https://doi.org/10.1007/978-3-030-53941-2_5
CPCB (2019) https://cpcb.nic.in/wqstandards/. Accessed 19 April 2023
Dang KB, Burkhard B, Windhorst W, Müller F (2019) Application of a hybrid neural-fuzzy inference system for mapping crop suitability areas and predicting rice yields. Environ Model Softw 114:166–180. https://doi.org/10.1016/j.envsoft.2019.01.015
El-Rawy M, Abd-Ellah MK, Fathi H, Ahmed AKA (2021) Forecasting effluent and performance of wastewater treatment plant using different machine learning techniques. J Water Process Eng 44:102380. https://doi.org/10.1016/j.jwpe.2021.102380
Ewaid SH, Abed SA, Kadhum SA (2018) Predicting the Tigris River water quality within Baghdad, Iraq by using water quality index and regression analysis. Environ Technol Innov 11:390–398. https://doi.org/10.1016/j.eti.2018.06.013
Fonseca A, Botelho C, Boaventura R, Vilar V (2021) Evaluation of uncertainty propagation predictions in river water quality modeling. https://doi.org/10.21203/rs.3.rs-386752/v1
Gebler D, Wiegleb G, Szoszkiewicz K (2018) Integrating river hydromorphology and water quality into ecological status modelling by artificial neural networks. Water Res 139:395–405. https://doi.org/10.1016/j.watres.2018.04.016
Geng Z, Duan X, Li J, Chu C, Han Y (2022) Risk prediction model for food safety based on improved random forest integrating virtual sample. Eng Appl Artif Intell 116:105352. https://doi.org/10.1016/j.engappai.2022.105352
Golabi MR, Farzi S, Khodabakhshi F, Sohrabi Geshnigani F, Nazdane F, Radmanesh F (2020) Biochemical oxygen demand prediction: development of hybrid wavelet-random forest and M5 model tree approach using feature selection algorithms. Environ Sci Pollut Res 27:34322–34336. https://doi.org/10.1007/s11356-020-09457-x
Gupta S, Gupta SK (2022) Application of Monte Carlo simulation for carcinogenic and non-carcinogenic risks assessment through multi-exposure pathways of heavy metals of river water and sediment, India. Environ Geochem Health 1–22. https://doi.org/10.1007/s10653-022-01421-7
Gupta S, Gupta SK (2021a) A critical review on water quality index tool: genesis, evolution and future directions. Eco Inform 63:101299. https://doi.org/10.1016/j.ecoinf.2021.101299
Gupta S, Gupta SK (2021b) Development and evaluation of an innovative Enhanced River Pollution Index model for holistic monitoring and management of river water quality. Environ Sci Pollut Res 28(21):27033–27046. https://doi.org/10.1007/s11356-021-12501-z
Herrig IM, Böer SI, Brennholt N, Manz W (2015) Development of multiple linear regression models as predictive tools for fecal indicator concentrations in a stretch of the lower Lahn River, Germany. Water Res 85:148–157. https://doi.org/10.1016/j.watres.2015.08.006
Hosseini M, Powell M, Collins J, Callahan-Flintoft C, Jones W, Bowman H, Wyble B (2020) I tried a bunch of things: The dangers of unexpected overfitting in classification of brain data. Neurosci Biobehav Rev 119:456–467. https://doi.org/10.1016/j.neubiorev.2020.09.036
Khosravi K, Shahabi H, Pham BT, Adamowski J, Shirzadi A, Pradhan B, Dou J, Ly HB, Gróf G, Ho HL, Hong H, Chapi K, Prakash I (2019) A comparative assessment of flood susceptibility modeling using multi-criteria decisionmaking analysis and machine learning methods. J Hydrol 573:311–323. https://doi.org/10.1016/j.jhydrol.2019.03.073
Khullar S, Singh N (2022) Water quality assessment of a river using deep learning Bi-LSTM methodology: forecasting and validation. Environ Sci Pollut Res 29(9):12875–12889. https://doi.org/10.1007/s11356-021-13875-w
Kim S, Alizamir M, Zounemat-Kermani M, Kisi O, Singh VP (2020) Assessing the biochemical oxygen demand using neural networks and ensemble tree approaches in South Korea. J Environ Manag 270:110834. https://doi.org/10.1016/j.jenvman.2020.110834
Leng P, Zhang Q, Li F, Kulmatov R, Wang G, Qiao Y, Khasanov S (2021) Agricultural impacts drive longitudinal variations of riverine water quality of the Aral Sea basin (Amu Darya and Syr Darya Rivers), Central Asia. Environ Pollut 284:117405. https://doi.org/10.1016/j.envpol.2021.117405
Leong WC, Kelani RO, Ahmad Z (2020) Prediction of air pollution index (API) using support vector machine (SVM). J Environ Chem Eng 8(3):103208. https://doi.org/10.1016/j.jece.2019.103208
Li Q, Yang Y, Yang L, Wang Y (2023) Comparative analysis of water quality prediction performance based on LSTM in the Haihe River Basin, China. Environ Sci Pollut Res 30(3):7498–7509. https://doi.org/10.1007/s11356-022-22758-7
Maity S, Maiti R, Senapati T (2022) Evaluation of spatio-temporal variation of water quality and source identification of conducive parameters in Damodar River, India. Environ Monit Assess 194(4):308. https://doi.org/10.1007/s10661-022-09955-0
Mathias SA, Sander GC (2021) Pseudospectral methods provide fast and accurate solutions for the horizontal infiltration equation. J Hydrol 598:126407. https://doi.org/10.1016/j.jhydrol.2021.126407
Motevalli A, Naghibi SA, Hashemi H, Berndtsson R, Pradhan B, Gholami V (2019) Inverse method using boosted regression tree and k-nearest neighbor to quantify effects of point and non-point source nitrate pollution in groundwater. J Clean Prod 228:1248–1263. https://doi.org/10.1016/j.jclepro.2019.04.293
Nafsin N, Li J (2022) Prediction of 5-day biochemical oxygen demand in the Buriganga River of Bangladesh using novel hybrid machine learning algorithms. Water Environ Res 94(5):e10718. https://doi.org/10.1002/wer.10718
Najafzadeh M, Niazmardi S (2021) A novel multiple-kernel support vector regression algorithm for estimation of water quality parameters. Nat Resour Res 30(5):3761–3775. https://doi.org/10.1007/s11053-021-09895-5
Najafzadeh M, Ghaemi A, Emamgholizadeh S (2019) Prediction of water quality parameters using evolutionary computing-based formulations. Int J Environ Sci Tech 16:6377–6396. https://doi.org/10.1007/s13762-018-2049-4
Najafzadeh M, Homaei F, Farhadi H (2021) Reliability assessment of water quality index based on guidelines of national sanitation foundation in natural streams: integration of remote sensing and data-driven models. Artif Intell Rev 54(6):4619–4651. https://doi.org/10.1007/s10462-021-10007-1
Ooi KS, Chen Z, Poh PE, Cui J (2022) BOD5 prediction using machine learning methods. Water Supply 22(1):1168–1183. https://doi.org/10.2166/ws.2021.202
Pras A, Mamane H (2023) Nowcasting of fecal coliform presence using an artificial neural network. Environ Pollut 326:121484. https://doi.org/10.1016/j.envpol.2023.121484
R Core Team (2022) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. Accessed 8 Nov 2023
Raghavendra S, Deka PC (2014) Support vector machine applications in the field of hydrology: a review. Appl Soft Comput J 19:372–386. https://doi.org/10.1016/j.asoc.2014.02.002
Rajaee T, Khani S, Ravansalar M (2020) Artificial intelligence-based single and hybrid models for prediction of water quality in rivers: a review. Chemom Intell Lab Syst 200:103978. https://doi.org/10.1016/j.chemolab.2020.103978
Sablier M, Garrigues P (2014) Cultural heritage and its environment: an issue of interest for environmental science and pollution research. Environ Sci Pollut Res 21:5769–5773. https://doi.org/10.1007/s11356-013-2458-3
Safder U, Loy-Benitez J, Nguyen HT, Yoo C (2022) A hybrid extreme learning machine and deep belief network framework for sludge bulking monitoring in a dynamic wastewater treatment process. J Water Process Eng 46:102580. https://doi.org/10.1016/j.jwpe.2022.102580
Salih SQ, Alakili I, Beyaztas U, Shahid S, Yaseen ZM (2021) Prediction of dissolved oxygen, biochemical oxygen demand, and chemical oxygen demand using hydrometeorological variables: case study of Selangor River, Malaysia. Environ Dev Sustain 23(5):8027–8046. https://doi.org/10.1007/s10668-020-00927-3
Seal K, Chaudhuri H, Pal S, Srivastava RR, Soldatova E (2022) A study on water pollution scenario of the Damodar river basin, India: assessment of potential health risk using long term database (1980–2019) and statistical analysis. Environ Sci Pollut Res 29(35):53320–53352. https://doi.org/10.1007/s11356-022-19402-9
Shakya D, Deshpande V, Kumar B, Agarwal M (2023) Predicting total sediment load transport in rivers using regression techniques, extreme learning and deep learning models. Artif Intell Rev 56(9):10067–10098. https://doi.org/10.1007/s10462-023-10422-6
Sharma A, Tiwari KN (2019) Predicting non-point source of pollution in Maithon reservoir using a semi-distributed hydrological model. Environ Monit Assess 191:1–13. https://doi.org/10.1007/s10661-019-7674-y
Shukla S, Gedam S (2019) Evaluating hydrological responses to urbanization in a tropical river basin: a water resources management perspective. Nat Resour Res 28(2):327–347. https://doi.org/10.1007/s11053-018-9390-7
Singh G, Jindal T, Patel N, Dubey SK (2022) A coherent review on approaches, causes and sources of river water pollution: an Indian perspective. In Soil-water, agriculture, and climate change: exploring linkages. Cham: Springer International Publishing, pp. 247–271. https://doi.org/10.1007/978-3-031-12059-6_13
Thakkar A, Lohiya R (2023) Fusion of statistical importance for feature selection in Deep Neural Network-based Intrusion Detection System. Inform Fusion 90:353–363. https://doi.org/10.1016/j.inffus.2022.09.026
Tiyasha, Tung TM, Yaseen ZM (2020) A survey on river water quality modelling using artificial intelligence models: 2000–2020. J Hydrol 585:124670. https://doi.org/10.1016/j.jhydrol.2020.124670
Tripathi M, Singal SK (2019) Use of principal component analysis for parameter selection for development of a novel water quality index: a case study of river Ganga India. Ecol Ind 96:430–436. https://doi.org/10.1016/j.ecolind.2018.09.025
Verma RK, Murthy S, Tiwary RK, Verma S (2019) Development of simplified WQIs for assessment of spatial and temporal variations of surface water quality in upper Damodar River basin, eastern India. Appl Water Sci 9(1):21. https://doi.org/10.1007/s13201-019-0893-0
Wagh V, Panaskar D, Muley A, Mukate S, Gaikwad S (2018) Neural network modelling for nitrate concentration in groundwater of Kadava River basin, Nashik, Maharashtra, India. Groundw Sustain Dev 7:436–445. https://doi.org/10.1016/j.gsd.2017.12.012
Wang J, Deng Z (2019) Modelling and predicting fecal coliform bacteria levels in oyster harvest waters along Louisiana Gulf coast. Ecol Ind 101:212–220. https://doi.org/10.1016/j.ecolind.2019.01.013
Wang X, Zhang J, Babovic V, Gin KY (2019) A comprehensive integrated catchment-scale monitoring and modelling approach for facilitating management of water quality. Environ Model Softw 120:104489. https://doi.org/10.1016/j.envsoft.2019.07.014
Wang G, Jia QS, Zhou M, Bi J, Qiao J, Abusorrah A (2022) Artificial neural networks for water quality soft-sensing in wastewater treatment: a review. Artif Intell Rev 55(1):565–587. https://doi.org/10.1007/s10462-021-10038-8
Wong LW, Tan GWH, Ooi KB, Lin B, Dwivedi YK (2022) Artificial intelligence-driven risk management for enhancing supply chain agility: a deep-learning-based dual-stage PLS-SEM-ANN analysis. Int J Prod Res 1–21. https://doi.org/10.1080/00207543.2022.2063089
World Health Organization (2019) National systems to support drinking-water: sanitation and hygiene: global status report 2019: UN-Water global analysis and assessment of sanitation and drinking-water: GLAAS 2019 report. https://www.who.int/publications/i/item/9789241516297
Yang X, Liu Q, Liu X, Xue J (2022) An improved deep echo state network inspired by tissue-like P system forecasting for non-stationary time series. J Membr Comput 1–10. https://doi.org/10.1007/s41965-022-00103-8
Yu C, Yin X, Li H, Yang Z (2020) A hybrid water-quality-index and grey water footprint assessment approach for comprehensively evaluating water resources utilization considering multiple pollutants. J Clean Prod 248:119225. https://doi.org/10.1016/j.jclepro.2019.119225
Zanoni MG, Majone B, Bellin A (2022) A catchment-scale model of river water quality by Machine Learning. Sci Total Environ 838:156377. https://doi.org/10.1016/j.scitotenv.2022.156377
Zhang W, Wu C, Li Y, Wang L, Samui P (2021) Assessment of pile drivability using random forest regression and multivariate adaptive regression splines. Georisk: Assess Manag Risk Eng Syst Geohazards 15(1):27–40. https://doi.org/10.1080/17499518.2019.1674340
Zhang L, Wu Z, Sun X, Yan J, Sun Y, Chen J (2023) Mapping topsoil pH using different predictive models and covariate sets in Henan Province, Central China. Ecol Inform 78:102290. https://doi.org/10.1016/j.ecoinf.2023.102290
Acknowledgements
The authors acknowledge the Indian Institute of Technology (Indian School of Mines), Dhanbad, India, and Harcourt Butler Technical University, Kanpur, India, for providing the research facilities.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by Suyog Gupta. The first draft of the manuscript was written by Suyog Gupta and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethical approval
Not applicable.
Consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Responsible Editor: Marcus Schulz
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Gupta, S., Gupta, S.K. Development of AI-based hybrid soft computing models for prediction of critical river water quality indicators. Environ Sci Pollut Res 31, 27829–27845 (2024). https://doi.org/10.1007/s11356-024-32984-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11356-024-32984-w