Abstract
Water security has attracted a lot of attention in the world, and water quality assessment is the main task to ensure water security. In China, as an important water supply route in the Beijing-Tianjin-Hebei region, the South-North Water Transfer Middle Route is crucial to the economic development and health of people in the region. Therefore, an effective water quality prediction method is essential to prevent water quality degradation and water pollution in the South-North Water Transfer Line. In this paper, on the basis of the water quality data of 13 automatic monitoring stations in the middle of the Transfer Middle Route, we propose a water quality prediction method based on k-nearest-neighbor probability rough sets and PSO-LSTM algorithm. More specifically, we first introduce a novel model of k-nearest-neighbor probabilistic rough sets (KNPRSs) by combining k-nearest-neighbor algorithm and probabilistic rough sets. Then, an attribute reduction approach based on KNPRSs is developed, which can effectively eliminate the redundant attributes in water quality assessment and filter out the valuable feature attributes. Furthermore, we propose a water quality prediction method based on LSTM neural network model optimized by PSO algorithm. By introducing the PSO algorithm, the hyperparameters of the LSTM neural network are adaptively optimized to improve the accuracy of water quality prediction. At last, the historical data of three automatic monitoring stations along the route are selected, and six water quality indicators with practical forecasting value are used as the target, and a comparison experiment is conducted. The experimental results show that the KNPRSs-PSO-LSTM model can fully extract the key characteristics of water quality attributes, can be used to predict different target indicators at different stations, which is reliable, stable and efficient, can effectively improve the prediction accuracy, can be applied to the South-North Water Transfer Middle Route water quality forecasting and early warning work.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data Availability and Access
The data that has been used is confidential.
References
Pawlak Z, Skowron A (2007) Rudiments of rough sets. Inf Sci 177(1):3–27
Pawlak Z (1998) Rough set theory and its applications to data analysis. Cybern Syst 29(7):661–688
Lin T (1988) Neighborhood systems and approximation in relational databases and knowledge bases. In: Proceedings of the 4th international symposium on methodologies of intelligent systems, citeseer, pp 75–86
Yao Y (1998) Relational interpretations of neighborhood operators and rough set approximation operators. Inf Sci 111(1–4):239–259
Wu WZ, Zhang WX (2002) Neighborhood operator systems and approximations. Inf Sci 144(1–4):201–217
Hu Q, Yu D, Liu J et al (2008) Neighborhood rough set based heterogeneous feature subset selection. Inf Sci 178(18):3577–3594
Ma L (2012) On some types of neighborhood-related covering rough sets. Int J Approx Reason 53(6):901–911
Li W, Huang Z, Jia X et al (2016) Neighborhood based decision-theoretic rough set models. Int J Approx Reason 69:1–17
Zhang Y, Miao D, Zhang Z et al (2018) A three-way selective ensemble model for multi-label classification. Int J Approx Reason 103:394–413
Fujita H, Gaeta A, Loia V et al (2018) Resilience analysis of critical infrastructures: a cognitive approach based on granular computing. IEEE Trans Cybern 49(5):1835–1848
Yang X, Chen Y, Fujita H et al (2022) Mixed data-driven sequential three-way decision via subjective-objective dynamic fusion. Knowl-Based Syst 237:107728
Liu J, Lin Y, Ding W et al (2023) Multi-label feature selection based on label distribution and neighborhood rough set. Neurocomputing 524:142–157
Yin T, Chen H, Yuan Z et al (2023) Noise-resistant multilabel fuzzy neighborhood rough sets for feature subset selection. Inf Sci 621:200–226
Wang C, Hu Q, Wang X et al (2017) Feature selection based on neighborhood discrimination index. IEEE Trans Neural Netw Learn Sys 29(7):2986–2999
Wang C, Shi Y, Fan X et al (2019) Attribute reduction based on k-nearest neighborhood rough sets. Int J Approx Reason 106:18–31
Wan J, Chen H, Yuan Z et al (2021) A novel hybrid feature selection method considering feature interaction in neighborhood rough set. Knowl-Based Syst 227:107167
Sang B, Chen H, Yang L et al (2021) Feature selection for dynamic interval-valued ordered data based on fuzzy dominance neighborhood rough set. Knowl-Based Syst 227:107223
Yang X, Li M, Fujita H et al (2022) Incremental rough reduction with stable attribute group. Inf Sci 589:283–299
Yang X, Chen H, Li T et al (2022) Student-t kernelized fuzzy rough set model with fuzzy divergence for feature selection. Inf Sci 610:52–72
Hu M, Tsang EC, Guo Y et al (2021) A novel approach to attribute reduction based on weighted neighborhood rough sets. Knowl-Based Syst 220:106908
Hu M, Tsang EC, Guo Y et al (2022) Attribute reduction based on overlap degree and k-nearest-neighbor rough sets in decision information systems. Inf Sci 584:301–324
Wong SM, Ziarko W (1987) Comparison of the probabilistic approximate classification and the fuzzy set model. Fuzzy Sets Syst 21(3):357–362
Wang G, Yu H et al (2015) Monotonic uncertainty measures for attribute reduction in probabilistic rough set model. Int J Approx Reason 59:41–67
Xie J, Hu BQ, Jiang H (2022) A novel method to attribute reduction based on weighted neighborhood probabilistic rough sets. Int J Approx Reason 144:1–17
Maier HR, Dandy GC (2000) Neural networks for the prediction and forecasting of water resources variables: a review of modelling issues and applications. Environ Model Softw 15(1):101–124
McCulloch WS, Pitts W (1943) A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys 5:115–133
Hebb DO (2005) The organization of behavior: a neuropsychological theory. Psychol Press
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Jin W, Li ZJ, Wei LS et al (2000) The improvements of bp neural network learning algorithm. In: WCC 2000-ICSP 2000. 2000 5th International conference on signal processing proceedings. 16th World computer congress 2000, IEEE, pp 1647–1649
Hinton GE, Osindero S, Teh YW (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7):1527–1554
Bengio Y, Lamblin P, Popovici D et al (2006) Greedy layer-wise training of deep networks. Adv Neural Inf Process Sys 19
Ranzato M, Poultney C, Chopra S et al (2006) Efficient learning of sparse representations with an energy-based model. Adv Neural Inf Process Sys 19
Kirzhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Sys 25:1097–1105
Zhang L, Zhao JQ, Zhang XN et al (2013) Study of a new improved pso-bp neural network algorithm. J Harbin Inst Tech 20(5):106–112
Wang S, Zhang N, Wu L et al (2016) Wind speed forecasting based on the hybrid ensemble empirical mode decomposition and ga-bp neural network method. Renew Energy 94:629–636
Xu X, Ding S, Jia W et al (2013) Research of assembling optimized classification algorithm by neural network based on ordinary least squares (ols). Neural Comput Applic 22:187–193
Lei L (2018) Wavelet neural network prediction method of stock price trend based on rough set attribute reduction. Appl Soft Comput 62:923–932
He H, Lu Z, Zhang C et al (2021) A data-driven method for dynamic load forecasting of scraper conveyer based on rough set and multilayered self-normalizing gated recurrent network. Energy Rep 7:1352–1362
Wang Y, Zhou J, Chen K et al (2017) Water quality prediction method based on lstm neural network. In: 2017 12th International conference on intelligent systems and knowledge engineering (ISKE), IEEE, pp 1–5
Ren T, Liu X, Niu J et al (2020) Real-time water level prediction of cascaded channels based on multilayer perception and recurrent neural network. J Hydrol 585:124783
Remolina MCR, Li Z, Peleato NM (2022) Application of machine learning methods for rapid fluorescence-based detection of naphthenic acids and phenol in natural surface waters. J Hazard Mater 430:128491
Wang S, Peng H, Liang S (2022) Prediction of estuarine water quality using interpretable machine learning approach. J Hydrol 605:127320
Wang L, Dong H, Cao Y et al (2023) Real-time water quality detection based on fluctuation feature analysis with the lstm model. J Hydroinformatics 25(1):140–149
Tao D, Yang Y, Cai Z et al (2023) Application of vmd-lstm in water quality prediction. In: Journal of physics: conference series, IOP Publishing, p 012057
Lin TY et al (1998) Granular computing on binary relations i: data mining and neighborhood systems. Rough Sets Knowl Discov 1(1):107–121
Yao Y (2008) Probabilistic rough set approximations. Int J Approx Reason 49(2):255–271
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comp 9(8):1735–1780
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. In: Proceedings of the sixth international symposium on micro machine and human science, IEEE, pp 39–43
Marini F, Walczak B (2015) Particle swarm optimization (pso). a tutorial. Chemometr Intell Lab Syst 149:153–165
Acknowledgements
The work described in this paper was supported by grants from the National Natural Science Foundation of China (Grant nos. 11971365 and 11571010) and the Key Project of Guangxi Natural Science Foundation (Grant No. 2023GXNSFDA026006).
Author information
Authors and Affiliations
Contributions
Minrui Huang: Methodology, Investigation, Writing-original draft. Bao Qing Hu: Methodology, Writing-Reviewing and Editing. Haibo Jiang: Methodology, Writing-Reviewing and Editing. Bo Wen Fang: Methodology, Investigation and Writing-Reviewing.
Corresponding author
Ethics declarations
Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Ethical and informed consent for data used
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, M., Hu, B.Q., Jiang, H. et al. A water quality prediction method based on k-nearest-neighbor probability rough sets and PSO-LSTM. Appl Intell 53, 31106–31128 (2023). https://doi.org/10.1007/s10489-023-05024-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-023-05024-2