Abstract
Hydrologists are often encountered problem of missing values in a rainfall and runoff database. They tend to use the normal ratio or distance power method to deal with the problem of missing data in the rainfall and runoff database. However, this method is time consuming and most of the time, it is less accurate. In this paper, two neighbor-based imputation methods namely K-nearest neighbor (KNN) and Gaussian mixture model based KNN imputation (GMM-KNN) were explored for gap filling the missing rainfall and runoff database. Different percentage of missing data entries were inserted randomly into the database such as 2%, 5%, 10%, 15% and 20% of missing data. Pros and cons of these two methods were compared and discussed. The selected study area is Bedup Basin, located at Samarahan Division, Sarawak, East Malaysia. It is observed that the GMM-KNN imputation method results in the best estimation accuracy for the missing rainfall and runoff database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Selase, A.E., Agyimpomaa, D.E., Selasi, D.D., Hakii, D.M.: Precipitation and rainfall types with their characteristic features. J. Nat. Sci. Res. 5(20), 1–3 (2015). www.iiste.org
Sattari, M.T., Rezazadeh-Joudi, A., Kusiak, A.: Assessment of different methods for estimation of missing data in precipitation studies. Hydrol. Res. 48(4), 1032–1044 (2017)
Kuok, K.K., Harun, S., Shamsudin, S.M.: Global optimization methods for calibration and optimization of the hydrologic Tank model’s parameters. Can. J. Civ. Eng. 1(1), 2–14 (2010)
Kuok, K.K., Kueh, S.M., Chiu, P.C.: Bat optimisation neural networks for rainfall forecasting: case study for Kuching city. J. Water Clim. Change (2018)
Valizadeh, N., El-Shafie, A., Mirzaei, M., Galavi, H., Mukhlisin, M., Jaafar, O.: Accuracy enhancement for forecasting water levels of reservoirs and river streams using a multiple-input-pattern fuzzification approach. Sci. World J. 2014 (2014)
Yaseen, Z.M., El-Shafie, A., Afan, H.A., Hameed, M., Mohtar, W.H., Hussain, A.: RBFNN versus FFNN for daily river flow forecasting at Johor River, Malaysia. Neural Comput. Appl. 27(6), 1533–1542 (2016)
Ismail, W.N., Zin, W.Z., Ibrahim, W.: Estimation of rainfall and stream flow missing data for Terengganu, Malaysia by using interpolation technique methods. Malay. J. Fundam. Appl. Sci. 13(3), 213–217 (2017)
Suhaila, J., Sayang, M.D., Jemain, A.A.: Revised spatial weighting methods for estimation of missing rainfall data. Asia-Pac. J. Atmos. Sci. 44(2), 93–104 (2008)
Eskelson, B.N., Temesgen, H., Lemay, V., Barrett, T.M., Crookston, N.L., Hudak, A.T.: The roles of nearest neighbor methods in imputing missing data in forest inventory and monitoring databases. Scand. J. For. Res. 24(3), 235–246 (2009)
Kamaruzaman, I.F., Zin, W.Z., Ariff, N.M.: A comparison of method for treating missing daily rainfall data in Peninsular Malaysia. Malay. J. Fundam. Appl. Sci. 13(4–1), 375–380 (2017)
Ferrari, G.T., Ozaki, V.: Missing data imputation of climate datasets: implications to modeling extreme drought events. Revista Brasileira de Meteorologia 29(1), 21–28 (2014)
Teegavarapu, R.S., Chandramouli, V.: Improved weighting methods, deterministic and stochastic data-driven models for estimation of missing precipitation records. J. Hydrol. 312(1–4), 191–206 (2005)
Dastorani, M.T., Moghadamnia, A., Piri, J., Rico-Ramirez, M.A.: Application of ANN and ANFIS models for reconstructing missing flow data. Environ. Monit. Assess. 166, 421–434 (2010)
Mispan, M.R., Rahman, N.F., Ali, M.F., Khalid, K., Bakar, M.H., Haron, S.: Missing river discharge data imputation approach using artificial neural network. J. Eng. Appl. Sci. 10(22) (2015)
Ding, Y., Ross, A.: A comparison of imputation methods for handling missing scores in biometric fusion. Pattern Recogn. 45(3), 919–933 (2012)
Kuok, K.K., Harun, S., Shamsuddin, S.M., Chiu, P.C.: Evaluation of daily rainfall-runoff model using multilayer perceptron and particle swarm optimization feed forward neural networks. J. Environ. Hydrol. 18(10), 1–6 (2010)
Oba, S., Sato, M.A., Takemasa, I., Monden, M., Matsubara, K.I., Ishii, S.: A Bayesian missing value estimation method for gene expression profile data. Bioinformatics 19(16), 2088–2096 (2003)
Little, R.J., Rubin, D.B.: Statistical Analysis with Missing Data. Wiley, Hoboken (2014)
Zainuri, N.A., Jemain, A.A., Muda, N.: A comparison of various imputation methods for missing values in air quality data. Sains Malaysiana 44(3), 449–456 (2015)
Acknowledgments
The authors sincerely acknowledge the Department of Irrigation and Drainage (DID), Sarawak, Malaysia for providing the rainfall and runoff data in this study. The authors wish to thank Universiti Teknologi Malaysia (UTM) under Research University Grant Vot-20H04, Malaysia Research University Network (MRUN) Vot 4L876 and the Fundamental Research Grant Scheme (FRGS) Vot 5F073 supported under Ministry of Education Malaysia for the completion of the research. The works were also supported by the SPEV project, University of Hradec Kralove, FIM, Czech Republic (ID: 2102–2019). We are also grateful for the support of Ph.D. student Sebastien Mambou in consultations regarding application aspects.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Chiu, P.C., Selamat, A., Krejcar, O. (2019). Infilling Missing Rainfall and Runoff Data for Sarawak, Malaysia Using Gaussian Mixture Model Based K-Nearest Neighbor Imputation. In: Wotawa, F., Friedrich, G., Pill, I., Koitz-Hristov, R., Ali, M. (eds) Advances and Trends in Artificial Intelligence. From Theory to Practice. IEA/AIE 2019. Lecture Notes in Computer Science(), vol 11606. Springer, Cham. https://doi.org/10.1007/978-3-030-22999-3_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-22999-3_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22998-6
Online ISBN: 978-3-030-22999-3
eBook Packages: Computer ScienceComputer Science (R0)