Abstract
This paper is concentrated on the applications of main data mining tools, methods, models and technologies for solving the basic tasks of data processing for telecommunication industry. The main forecasting task in this industry is to predict the class and volume of services needed for the subscribers as well as for predicting the capacity of all needed engineering equipment. It is proposed to develop regression and forecasting models based upon the Facebook Prophet module to take into account seasonal effects in data and to compare the results with the ones received by using the LSTM network. The classification task was related to the problem of churn prediction. The authors identified such promising methods as decision trees, random forest, logistic regression, neural networks, support vector machines and gradient boosting to solve problems of subscribers classification by their certain preferences and services, as well as the tendency to outflow. The dynamic approach based on dynamic models of survival theory for churn time prediction is proposed. Next the task of forecasting the volume and class of services which subscribers are going to use in roaming is solved. The results of using regression models and data mining methods were also shown. All the methods proposed were compared and evaluated by necessary statistical characteristics, and interpretation of the model application results for practical solutions were proposed. Finally in the paper the client-service information technology with all needed functionality for solving all these tasks is proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ministry of digital transformation of Ukraine (2021). https://thedigital.gov.ua/news/mintsifra-bilshe-126-tis-ukraintsiv-otrimali-4g-vpershe-v-lyutomu
Beran, J.: Mathematical Foundations of Time Series Analysis, p. 309. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-74380-6
Brandusoiu, I., Toderean, G.: Predicting churn in mobile telecommunications industry. Electron. Telecommun. 51, 1–6 (2010)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001). https://doi.org/10.1023/A:1010933404324
Brownlee, J.: How to develop LSTM models for time series forecasting (2018). https://machinelearningmastery.com/how-to-develop-lstm-models-for-time-series-forecasting/
Chih-Ping, W., I-Tang, C.: Turning telecommunications call details to churn prediction: a data mining approach. Expert Syst. Appl. 23(1), 103–112 (2002). https://doi.org/10.1016/S09574174(02)000301
Chistyakov, S.: Random forests: an overview. Works of Karelian scientist center of RAS 1, 117–136 (2013). (in Russian)
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1007/BF00994018
Coussement, K., Poel, D.: Churn prediction in subscription services: an application of support vector machines while comparing two parameter-selection techniques. Expert Syst. Appl. 34, 313–327 (2008)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). http://www.jstor.org/stable/2699986
Hanzo, L., Akhtman, Y., Wang, L., Jiang, M.: MIMO-OFDM for LTE, WiFi and WiMAX: Coherent Versus Non-Coherent and Cooperative Turbo Transceivers, p. 692. Wiley-IEEE Press (2020)
Havrylovych M., Kuznietsova, N.: Survival analysis methods for churn prevention in telecommunications industry. In: CEUR Workshop Proceeding, vol. 2577, pp. 47–58. CEUR (2020). http://ceur-ws.org/Vol-2577/paper5.pdf
Herrera, F., Charte, F., Rivera, A., Jesus, M.: Multilabel Classification. Problem Analysis, Metrics and Techniques, p. 194. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-41111-8
Hung, S., Yen, D., Wang, H.: Applying data mining to telecom churn management. Expert Syst. Appl. 31, 515–524 (2004)
Kuznietsova, N., Bidyuk, P.: Intelligence information technologies for financial data processing in risk management. In: Babichev, S., Peleshko, D., Vynokurova, O. (eds.) Data Stream Mining & Processing. Communications in Computer and Information Science, vol. 1158, pp. 539–558 (2020). https://doi.org/10.1007/978-3-030-61656-4_36
Kuznietsova, N., Kuznietsova, M.: Data mining methods application for increasing the data storage systems fault-tolerance. In: 2020 IEEE 2nd International Conference on System Analysis & Intelligent Computing (SAIC), pp. 315–318 (2020). https://doi.org/10.1109/SAIC51296.2020.9239222
Kuznietsova, N.V.: Information technologies for clients’ database analysis and behaviour forecasting. In: CEUR Workshop Proceeding, vol. 2067, pp. 56–62. CEUR (2017). http://ceur-ws.org/Vol-2067/
Kuznietsova, N.V.: Dynamic method of risk assessment in financial management system. Registration Storage Data Process. 21(3), 85–98 (2019). https://doi.org/10.35681/1560-9189.2019.21.3.183724
Kuznietsova, N.V., Bidyuk, P.I.: Dynamic modeling of financial risks. Inductive Model. Complex Syst. 9, 122–137 (2017). http://nbuv.gov.ua/UJRN/Imss_2017_9_15
Kuznietsova, N.V., Bidyuk, P.I.: Modeling of financial risk in the telecommunications field. Sci. News NTUU “KPI” 5, 51–58 (2017). https://doi.org/10.20535/1810-0546.2017.5.110338
Kwiatkowski, D., Phillips, P., Schmidt, P., Shin, Y.: Testing the null hypothesis of stationarity against the alternative of a unit root: how sure are we that economic time series are nonstationary? J. Econ. 54, 159–178 (1992)
Mozer, M.C., Wolniewicz, R., Grimes, D.B., Johnson, E., Kaushansky, H.: Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry. IEEE Trans. Neural Networks 11(3), 690–696 (2000). https://doi.org/10.1109/72.846740
Nikulin, V.N., Kanishchev, I.S., Bagaev, I.: Methods of balancing and normalization of data to improve the quality of classification. Comput. Tools Educ. 3, 16–24 (2016)
Orac, R.: LSTM for time series prediction (2019). https://towardsdatascience.com/lstm-for-time-series-prediction-de8aeb26f2ca
Osovsky, S.: Neural networks for information processing (translation in Russian by I.D. Rudinsky), p. 344. Finance and statistics (2002)
Papageorgiou, N.S., Radulescu, V.D., Repovs, D.D.: Nonlinear Analysis – Theory and Methods, p. 586. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-03430-6
Robson, W.: The math of Prophet (2019). https://medium.com/future-vision/the-math-of-prophet-46864fa9c55a
Taylor, S., Letham, B.: Forecasting at scale. PeerJ Preprints 5, 1–25, e3190v2 (2017). https://doi.org/10.7287/peerj.preprints.3190v2
Tsay, R.: Analysis of Financial Time Series, p. 720. Wiley, New York (2010)
Umayaparvathi, V., Iyakutti, K.: Applications of data mining techniques in telecom churn prediction. Int. J. Comput. Appl. 42, 5–9 (2012). https://doi.org/10.5120/5814-8122
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Kuznietsova, N., Bidyuk, P., Kuznietsova, M. (2022). Data Mining Methods, Models and Solutions for Big Data Cases in Telecommunication Industry. In: Babichev, S., Lytvynenko, V. (eds) Lecture Notes in Computational Intelligence and Decision Making. ISDMCI 2021. Lecture Notes on Data Engineering and Communications Technologies, vol 77. Springer, Cham. https://doi.org/10.1007/978-3-030-82014-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-82014-5_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-82013-8
Online ISBN: 978-3-030-82014-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)