Nothing Special   »   [go: up one dir, main page]

Skip to main content

Performance Degradation of Multi-class Classification Model Due to Continuous Evolving Data Streams

  • Conference paper
  • First Online:
Innovative Systems for Intelligent Health Informatics (IRICT 2020)

Abstract

Online machine learning plays a pivotal role in the 4th Industrial Revolution (IR 4.0). IR 4.0 requires real-time data analysis (classification or prediction) using the streaming data. However, mostly the data streams contain nonstationary (variability features) characteristics, such as concept drift and class imbalance. The issues of concept drift and class imbalance adversely affects the accuracy of the classification models. The classification accuracy is even more affected when these issues arrived at the same time (joint problem). Some efforts have been made in the literature to cope with the joint problem of class imbalance and concept drift in online learning, but the existing solutions are limited to binary class classification and these solutions do not work for multi-class classification. Besides, the literature doesn’t mention the exact correlation between the critical factors of concept drift and class imbalance. Also, the tuning parameters of the multi-class classification models, which can help in improving the classification accuracy, are unknown. However, to resolve the joint problem of concept drift and class imbalance in online multi-class classification models, it is essential to determine the exact correlation and find the tuning parameters, which could be helpful to provide a more dynamic approach for avoiding the performance degradation of the multi-class classification models. Therefore, to resolve this issue, this study aims to determine the correlation between the concept drift and class imbalance, identify the tuning parameters of multi-class classification models and propose a dynamic solution based on these findings. Our proposed dynamic approach could be effectively utilized in various online machine learning based (real-time) streams analysis, which is desirable for IR 4.0.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Zhao, R., Yan, R., Chen, Z., Mao, K., Wang, P., Gao, R.X.: Deep learning and its applications to machine health monitoring. Mech. Syst. Signal Process. 115, 213–237 (2019)

    Article  Google Scholar 

  2. Mazurowski, M.A., Habas, P.A., Zurada, J.M., Lo, J.Y., Baker, J.A., Tourassi, G.D.: Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance. Neural Netw. 21(2–3), 427–436 (2008)

    Article  Google Scholar 

  3. Zhang, J., Zulkernine, M.: Network intrusion detection using random forests. In: Book Network Intrusion Detection using Random Forests. Citeseer (2005)

    Google Scholar 

  4. Wang, G.: Asymmetric random subspace method for imbalanced credit risk evaluation. In: Software Engineering and Knowledge Engineering: Theory and Practice, pp. 1047–1053. Springer, Heidelberg (2012)

    Google Scholar 

  5. Jameel, S.M.H., Rehman, M.A., Budiman, M.: An adaptive deep learning framework for dynamic image classification in the Internet of Things. Sensors 20, 5811 (2020)

    Article  Google Scholar 

  6. Hashmani, M.A., Jameel, S.M., Al-Hussain, H., Rehman, M., Budiman, A.: Accuracy performance degradation in image classification models due to concept drift. Int. J. Adv. Comput. Sci. Appl. (IJACSA) 10(5), 1–5 (2019)

    Google Scholar 

  7. Jameel, S.M., Hashmani, M.A., Alhussain, H., Rehman, M., Budiman, A.: ‘A Critical Review on Adverse Effects of Concept Drift over Machine Learning Classification Models’

    Google Scholar 

  8. Farid, D.M., Zhang, L., Hossain, A., Rahman, C.M., Strachan, R., Sexton, G., Dahal, K.: An adaptive ensemble classifier for mining concept drifting data streams. Expert Sys Appl 40(15), 5895–5906 (2013)

    Article  Google Scholar 

  9. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Book Learning from Time-Changing Data with Adaptive Windowing, pp. 443–448. SIAM (2007)

    Google Scholar 

  10. Jameel, S.M., Hashmani, M.A., Rehman, M., Budiman, A.: Adaptive CNN ensemble for complex multispectral image analysis. Complexity 2020 (2020)

    Google Scholar 

  11. Jameel, S.M., Gilal, A.R., Rizvi, S.S.H., Rehman, M., Hashmani, M.A.: Practical implications and challenges of multispectral image analysis. In: Book Practical Implications and Challenges of Multispectral Image Analysis, pp. 1–5. IEEE (2020)

    Google Scholar 

  12. Mirza, B., Lin, Z., Toh, K.-A.: Weighted online sequential extreme learning machine for class imbalance learning. Neural Process. Lett. 38(3), 465–486 (2013)

    Article  Google Scholar 

  13. Mirza, B., Lin, Z., Cao, J., Lai, X.: Voting based weighted online sequential extreme learning machine for imbalance multi-class classification. In: Book Voting Based Weighted Online Sequential Extreme Learning Machine for Imbalance Multi-class Classification, pp. 565–568. IEEE (2015)

    Google Scholar 

  14. Mustaqeem, A., Anwar, S.M., Majid, M.: Multiclass classification of cardiac arrhythmia using improved feature selection and SVM invariants. Computat. Math. Methods Med. 2018, 1–10 (2018)

    Article  MathSciNet  Google Scholar 

  15. Priya, S., Uthra, R.A.: Comprehensive analysis for class imbalance data with concept drift using ensemble based classification. J. Ambient Intell. Humanized Comput. (2020)

    Google Scholar 

  16. Iwashita, A.S., Papa, J.P.: An overview on concept drift learning. IEEE Access 7, 1532–1547 (2018)

    Article  Google Scholar 

  17. Wang, S., Minku, L.L., Yao, X.: A systematic study of online class imbalance learning with concept drift. IEEE Trans. Neural Netw. Learn. Syst. 29(10), 4802–4821 (2018)

    Article  Google Scholar 

  18. Ancy, S., Paulraj, D.: Handling imbalanced data with concept drift by applying dynamic sampling and ensemble classification model. Comput. Commun. 153, 553–560 (2020)

    Article  Google Scholar 

  19. Khandekar, V.S., Srinath, P.: ‘Non-stationary data stream analysis: state-of-the-art challenges and solutions. In: Book Non-stationary Data Stream Analysis: State-of-the-Art Challenges and Solutions, pp. 67–80. Springer, Heidelberg (2020)

    Google Scholar 

  20. Zheng, X., Li, P., Chu, Z., Xuegang, H.: A survey on multi-label data stream classification. IEEE Access 8, 1249–1275 (2020)

    Article  Google Scholar 

  21. Wang, S., Minku, L.L., Yao, X.: Dealing with multiple classes in online class imbalance learning. In: Book Dealing with Multiple Classes in Online Class Imbalance Learning, pp. 2118–2124 (2016)

    Google Scholar 

  22. Chen, S., He, H.: SERA: selectively recursive approach towards nonstationary imbalanced stream data mining. In: Book SERA: Selectively Recursive Approach Towards Nonstationary Imbalanced Stream Data Mining, pp. 522–529 (2009)

    Google Scholar 

  23. Chen, S., He, H.: Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol. Syst. 2(1), 35–50 (2011)

    Article  Google Scholar 

  24. Wang, S., Minku, L.L., Yao, X.: A learning framework for online class imbalance learning. In: Book A Learning Framework for Online Class Imbalance learning, pp. 36–45. IEEE (2013)

    Google Scholar 

  25. Wang, S., Minku, L.L., Yao, X.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)

    Article  Google Scholar 

  26. Ghazikhani, A., Monsefi, R., Yazdi, H.S.: Ensemble of online neural networks for non-stationary and imbalanced data streams. Neurocomputing 122, 535–544 (2013)

    Article  Google Scholar 

  27. Mirza, B., Lin, Z., Liu, N.: Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift. Neurocomputing 149, 316–329 (2015)

    Article  Google Scholar 

  28. Wang, L., Wu, C.: Dynamic imbalanced business credit evaluation based on Learn++ with sliding time window and weight sampling and FCM with multiple kernels. Inf Sci 520, 305–323 (2020)

    Article  Google Scholar 

  29. Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)

    Article  Google Scholar 

  30. Li, Z., Huang, W., Xiong, Y., Ren, S., Zhu, T.: Incremental learning imbalanced data streams with concept drift: the dynamic updated ensemble algorithm. Knowl.-Based Syst. 195, 105694 (2020)

    Article  Google Scholar 

  31. Lin, C.-C., Deng, D.-J., Kuo, C.-H., Chen, L.: Concept drift detection and adaption in big imbalance industrial IoT data using an ensemble learning method of offline classifiers. IEEE Access 7, 56198–56207 (2019)

    Article  Google Scholar 

  32. Wang, H., Abraham, Z.: Concept drift detection for streaming data. In: Book Concept Drift Detection for Streaming Data, pp. 1–9. IEEE (2015)

    Google Scholar 

  33. Kokilam, K.V., Latha, D.P.P., Raj, D.J.P.: Learning of concept drift and multi class imbalanced dataset using resampling ensemble methods. Int. J. Recent Technol. Eng. 8(1), 1332–1340 (2019)

    Google Scholar 

  34. Nayer Wanas, A.F., Said, D., Khodeir, N., Fayek, M.: Detection and handling of different types of concept drift in news recommendation systemS. Int. J. Comput. Sci. Inf. Technol. (IJCSIT) 11, 87–106 (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Palli, A.S., Jaafar, J., Hashmani, M.A. (2021). Performance Degradation of Multi-class Classification Model Due to Continuous Evolving Data Streams. In: Saeed, F., Mohammed, F., Al-Nahari, A. (eds) Innovative Systems for Intelligent Health Informatics. IRICT 2020. Lecture Notes on Data Engineering and Communications Technologies, vol 72. Springer, Cham. https://doi.org/10.1007/978-3-030-70713-2_63

Download citation

Publish with us

Policies and ethics