Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

GPU-Accelerated Extreme Learning Machines for Imbalanced Data Streams with Concept Drift

Published: 01 June 2016 Publication History

Abstract

Mining data streams is one of the most vital fields in the current era of big data. Continuously arriving data may pose various problems, connected to their volume, variety or velocity. In this paper we focus on two important difficulties embedded in the nature of data streams: non- stationary nature and skewed class distributions. Such a scenario requires a classifier that is able to rapidly adapt itself to concept drift and displays robustness to class imbalance problem. We propose to use online version of Extreme Learning Machine that is enhanced by an efficient drift detector and method to alleviate the bias towards the majority class. We investigate three approaches based on undersampling, oversampling and cost-sensitive adaptation. Additionally, to allow for a rapid updating of the proposed classifier we show how to implement online Extreme Learning Machines with the usage of GPU. The proposed approach allows for a highly efficient mining of high-speed, drifting and imbalanced data streams with significant acceleration offered by GPU processing.

References

[1]
M. Alia-Martinez, J. Antonanzas, F. Antoanzas-Torres, A.V. Perna-Espinoza, and R. Urraca- Valle. A straightforward implementation of a gpu-accelerated ELM in R with NVIDIA graphic cards. In Hybrid Artificial Intelligent Systems - 10th International Conference, HAIS 2015, Bilbao, Spain, June 22-24, 2015, Proceedings, pages 656-667, 2015.
[2]
A. Bifet, G. De Francisci Morales, J. Read, G. Holmes, and B. Pfahringer. Efficient online evaluation of big data stream classifiers. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia, August 10-13, 2015, pages 59-68, 2015.
[3]
A. Bifet and R. Gavalda. Learning from time-changing data with adaptive windowing. In Proceedings of the Seventh SIAM International Conference on Data Mining, April 26-28, 2007, Minneapolis, Minnesota, USA, pages 443-448, 2007.
[4]
D. Brzezinski and J. Stefanowski. Prequential AUC for classifier evaluation and drift detection in evolving data streams. In New Frontiers in Mining Complex Patterns - Third International Work- shop, NFMCP 2014, Held in Conjunction with ECML-PKDD 2014, Nancy, France, September 19, 2014, Revised Selected Papers, pages 87-101, 2014.
[5]
P. Buteneers, K. Caluwaerts, J. Dambre, D. Verstraeten, B. Schrauwen, Optimized parameter search for large datasets of the regularization parameter and feature selection for ridge regression, Neural Processing Letters, 38 (2013) 403-416.
[6]
E. K. P. Chong and S. H. Z ak. An introduction to optimization, volume 76. John Wiley & Sons, 2013.
[7]
B. Cyganek, S. Gruszczyski., Hybrid computer vision system for drivers eye recognition and fatigue monitoring, Neurocomputing, 126 (2014) 78-94.
[8]
S. Ding, H. Zhao., Y. Zhang, X. Xu, R. Nie., Extreme learning machine: algorithm, theory and applications, Artif. Intell. Rev., 44 (2015) 103-115.
[9]
J. Gama., A survey on learning from data streams: current and future trends, Progress in AI, 1 (2012) 45-55.
[10]
G.-B. Huang, N.-Y. Liang, H.-J. Rong, P. Saratchandran, and N. Sundararajan. On-line sequential extreme learning machine. In IASTED International Conference on Computational Intelligence, Calgary, Alberta, Canada, July 4-6, 2005, pages 232-237, 2005.
[11]
D. Jankowski and K. Jackowski. Evolutionary algorithm for decision tree induction. In Computer Information Systems and Industrial Management - 13th IFIP TC8 International Conference, CISIM 2014, Ho Chi Minh City, Vietnam, November 5-7, 2014. Proceedings, pages 23-32, 2014.
[12]
B. Mirza, Z. Lin, N. Liu., Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift, Neurocomputing, 149 (2015) 316-329.
[13]
Y.-H. Pao, G.H. Park, D.J. Sobajic, Learning and generalization characteristics of the random vector functional-link net, Neurocomputing, 6 (1994) 163-180.
[14]
M. Przewozniczek, R. Goscien, K. Walkowiak, M. Klinkowski, Towards solving practical problems of large solution space using a novel pattern searching hybrid evolutionary algorithm - an elastic optical network optimization case study, Expert Syst. Appl., 42 (2015) 7781-7796.
[15]
I. Triguero, S. del Ro, V. Lpez, J. Bacardit, J.M. Bentez, F. Herrera, ROSEFW-RF: the winner algorithm for the ecbdl14 big data competition: An extremely imbalanced big data bioinformatics problem, Knowl. -Based Syst., 87 (2015) 69-79.
[16]
S. Wang, L.L. Minku, X. Yao., Resampling-based ensemble methods for online class imbalance learning, IEEE Trans. Knowl. Data Eng., 27 (2015) 1356-1368.
[17]
M. Woniak., A hybrid decision tree training method using data streams, Knowl. Inf. Syst., 29 (2011) 335-347.
[18]
M. Woniak. Application of combined classifiers to data stream classification. In Computer Information Systems and Industrial Management - 12th IFIP TC8 International Conference, CISIM 2013, Krakow, Poland, September 25-27, 2013. Proceedings, pages 13-23, 2013.
[19]
M. Woniak, P. Cal, and B. Cyganek. The influence of a classifiers diversity on the quality of weighted aging ensemble. In Intelligent Information and Database Systems - 6th Asian Conference, ACIIDS 2014, Bangkok, Thailand, April 7-9, 2014, Proceedings, Part II, pages 90-99, 2014.
[20]
M. Woniak, M. Graa, E. Corchado, A survey of multiple classifier systems as hybrid systems, Information Fusion, 16 (2014) 3-17.
[21]
M. Woniak, A. Kasprzak, and P. Cal. Weighted aging classifier ensemble for the incremental drifted data streams. In Flexible Query Answering Systems - 10th International Conference, FQAS 2013, Granada, Spain, September 18-20, 2013. Proceedings, pages 579-588, 2013.

Cited By

View all
  • (2020)ELM-NET, a closer to practice approach for classifying the big data using multiple independent ELMsCluster Computing10.1007/s10586-019-02957-723:2(735-757)Online publication date: 1-Jun-2020

Index Terms

  1. GPU-Accelerated Extreme Learning Machines for Imbalanced Data Streams with Concept Drift

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Procedia Computer Science
    Procedia Computer Science  Volume 80, Issue C
    June 2016
    2452 pages
    ISSN:1877-0509
    EISSN:1877-0509
    Issue’s Table of Contents

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 01 June 2016

    Author Tags

    1. Big data
    2. Concept drift
    3. Data streams
    4. Extreme learning machines
    5. GPU.
    6. Imbalanced data

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2020)ELM-NET, a closer to practice approach for classifying the big data using multiple independent ELMsCluster Computing10.1007/s10586-019-02957-723:2(735-757)Online publication date: 1-Jun-2020

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media