Nothing Special   »   [go: up one dir, main page]

Skip to main content

Advertisement

Log in

Discussion and review on evolving data streams and concept drift adapting

  • Original Paper
  • Published:
Evolving Systems Aims and scope Submit manuscript

Abstract

Recent advances in computational intelligent systems have focused on addressing complex problems related to the dynamicity of the environments. In increasing number of real world applications, data are presented as streams that may evolve over time and this is known by concept drift. Handling concept drift is becoming an attractive topic of research that concerns multidisciplinary domains such that machine learning, data mining, ubiquitous knowledge discovery, statistic decision theory, etc... Therefore, a rich body of the literature has been devoted to the study of methods and techniques for handling drifting data. However, this literature is fairly dispersed and it does not define guidelines for choosing an appropriate approach for a given application. Hence, the main objective of this survey is to present an ease understanding of the concept drift issues and related works, in order to help researchers from different disciplines to consider concept drift handling in their applications. This survey covers different facets of existing approaches, evokes discussion and helps readers to underline the sharp criteria that allow them to properly design their own approach. For this purpose, a new categorization of the existing state-of-the-art is presented with criticisms, future tendencies and not-yet-addressed challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Alippi C, Boracchi G, Roveri M (2010) Change detection tests using the ICI rule. In: The international joint conference on neural networks (IJCNN), pp 1–7

  • Alippi C, Roveri M (2008) Just-in-time adaptive classifiers; part i: Detecting nonstationary changes. Neural Netw IEEE Trans 19(7):1145–1153

    Article  Google Scholar 

  • Aloraini A (2015) Penalized ensemble feature selection methods for hidden associations in time series environments case study: equities companies in saudi stock exchange market. Evol Syst 6(2):93–100

    Article  Google Scholar 

  • AlZoubi O, Fossati D, DMello S, Calvo R (2015) Affect detection from non-stationary physiological data using ensemble classifiers. Evol Syst 6(2):79–92

    Article  Google Scholar 

  • Amiribesheli M, Benmansour A, Bouchachia A (2015) A review of smart homes in healthcare. J Ambient Intell Hum Comput 1–23

  • Angelov P (2012) Autonomous learning systems: from data streams to knowledge in real-time. Wiley Press, New York

    Book  Google Scholar 

  • Angelov P, Filev DP, Kasabov N (2010) Evolving intelligent systems: methodology and applications. Wiley-IEEE Press, New York

    Book  Google Scholar 

  • Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems., PODS ’02ACM, New York, pp 1–16

  • Bach S, Maloof M (2008) Paired learners for concept drift. In: Data mining, 2008. ICDM ’08. Eighth IEEE international conference, pp 23–32

  • Baena-García M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavaldá R, Morales-Bueno R (2006) Early drift detection method. In: In fourth international workshop on knowledge discovery from data streams

  • Baruah RD, Angelov PP (2011) Evolving fuzzy systems for data streams: a survey. Wiley Interdisc Rew Data Mining Knowl Discov 1(6):461–476

    Article  Google Scholar 

  • Behdad M, Barone L, Bennamoun M, French T (2012) Nature-inspired techniques in the context of fraud detection. Syst Man Cybernet Part C Appl Rev IEEE Trans 42(6):1273–1290

    Article  Google Scholar 

  • Bifet A, Frank E, Holmes G, Pfahringer B, Sugiyama M, Yang Q (2010) Accurate ensembles for data streams: combining restricted hoeffding trees using stacking. In: 2nd Asian conference on machine learning (ACML2010), pp 225–240

  • Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. Proceedings of the seventh SIAM international conference on data mining, April 26–28, 2007. Minneapolis, Minnesota, pp 443–448

    Chapter  Google Scholar 

  • Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part I., ECML PKDD’10Springer-Verlag, Berlin, pp 135–150

  • Bifet A, Read J, Zliobaite I, Pfahringer B, Holmes G (2013) Pitfalls in benchmarking data stream classification and how to avoid them. Machine learning and knowledge discovery in databases, vol 8188. Lecture notes in computer science. Springer, Berlin, pp 465–479

  • Bose RPJC, van der Aalst WMP, Žliobaitė I, Pechenizkiy M (2011) Advanced information systems engineering: 23rd international conference, CAiSE 2011, London, UK. Proceedings, chap. Handling concept drift in process mining. Springer, Berlin, pp 391–405

  • Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67

    Article  MathSciNet  MATH  Google Scholar 

  • Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. Neural Netw Learn Syst IEEE Trans 25(1):81–94

    Article  Google Scholar 

  • Cao F, Liang J, Bai L, Zhao X, Dang C (2010) A framework for clustering categorical time-evolving data. Fuzzy Syst IEEE Trans 18(5):872–882

    Article  Google Scholar 

  • Cauwenberghs G, Poggio T (2001) Incremental and decremental support vector machine learning. In: Advances in neural information processing systems

  • Chen HL, Chen MS, Lin SC (2009) Catching the trend: a framework for clustering concept-drifting categorical data. Knowl Data Eng IEEE Trans 21(5):652–665

    Article  Google Scholar 

  • Cieslak D, Chawla N (2009) A framework for monitoring classifiers performance: when and why failure occurs? Knowl Inf Syst 18(1):83–108

    Article  Google Scholar 

  • Ditzler G, Polikar R (2011) Hellinger distance based drift detection for nonstationary environments. In: Computational intelligence in dynamic and uncertain environments (CIDUE), 2011 IEEE symposium, pp 41–48

  • Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Int Mag 10(4):12–25

    Article  Google Scholar 

  • Dries A, Ruckert U (2009) Adaptive concept drift detection. Stat Anal Data Min 2(5–6):311–327

    Article  MathSciNet  Google Scholar 

  • Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton

    Book  MATH  Google Scholar 

  • Gama, J, Castillo G (2006) Learning with local drift detection. In: Advanced data mining and applications, second international conference, ADMA 2006, Xi’an, China, August 14–16, 2006, Proceedings, pp 42–55

  • Gama JA, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37

  • GonçAlves PM Jr, Barros RSMD (2013) Rcd: a recurring concept drift framework. Pattern Recogn Lett 34(9):1018–1025

    Article  Google Scholar 

  • Hoens T, Polikar R, Chawla N (2012) Learning from streaming data with concept drift and imbalance: an overview. Progress Artif Intell 1(1):89–101

    Article  Google Scholar 

  • Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, CA, August 26–29, 2001, pp 97–106

  • Jackowski K (2014) Fixed-size ensemble classifier system evolutionarily adapted to a recurring context with an unlimited pool of classifiers. Pattern Anal Appl 17(4):709–724

    Article  MathSciNet  Google Scholar 

  • Goncalves PMG Jr, de Carvalho Santos SG, Barros RS, Vieira DC (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156

    Article  Google Scholar 

  • Khamassi I, Sayed-Mouchaweh M (2014) Drift detection and monitoring in non-stationary environments. In: Evolving and adaptive intelligent systems (EAIS), Austria, pp 1–6

  • Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2013) Ensemble classifiers for drift detection and monitoring in dynamical environments. In: Annual conference of the prognostics and health management society, New Orlean, pp 199–224

  • Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2015) Self-adaptive windowing approach for handling complex concept drift. Cogn Comput 7(6):772–790

    Article  Google Scholar 

  • Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB ’04, pp 180–191

  • Klinkenberg R, Renz I (1998) Adaptive information filtering: learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, pp 33-40

  • Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790

    MATH  Google Scholar 

  • Krawczyk B, Wozniak, M (2014) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 1–14

  • Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, Stefanowski J (2014) Open challenges for data stream mining research. SIGKDD Explor Newsl 16(1):1–10

    Article  Google Scholar 

  • Kukar M (2003) Drifting concepts as hidden factors in clinical studies. In: Dojat M, Keravnou E, Barahona P (eds) Artificial intelligence in medicine, vol 2780., Lecture notes in computer scienceSpringer, Berlin, pp 355–364

    Chapter  Google Scholar 

  • Kuncheva L (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems, vol 3077., Lecture notes in computer scienceSpringer, Berlin, pp 1–15

    Chapter  Google Scholar 

  • Kuncheva LI (2009) Using control charts for detecting concept change in streaming data. Tech. Rep. BCS-TR-001-2009, School of Computer Science, Bangor University, UK

  • Kuncheva LI, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872

    Google Scholar 

  • Lazarescu MM, Venkatesh S, Bui HH (2004) Using multiple windows to track concept drift. Intell Data Anal 8(1):29–59

    Google Scholar 

  • Lichtenwalter R, Chawla N (2010) Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. In: Theeramunkong T, Nattee C, Adeodato P, Chawla N, Christen P, Lenca P, Poon J, Williams G (eds) New frontiers in applied data mining, vol 5669., Lecture notes in computer scienceSpringer, Berlin, pp 53–75

    Chapter  Google Scholar 

  • Lu Z, Wu X, Bongard J (2015) Active learning through adaptive heterogeneous ensembling. Knowl Data Eng IEEE Trans 27(2):368–381

    Article  Google Scholar 

  • Lughofer E (2012) Evolving fuzzy systems-methodologies, advanced concepts and applications. Springer, New York

    MATH  Google Scholar 

  • Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151

  • Luo Y, Li Z, Wang Z (2009) Adaptive cusum control chart with variable sampling intervals. Comput Stat Data Anal 53(7):2693–2701

    Article  MathSciNet  MATH  Google Scholar 

  • Martfnez-Rego D, Fernndez-Francos D, Fontenla-Romero O, Alonso-Betanzos A (2015) Stream change detection via passive-aggressive classification and bernoulli CUSUM. Inf Sci 305:130–145

    Article  MathSciNet  MATH  Google Scholar 

  • Masud M, Gao J, Khan L, Han J, Thuraisingham B (2011) Classification and novel class detection in concept-drifting data streams under time constraints. Knowl Data Eng IEEE Trans 23(6):859–874

    Article  Google Scholar 

  • Mejri D, Khanchel R, Limam M (2013) An ensemble method for concept drift in nonstationary environment. J Stat Comput Simul 83:1115–1128

    Article  MathSciNet  MATH  Google Scholar 

  • Mejri D, Limam M, Weihs C (2013) Adaptive control chart with time varying control limits based on online classification methods for data streams. In: 12th workshop on quality improvement methods in Dortmund, Germany

  • Minku L, White A, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. Knowl Data Eng IEEE Trans 22(5):730–742

    Article  Google Scholar 

  • Minku L, Yao X (2012) Ddd: A new ensemble approach for dealing with concept drift. Knowl Data Eng IEEE Trans 24(4):619–633

    Article  Google Scholar 

  • Muthukrishnan S, van den Berg E, Wu Y (2007) Sequential change detection on data streams. In: Data mining workshops, 2007. ICDM Workshops 2007. Seventh IEEE international conference, pp 551–550

  • Navarro-Gonzalez J, Lopez-Juarez I, Ordaz-Hernandez K, Rios-Cabrera R (2015) On-line incremental learning for unknown conditions during assembly operations with industrial robots. Evol Syst 6(2):101–114

    Article  Google Scholar 

  • Nelwamondo F, Marwala T (2008) Key issues on computational intelligence techniques for missing data imputation-a review. In: Proc. of world multi conf. on systemics, cybernetics and informatics, pp 35–45

  • Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: Corruble V, Takeda M, Suzuki E (eds) Discovery science, vol 4755., Lecture notes in computer scienceSpringer, Berlin, pp 264–269

    Chapter  Google Scholar 

  • Oza NC, Russell S (2001) Online bagging and boosting. In: In artificial intelligence and statistics 2001. Morgan Kaufmann, pp 105–112

  • Pandarachalil R, Sendhilkumar S, Mahalakshmi G (2015) Twitter sentiment analysis for large-scale data: an unsupervised approach. Cogn Comput 7(2):254–262

    Article  Google Scholar 

  • Pinto C, Gama J (2007) Incremental discretization, application to data with concept drift. In: Proceedings of the 2007 ACM symposium on applied computing. SAC ’07ACM, New York, pp 467–468

  • Polikar R, Upda L, Upda S, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. Syst Man Cybernet Part C Appl Rev IEEE Trans 31(4):497–508

    Article  Google Scholar 

  • Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression-recent developments, applications and future directions [review article]. IEEE Comput Intell Mag 11(1):41–53

    Article  Google Scholar 

  • Ross G, Adams N (2012) Two nonparametric control charts for detecting arbitrary distribution changes. J Qual Technol 44:102–116

    Article  Google Scholar 

  • Ross GJ, Adams NM, Tasoulis DK, Hand DJ (2012) Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit Lett 33(2):191–198

    Article  Google Scholar 

  • Sayed Mouchaweh M, Lughofer E (2012) Learning in non-stationary environments: methods and applications. Springer, New York

    Book  MATH  Google Scholar 

  • Schliebs S, Kasabov N (2013) Evolving spiking neural network–a survey. Evol Syst 4(2):87–98

    Article  Google Scholar 

  • Sebastipo R, Silva M, Rabito R, Gama J, Mendonta T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12

    Article  Google Scholar 

  • Shaker A, Lughofer E (2014) Self-adaptive and local strategies for a smooth treatment of drifts in data streams. Evol Syst 5(4):239–257

    Article  Google Scholar 

  • Sobhani P, Beigy H (2011) New drift detection method for data streams. In: Bouchachia A (ed) Adaptive and intelligent systems, vol 6943., Lecture notes in computer scienceSpringer, Berlin, pp 88–97

    Chapter  Google Scholar 

  • Sobolewski P, Wozniak M (2013) Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors. J Univ Comput Sci 19(4):462–483

    Google Scholar 

  • Song G, Ye Y, Zhang H, Xu X, Lau RY, Liu F (2016) Dynamic clustering forest: an ensemble framework to efficiently classify textual data stream with concept drift. Inf Sci 357:125–143

    Article  Google Scholar 

  • Sun J, Li H, Adeli H (2013) Concept drift-oriented adaptive and dynamic support vector machine ensemble with time window in corporate financial risk prediction. Syst Man Cybernet Syst IEEE Trans 43(4):801–813

    Article  Google Scholar 

  • Toubakh H, Sayed-Mouchaweh M (2015) Hybrid dynamic data-driven approach for drift-like fault detection in wind turbines. Evol Syst 6(2):115–129

    Article  Google Scholar 

  • Tran D (2013) Automated change detection and reactive clustering in multivariate streaming data. CoRR arXiv:1311.0505

  • Tsymbal A, Pechenizkiy M, Cunningham P, Puuronen S (2006) Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: Computer-based medical systems, 2006. CBMS 2006. 19th IEEE international symposium, pp 679—684

  • Tsymbal A, Puuronen S (2000) Bagging and boosting with dynamic integration of classifiers. Principles of data mining and knowledge discovery, vol (1910). Lecture notes in computer science. Springer, Berlin, pp 116–125

  • Tünnermann J, Mertsching B (2014) Region-based artificial visual attention in space and time. Cognit Comput 6(1):125–143

    Article  Google Scholar 

  • Vorburger P, Bernstein A (2006) Entropy-based concept shift detection. In: Data Mining, 2006. ICDM ’06. Sixth international conference, pp 1113–1118

  • Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’03ACM, New York, pp 226–235

  • Wang S, Minku LL, Yao X (2013) Online class imbalance learning and its applications in fault detection. Int J Comput Intell Appl 12(4)

  • Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. In: Machine learning, pp 69–101

  • Wozniak M, Krawczyk B (2012) Combined classifier based on feature space partitioning. Int J Appl Math Comput Sci 22(4):855–866

    Article  MathSciNet  Google Scholar 

  • Zliobaite I (2009) Combining time and space similarity for small size learning under concept drift. Foundations of intelligent systems, vol 5722. Lecture notes in computer science. Springer, Berlin, pp 412–421

  • Zliobaite I (2010) Learning under concept drift: an overview. CoRR arXiv:1010.4784

  • Zliobaite I, Bifet A, Read J, Pfahringer B, Holmes G (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482

    Article  MathSciNet  MATH  Google Scholar 

  • Zliobaite I, Kuncheva L (2009) Determining the training window for small sample size classification with concept drift. In: Data mining workshops, 2009. ICDMW ’09. IEEE International Conference, pp 447–452

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Imen Khamassi.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khamassi, I., Sayed-Mouchaweh, M., Hammami, M. et al. Discussion and review on evolving data streams and concept drift adapting. Evolving Systems 9, 1–23 (2018). https://doi.org/10.1007/s12530-016-9168-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12530-016-9168-2

Keywords

Navigation