Abstract
Recent advances in computational intelligent systems have focused on addressing complex problems related to the dynamicity of the environments. In increasing number of real world applications, data are presented as streams that may evolve over time and this is known by concept drift. Handling concept drift is becoming an attractive topic of research that concerns multidisciplinary domains such that machine learning, data mining, ubiquitous knowledge discovery, statistic decision theory, etc... Therefore, a rich body of the literature has been devoted to the study of methods and techniques for handling drifting data. However, this literature is fairly dispersed and it does not define guidelines for choosing an appropriate approach for a given application. Hence, the main objective of this survey is to present an ease understanding of the concept drift issues and related works, in order to help researchers from different disciplines to consider concept drift handling in their applications. This survey covers different facets of existing approaches, evokes discussion and helps readers to underline the sharp criteria that allow them to properly design their own approach. For this purpose, a new categorization of the existing state-of-the-art is presented with criticisms, future tendencies and not-yet-addressed challenges.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alippi C, Boracchi G, Roveri M (2010) Change detection tests using the ICI rule. In: The international joint conference on neural networks (IJCNN), pp 1–7
Alippi C, Roveri M (2008) Just-in-time adaptive classifiers; part i: Detecting nonstationary changes. Neural Netw IEEE Trans 19(7):1145–1153
Aloraini A (2015) Penalized ensemble feature selection methods for hidden associations in time series environments case study: equities companies in saudi stock exchange market. Evol Syst 6(2):93–100
AlZoubi O, Fossati D, DMello S, Calvo R (2015) Affect detection from non-stationary physiological data using ensemble classifiers. Evol Syst 6(2):79–92
Amiribesheli M, Benmansour A, Bouchachia A (2015) A review of smart homes in healthcare. J Ambient Intell Hum Comput 1–23
Angelov P (2012) Autonomous learning systems: from data streams to knowledge in real-time. Wiley Press, New York
Angelov P, Filev DP, Kasabov N (2010) Evolving intelligent systems: methodology and applications. Wiley-IEEE Press, New York
Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems., PODS ’02ACM, New York, pp 1–16
Bach S, Maloof M (2008) Paired learners for concept drift. In: Data mining, 2008. ICDM ’08. Eighth IEEE international conference, pp 23–32
Baena-García M, del Campo-Ávila J, Fidalgo R, Bifet A, Gavaldá R, Morales-Bueno R (2006) Early drift detection method. In: In fourth international workshop on knowledge discovery from data streams
Baruah RD, Angelov PP (2011) Evolving fuzzy systems for data streams: a survey. Wiley Interdisc Rew Data Mining Knowl Discov 1(6):461–476
Behdad M, Barone L, Bennamoun M, French T (2012) Nature-inspired techniques in the context of fraud detection. Syst Man Cybernet Part C Appl Rev IEEE Trans 42(6):1273–1290
Bifet A, Frank E, Holmes G, Pfahringer B, Sugiyama M, Yang Q (2010) Accurate ensembles for data streams: combining restricted hoeffding trees using stacking. In: 2nd Asian conference on machine learning (ACML2010), pp 225–240
Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. Proceedings of the seventh SIAM international conference on data mining, April 26–28, 2007. Minneapolis, Minnesota, pp 443–448
Bifet A, Holmes G, Pfahringer B (2010) Leveraging bagging for evolving data streams. In: Proceedings of the 2010 European conference on machine learning and knowledge discovery in databases: Part I., ECML PKDD’10Springer-Verlag, Berlin, pp 135–150
Bifet A, Read J, Zliobaite I, Pfahringer B, Holmes G (2013) Pitfalls in benchmarking data stream classification and how to avoid them. Machine learning and knowledge discovery in databases, vol 8188. Lecture notes in computer science. Springer, Berlin, pp 465–479
Bose RPJC, van der Aalst WMP, Žliobaitė I, Pechenizkiy M (2011) Advanced information systems engineering: 23rd international conference, CAiSE 2011, London, UK. Proceedings, chap. Handling concept drift in process mining. Springer, Berlin, pp 391–405
Brzezinski D, Stefanowski J (2014) Combining block-based and online methods in learning ensembles from concept drifting data streams. Inf Sci 265:50–67
Brzezinski D, Stefanowski J (2014) Reacting to different types of concept drift: the accuracy updated ensemble algorithm. Neural Netw Learn Syst IEEE Trans 25(1):81–94
Cao F, Liang J, Bai L, Zhao X, Dang C (2010) A framework for clustering categorical time-evolving data. Fuzzy Syst IEEE Trans 18(5):872–882
Cauwenberghs G, Poggio T (2001) Incremental and decremental support vector machine learning. In: Advances in neural information processing systems
Chen HL, Chen MS, Lin SC (2009) Catching the trend: a framework for clustering concept-drifting categorical data. Knowl Data Eng IEEE Trans 21(5):652–665
Cieslak D, Chawla N (2009) A framework for monitoring classifiers performance: when and why failure occurs? Knowl Inf Syst 18(1):83–108
Ditzler G, Polikar R (2011) Hellinger distance based drift detection for nonstationary environments. In: Computational intelligence in dynamic and uncertain environments (CIDUE), 2011 IEEE symposium, pp 41–48
Ditzler G, Roveri M, Alippi C, Polikar R (2015) Learning in nonstationary environments: a survey. IEEE Comput Int Mag 10(4):12–25
Dries A, Ruckert U (2009) Adaptive concept drift detection. Stat Anal Data Min 2(5–6):311–327
Gama J (2010) Knowledge discovery from data streams. Chapman & Hall/CRC, Boca Raton
Gama, J, Castillo G (2006) Learning with local drift detection. In: Advanced data mining and applications, second international conference, ADMA 2006, Xi’an, China, August 14–16, 2006, Proceedings, pp 42–55
Gama JA, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44:1–44:37
GonçAlves PM Jr, Barros RSMD (2013) Rcd: a recurring concept drift framework. Pattern Recogn Lett 34(9):1018–1025
Hoens T, Polikar R, Chawla N (2012) Learning from streaming data with concept drift and imbalance: an overview. Progress Artif Intell 1(1):89–101
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, San Francisco, CA, August 26–29, 2001, pp 97–106
Jackowski K (2014) Fixed-size ensemble classifier system evolutionarily adapted to a recurring context with an unlimited pool of classifiers. Pattern Anal Appl 17(4):709–724
Goncalves PMG Jr, de Carvalho Santos SG, Barros RS, Vieira DC (2014) A comparative study on concept drift detectors. Expert Syst Appl 41(18):8144–8156
Khamassi I, Sayed-Mouchaweh M (2014) Drift detection and monitoring in non-stationary environments. In: Evolving and adaptive intelligent systems (EAIS), Austria, pp 1–6
Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2013) Ensemble classifiers for drift detection and monitoring in dynamical environments. In: Annual conference of the prognostics and health management society, New Orlean, pp 199–224
Khamassi I, Sayed-Mouchaweh M, Hammami M, Ghédira K (2015) Self-adaptive windowing approach for handling complex concept drift. Cogn Comput 7(6):772–790
Kifer D, Ben-David S, Gehrke J (2004) Detecting change in data streams. In: Proceedings of the thirtieth international conference on very large data bases, vol 30. VLDB ’04, pp 180–191
Klinkenberg R, Renz I (1998) Adaptive information filtering: learning in the presence of concept drifts. In: Workshop notes of the ICML/AAAI-98 workshop learning for text categorization. AAAI Press, pp 33-40
Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. J Mach Learn Res 8:2755–2790
Krawczyk B, Wozniak, M (2014) One-class classifiers with incremental learning and forgetting for data streams with concept drift. Soft Comput 1–14
Krempl G, Žliobaite I, Brzeziński D, Hüllermeier E, Last M, Lemaire V, Noack T, Shaker A, Sievi S, Spiliopoulou M, Stefanowski J (2014) Open challenges for data stream mining research. SIGKDD Explor Newsl 16(1):1–10
Kukar M (2003) Drifting concepts as hidden factors in clinical studies. In: Dojat M, Keravnou E, Barahona P (eds) Artificial intelligence in medicine, vol 2780., Lecture notes in computer scienceSpringer, Berlin, pp 355–364
Kuncheva L (2004) Classifier ensembles for changing environments. In: Roli F, Kittler J, Windeatt T (eds) Multiple classifier systems, vol 3077., Lecture notes in computer scienceSpringer, Berlin, pp 1–15
Kuncheva LI (2009) Using control charts for detecting concept change in streaming data. Tech. Rep. BCS-TR-001-2009, School of Computer Science, Bangor University, UK
Kuncheva LI, Žliobaitė I (2009) On the window size for classification in changing environments. Intell Data Anal 13(6):861–872
Lazarescu MM, Venkatesh S, Bui HH (2004) Using multiple windows to track concept drift. Intell Data Anal 8(1):29–59
Lichtenwalter R, Chawla N (2010) Adaptive methods for classification in arbitrarily imbalanced and drifting data streams. In: Theeramunkong T, Nattee C, Adeodato P, Chawla N, Christen P, Lenca P, Poon J, Williams G (eds) New frontiers in applied data mining, vol 5669., Lecture notes in computer scienceSpringer, Berlin, pp 53–75
Lu Z, Wu X, Bongard J (2015) Active learning through adaptive heterogeneous ensembling. Knowl Data Eng IEEE Trans 27(2):368–381
Lughofer E (2012) Evolving fuzzy systems-methodologies, advanced concepts and applications. Springer, New York
Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355(C):127–151
Luo Y, Li Z, Wang Z (2009) Adaptive cusum control chart with variable sampling intervals. Comput Stat Data Anal 53(7):2693–2701
Martfnez-Rego D, Fernndez-Francos D, Fontenla-Romero O, Alonso-Betanzos A (2015) Stream change detection via passive-aggressive classification and bernoulli CUSUM. Inf Sci 305:130–145
Masud M, Gao J, Khan L, Han J, Thuraisingham B (2011) Classification and novel class detection in concept-drifting data streams under time constraints. Knowl Data Eng IEEE Trans 23(6):859–874
Mejri D, Khanchel R, Limam M (2013) An ensemble method for concept drift in nonstationary environment. J Stat Comput Simul 83:1115–1128
Mejri D, Limam M, Weihs C (2013) Adaptive control chart with time varying control limits based on online classification methods for data streams. In: 12th workshop on quality improvement methods in Dortmund, Germany
Minku L, White A, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. Knowl Data Eng IEEE Trans 22(5):730–742
Minku L, Yao X (2012) Ddd: A new ensemble approach for dealing with concept drift. Knowl Data Eng IEEE Trans 24(4):619–633
Muthukrishnan S, van den Berg E, Wu Y (2007) Sequential change detection on data streams. In: Data mining workshops, 2007. ICDM Workshops 2007. Seventh IEEE international conference, pp 551–550
Navarro-Gonzalez J, Lopez-Juarez I, Ordaz-Hernandez K, Rios-Cabrera R (2015) On-line incremental learning for unknown conditions during assembly operations with industrial robots. Evol Syst 6(2):101–114
Nelwamondo F, Marwala T (2008) Key issues on computational intelligence techniques for missing data imputation-a review. In: Proc. of world multi conf. on systemics, cybernetics and informatics, pp 35–45
Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing. In: Corruble V, Takeda M, Suzuki E (eds) Discovery science, vol 4755., Lecture notes in computer scienceSpringer, Berlin, pp 264–269
Oza NC, Russell S (2001) Online bagging and boosting. In: In artificial intelligence and statistics 2001. Morgan Kaufmann, pp 105–112
Pandarachalil R, Sendhilkumar S, Mahalakshmi G (2015) Twitter sentiment analysis for large-scale data: an unsupervised approach. Cogn Comput 7(2):254–262
Pinto C, Gama J (2007) Incremental discretization, application to data with concept drift. In: Proceedings of the 2007 ACM symposium on applied computing. SAC ’07ACM, New York, pp 467–468
Polikar R, Upda L, Upda S, Honavar V (2001) Learn++: an incremental learning algorithm for supervised neural networks. Syst Man Cybernet Part C Appl Rev IEEE Trans 31(4):497–508
Ren Y, Zhang L, Suganthan PN (2016) Ensemble classification and regression-recent developments, applications and future directions [review article]. IEEE Comput Intell Mag 11(1):41–53
Ross G, Adams N (2012) Two nonparametric control charts for detecting arbitrary distribution changes. J Qual Technol 44:102–116
Ross GJ, Adams NM, Tasoulis DK, Hand DJ (2012) Exponentially weighted moving average charts for detecting concept drift. Pattern Recognit Lett 33(2):191–198
Sayed Mouchaweh M, Lughofer E (2012) Learning in non-stationary environments: methods and applications. Springer, New York
Schliebs S, Kasabov N (2013) Evolving spiking neural network–a survey. Evol Syst 4(2):87–98
Sebastipo R, Silva M, Rabito R, Gama J, Mendonta T (2013) Real-time algorithm for changes detection in depth of anesthesia signals. Evol Syst 4(1):3–12
Shaker A, Lughofer E (2014) Self-adaptive and local strategies for a smooth treatment of drifts in data streams. Evol Syst 5(4):239–257
Sobhani P, Beigy H (2011) New drift detection method for data streams. In: Bouchachia A (ed) Adaptive and intelligent systems, vol 6943., Lecture notes in computer scienceSpringer, Berlin, pp 88–97
Sobolewski P, Wozniak M (2013) Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors. J Univ Comput Sci 19(4):462–483
Song G, Ye Y, Zhang H, Xu X, Lau RY, Liu F (2016) Dynamic clustering forest: an ensemble framework to efficiently classify textual data stream with concept drift. Inf Sci 357:125–143
Sun J, Li H, Adeli H (2013) Concept drift-oriented adaptive and dynamic support vector machine ensemble with time window in corporate financial risk prediction. Syst Man Cybernet Syst IEEE Trans 43(4):801–813
Toubakh H, Sayed-Mouchaweh M (2015) Hybrid dynamic data-driven approach for drift-like fault detection in wind turbines. Evol Syst 6(2):115–129
Tran D (2013) Automated change detection and reactive clustering in multivariate streaming data. CoRR arXiv:1311.0505
Tsymbal A, Pechenizkiy M, Cunningham P, Puuronen S (2006) Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: Computer-based medical systems, 2006. CBMS 2006. 19th IEEE international symposium, pp 679—684
Tsymbal A, Puuronen S (2000) Bagging and boosting with dynamic integration of classifiers. Principles of data mining and knowledge discovery, vol (1910). Lecture notes in computer science. Springer, Berlin, pp 116–125
Tünnermann J, Mertsching B (2014) Region-based artificial visual attention in space and time. Cognit Comput 6(1):125–143
Vorburger P, Bernstein A (2006) Entropy-based concept shift detection. In: Data Mining, 2006. ICDM ’06. Sixth international conference, pp 1113–1118
Wang H, Fan W, Yu PS, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’03ACM, New York, pp 226–235
Wang S, Minku LL, Yao X (2013) Online class imbalance learning and its applications in fault detection. Int J Comput Intell Appl 12(4)
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. In: Machine learning, pp 69–101
Wozniak M, Krawczyk B (2012) Combined classifier based on feature space partitioning. Int J Appl Math Comput Sci 22(4):855–866
Zliobaite I (2009) Combining time and space similarity for small size learning under concept drift. Foundations of intelligent systems, vol 5722. Lecture notes in computer science. Springer, Berlin, pp 412–421
Zliobaite I (2010) Learning under concept drift: an overview. CoRR arXiv:1010.4784
Zliobaite I, Bifet A, Read J, Pfahringer B, Holmes G (2015) Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach Learn 98(3):455–482
Zliobaite I, Kuncheva L (2009) Determining the training window for small sample size classification with concept drift. In: Data mining workshops, 2009. ICDMW ’09. IEEE International Conference, pp 447–452
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Khamassi, I., Sayed-Mouchaweh, M., Hammami, M. et al. Discussion and review on evolving data streams and concept drift adapting. Evolving Systems 9, 1–23 (2018). https://doi.org/10.1007/s12530-016-9168-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-016-9168-2