Abstract
In most challenging data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time, thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining this phenomenon is referred to as concept drift. The objective is to deploy models that would diagnose themselves and adapt to changing data over time. This chapter provides an application oriented view towards concept drift research, with a focus on supervised learning tasks. First we overview and categorize application tasks for which the problem of concept drift is particularly relevant. Then we construct a reference framework for positioning application tasks within a spectrum of problems related to concept drift. Finally, we discuss some promising research directions from the application perspective, and present recommendations for application driven concept drift research and development.
We dedicate this chapter to Dr. Alexey Tsymbal who passed away suddenly and unexpectedly in November 2014 at age of 39. Alexey contributed to the progress of data mining and medical informatics on several topics, including notable work on handling concept drift.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ang, H.H., Gopalkrishnan V., Zliobaite I., Pechenizkiy M., Hoi S.C.H.: Predictive handling of asynchronous concept drifts in distributed environments. IEEE Trans. Knowl. Data Eng. 25, 2343–2355 (2013)
Anguita, D.: Smart adaptive systems: state of the art and future directions of research. In: Proceedings of the 1st European Sympposium on Intelligent Technologies, Hybrid Systems and Smart Adaptive Systems, EUNITE (2001)
Becker, R.A., Volinsky, C., Wilks, A.R.: Fraud detection in telecommunications: History and lessons learned. Technometrics 52(1), 20–33 (2010)
Billsus, D., Pazzani, M.: A hybrid user model for news story classification. In: Proceedings of the 7th International Conference on User Modeling, UM, pp. 99–108 (1999)
Black, M., Hickey, R.: Classification of customer call data in the presence of concept drift and noise. In: Proceedings of the 1st International Conference on Computing in an Imperfect World, pp. 74–87 (2002)
Black, M., Hickey, R.: Detecting and adapting to concept drift in bioinformatics, pp. 161–168. In Proc. of Knowledge Exploration in Life Science Informatics, International Symposium (2004)
Bolton, R., Hand, D.: Statistical fraud detection: A review. Stat. Sci. 17(3), 235–255 (2002)
Bose, R.P.J.C., van der Aalst W.M.P., Zliobaite, I., Pechenizkiy, M. Dealing with concept drift in process mining. IEEE Trans. Neur. Net. Lear. Syst. accepted (2013)
Budka, M., Eastwood, M., Gabrys, B., Kadlec, P., Martin-Salvador, M., Schwan, S., Tsakonas, A., Zliobaite, I.: From sensor readings to predictions: on the process of developing practical soft sensors. In: Procedings of the 13th International Symposium on Intelligent Data Analysis, pp. 49–60 (2014)
Carmona, J., Gavaldà, R.: Online techniques for dealing with concept drift in process mining. In: Proceedings of the 11th International Symposium on Intelligent Data Analysis, pp. 90–102 (2012)
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0 step-by-step data mining guide. Technical report, The CRISP-DM consortium (2000)
Charles, D., Kerr, A., McNeill, M., McAlister, M. Black, M., Kucklich, J., Moore, A., Stringer, K.: Player-centred game design: player modelling and adaptive digital games. In: Proceedings of the Digital Games Research Conference, pp. 285–298 (2005)
Crespo, F., Weber, R.: A methodology for dynamic data mining based on fuzzy clustering. Fuzzy Sets and Syst. 150, 267–284 (2005)
Crook, J., Hamilton, R., Thomas, L.C.: The degradation of the scorecard over the business cycle. IMA J. Manage. Math. 4, 111–123 (1992)
da Silva, A., Lechevallier, Y., Rossi, F., de Carvalho, F.: Construction and analysis of evolving data summaries: an application on web usage data. In: Proceedings of the 7th International Conference on Intelligent Systems Design and Applications, pp. 377–380 (2007)
De Bra, P., Aerts, A., Berden, B., de Lange, B., Rousseau, B., Santic, T., Smits, D., Stash, N.: AHA! the adaptive hypermedia architecture. In: Proceedings of the 14th ACM Conference on Hypertext and hypermedia, pp. 81–84 (2003)
Delany, S., Cunningham, P., Tsymbal, A.: A comparison of ensemble and case-base maintenance techniques for handling concept drift in spam filtering. In: Proceedings of Florida Artificial Intelligence Research Society Conference, pp. 340–345 (2006)
Ding, Y., Li, X.: Time weight collaborative filtering. In: Proceedings of the 14th ACM International Conference on Information and Knowledge Management, pp. 485–492 (2005)
Donoho, S.: Early detection of insider trading in option markets. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 420–429 (2004)
Ekanayake, J., Tappolet, J., Gall, H.C., Bernstein, A.: Tracking concept drift of software projects using defect prediction quality. In: Proceedings of the 6th IEEE International Working Conference on Mining Software Repositories, pp. 51–60 (2009)
Fdez-Riverola, F., Iglesias, E., Diaz, F., Mendez, J., Corchado, J.: Applying lazy learning algorithms to tackle concept drift in spam filtering. Expert Syst. Appl. 33(1), 36–48 (2007)
Flasch, O., Kaspari, A., Morik, K., Wurst, M.: Aspect-based tagging for collaborative media organization. In: Proceedings of Workshop on Web Mining, From Web to Social Web: Discovering and Deploying User and Content Profiles, pp. 122–141 (2007)
Forman, G.: Incremental machine learning to reduce biochemistry lab costs in the search for drug discovery. In: Proceedings of the 2nd Workshop on Data Mining in Bioinformatics, pp. 33–36 (2002)
Gago, P., Silva, A., Santos, M.: Adaptive decision support for intensive care. In: Proceedings of 13th Portuguese Conference on Artificial Intelligence, pp. 415–425 (2007)
Gama, J., Kosina, P.: Learning about the learning process. In: Proceedings of the 10th International Conference on Advances in intelligent data analysis, IDA, pp. 162–172, Germany, Springer (2011)
Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Proceedings of the 17th Brazilian Symposium on Artificial Intelligence, pp. 286–295 (2004)
Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)
Gauch, S. Speretta, M., Chandramouli, A., Micarelli, A.: User profiles for personalized information access. In: Brusilovsky, P., Kobsa, A., Nejdl, W. (eds.) The Adaptive Web, pp. 54–89. Springer (2007)
Giacomini, R., Rossi, B.: Detecting and predicting forecast breakdowns. Working Paper 638, ECB (2006)
Hand, D.J.: Fraud detection in telecommunications and banking: discussion of Becker, Volinsky, and Wilks (2010); Sudjianto et al. Technometrics 52(1), 34–38 (2010)
Hand, D.: Classifier technology and the illusion of progress. Stat. Sci. 21(1), 1–14 (2006)
Hand, D.J., Adams, N.M.: Selection bias in credit scorecard evaluation. JORS 65(3), 408–415 (2014)
Harries, M., Horn, K.: Detecting concept drift in financial time series prediction using symbolic machine learning. In: In Proceedings of the 8th Australian Joint Conference on Artificial Intelligence, pp. 91–98 (1995)
Harries, M., Sammut, C., Horn, K.: Extracting hidden context. Mach. Learn. 32(2), 101–126 (1998)
Hasan, M., Nantajeewarawat, E.: Towards intelligent and adaptive digital library services. In: Proceedings of the 11th International Conference on Asian Digital Libraries, pp. 104–113 (2008)
Haykin, S., Li, L.: Nonlinear adaptive prediction of nonstationary signals. IEEE Trans. Sig. Process. 43(2), 526–535 (1995)
Hilas, C.: Designing an expert system for fraud detection in private telecommunications networks. Expert Syst. Appl. 36(9), 11559–11569 (2009)
Horta, R., de Lima, B., Borges, C.: Data pre-processing of bankruptcy prediction models using data mining techniques (2009)
Jermaine, C.: Data mining for multiple antibiotic resistance. Online (2008)
Kadlec, P., Grbic, R., Gabrys, B.: Review of adaptation mechanisms for data-driven soft sensors. Comput. Chem. Eng. 35, 1–24 (2011)
Kadlec, P., Gabrys, B.: Local learning-based adaptive soft sensor for catalyst activation prediction. AIChE J. 57(5), 1288–1301 (2011)
Kiseleva, J., Crestan, E., Brigo, R., Dittel, R.: Modelling and detecting changes in user satisfaction. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management, pp. 1449–1458 (2014)
Kleinberg, J.: Bursty and hierarchical structure in streams. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 91–101. ACM (2002)
Klinkenberg, R.: Meta-learning, model selection and example selection in machine learning domains with concept drift. In: Proceedings of annual workshop of the Special Interest Group on Machine Learning, Knowledge Discovery, and Data Mining, pp. 64–171 (2005)
Koren, Y.: Collaborative filtering with temporal dynamics. Commun. ACM 53(4), 89–97 (2010)
Kukar, M.: Drifting concepts as hidden factors in clinical studies. In: Proceedings of the 9th Conference on Artificial Intelligence in Medicine in Europe, pp. 355–364 (2003)
Lathia, N., Hailes, S., Capra, L.: kNN CF: a temporal social network. In: Proceedings of the ACM Conference on Recommender Systems, pp. 227–234 (2008)
Lattner, A., Miene, A., Visser, U., Herzog, O.: Sequential pattern mining for situation and behavior prediction in simulated robotic soccer. In: Proceedings of Robot Soccer World Cup IX, pp. 118–129 (2006)
Lebanon, G., Zhao, Y.: Local likelihood modeling of temporal text streams. In: Proceedings of the 25th International Conference on Machine Learning, pp. 552–559 (2008)
Lee, W., Stolfo, S.J., Mok, K.W.: Adaptive intrusion detection: A data mining approach. Artif. Intell. Rev. 14(6), 533–567 (2000)
Liao, L., Patterson, D., Fox, D., Kautz, H.: Learning and inferring transportation routines. Artif. Intell. 171(5–6), 311–331 (2007)
Luo, J., Pronobis, A., Caputo, B., Jensfelt, P.: Incremental learning for place recognition in dynamic environments. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 721–728 (2007)
Martin, M.T., Knudsen, T.B., Judson, R.S., Kavlock, R.J., Dix, D.J.: Economic benefits of using adaptive predictive models of reproductive toxicity in the context of a tiered testing program. Syst. Biol. Reprod. Med. 58, 3–9 (2012)
Mazhelis, O., Puuronen, S.: Comparing classifier combining techniques for mobile-masquerader detection. In: Proceedings of the The 2nd International Conference on Availability, Reliability and Security, pp. 465–472 (2007)
Minku, L.L., White, A.P., Yao, X.: The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans. Knowl. Data Eng. 22(5), 730–742 (2010)
Morales, G.D.F., A, Bifet.: SAMOA: Scalable advanced massive online analysis. J. Mach. Learn. Res. 16, 149–153 (2015)
Moreira, J.: Travel time prediction for the planning of mass transit companies: a machine learning approach. PhD thesis, University of Porto (2008)
Moreno-Torres, J.G., Raeder, T., Alaiz-Rodríguez, R., Chawla, N.V., Herrera, F.: A unifying view on dataset shift in classification. Pattern Recogn. 45(1), 521–530 (2012)
Mourao, F., Rocha, L., Araujo, R., Couto, T., Goncalves, M., Meira, W.: Understanding temporal aspects in document classification. In: Proceedings of the International Conference on Web Search and Web Data Mining, pp. 159–170 (2008)
Pawling, A., Chawla, N., Madey, G.: Anomaly detection in a mobile communication network. Comput. Math. Organ. Theory 13(4), 407–422 (2007)
Pechenizkiy, M., Bakker, J., Zliobaite, I., Ivannikov, A., Karkkainen, T.: Online mass flow prediction in CFB boilers with explicit detection of sudden concept drift. SIGKDD Explor. 11(2), 109–116 (2009)
Poh, N., Wong, R., Kittler, J., Roli, F.: Challenges and research directions for adaptive biometric recognition systems. In: Proceedings of the 3rd International Conference on Advances in Biometrics, pp. 753–764 (2009)
Procopio, M., Mulligan, J., Grudic, G.: Learning terrain segmentation with classifier ensembles for autonomous robot navigation in unstructured environments. J. Field Robot. 26(2), 145–175 (2009)
Rashidi, P., Cook, D.: Keeping the resident in the loop: Adapting the smart home to the user. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum 39(5), 949–959 (2009)
Reinartz, T.P.: Focusing solutions for data mining: analytical studies and experimental results in real-world domains. In: Lecture Notes in Computer Science, vol. 1623. Springer (1999)
Rozsypal, A., Kubat, M.: Association mining in time-varying domains. Intell. Data Anal. 9(3), 273–288 (2005)
Scanlan, J., Hartnett, J., Williams. R.: DynamicWEB: adapting to concept drift and object drift in cobweb. In: Proceedings of the 21st Australasian Joint Conference on Artificial Intelligence, pp. 454–460 (2008)
Sudjianto, A., Nair, S., Yuan, M., Zhang, A., Kern, D., Cela-Diaz, F.: Statistical methods for fighting financial crimes. Technometrics 52(1), 5–19 (2010)
Sung, T., Chang, N., Lee, G.: Dynamics of modeling in data mining: interpretive approach to bankruptcy prediction. J. Manage. Inf. Syst. 16(1), 63–85 (1999)
Thrun, S., Montemerlo, M., Dahlkamp, H., Stavens, D., Aron, A., Diebel, J., Fong, P., Gale, J., Halpenny, M., Hoffmann, G., Lau, K., Oakley, C., Palatucci, M., Pratt, V., Stang, P., Strohband, S., Dupont, C., Jendrossek, L.-E., Koelen, C., Markey, C., Rummel, C., van Niekerk, J., Jensen, E., Alessandrini, P., Bradski, G., Davies, B., Ettinger, S., Kaehler, A., Nefian, A., Mahoney, P.: Winning the darpa grand challenge. J. Field Robot. 23(9), 661–692 (2006)
Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report, Department of Computer Science, Trinity College Dublin, Ireland (2004)
Tsymbal, A., Pechenizkiy, M., Cunningham, P., Puuronen, S.: Dynamic integration of classifiers for handling concept drift. Inf. Fusion 9(1), 56–68 (2008)
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Widyantoro, D., Yen, J.: Relevant data expansion for learning concept drift from sparsely labeled data. IEEE Trans. Knowl. Data Eng. 17(3), 401–412 (2005)
Yampolskiy, R., Govindaraju, V.: Direct and indirect human computer interaction based biometrics. J. Comput. 2(10), 76–88 (2007)
Yang, Y., Wu, X., Zhu, X.: Mining in anticipation for concept change: Proactive-reactive prediction in data streams. Data Min. Knowl. Discov. 13(3), 261–289 (2006)
Zhou, J., Cheng, L., Bischof, W.: Prediction and change detection in sequential data for interactive applications. In: Proceedings of the 23rd AAAI Conference on Artificial Intelligence, pp. 805–810 (2008)
Zliobaite, I., Bakker, J., Pechenizkiy, M.: Beating the baseline prediction in food sales: How intelligent an intelligent predictor is? Expert Syst. Appl. 31(1), 806–815 (2012)
Acknowledgments
This work was partially supported by European Commission through the project MAESTRA (Grant number ICT-2013-612944).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Žliobaitė, I., Pechenizkiy, M., Gama, J. (2016). An Overview of Concept Drift Applications. In: Japkowicz, N., Stefanowski, J. (eds) Big Data Analysis: New Algorithms for a New Society. Studies in Big Data, vol 16. Springer, Cham. https://doi.org/10.1007/978-3-319-26989-4_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-26989-4_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-26987-0
Online ISBN: 978-3-319-26989-4
eBook Packages: EngineeringEngineering (R0)