Abstract
Customer Segmentation aims to identify groups of customers that share similar interest or behaviour. It is an essential tool in marketing and can be used to target customer segments with tailored marketing strategies. Customer segmentation is often based on clustering techniques. This analysis is typically performed as a snapshot analysis where segments are identified at a specific point in time. However, this ignores the fact that customer segments are highly volatile and segments change over time. Once segments change, the entire analysis needs to be repeated and strategies adapted. In this paper we explore stream clustering as a tool to alleviate this problem. We propose a new stream clustering algorithm which allows to identify and track customer segments over time. The biggest challenge is that customer segmentation often relies on the transaction history of a customer. Since this data changes over time, it is necessary to update customers which have already been incorporated into the clustering. We show how to perform this step incrementally, without the need for periodic re-computations. As a result, customer segmentation can be performed continuously, faster and is more scalable. We demonstrate the performance of our algorithm using a large real-life case study.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Implementation available at: http://www.matthias-carnein.de/userStream. For reproducability, we also show how to apply the algorithm on a public dataset.
References
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: Proceedings of the 29th International Conference on Very Large Data Bases, VLDB 2003, Berlin, Germany, vol. 29, pp. 81–92. VLDB Endowment (2003)
Aggarwal, C.C., Han, J., Wang, J., Yu, P.S.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the Thirtieth International Conference on Very Large Data Bases, VLDB 2004, Toronto, Canada, vol. 30, pp. 852–863. VLDB Endowment (2004)
Bifet, A., Gavalda, R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams with Practical Examples in MOA. MIT Press, Cambridge (2018)
Buttle, F.: Customer Relationship Management: Concepts and Technologies. Elsevier Butterworth-Heinemann, Oxford (2009)
Carnein, M., Assenmacher, D., Trautmann, H.: An empirical comparison of stream clustering algorithms. In: Proceedings of the ACM International Conference on Computing Frontiers (CF 2017), pp. 361–365. ACM (2017). https://doi.org/10.1145/3075564.3078887
Carnein, M., Trautmann, H.: Evostream - evolutionary stream clustering utilizing idle times. Big Data Res. (2018). https://doi.org/10.1016/j.bdr.2018.05.005
Carnein, M., Trautmann, H.: Optimizing data stream representation: an extensive survey on stream clustering algorithms. Bus. Inf. Syst. Eng. (BISE) (2019). https://doi.org/10.1007/s12599-019-00576-5
Chen, Y., Tu, L.: Density-based clustering for real-time stream data. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2007, San Jose, California, USA, pp. 133–142. ACM (2007). https://doi.org/10.1145/1281192.1281210
Hahsler, M., Bolaños, M.: Clustering data streams based on shared density between micro-clusters. IEEE Trans. Knowl. Data Eng. 28(6), 1449–1461 (2016). https://doi.org/10.1109/TKDE.2016.2522412
Kranen, P., Assent, I., Baldauf, C., Seidl, T.: Self-adaptive anytime stream clustering. In: 9th IEEE International Conference on Data Mining (ICDM 2009), pp. 249–258, December 2009. https://doi.org/10.1109/ICDM.2009.47
Lloyd, S.: Least squares quantization in PCM. IEEE Trans. Inf. Theor. 28(2), 129–137 (2006). https://doi.org/10.1109/TIT.1982.1056489
Rousseeuw, P.J., Kaufman, L.: Finding Groups in Data. Wiley, Hoboken (1990)
Schiffman, L.G., Hansen, H., Kanuk, L.L.: Consumer Behaviour: A European Outlook. Pearson Education, London (2008)
Wedel, M., Kamakura, W.A.: Market Segmentation, 2nd edn. Springer, USA (2000). https://doi.org/10.1007/978-1-4615-4651-1
Welford, B.P.: Note on a method for calculating corrected sums of squares and products. Technometrics 4(3), 419–420 (1962)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: a new data clustering algorithm and its applications. Data Min. Knowl. Discov. 1(2), 141–182 (1997)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Carnein, M., Trautmann, H. (2019). Customer Segmentation Based on Transactional Data Using Stream Clustering. In: Yang, Q., Zhou, ZH., Gong, Z., Zhang, ML., Huang, SJ. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture Notes in Computer Science(), vol 11439. Springer, Cham. https://doi.org/10.1007/978-3-030-16148-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-16148-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16147-7
Online ISBN: 978-3-030-16148-4
eBook Packages: Computer ScienceComputer Science (R0)