Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Driving Profiles Computation and Monitoring for Car Insurance CRM

Published: 23 August 2016 Publication History

Abstract

Customer segmentation is one of the most traditional and valued tasks in customer relationship management (CRM). In this article, we explore the problem in the context of the car insurance industry, where the mobility behavior of customers plays a key role: Different mobility needs, driving habits, and skills imply also different requirements (level of coverage provided by the insurance) and risks (of accidents). In the present work, we describe a methodology to extract several indicators describing the driving profile of customers, and we provide a clustering-oriented instantiation of the segmentation problem based on such indicators. Then, we consider the availability of a continuous flow of fresh mobility data sent by the circulating vehicles, aiming at keeping our segments constantly up to date. We tackle a major scalability issue that emerges in this context when the number of customers is large—namely, the communication bottleneck—by proposing and implementing a sophisticated distributed monitoring solution that reduces communications between vehicles and company servers to the essential. We validate the framework on a large database of real mobility data coming from GPS devices on private cars. Finally, we analyze the privacy risks that the proposed approach might involve for the users, providing and evaluating a countermeasure based on data perturbation.

Supplementary Material

a14-nanni-apndx.pdf (nanni.zip)
Supplemental movie, appendix, image and software files for, Driving Profiles Computation and Monitoring for Car Insurance CRM

References

[1]
Charu C. Aggarwal, Jiawei Han, Jianyong Wang, and Philip S. Yu. 2003. A framework for clustering evolving data streams. In Proceedings of the 29th International Conference on Very Large Data Bases - Volume 29 (VLDB’03). 81--92.
[2]
Rakesh Agrawal and Ramakrishnan Srikant. 2000. Privacy-preserving data mining. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data. 439--450.
[3]
Rakesh Agrawal, Alexandre V. Evfimievski, and Ramakrishnan Srikant. 2003. Information sharing across private databases. In SIGMOD Conference. Alon Y. Halevy, Zachary G. Ives, and AnHai Doan, eds, ACM, New York, 86--97.
[4]
Brain Babcock, Mayur Datar, Rajeev Motwani, and Liadan O’Callaghan. 2003. Maintaining variance and k-medians over data stream windows. In Proceedings of the 22nd ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems (PODS’03). ACM, New York, 234--243.
[5]
Sanghamitra Bandyopadhyay, Chris Giannella, Ujjwal Maulik, Hillol Kargupta, Kun Liu, and Souptik Datta. 2006. Clustering distributed data streams in peer-to-peer environments. Information Science, 176(14):1952--1985.
[6]
Michael J. A. Berry and Gordon S. Linoff. 2004. Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley Computer Publishing.
[7]
T.-H. Hubert Chan, Elaine Shi, and Dawn Song. 2011. Private and continual release of statistics. ACM Transactions on Information Systems Securit,y 14(3):26.
[8]
Song Chaoming, Qu Zehui, Nicholas Blumm, and Barabási Albert-László. 2011. Limits of predictability in human mobility. Science Journal, 327(5968):1018--1021.
[9]
Moses Charikar, Chandra Chekuri, Tomás Feder, and Rajeev Motwani. 1997. Incremental clustering and dynamic information retrieval. In Proceedings of the 29th Annual ACM Symposium on Theory of Computing (STOC’97). ACM, New York, 626--635.
[10]
Souptik Datta and Hillol Kargupta. 2009. A communication efficient probabilistic algorithm for mining frequent itemsets from a peer-to-peer network. Statistical Analysis and Data Mining, 2(1):48--69.
[11]
Wenliang Du and Zhijun Zhan. 2002. Building decision tree classifier on private data. In Proceedings of the IEEE International Conference on Privacy, Security and Data Mining.
[12]
Wenliang Du, Yunghsiang S. Han, and Shigang Chen. 2004. Privacy-preserving multivariate statistical analysis: Linear regression and classification. In SDM. Michael W. Berry, Umeshwar Dayal, Chandrika Kamath, and David B. Skillicorn, eds., SIAM.
[13]
Cynthia Dwork. 2010. Differential privacy in new settings. In Moses Charikar, ed., SODA. SIAM, 174--183.
[14]
Cynthia Dwork, Frank McSherry, Kobbi Nissim, and Adam Smith. 2006. Calibrating noise to sensitivity in private data analysis. In TCC, Shai Halevi and Tal Rabin, eds., Vol. 3876 of Lecture Notes in Computer Science. Springer, 265--284.
[15]
Cynthia Dwork, Moni Naor, Toniann Pitassi, and Guy N. Rothblum. 2010. Differential privacy under continual observation. In STOC, Leonard J. Schulman, ed., 715--724. ACM. ISBN 978-1-4503-0050-6.
[16]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Michael Wimmer, and Xiaowei Xu. 1998. Incremental clustering for mining in a data warehousing environment. In Proceedings of 24rd International Conference on Very Large Data Bases, August 24--27, (VLDB’98), New York, 323--333.
[17]
Pedro A. Forero, Alfonso Cano, and Georgios B. Giannakis. 2011. Distributed clustering using wireless sensor networks. Journal on Selected Topics in Signal Processing, 5(4):707--724.
[18]
Michael J. Freedman, Kobbi Nissim, and Benny Pinkas. 2004. Efficient private matching and set intersection. In EUROCRYPT, Christian Cachin and Jan Camenisch, eds., Vol. 3027 of Lecture Notes in Computer Science. Springer, 1--19.
[19]
Fosca Giannotti, Mirco Nanni, Dino Pedreschi, Fabio Pinelli, Chiara Renso, Salvatore Rinzivillo, and Roberto Trasarti. 2011. Unveiling the complexity of human mobility by querying and mining massive trajectory data. The VLDB Journal 20(5):695--719.
[20]
Nikos Giatrakos, Antonios Deligiannakis, Minos N. Garofalakis, Izchak Sharfman, and Assaf Schuster. 2012. Prediction-based geometric monitoring over distributed data streams. In SIGMOD Conference. 265--276.
[21]
Oded Goldreich. 2004. The Foundations of Cryptography - Volume 2, Basic Applications. Cambridge University Press.
[22]
Oded Goldreich, Silvio Micali, and Avi Wigderson. 1987. How to play any mental game or a completeness theorem for protocols with honest majority. In STOC. Alfred V. Aho, ed., ACM, 218--229.
[23]
Alessio Guerrieri and Alberto Montresor. 2012. Ds-means: Distributed data stream clustering. In Euro-Par. 260--271.
[24]
Sudipto Guha, Nina Mishra, Rajeev Motwani, and Liadan O’Callaghan. 2000. Clustering data streams. In FOCS. 359--366.
[25]
Songtao Guo, Xintao Wu, and Yingjiu Li. 2008. Determining error bounds for spectral filtering based reconstruction methods in privacy preserving data mining. Knowledge and Information Systems, 17(2):217--240.
[26]
Ming Hua, Man Ki Lau, Jian Pei, and Kui Wu. 2009. Continuous k-means monitoring with low reporting cost in sensor networks. IEEE Transactions on Knowledge and Data Engineering, 21(12):1679--1691.
[27]
Bernardo A. Huberman, Matt Franklin, and Tad Hogg. 1999. Enhancing privacy and trust in electronic communities. In Proceedings of the 1st ACM Conference on Electronic Commerce.ACM, 78--86.
[28]
Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining time-changing data streams. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD’01). ACM, 97--106.
[29]
Eshref Januzaj, Hans-Peter Kriegel, and Martin Pfeifle. 2004. Dbdc: Density based distributed clustering. In EDBT. 88--105.
[30]
Nigel Jefferies, Chris J. Mitchell, and Michael Walker. 1995. A proposed architecture for trusted third party services. In Cryptography: Policy and Algorithms, Ed Dawson and Jovan Dj. Golic, eds., Vol. 1029 of Lecture Notes in Computer Science. Springer, 98--104.
[31]
Murat Kantarcioglu and Chris Clifton. 2004. Privacy-preserving distributed mining of association rules on horizontally partitioned data. IEEE Transactions on Knowledge and Data Engineering, 16(9):1026--1037.
[32]
Hillol Kargupta, Souptik Datta, Qi Wang, and Krishnamoorthy Sivakumar. 2005. Random-data perturbation techniques and privacy-preserving data mining. Knowledge and Information Systems, 7(4):387--414.
[33]
Daniel Keren, Izchak Sharfman, Assaf Schuster, and Avishay Livne. 2012. Shape sensitive geometric monitoring. IEEE Transactions on Knowledge and Data Engineering, 24(8):1520--1535.
[34]
Taoying Li and Yan Chen. 2010. Fuzzy k-means incremental clustering based on k-center and vector quantization. JCP, 5(11):1670--1677.
[35]
Frank McSherry and Kunal Talwar. 2007. Mechanism design via differential privacy. In FOCS. IEEE Computer Society, 94--103.
[36]
Mirco Nanni, Roberto Trasarti, Giulio Rossetti, and Dino Pedreschi. 2012. Efficient distributed computation of human mobility aggregates through user mobility profiles. In Proceedings of the ACM SIGKDD International Workshop on Urban Computing (UrbComp’12). ACM, 87--94.
[37]
Bang Nguyen and Dilip S. Mutum. 2012. A review of customer relationship management: Successes, advances, pitfalls and futures. Business Process Management Journal, 18(3):400--419.
[38]
Vibhor Rastogi and Suman Nath. 2010. Differentially private aggregation of distributed time-series with transformation and encryption. In Ahmed K. Elmagarmid and Divyakant Agrawal, eds., SIGMOD Conference. ACM, 735--746.
[39]
Nagiza F. Samatova, George Ostrouchov, Al Geist, and Anatoli V. Melechko. 2002. Rachet: An efficient cover-based merging of clustering hierarchies from distributed datasets. Distributed and Parallel Databases, 11(2):157--180.
[40]
P. Sasikumar and S. Khara. 2012. K-means clustering in wireless sensor networks. In Proceedings of the 2012 4th International Conference on Computational Intelligence and Communication Networks (CICN), 140--144.
[41]
Izchak Sharfman, Assaf Schuster, and Daniel Keren. 2007. A geometric approach to monitoring threshold functions over distributed data streams. ACM Transactions on Database Systems, 32(4).
[42]
Elaine Shi, T.-H. Hubert Chan, Eleanor G. Rieffel, Richard Chow, and Dawn Song. 2011. Privacy-preserving aggregation of time-series data. In NDSS. The Internet Society.
[43]
Gabriel L. Somlo and Adele E. Howe. 2001. Incremental clustering for profile maintenance in information gathering web agents. In Proceedings of the 5th International Conference on Autonomous Agents (AGENTS’01). ACM, New York, 262--269.
[44]
Pang-Ning Tan, Michael Steinbach, and Vipin Kumar. 2005. Introduction to Data Mining, 1st ed. Addison-Wesley Longman Publishing, Boston, MA.
[45]
Jaideep Vaidya and Chris Clifton. 2002. Privacy preserving association rule mining in vertically partitioned data. In KDD. ACM, 639--644.
[46]
Jaideep Vaidya and Chris Clifton. 2005. Privacy-preserving top-k queries. In ICDE. Karl Aberer, Michael J. Franklin, and Shojiro Nishio, eds., IEEE Computer Society, 545--546.
[47]
Ran Wolff, Kanishka Bhaduri, and Hillol Kargupta. 2006. Local l2-thresholding based data mining in peer-to-peer systems. In SDM. 430--441.
[48]
Andrew Chi-Chih Yao. 1986. How to generate and exchange secrets (extended abstract). In FOCS. IEEE Computer Society, 162--167.
[49]
Justin Zhijun Zhan, Stan Matwin, and LiWu Chang. 2008. Privacy-preserving naive bayesian classification over horizontally partitioned data. In Tsau Young Lin, Ying Xie, Anita Wasilewska, and Churn-Jung Liau, eds., Data Mining: Foundations and Practice, Vol. 118 of Studies in Computational Intelligence. Springer, 529--538.
[50]
Zhenjie Zhang, Yin Yang, Anthony K. H. Tung, and Dimitris Papadias. 2008. Continuous k-means monitoring over moving objects. IEEE Transactions on Knowledge and Data Engineering, 20(9):1205--1216.
[51]
Yu Zheng. 2015. Trajectory data mining: An overview. ACM Transaction on Intelligent Systems and Technology 6, 3, Article no. 29.

Cited By

View all
  • (2022)Impact of Driving Behavior on Commuter’s Comfort During Cab Rides: Towards a New Perspective of Driver RatingACM Transactions on Intelligent Systems and Technology10.1145/352306313:6(1-25)Online publication date: 22-Mar-2022
  • (2021)Data Science Workflows for the Cloud/Edge Computing ContinuumProceedings of the 1st Workshop on Flexible Resource and Application Management on the Edge10.1145/3452369.3463820(41-44)Online publication date: 25-Jun-2021
  • (2021)Data science: a game changer for science and innovationInternational Journal of Data Science and Analytics10.1007/s41060-020-00240-2Online publication date: 19-Apr-2021
  • Show More Cited By

Index Terms

  1. Driving Profiles Computation and Monitoring for Car Insurance CRM

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 8, Issue 1
    January 2017
    363 pages
    ISSN:2157-6904
    EISSN:2157-6912
    DOI:10.1145/2973184
    • Editor:
    • Yu Zheng
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 August 2016
    Accepted: 01 April 2016
    Revised: 01 December 2015
    Received: 01 December 2014
    Published in TIST Volume 8, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Driving profiles
    2. distributed clustering
    3. privacy

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)15
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Impact of Driving Behavior on Commuter’s Comfort During Cab Rides: Towards a New Perspective of Driver RatingACM Transactions on Intelligent Systems and Technology10.1145/352306313:6(1-25)Online publication date: 22-Mar-2022
    • (2021)Data Science Workflows for the Cloud/Edge Computing ContinuumProceedings of the 1st Workshop on Flexible Resource and Application Management on the Edge10.1145/3452369.3463820(41-44)Online publication date: 25-Jun-2021
    • (2021)Data science: a game changer for science and innovationInternational Journal of Data Science and Analytics10.1007/s41060-020-00240-2Online publication date: 19-Apr-2021
    • (2019)Meaningful explanations of black box AI decision systemsProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33019780(9780-9784)Online publication date: 27-Jan-2019
    • (2019)APDS: A framework for discovering movement pattern from trajectory databaseInternational Journal of Distributed Sensor Networks10.1177/155014771988816415:11(155014771988816)Online publication date: 14-Nov-2019
    • (2018)Data science at SoBigData: the European research infrastructure for social mining and big data analyticsInternational Journal of Data Science and Analytics10.1007/s41060-018-0126-x6:3(205-216)Online publication date: 15-May-2018

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media