Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Clustering Users by Their Mobility Behavioral Patterns

Published: 20 August 2019 Publication History

Abstract

The immense stream of data from mobile devices during recent years enables one to learn more about human behavior and provide mobile phone users with personalized services. In this work, we identify clusters of users who share similar mobility behavioral patterns. We analyze trajectories of semantic locations to find users who have similar mobility “lifestyle,” even when they live in different areas. For this task, we propose a new grouping scheme that is called Lifestyle-Based Clustering (LBC). We represent the mobility movement of each user by a Markov model and calculate the Jensen–Shannon distances among pairs of users. The pairwise distances are represented by a similarity matrix, which is used for the clustering. To validate the unsupervised clustering task, we develop an entropy-based clustering measure, namely, an index that measures the homogeneity of mobility patterns within clusters of users. The analysis is validated on a real-world dataset that contains location-movements of 50,000 cellular phone users that were analyzed over a two-month period.

References

[1]
Luis Otavio Alvares, Vania Bogorny, Bart Kuijpers, Bart Moelans, Jose Antonio Fernandes de Macedo, and Andrey Tietbohl Palma. 2007. Towards semantic trajectory knowledge discovery. Technical Report, Hasselt University, Belgium.
[2]
Asier Aztiria, Alberto Izaguirre, and Juan Carlos Augusto. 2010. Learning patterns in ambient intelligence environments: A survey. Artif. Intell. Rev. 34, (2010), 35--51.
[3]
Daniel Ashbrook and Thad Starner. 2003. Using GPS to learn significant locations and predict movement across multiple users. Pers. Ubiquitous Comput. 7, 5 (2003), 275--286.
[4]
Omer Barak, Gabriella Cohen, and Eran Toch. 2016. Anonymizing mobility data using semantic cloaking. Pervasive Mobile Comput. 28 (2016), 102--112.
[5]
Michael Batty, Kay W. Axhausen, Fosca Giannotti, Alexey Pozdnoukhov, Armando Bazzani, Monica Wachowicz, Georgios Ouzounis, and Yuval Portugali. 2012. Smart cities of the future. Eur. Phys. J. Spec. Top. 214 (2012), 481--518.
[6]
Paolo Bellavista, Axel Kupper, and Sumi Helal. 2008. Location-based services: Back to the future. IEEE Pervasive Comput. 7, 2 (2008), 85--89.
[7]
Irad Ben-Gal, Gail Morag, and Armin Shmilovici. 2003. CSPC: A monitoring procedure for state dependent processes. Technometrics 45, 4 (2003) 293--311.
[8]
Irad Ben-Gal, Ayala Shani, Andre Gohr, Jan Grau, Sigal Arviv, Armin Shmilovici, Stefan Posch, and Ivo Grosse. 2005. Identification of transcription factor binding sites with variable-order Bayesian networks. Bioinformatics 21, 11 (2005), 2657--66.
[9]
Chen Cheng, Haiqin Yang, Irwin King, and Mr Lyu. 2012. Fused matrix factorization with geographical and social influence in location-based social networks. In Proc. AAAI. 17--23.
[10]
Darya Chudova, Scott Gaffney, Eric Mjolsness, and Padhraic Smyth. 2003. Translation-invariant mixture models for curve clustering. In Proc. 9th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min. (KDD’03). 79--88.
[11]
Gabriella Cohen, Eran Toch, and Irad Ben-Gal. 2017. “Analyzing Mobility Patterns with Zero-Knowledge Routine Diaries.” Working paper based on unpublished thesis.
[12]
Thomas M. Cover and Joy A. Thomas. 2012. Elements of Information Theory. John Wiley 8 Sons, Inc, 2012.
[13]
William H. E. Day and Herbert Edelsbrunner. 1984. Efficient algorithms for agglomerative hierarchical clustering methods. J. Classif. 24 (1984), 7--24.
[14]
Joseph C. Dunn. 1973. A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters. J. Cybern. 3 (1973) 32--57.
[15]
Nathan Eagle and Alex Sandy Pentland. 2009. Eigenbehaviors: Identifying structure in routine. Behav. Ecol. Sociobiol. 63, 7 (2009) 1057--1066.
[16]
Dominik M. Endres and Johannes E. Schindelin. 2003. A new metric for probability distributions. IEEE Trans. Inf. Theory 49, 7 (2003), 1858--1860.
[17]
Katayoun Farrahi and Daniel Gatica-Perez. 2012. Extracting mobile behavioral patterns with the distant n-gram topic model. In Proc. 2012 16th Int. Symp. Wearable Comput. 1--8.
[18]
Laura Ferrari, Alberto Rosi, Marco Mamei, and Franco Zambonelli. 2011. Extracting urban patterns from location-based social networks. In Proc. 3rd ACM SIGSPATIAL Int. Work. Locat. Soc. Netw. (LBSN’11). 9--16.
[19]
Barbara Furletti, Paolo Cintia, Chiara Renso, and Laura Spinsanti. 2013. Inferring human activities from GPS tracks. In Proc. 2nd ACM SIGKDD Int. Workshop Urban Comput. Article 5.
[20]
Richard W. Hamming. 1950. Error detecting and error correcting codes. Bell Syst. Tech. J. 29, 2 (1950), 147--160.
[21]
George Karypis, Eui-Hong Han, and Han Vipin Kumar. 1999. Chameleon: A hierarchical clustering algorithm using dynamic modeling. IEEE Comput. 32, (8):68--75.
[22]
Leonard Kaufman and Peter Rousseeuw. 1987. Clustering by means of medoids. Stat. Data Anal. Based L 1-Norm Relat. Methods. Y. Dodge (Ed.). Amsterdam, 405--416.
[23]
Benjamin King. 1967. Step-wise clustering procedures. 1967. J. Am. Stat. Assoc. 69 (1967) 86--101.
[24]
Joshua Damian Knowles and J. Handl. 2005. Exploiting the trade-off: The benefits of multiple objectives in data clustering. Evolutionary Multi-Criterion Optimization. Lecture Notes in Computer Science, Vol. 3410. 547--560.
[25]
Solomon Kullback. Information Theory and Statistics. John Wiley 8 Sons, Inc, 1959.
[26]
Kenneth Wai-ting Leung, Dik Lun Lee, and Wang-chien Lee. 2011. CLR : A collaborative location recommendation framework based on co-clustering categories and subject descriptors. In Proc. 34th Int. ACM SIGIR Conf. Res. Dev. Inf. Retrieval (SIGIR’11). 305--314.
[27]
Tao Li, Sheng Ma, and Mitsunori Ogihara. 2004. Entropy-based criterion in categorical clustering. In Proc. 21st Int. Conf. Mach. Learn. (ICML’04). 68.
[28]
Jianhua Lin. 1991. Divergence measures based on the Shannon entropy. IEEE Trans. Inf. Theory 37, 1 (1991), 145--151.
[29]
Juhong Liu, Ouri Wolfson, and Huabei Yin. 2006. Extracting semantic location from outdoor positioning systems. Proc. IEEE Int. Conf. Mob. Data Manag. 2006 (2006), 1--8.
[30]
Eric Hsueh-Chan Lu and Vincent S. Tseng. 2009. Mining cluster-based mobile sequential patterns in location-based service environments. In Proc. 2009 10th Int. Conf. Mob. Data Manag. Syst. Serv. Middlew. 273--278.
[31]
Mingqi Lv, Ling Chen, and Gencai Chen. 2013. Mining user similarity based on routine activities. Inf. Sci. 236 (2013), 17--32.
[32]
Haiping Ma, Huanhuan Cao, Qiang Yang, Enhong Chen, and Jilei Tian. 2012. A habit mining approach for discovering similar mobile users. In Proc. 21st Int. Conf. World Wide Web (WWW’12). 231--340.
[33]
Ujjwal Maulik and Sanghamitra Bandyopadhyay. 2002. Performance evaluation of some clustering algorithms and validity indices. IEEE Trans. Pattern Anal. Mach. Intell. 24, 12 (2002), 1650--1654.
[34]
Todd K. Moon. 1996. The expectation-maximization algorithm. IEEE Signal Process. Mag. 13, 47--60, 1996.
[35]
Andrew Y. Ng, Michael I. Jordan, and Yair Weiss. 2002. On spectral clustering: Analysis and an algorithm. Adv. Neural Inf. Process. Syst. 2 (2002), 849--856.
[36]
Gang Pan, Guande Qi, Wangsheng Zhang, Shijian Li, Zhaohui Wu, and Laurence Tianruo Yang. 2013. Trace analysis and mining for smart cities: Issues, methods, and applications. IEEE Commun. Mag. 51, 6 (2013), 120--126.
[37]
Ido Priness, Oded Maimon, and Irad Ben-Gal. 2007. Evaluation of gene-expression clustering via mutual information distance measure. BMC Bioinform. 8 (2007), 111.
[38]
Eréndira Rendón, Itzel Abundez, Alejandra Arizmendi, and Elvia M. Quiroz. 2011. Internal versus External cluster validation indexes. Int. J. Comput. Commun. Control 5, 1 (2011), 27--34.
[39]
Peter J. Rousseeuw. 1987. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20 (1987), 53--65.
[40]
Claude Shannon. 1948. Prediction and entropy of printed English. Bell Syst. Tech. J. 30 (1948), 50--64.
[41]
Peter P. H. Sneath and Robert R. Sokal. 1973. Numerical taxonomy. Freeman, London, UK.
[42]
John Steenbruggen, Emmanouil Tranos, and Peter Nijkamp. 2015. Data from mobile phone operators: a tool for smarter cities? Telecommun. Policy 39, 3--4 (2015), 335--346.
[43]
David Taniar and John Goh. 2007. On mining movement pattern from mobile users. Int. J. Distrib. Sens. Netw. 3, 1 (2007), 69--86.
[44]
Vincent S. Tseng and Kawuu W. Lin. 2005. Mining sequential mobile access patterns efficiently in mobile web systems. In Proc. 19th Int. Conf. Adv. Inf. Netw. Appl. 1--6.
[45]
Michail Vlachos, George Kollios, and Dimitrios Gunopulos. 2002. Discovering similar multidimensional trajectories. In Proc. 18th Int. Conf. Data Eng. 673--684.
[46]
Jingjing Wang and Bhaskar Prabhala. 2012. Periodicity based next place prediction. In Proceedings of the Nokia Mobile Data Challenge (MDC'12) Workshop, Vol. 2.
[47]
Xiangye Xiao, Yu Zheng, Qiong Luo, and Xing Xie. 2010. Finding similar users using category-based location history. In Proc. 18th SIGSPATIAL Int. Conf. Adv. Geogr. Inf. Syst. (GIS’10). 442--445.
[48]
Zhixian Yan, Dipanjan Chakraborty, Christine Parent, Stefano Spaccapietra, and Karl Aberer. 2013. Semantic trajectories: Mobility data computation and annotation. ACM Trans. Intell. Syst. Technol. 4, 3 (2013), 49:1--49:38.
[49]
Zhixian Yan, Dipanjan Chakraborty, Stefano Spaccapietra, Christine Parent, and Karl Aberer. 2011. SeMiTri : A framework for semantic annotation of heterogeneous trajectories. In Proc. 14th Int. Conf. Extending Database Technol.
[50]
Yang Ye, Yu Zheng, Yukun Chen, Jianhua Feng, and Xing Xie. 2009. Mining individual life pattern based on location history. In Proc. IEEE Int. Conf. Mob. Data Manag. 1--10.
[51]
Josh Jia-ching Ying, Eric Hsueh-chan Lu, Wang-chien Lee, and Vincent S. Tseng. 2010. Mining user similarity from semantic trajectories. In Proc. 2nd ACM SIGSPATIAL Int. Work. Locat. Based Soc. Netw. (LBSN’10). 19--26.
[52]
Jin Soung Yoo and Shashi Sekhar. 2009. Similarity-profiled temporal association mining. IEEE Trans. Knowl. Data Eng. 21, 8 (2009), 1147--1161.
[53]
Andreas Weingessel, Evgenia Dimitriadou, and Sara Dolničar. 2002. An examination of indexes for determining the number of clusters in binary data sets. Psychometrika 67, (2002), 137--159.
[54]
Yu Zheng. 2011. Location-based social networks : Users. In Computing with Spatial Trajectories. Springer.
[55]
Yu Zheng and Xing Xie. 2010. Learning location correlation from GPS trajectories. In Proc. 2010 11th Int. Conf. Mob. Data Manag. 49 (2010). 27--32.

Cited By

View all
  • (2024)Population mobility, well-mixed clustering and disease spread: a look at COVID-19 Spread in the United States and preventive policy insightsMathematical Biosciences and Engineering10.3934/mbe.202424721:4(5604-5633)Online publication date: 2024
  • (2024)Unveiling Human Attributes through Life Pattern Clustering using GPS Data OnlyProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691309(621-624)Online publication date: 29-Oct-2024
  • (2024)Demand-driven Urban Facility Visit PredictionACM Transactions on Intelligent Systems and Technology10.1145/362523315:2(1-24)Online publication date: 22-Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data  Volume 13, Issue 4
August 2019
235 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/3343141
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 August 2019
Accepted: 01 March 2019
Revised: 01 March 2019
Received: 01 April 2018
Published in TKDD Volume 13, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Clustering trajectories
  2. clustering evaluation

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • Israeli Ministry of Science and Technology
  • Koret Foundation Digital Living

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)85
  • Downloads (Last 6 weeks)9
Reflects downloads up to 10 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Population mobility, well-mixed clustering and disease spread: a look at COVID-19 Spread in the United States and preventive policy insightsMathematical Biosciences and Engineering10.3934/mbe.202424721:4(5604-5633)Online publication date: 2024
  • (2024)Unveiling Human Attributes through Life Pattern Clustering using GPS Data OnlyProceedings of the 32nd ACM International Conference on Advances in Geographic Information Systems10.1145/3678717.3691309(621-624)Online publication date: 29-Oct-2024
  • (2024)Demand-driven Urban Facility Visit PredictionACM Transactions on Intelligent Systems and Technology10.1145/362523315:2(1-24)Online publication date: 22-Feb-2024
  • (2024)Spatial scales of COVID-19 transmission in MexicoPNAS Nexus10.1093/pnasnexus/pgae3063:9Online publication date: 31-Jul-2024
  • (2024)Behavioural user segmentation of app users based on functionality interaction patternsCogent Engineering10.1080/23311916.2024.243043011:1Online publication date: 3-Dec-2024
  • (2023)Graph-based mobility profilingComputers, Environment and Urban Systems10.1016/j.compenvurbsys.2022.101910100(101910)Online publication date: Mar-2023
  • (2023)Anonymous Yet Alike: A Privacy-Preserving DeepProfile Clustering for Mobile Usage PatternsMobile and Ubiquitous Systems: Computing, Networking and Services10.1007/978-3-031-34776-4_5(81-100)Online publication date: 27-Jun-2023
  • (2022)A Cluster-Based Approach Using Smartphone Data for Bike-Sharing Docking Stations Identification: Lisbon Case StudySmart Cities10.3390/smartcities50100165:1(251-275)Online publication date: 3-Mar-2022
  • (2022)Synthesis of Longitudinal Human Location Sequences: Balancing Utility and PrivacyACM Transactions on Knowledge Discovery from Data10.1145/352926016:6(1-27)Online publication date: 24-Apr-2022
  • (2021)RABIT: Reflective Analytics for Business InTelligenceCHI Greece 2021: 1st International Conference of the ACM Greek SIGCHI Chapter10.1145/3489410.3489423(1-8)Online publication date: 25-Nov-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media