Abstract
Numerous methods for detecting communities on social networks have been proposed in recent years. However, the performance and scalability of the algorithms are not enough to work on the real-world large-scale social networks. In this paper, we propose Improved Speaker-listener Label Propagation Algorithm (iSLPA), an efficient and fully distributed method for community detection. It is implemented with Dpark, which is a Python version of Spark and a lightning-fast cluster computing framework. To the best of our knowledge, this is the first attempt at community detection on Dpark. It can automatically work on three kinds of networks: directed networks, undirected networks, and especially bipartite networks. In iSLPA, we propose a new initialization and updating strategy to improve the quality and scalability for detecting communities. And we conduct our experiments on real-world social networks datasets on both benchmark networks and Douban (http://www.douban.com) user datasets. Experimental results demonstrate that iSLPA has a comparable performance than SLPA, and have confirmed our algorithms is very efficient and effective on the overlapping community detection of large-scale networks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Phys. Rev. E 69(2), 026113 (2004)
Lancichinetti, A., Fortunato, S.: Community detection algorithms: a comparative analysis (2009) cite arxiv:0908.1062
Lancichinetti, A., Fortunato, S., Kertész, J.: Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys. 11(3), March 2009
Palla, G., Derényi, I., Farkas, I., Vicsek, T.: Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043), 814–818 (2005)
Lee, C., Reid, F., McDaid, A., Hurley, N.: Detecting highly overlapping community structure by greedy clique expansion. In: Workshop on Social Network Mining and Analysis. (2010) cite arxiv:1002.1827
Wang, T., Qian, X., Xu, H.: An improved parallel hybrid seed expansion (PHSE) method for detecting highly overlapping communities in social networks. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds.) ADMA 2013, Part I. LNCS, vol. 8346, pp. 385–396. Springer, Heidelberg (2013)
Lancichinetti, A., Radicchi, F., Ramasco, J.J., Fortunato, S.: Finding statistically significant communities in networks. CoRR abs/1012.2363 (2010)
Rosvall, M., Bergstrom, C.T.: Maps of random walks on complex networks reveal community structure. Proceedings of the National Academy of Sciences 105(4), 1118–1123 (2008)
Ronhovde, P., Nussinov, Z.: Multiresolution community detection for megascale networks by information-based replica correlations. Phys. Rev. E 80, 016109 (2009)
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 036106 (2007)
Leung, I.X.Y., Hui, P., Liò, P., Crowcroft, J.: Towards real-time community detection in large networks. Physical Review E 79(6), 066107+ (2009)
Liu, X., Murata, T.: Community detection in large-scale bipartite networks. In: IEEE Web Intelligence, pp. 50–57 (2009)
Gregory, S.: Finding overlapping communities in networks by label propagation. New Journal of Physics 12(10), 103018+ (2010)
Liu, X., Murata, T.: How does label propagation algorithm work in bipartite networks? In: Proceedings of the 2009 IEEE/WIC/ACM International Conference on Web Intelligence and International Conference on Intelligent Agent Technology - Workshops, pp. 5–8. IEEE Computer Society, Milan (2009)
Xie, J., Szymanski, B.K., Liu, X.: SLPA: uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In: 2011 IEEE 11th International Conference on Data Mining Workshops (ICDMW), pp. 344–349. Vancouver, BC (2011)
Xie, J., Szymanski, B.K.: Towards linear time overlapping community detection in social networks. In: Tan, P.-N., Chawla, S., Ho, C.K., Bailey, J. (eds.) PAKDD 2012, Part II. LNCS, vol. 7302, pp. 25–36. Springer, Heidelberg (2012)
Zachary, W.: An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33, 452–473 (1977)
Adamic, L.A., Glance, N.: The political blogosphere and the 2004 U.S. election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, LinkKDD 2005, New York, NY, USA, pp. 36–43 (2005)
Davis, A., Gardner, B., Gardner, M.: Deep south: A social anthropological study of caste and class. University of Chicago Press, Chicago (1941)
Harenberg, S., Bello, G., Gjeltema, L., Ranshous, S., Harlalka, J., Seay, R., Padmanabhan, K., Samatova, N.: Community detection in large-scale networks: a survey and empirical evaluation. Wiley Interdisciplinary Reviews: Computational Statistics 6(6), 426–439 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, T., Qian, X., Wang, X. (2015). Label Propagation Based Community Detection Algorithm with Dpark. In: Thai, M., Nguyen, N., Shen, H. (eds) Computational Social Networks. CSoNet 2015. Lecture Notes in Computer Science(), vol 9197. Springer, Cham. https://doi.org/10.1007/978-3-319-21786-4_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-21786-4_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21785-7
Online ISBN: 978-3-319-21786-4
eBook Packages: Computer ScienceComputer Science (R0)