Abstract
Community detection is one of the most important ways to reflect the structures and mechanisms of a social network. The overlapping communities are more in line with the reality of the social networks. In society, the phenomenon of some members sharing memberships among different communities reflects as overlapping communities in the networks. Dealing with big data networks, it is a challenging and computationally complex problem to detect overlapping communities. In this paper, we propose highly scalable variants of a community-detection algorithm in a parallel manner called Label Propagation with nodes Confidence (PLPAC). We introduce MapReduce into our scheme to process the big data in a parallel manner and guarantee the efficiency of community detection. We implemented the algorithm on artificial networks as well as real networks to evaluate the accuracy and speedup of the proposed method. Experimental results on datasets from different scenarios illustrate that the improved label propagation method outperforms the state-of-the-art methods in terms of accuracy and time efficiency.
Similar content being viewed by others
References
Barber, M.J., Clark, J.W.: Detecting network communities by propagating labels under constraints. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 80, 026129 (2009)
Belkin, M., Niyogi, P.: Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv. Neural Inf. Proces. Syst. MIT Press. 14, 585–591 (2001)
Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Phys. Rev. E. 70, 066111 (2004)
Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: Demon: a local-first discovery method for overlapping communities. 615–623 (2012)
Danon, L., Dazguilera, A., Duch, J., Arenas, A.: Comparing community structure identification (2005)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM. 51, 107–114 (2008)
Dean, J., Ghemawat, S.: MapReduce: a flexible data processing tool. Commun. ACM. 54, 72–77 (2010)
Ester, M., Kriegel, H. P., Xu, X.: A densitybased algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In International Conference on Knowledge Discovery and Data Mining 226–231, (1996)
Fortunato, S.: Community detection in graphs. Phys. Rep. 486, 75–174 (2009)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99, 7821–7826 (2002)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. U. S. A. 99, 7821–7826 (2002)
Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12, 2011–2024 (2001)
Gregory, S.: Finding overlapping communities in networks by label propagation. New J. Phys. 12, 2011–2024 (2010)
He-Li, S., Jian-Bin, H., Yong-Qiang, T., et al.: Detecting overlapping communities in networks via dominant label propagation. Chin. Phys. B. 24, 551–559 (2015)
Hu, Y., Li, M., Zhang, P., Fan, Y., Di, Z.: Community detection by signaling on complex networks. Phys. Rev. E. 78, 016115 (2008)
Jaccard, P.: The distribution of the Flora in the alpine zone. New Phytol. 11, 37–50 (1912)
Jin, S., Yu, P.S., Li, S., Yang, S.: A parallel community structure mining method in big social networks. Math. Probl. Eng. 2015(10), 1–13 (2015)
Konstantin Kuzmin, S., Shah, Y., Szymanski, B. K.: Parallel overlapping community detection with slpa. In International Conference on Social Computing 204–212 (2013)
Leskovec, J., Lang, K.J., Dasgupta, A., et al.: Community structure in large networks: natural cluster sizes and the absence of large well-defined clusters. Internet Math. 6, 29–123 (2009)
Li, S., Lou, H., Jiang, W., et al.: Detecting community structure via synchronous label propagation. Neurocomputing. 151, 1063–1075 (2015)
Lusseau, D., Boisseau, S.K., et al.: The bottlenose dolphin community of doubtful sound features a large proportion of long-lasting associations. Behav. Ecol. Sociobiol. 54, 496–405 (2004)
Masdarolomoor, Z., Azmi, R., Aliakbary, S., Riahi, N.: Finding community structure in complex networks using parallel approach. In Ifip International Conference on Embedded and Ubiquitous Computing 474–479 (2011)
Meo, D., Pasquale, F., et al.: Enhancing community detection using a network weighting strategy. Information Sciences an International Journal. 222, 648–668 (2013)
Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Phys. Rev. E. 69, 066–133 (2004)
Newman, M.E.J.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103, 8577–8582 (2006)
Newman, M.E.J.: Finding community structure in networks using the eigenvectors of matrices. Phys. Rev. E. 74, 036104 (2006)
Pons, P., Latapy, M.: Computing communities in large networks using random walks (long version). J. Graph Alg. App. 10(2):284–293 (2005)
Pons, P., Latapy, M.: Computing communities in large networks using random walks. J. Graph Algorithms Appl. 10, 191–218 (2006)
Radicchi, F., Castellano, C., Cecconi, F., Loreto, V., Parisi, D.: Defining and identifying communities in networks. Proc. Natl. Acad. Sci. U. S. A. 101, 2658–2663 (2004)
Raghavan, U., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E. 76, 036106 (2007)
Schuetz, P., Schuetz, P., Caflisch, A.: Efficient modularity optimization by multistep greedy algorithm and vertex mover refinement. Phys. Rev. E Stat. Nonlinear Soft Matter Phys. 77, 046–112 (2008)
Song, Q., Li, B., Yu, W., Li, J., Shi, B.: Nslpa: A node similarity based label propagation algorithm for real-time community detection. In Ieee/acm International Conference on Utility and Cloud Computing 896–901 (2014)
Ugander, J., Backstrom, L.: Balanced label propagation for partitioning massive graphs. In ACM International Conference on Web Search and Data Mining 507–516 (2013)
Wakita, K., Tsurumi, T.: Finding community structure in mega-scale social networks. In: International Conference on World Wide Web. ACM, 1275–1276 (2007)
White, S., Smyth, P.: A spectral clustering approach to finding communities in graph. In Siam International Conference on Data Mining (2005)
Wu, Z.H., Lin, Y.F., Gregory, S., Wan, H.Y., Tian, S.F.: Balanced multi-label propagation for overlapping community detection in social networks. J. Comput. Sci.Technol. 27, 468–479 (2012)
Xie, J., Szymanski, B. K., Liu, X.: Slpa: Uncovering overlapping communities in social networks via a speaker-listener interaction dynamic process. In IEEE International Conference on Data Mining Workshops 344–349 (2012)
Xu, X., Jäger, J., Kriegel, H.P.: A fast parallel clustering algorithm for large spatial databases. Data Min. Knowl. Disc. 3, 263–290 (1999)
Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. Knowl. Inf. Syst. 42, 181–213 (2015)
Zachary, W.W.: An information flow model for conflict and fission in small groups. J. Anthropol. Res. 44, 452–474 (1997)
Zhang, J., Ge, S.: A parallel algorithm to find overlapping community structure in directed and weighted complex networks. In Second International Conference on Instrumentation, Measurement, Computer, Communication and Control 1561–1564 (2013)
Zhang, Q., Qiu, Q., Guo, W., et al.: A social community detection algorithm based on parallel grey label propagation. Comput. Netw. 107, 13–143 (2016)
Zhou, H.: Distance, dissimilarity index, and network community structure. Phys. Rev. E. 67, 061–901 (2003)
Zhou, H., Lipowsky, R.: Network brownian motion: a new method to measure vertex-vertex proximity and to identify communities and subcommunities. Lect. Notes Comput. Sci. 3038, 1062–1069 (2004)
Acknowledgments
This work has been supported by the Fundamental Research Funds for the Central Universities 2016YJS029 and the National Natural Science Foundation of China under Grant 61401015. Academic Discipline and Postgraduate Education Project of Beijing Municipal Commission of Education.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, N., Liu, Y., Cheng, J. et al. A novel parallel community detection scheme based on label propagation. World Wide Web 21, 1377–1398 (2018). https://doi.org/10.1007/s11280-017-0519-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11280-017-0519-0