Abstract
This paper initiates formal analysis of a simple, distributed algorithm for community detection on networks. We analyze an algorithm that we call Max-LPA, both in terms of its convergence time and in terms of the “quality” of the communities detected. Max-LPA is an instance of a class of community detection algorithms called label propagation algorithms. As far as we know, most analysis of label propagation algorithms thus far has been empirical in nature and in this paper we seek a theoretical understanding of label propagation algorithms. In our main result, we define a clustered version of Erdös-Rényi random graphs with clusters V 1, V 2, …, V k where the probability p, of an edge connecting nodes within a cluster V i is higher than p′, the probability of an edge connecting nodes in distinct clusters. We show that even with fairly general restrictions on p and p′ (\(p = \Omega\left(\frac{1}{n^{1/4-\epsilon}}\right)\) for any ε > 0, p′ = O(p 2), where n is the number of nodes), Max-LPA detects the clusters V 1, V 2, …, V n in just two rounds. Based on this and on empirical results, we conjecture that Max-LPA can correctly and quickly identify communities on clustered Erdös-Rényi graphs even when the clusters are much sparser, i.e., with \(p = \frac{c\log n}{n}\) for some c > 1.
This work was done when the first author (KK) was visiting The University of Iowa on an Indo-US Science and Technology Forum Fellowship. The work of the second author (SP) was partially supported by National Science Foundation grant CCF 0915543.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Albert, R., Barabási, A.-L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74, 47–97 (2002)
Bender, E.A., Rodney Canfield, E.: The asymptotic number of labeled graphs with given degree sequences. Journal of Combinatorial Theory, Series A 24(3), 296–307 (1978)
Chernoff, H.: A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations. The Annals of Mathematical Statistics 23(4), 493–507 (1952)
Cordasco, G., Gargano, L.: Community detection via semi-synchronous label propagation algorithms. In: 2010 IEEE International Workshop on Business Applications of Social Network Analysis (BASNA), pp. 1–8. IEEE (2010)
Elsner, U.: Graph Partitioning - A Survey (1997)
Erdős, P., Rényi, A.: On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci. 5(17) (1960)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of web communities. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160. ACM (2000)
Fortunato, S.: Community detection in graphs. Physics Reports 486(3-5), 75–174 (2010)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99(12), 7821 (2002)
Gregory, S.: Finding overlapping communities using disjoint community detection algorithms. Complex Networks, 47–61 (2009)
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58, 13–30 (1963)
Kannan, R., Vempala, S., Vetta, A.: On clusterings: Good, bad and spectral. J. ACM 51(3), 497–515 (2004)
Karger, D.R.: Minimum cuts in near-linear time. J. ACM 47(1), 46–76 (2000)
Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal 49(2), 291–307 (1970)
Kothapalli, K., Pemmaraju, S.V., Sardeshmukh, V.: On the analysis of a label propagation algorithm for community detection. arXiv preprint (2012)
Leung, I.X.Y., Hui, P., Lio, P., Crowcroft, J.: Towards real-time community detection in large networks. Arxiv preprint arXiv:0808.2633 (2008)
Liu, X., Murata, T.: How does label propagation algorithm work in bipartite networks? In: Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, vol. 03, pp. 5–8. IEEE Computer Society (2009)
Molloy, M., Reed, B.: A critical point for random graphs with a given degree sequence. Random Structures & Algorithms 6(2-3), 161–180 (1995)
Newman, M.E.J.: Networks: An Introduction. OUP Oxford (2010)
Newman, M.E.J.: The spread of epidemic disease on networks. Physical Review Letters 66, 16128 (2002)
Newman, M.E.J.: The Structure and Function of Complex Networks. SIAM Review 45(2), 167–256 (2003)
Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69(2), 26113 (2004)
Peleg, D.: Distributed computing: a locality-sensitive approach, vol. 5. Society for Industrial Mathematics (2000)
Poljak, S., Sura, M.: On periodical behaviour in societies with symmetric influences. Combinatorica 3(1), 119–121 (1983)
Raghavan, U.N., Albert, R., Kumara, S.: Near linear time algorithm to detect community structures in large-scale networks. Physical Review E 76(3), 36106 (2007)
Suaris, P.R., Kedem, G.: An algorithm for quadrisection and its application to standard cell placement. IEEE Transactions on Circuits and Systems 35(3), 294–303 (1988)
Šubelj, L., Bajec, M.: Unfolding network communities by combining defensive and offensive label propagation. In: In Proceedings of the ECML PKDD Workshop on the Analysis of Complex Networks 2010 (ACNE 2010), pp. 87–104 (March 2011)
Watts, D.J., Strogatz, S.H.: Collective dynamics of ‘small-world’ networks. Nature 393, 440–442 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kothapalli, K., Pemmaraju, S.V., Sardeshmukh, V. (2013). On the Analysis of a Label Propagation Algorithm for Community Detection. In: Frey, D., Raynal, M., Sarkar, S., Shyamasundar, R.K., Sinha, P. (eds) Distributed Computing and Networking. ICDCN 2013. Lecture Notes in Computer Science, vol 7730. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35668-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-35668-1_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35667-4
Online ISBN: 978-3-642-35668-1
eBook Packages: Computer ScienceComputer Science (R0)