Overlapping Community Detection Based on Attribute Augmented Graph
<p>Illustration of an attributed network. For attribute matrix <math display="inline"><semantics> <mrow> <mi>X</mi> <mo>∈</mo> <msup> <mi>R</mi> <mrow> <mrow> <mi>n</mi> <mo>×</mo> <mi>d</mi> </mrow> </mrow> </msup> </mrow> </semantics></math>, <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mrow> <mi>i</mi> <mi>i</mi> </mrow> </msub> <mrow> <mo>=</mo> <mn>1</mn> </mrow> </mrow> </semantics></math> when vertex <span class="html-italic">i</span> has the attribute <span class="html-italic">m</span>, otherwise <math display="inline"><semantics> <mrow> <msub> <mi>X</mi> <mrow> <mi>i</mi> <mi>m</mi> </mrow> </msub> <mrow> <mo>=</mo> <mn>0</mn> </mrow> </mrow> </semantics></math>.</p> "> Figure 2
<p>Example of augmented attribute graph.</p> "> Figure 3
<p>Framework of OCEA.</p> "> Figure 4
<p>Probability distribution curve of four Facebook egonetworks’ density values.</p> "> Figure 5
<p>Estimation on the number of communities according to the evolution of community centers.</p> "> Figure 6
<p>ONMI values of algorithms with varying network sizes.</p> "> Figure 7
<p>ONMI values of algorithms with varying values of <span class="html-italic">μ</span>.</p> "> Figure 8
<p>ONMI values of algorithms on real-world networks.</p> "> Figure 9
<p>Runtime of the algorithms.</p> ">
Abstract
:1. Introduction
2. Preliminaries
2.1. Problem Definition
- Overlapping community: In a traditional community detection problem, the final partitions do not share any vertices with each other. However, some vertices may be assigned to multiple communities in an overlapping community detection problem. In this paper, we utilize the framework of fuzzy k-medoids [39] to calculate the memberships of every vertex to communities and finally obtain the overlapping communities.
- Attributed network: For an attributed network, the final partitions should satisfy two properties: (1) Topology similarity. The vertices belonging to the same community have more connections with each other than the vertices outside the community. (2) Attribute homogeneity. The vertices whose attribute vectors are close to each other have a high probability to be assigned to the same community. The community partitions should embody both the topology similarity and attribute homogeneity.
2.2. Augmented Attribute Graph
- From structure vertex vi to structure vertex vj:
- From attribute vertices vip and vjq corresponding to the pth attribute of vertex vi and the qth attribute of vertex vj, respectively:
- From attribute vertex vip to structure vertex vj:
- From structure vertex vi to attribute vertex vjq:
3. Algorithms
3.1. Overlapping Community Detection
3.2. Weights Adjustment
3.3. Estimation on the Number of Communities
Function 1. ClusterNumberEstimation. |
Input: Attribute Augmented Graph GA(V, E, A, X), random walk length l, stop probability c, number of epochs T |
Output: Number of Communities k |
|
3.4. OCEA and AOCEA
Algorithm 1 OCEA |
Input: Attribute Augmented Graph GA(V, E, A, X), random walk length l, stop probability c, convergence parameter ε, overlapping parameter γ, number of communities k |
Output: Communities C1, …, Ck |
|
Algorithm 2 AOCEA |
Input: Attribute Augmented Graph GA(V, E, A, F), random walk length l, stop probability c, convergence parameter ε, overlapping parameter γ |
Output: Community C1, …, Ck |
|
4. Experiments
4.1. Datasets
4.2. Evaluation Metric
4.3. Experimental Scheme
- Accuracy of detected communities on synthetic networks: Experiments were conducted on D1 and D2 to compare the accuracy of detected communities. D1 fixes the mixing parameter μ to 0.1 and sets the size of networks from 100 to 700. Meanwhile, D2 fixes the size of the networks to 600 and sets the mixing parameters μ from 0.2 to 0.6.
- Accuracy of detected communities on real-world networks: Experiments were conducted on three networks from Facebook egonetwork set and three paper citation networks: Texas, Washington, and Wisconsin. The accuracy of the detected communities was studied and analyzed.
- Accuracy of estimation on the number of communities: Experiments on Facebook egonetwork set, including all the ten egonetworks, to study the accuracy of the estimation on real-world networks.
- Runtime: Experiments were conducted on D1 to compare the algorithm runtimes. As the size of networks increasing, we analyze the changing runtime trends.
4.4. Results and Analysis
4.4.1. Accuracy on Synthetic Networks
4.4.2. Accuracy on Real-World Networks
4.4.3. Accuracy of Estimation on the Number of Communities
4.4.4. Runtime
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Albert, R.; Barabási, A.-L. Statistical mechanics of complex networks. Rev. Mod. Phys. 2002, 74, 47–97. [Google Scholar] [CrossRef] [Green Version]
- Fortunato, S. Community detection in graphs. Phys. Rep. 2010, 486, 75–174. [Google Scholar] [CrossRef] [Green Version]
- Zachary, W.W. An Information Flow Model for Conflict and Fission in Small Groups. J. Anthropol. Res. 1977, 33, 452–473. [Google Scholar] [CrossRef] [Green Version]
- Wang, F.Y.; Zeng, D.; Carley, K.M.; Mao, W. Social computing: From social informatics to social intelligence. IEEE Intell. Syst. 2007, 22, 79–83. [Google Scholar] [CrossRef]
- Lancichinetti, A.; Fortunato, S. Community detection algorithms: A comparative analysis. Phys. Rev. E 2009, 80, 056117. [Google Scholar] [CrossRef] [Green Version]
- Coscia, M.; Giannotti, F.; Pedreschi, D. A classification for community discovery methods in complex networks. Stat. Anal. Data Min. 2011, 4, 512–546. [Google Scholar] [CrossRef] [Green Version]
- Bothorel, C.; Cruz, J.D.; Magnani, M.; Micenkova, B. Clustering attributed graphs: Models, measures and methods. Netw. Sci. 2015, 3, 408–444. [Google Scholar] [CrossRef] [Green Version]
- Xiao, J.; Ren, H.F.; Xu, X.K. Constructing Real-Life Benchmarks for Community Detection by Rewiring Edges. Complexity 2020, 2020, 7096230. [Google Scholar] [CrossRef] [Green Version]
- Hoffmann, T.; Peel, L.; Lambiotte, R.; Jones, N.S. Community detection in networks without observing edges. Sci. Adv. 2020, 6, eaav1478. [Google Scholar] [CrossRef] [Green Version]
- Hou, C.; He, S.; Tang, K. RoSANE: Robust and scalable attributed network embedding for sparse networks. Neurocomputing 2020, 409, 231–243. [Google Scholar] [CrossRef]
- Sheikh, N.; Kefato, Z.; Montresor, A. gat2vec: Representation learning for attributed graphs. Computing 2019, 101, 187–209. [Google Scholar] [CrossRef]
- Wang, X.; Jin, D.; Cao, X.; Yang, L.; Zhang, W. Semantic community identification in large attribute networks. In Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, AAAI’16, Phoenix, AZ, USA, 12–17 February 2016; Volume 2016, pp. 265–271. [Google Scholar]
- Jin, D.; Li, B.; Jiao, P.; He, D.; Shan, H.; Zhang, W. Modeling with node popularities for autonomous overlapping community detection. ACM Trans. Intell. Syst. Technol. (TIST) 2020, 11, 1–23. [Google Scholar] [CrossRef]
- Džamić, D.; Aloise, D.; Mladenović, N. Ascent–descent variable neighborhood decomposition search for community detection by modularity maximization. Ann. Oper. Res. 2019, 272, 273–287. [Google Scholar] [CrossRef]
- Souravlas, S.; Sifaleras, A.; Katsavounis, S. A parallel algorithm for community detection in social networks, based on path analysis and threaded binary trees. IEEE Access 2019, 7, 20499–20519. [Google Scholar] [CrossRef]
- Souravlas, S.; Sifaleras, A.; Katsavounis, S. Hybrid CPU-GPU Community Detection in Weighted Networks. IEEE Access 2020, 8, 57527–57551. [Google Scholar] [CrossRef]
- Xu, R.; Che, Y.; Wang, X.; Hu, J.; Xie, Y. Stacked auto-encoded based community detection method via an ensemble clustering framework. Inf. Sci. 2020, 526, 151–165. [Google Scholar] [CrossRef]
- Chunaev, P. Community detection in node-attributed social networks: A survey. Comput. Sci. Rev. 2020, 37, 100286. [Google Scholar] [CrossRef]
- Baroni, A.; Conte, A.; Patrignani, M. Efficiently clustering very large attributed graphs. In Proceedings of the 2017 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Sydney, Australia, 31 July–3 August 2017; Volume 2017, pp. 369–376. [Google Scholar]
- Zhou, Y.; Cheng, H.; Yu, J.X. Graph clustering based on structural/attribute similarities. Proc. VLDB Endow. 2009, 2, 718–729. [Google Scholar] [CrossRef] [Green Version]
- Cheng, H.; Zhou, Y.; Huang, X.; Yu, J.X. Clustering large attributed information networks: An efficient incremental computing approach. Data Min. Knowl. Discov. 2012, 25, 450–477. [Google Scholar] [CrossRef]
- Wang, C.; Pan, S.; Long, G.; Zhu, X.; Jiang, J. MGAE: Marginalized Graph Autoencoder for Graph Clustering. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, Singapore, 6–10 November 2017; Volume 2017, pp. 889–898. [Google Scholar]
- Gao, H.; Huang, H. Deep Attributed Network Embedding. In Proceedings of the IJCAI-ECAI-18, Stockholm, Sweden, 13–19 July 2018; Volume 18, pp. 3364–3370. [Google Scholar]
- Reihanian, A.; Feizi-Derakhshi, M.R.; Aghdasi, H.S. Community detection in social networks with node attributes based on multi-objective biogeography based optimization. Eng. Appl. Artif. Intell. 2017, 62, 51–67. [Google Scholar] [CrossRef]
- Li, Z.; Liu, J.; Wu, K. A multiobjective evolutionary algorithm based on structural and attribute similarities for community detection in attributed networks. IEEE Trans. Cybern. 2017, 48, 1963–1976. [Google Scholar] [CrossRef] [PubMed]
- Li, Z.; Pan, Z.; Hu, G.; Li, G.; Zhou, X. Detecting semantic communities in social networks. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 2017, 100, 2507–2512. [Google Scholar] [CrossRef]
- Qin, M.; Jin, D.; Lei, K. Adaptive community detection incorporating topology and content in social networks. Knowl.-Based Syst. 2018, 161, 342–356. [Google Scholar] [CrossRef]
- Yang, J.; McAuley, J.; Leskovec, J. Community detection in networks with node attributes. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–12 December 2013; Volume 2013, pp. 1151–1156. [Google Scholar]
- Bojchevski, A.; Günnemann, S. Bayesian robust attributed graph clustering: Joint learning of partial anomalies and group structure. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 2018, pp. 2738–2745. [Google Scholar]
- Wen, X.; Chen, W.N.; Lin, Y.; Gu, T.; Zhang, H. A maximal clique based multiobjective evolutionary algorithm for overlapping community detection. IEEE Trans. Evol. Comput. 2016, 21, 363–377. [Google Scholar] [CrossRef]
- Zhang, L.; Pan, H.; Su, Y. A mixed representation-based multiobjective evolutionary algorithm for overlapping community detection. IEEE Trans. Cybern. 2017, 47, 2703–2716. [Google Scholar] [CrossRef]
- Bello-Orgaz, G.; Salcedo-Sanz, S.; Camacho, D. A multi-objective genetic algorithm for overlapping community detection based on edge encoding. Inf. Sci. 2018, 462, 290–314. [Google Scholar] [CrossRef]
- Bai, X.; Yang, P.; Shi, X. An overlapping community detection algorithm based on density peaks. Neurocomputing 2017, 226, 7–15. [Google Scholar] [CrossRef]
- Ma, T.; Wang, Y.; Tang, M.; Cao, J.; Tian, Y. LED: A fast overlapping communities detection algorithm based on structural clustering. Neurocomputing 2016, 207, 488–500. [Google Scholar] [CrossRef] [Green Version]
- Whang, J.J.; Gleich, D.F.; Dhillon, I.S. Overlapping community detection using neighborhood-inflated seed expansion. IEEE Trans. Knowl. Data Eng. 2016, 28, 1272–1284. [Google Scholar] [CrossRef]
- Wang, X.; Liu, G.; Li, J. Overlapping community detection based on structural centrality in complex networks. IEEE Access 2017, 5, 25258–25269. [Google Scholar] [CrossRef]
- Deng, X.; Li, G.; Dong, M.; Ota, K. Finding overlapping communities based on Markov chain and link clustering. Peer-to-Peer Netw. Appl. 2017, 10, 411–420. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Sha, C.; Huang, X.; Zhang, Y. Community detection in attributed graphs: An embedding approach. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, New Orleans, LA, USA, 2–7 February 2018; Volume 2018, pp. 338–345. [Google Scholar]
- Al-Akhras, M.T. An Efficient Fuzzy K-Medoids Method. World Appl. Sci. J. 2010, 10, 574–583. [Google Scholar]
- McAuley, J.J.; Leskovec, J. Learning to discover social circles in ego networks. NIPS 2012, 2012, 548–556. [Google Scholar]
- Wei, H.; Pan, Z.; Hu, G.; Yang, H.; Li, X.; Zhou, X. Attributed network representation learning via DeepWalk. Intell. Data Anal. 2019, 23, 877–893. [Google Scholar] [CrossRef]
- Lancichinetti, A.; Fortunato, S. Benchmarks for testing community detection algorithms on directed and weighted graphs with overlapping communities. Phys. Rev. E 2009, 80, 016118. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Huang, B.; Wang, C.; Wang, B. NMLPA: Uncovering Overlapping Communities in Attributed Networks via a Multi-Label Propagation Approach. Sensors 2019, 19, 260. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- McDaid, A.F.; Greene, D.; Hurley, N. Normalized mutual information to evaluate overlapping community finding algorithms. arXiv 2011, arXiv:1110.2515. [Google Scholar]
- MacQueen, J. Some methods for classification and analysis of multivariate observations. In Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; University of California Press: Berkeley, CA, USA, 1967; pp. 281–297. [Google Scholar]
Network | Mean | Variance | Standard Deviation |
---|---|---|---|
Facebook_686 | |||
Facebook_414 | |||
Facebook_698 | |||
Facebook_3437 |
Parameter | Meaning |
---|---|
N | number of vertices |
kavg | average degree |
kmax | maximum degree |
μ | mixing parameters |
τ1 | negative exponent of degree |
τ2 | negative exponent of community size |
cmin | minimum community size |
cmax | maximum community size |
on | number of overlapping vertices |
om | maximum of an overlapping vertex belongs to |
Dataset | Parameters |
---|---|
D1 | N = 100–700, kavg = 10, kmax=25, μ = 0.1, cmin = 10, cmax = 50, on = 20, om = 2 |
D2 | N = 600, kavg = 10, kmax = 25, μ = 0.2–0.6, cmin = 10, cmax = 50, on = 20, om = 2 |
Network | N | M | k |
---|---|---|---|
Facebook_698 | 65 | 48 | 13 |
Facebook_348 | 225 | 161 | 14 |
Facebook_0 | 347 | 2533 | 24 |
Texas | 187 | 328 | 5 |
Washington | 230 | 446 | 5 |
Wisconsin | 265 | 530 | 5 |
Real | Initial | Estimation | |
---|---|---|---|
facebook_0 | 24 | 127 | 17 |
facebook_107 | 9 | 417 | 17 |
facebook_1684 | 17 | 335 | 14 |
facebook_1912 | 46 | 338 | 12 |
facebook_3437 | 32 | 230 | 15 |
facebook_348 | 14 | 109 | 14 |
facebook_3980 | 17 | 27 | 14 |
facebook_414 | 7 | 85 | 19 |
facebook_686 | 14 | 76 | 8 |
facebook_698 | 13 | 28 | 16 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lin, H.; Zhan, Y.; Zhao, Z.; Chen, Y.; Dong, C. Overlapping Community Detection Based on Attribute Augmented Graph. Entropy 2021, 23, 680. https://doi.org/10.3390/e23060680
Lin H, Zhan Y, Zhao Z, Chen Y, Dong C. Overlapping Community Detection Based on Attribute Augmented Graph. Entropy. 2021; 23(6):680. https://doi.org/10.3390/e23060680
Chicago/Turabian StyleLin, Hanyang, Yongzhao Zhan, Zizheng Zhao, Yuzhong Chen, and Chen Dong. 2021. "Overlapping Community Detection Based on Attribute Augmented Graph" Entropy 23, no. 6: 680. https://doi.org/10.3390/e23060680