Abstract
In recent years, many attributednetwork have emerged, such as Facebook networks in social networks, protein networks and academic citation networks. In order to find communities where the nodes are tightly connected and have attributes similar to each other by unsupervised learning and improve the accuracy of community detection to make better analysis of the attributed networks, we propose a two-stage attributed network community detection combined with network embedding and parameter-free clustering. In the first stage, we build an attributed network embedding framework that integrates common neighbor information and node attributes. We define node similarity in terms of local link information, jointly model it with attribute proximity, and then adopt the distributed algorithm to obtain the embedding vector of each node. In the second stage, the number of communities can be decided automatically based on curvature and modularity, and the community detection results can be obtained by clustering the embeddings. The performance experiments of our method compared with some representative approaches are tested on real network datasets. The experimental results validate the effectiveness and superiority of our approach.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
References
Balasubramanyan R, Cohen WW (2011) Block-lda: Jointly modeling entity-annotated text and entity-entity links. In: Proceedings of the 2011 SIAM international conference on data mining. SIAM, pp 450–461
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mechan Theory Exp 2008(10):P10008
Caliński T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat-Theor Methods 3(1):1–27
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. IEEE Trans Pattern Anal Machine Intell 24(5):603–619
Ester M, Kriegel HP, Sander J, Xu X, et al. (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, vol 96, pp 226–231
Huang X, Li J, Hu X (2017) Accelerated attributed network embedding. In: Proceedings of the 2017 SIAM international conference on data mining. SIAM, pp 633–641
Huang X, Li J, Hu X (2017) Label informed attributed network embedding. In: Proceedings of the Tenth ACM international conference on web search and data mining, pp 731–739
Huang X, Li J, Zou N, Hu X (2018) A general embedding framework for heterogeneous information learning in large-scale networks. ACM Trans Knowl Discov Data (TKDD) 12(6):1–24
Huang X, Song Q, Yang F, Hu X (2019) Large-scale heterogeneous feature embedding. In: Proceedings of the AAAI conference on artificial intelligence, vol 33, pp 3878–3885
Huang Z, Zhong X, Wang Q, Gong M, Ma X (2020) Detecting community in attributed networks by dynamically exploring node attributes and topological structure. Knowledge-based Systems 105760
Krzanowski WJ, Lai Y (1988) A criterion for determining the number of groups in a data set using sum-of-squares clustering. Biometrics 23–34
Kumpula JM, Kivelä M, Kaski K, Saramäki J (2008) Sequential algorithm for fast clique percolation. Phys Rev E 78(2):026109
Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Advances in neural information processing systems, pp 539–547
Li J, Hu X, Wu L, Liu H (2016) Robust unsupervised feature selection on networked data. In: Proceedings of the 2016 SIAM international conference on data mining. SIAM, pp 387–395
Meng J, Fu D, Tang Y (2020) Belief-peaks clustering based on fuzzy label propagation. Appl Intell 50(4):1259–1271
Newman ME (2004) Fast algorithm for detecting community structure in networks. Phys Rev E 69(6):066133
Newman ME (2006) Modularity and community structure in networks. Proceed Nat Acad Sci 103(23):8577–8582
Palla G, Derényi I, Farkas I, Vicsek T (2005) Uncovering the overlapping community structure of complex networks in nature and society. Nature 435(7043):814–818
Pan Y, Hu G, Qiu J, Zhang Y, Wang S, Shao D, Pan Z (2020) Flgai: a unified network embedding framework integrating multi-scale network structures and node attribute information. Appl Intell 1–14
Perozzi B, Al-Rfou R, Skiena S (2014) Deepwalk: Online learning of social representations. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pp 701–710
Raghavan UN, Albert R, Kumara S (2007) Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E 76(3):036106
Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344 (6191):1492–1496
Starczewski A, Goetzen P, Er MJ (2020) A new method for automatic determining of the dbscan parameters. J Artif Intell Soft Comput Res 10
Sugar CA, James GM (2003) Finding the number of clusters in a dataset: an information-theoretic approach. J Am Stat Assoc 98(463):750–763
Sun FY, Qu M, Hoffmann J, Huang CW, Tang J (2019) Vgraph: A generative model for joint community detection and node representation learning. arXiv:1906.07159
Sun H, He F, Huang J, Sun Y, Li Y, Wang C, He L, Sun Z, Jia X (2020) Network embedding for community detection in attributed networks. ACM Trans Knowl Discov Data (TKDD) 14(3):1–25
Tang J, Qu M, Wang M, Zhang M, Yan J, Mei Q (2015) Line: Large-scale information network embedding. In: Proceedings of the 24th international conference on world wide web, pp 1067–1077
Tibshirani R, Walther G, Hastie T (2001) Estimating the number of clusters in a data set via the gap statistic. J R Stat Soc Series B (Stat Methodol) 63(2):411–423
Wang H, Yang Y, Liu B (2020) Gmc: Graph-based multi-view clustering. IEEE Trans Knowl Data Eng 32(6):1116–1129. https://doi.org/10.1109/TKDE.2019.2903810
Wang X, Jin D, Cao X, Yang L, Zhang W (2016) Semantic community identification in large attribute networks. In: AAAI. Citeseer, pp 265–271
Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. In: International conference on machine learning, pp 478–487
Yang J, McAuley J, Leskovec J (2013) Community detection in networks with node attributes. In: 2013 IEEE 13Th international conference on data mining. IEEE, pp 1151–1156
Yang XH, Zhu QP, Huang YJ, Xiao J, Wang L, Tong FC (2017) Parameter-free laplacian centrality peaks clustering. Pattern Recogn Lett 100:167–173
Yu Z, Zhang Z, Chen H, Shao J (2020) Structured subspace embedding on attributed networks. Inf Sci 512:726–740
Zhang B, Yu Z, Zhang W (2020) Community-centric graph convolutional network for unsupervised community detection. IJCAI
Zhang Y, Mańdziuk J, Quek CH, Goh BW (2017) Curvature-based method for determining the number of clusters. Inf Sci 415:414–428
Acknowledgements
This work was supported in part by National Natural Science Foundation of China (No. 61773348, 61873240, and 61603340), and in part by the Public Welfare Technology Research Project in Zhe’jiang Province of China(No. LGG20F020017).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, XL., Xiao, YY., Yang, XH. et al. Attributed network community detection based on network embedding and parameter-free clustering. Appl Intell 52, 8073–8086 (2022). https://doi.org/10.1007/s10489-021-02779-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-021-02779-4