Abstract
Knowing the number of communities in the complex networks can have a great impact on the community detection process and can be considered a critical parameter for most community detection algorithms. Nonnegative matrix tri-factorization (NMTF) is another type of nonnegative matrix factorization (NMF) method that considers the interactions among communities and has high interpretability and accuracy in community detection. Whereas community detection is an NP-hard problem, applying evolutionary algorithms is favored rather than classical methods. In the proposed approach, the number of communities in a network is heuristically examined based on the network topology structure. The genetic algorithm (GA) is one of the most widely accepted optimizing techniques and the oldest evolutionary algorithm which optimizes various issues. In this paper, first, a symmetric nonnegative matrix tri-factorization (SNMTF) is presented to detect communities as an optimization problem in complex networks, because most complex networks are symmetric. The projected gradient descent method is applied to the SNMTF algorithm (PGD-SNMTF) to improve efficiency. Then the GA is used for optimizing the factorization process and estimating the number of communities in the complex networks. Finally, the accuracy of the proposed method is evaluated based on the results obtained and the expected number of communities in the networks. In addition, the proposed approach is compared to other common methods such as elbow, silhouette, rule of thumb, and eigen-gap methods for estimating the number of communities in complex networks. The results of the proposed method indicate high efficiency and accuracy in estimating the number of communities in the investigated networks.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
No data and material are associated with this article.
Code availability
No codes are associated with this article.
References
Ahookhosh M, Hien LTK, Gillis N, Patrinos P (2021) A Block Inertial Bregman Proximal algorithm for nonsmooth nonconvex problems with application to symmetric nonnegative matrix tri-factorization. J Optim Theory Appl 190(1):234–258. https://doi.org/10.1007/s10957-021-01880-5
American football. https://www.cc.gatech.edu/dimacs10/archive/clustering.shtml
Asadi S, Povh J (2021) A block coordinate descent-based projected gradient algorithm for orthogonal non-negative matrix factorization. Mathematics 9(5):1–22. https://doi.org/10.3390/math9050540
Attea BA et al (2021) A review of heuristics and metaheuristics for community detection in complex networks: current usage, emerging development, and future directions. Swarm Evol Comput 63:100885. https://doi.org/10.1016/j.swevo.2021.100885
Batool F, Hennig C (2021) Clustering with the average silhouette width. Comput Stat Data Anal 158:107190. https://doi.org/10.1016/j.csda.2021.107190
Calmon W, Albi M (2021) Estimating the number of clusters in a ranking data context. Inf Sci (Ny) 546:977–995. https://doi.org/10.1016/j.ins.2020.09.056
Chen K, Lei J (2018) Network cross-validation for determining the number of communities in network data. J Am Stat Assoc 113(521):241–251. https://doi.org/10.1080/01621459.2016.1246365
Chouchani N, Abed M (2020) Online social network analysis: detection of communities of interest. J Intell Inf Syst 54(1):5–21. https://doi.org/10.1007/s10844-018-0522-7
Contiguous USA http://www-cs-faculty.stanford.edu/~uno/sgb.html
Ding C, Li T, Peng W, Park H (2006) “Orthogonal nonnegative matrix tri-factorizations for clustering,” Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min, vol. pp. 126–135, 2006, doi: https://doi.org/10.1145/1150402.1150420
email-Eu-core network.http://snap.stanford.edu/data/email-Eu-core.html
Esposito F (2021) A review on initialization methods for nonnegative matrix factorization: towards omics data experiments. Mathematics. https://doi.org/10.3390/math9091006
Fu W, Perry PO (2020) Estimating the number of clusters using Cross-Validation. J Comput Graph Stat 29(1):162–173. https://doi.org/10.1080/10618600.2019.1647846
Geng J, Bhattacharya A, Pati D (2019) Probabilistic community detection with unknown number of communities. J Am Stat Assoc 114:893–905. https://doi.org/10.1080/01621459.2018.1458618
Guan N, Tao D, Luo Z, Yuan B (2012) NeNMF: an optimal gradient method for nonnegative matrix factorization. IEEE Trans Signal Process 60(6):2882–2898. https://doi.org/10.1109/TSP.2012.2190406
Huang J, Zhang T, Yu W, Zhu J, Cai E (2021) Community detection based on modularized deep nonnegative matrix factorization. Int J Pattern Recognit Artif Intell 35(2):1–17. https://doi.org/10.1142/S0218001421590060
Huang X, Chen D, Ren T, Wang D (2021) A survey of community detection methods in multilayer networks. Springer, US
Jazz musicians. http://deim.urv.cat/~alexandre.arenas/data/welcome.htm
Jin D, He J, Chai B, He D (2021) Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularity. Front Comput Sci. https://doi.org/10.1007/s11704-020-9203-0
Kasuya T (2017) “Bottlenose Dolphins,” Small Cetaceans of Japan, http://www-personal.umich.edu/~mejn/netdata/
Katoch S, Chauhan SS, Kumar V (2021) A review on genetic algorithm: past, present, and future. Multimed Tools Appl 80(5):8091–8126. https://doi.org/10.1007/s11042-020-10139-6
Koc I (2022) A fast community detection algorithm based on coot bird metaheuristic optimizer in social networks. Eng Appl Artif Intell 114:105202. https://doi.org/10.1016/j.engappai.2022.105202
Kumar S, Panda BS, Aggarwal D (2021) Community detection in complex networks using network embedding and gravitational search algorithm. J Intell Inf Syst 57(1):51–72. https://doi.org/10.1007/s10844-020-00625-6
Le CM, Levina E (2022) Estimating the number of communities by spectral methods. Electron J Stat 16(1):3315–3342. https://doi.org/10.1214/21-EJS1971
Li Z, Chen J, Fu Y, Hu G, Pan Z, Zhang L (2018) Community detection based on regularized semi-nonnegative matrix tri-factorization in signed networks. Mob Networks Appl 23(1):71–79. https://doi.org/10.1007/s11036-017-0883-0
Li N, Lou G, Li D, Gao Z, Wu H (2021) “Estimating the Number of Communities in Complex Network. Proceedings-2021 2nd International Conference on Electronics, Communication and Information Technology (CECIT) 2021, pp. 560–564, https://doi.org/10.1109/CECIT53797.2021.00105
Lin C (2007) Projected gradient methods for nonnegative matrix factorization. Neural Comput 19(10):2756–2779. https://doi.org/10.1162/neco.2007.19.10.2756
Lu H, Shen Z, Sang X, Zhao Q, Lu J (2020) Community detection method using improved density peak clustering and nonnegative matrix factorization. Neurocomputing 415:247–257. https://doi.org/10.1016/j.neucom.2020.07.080
Luo X, Liu Z, Jin L, Zhou Y, Zhou M (2022) Symmetric nonnegative matrix factorization-based community detection models and their convergence analysis. IEEE Trans Neural Networks Learn Syst 33(3):1203–1215. https://doi.org/10.1109/TNNLS.2020.3041360
Maier HR, Razavi S, Kapelan Z, Matott LS, Kasprzyk J, Tolson BA (2018) Introductory overview: optimization using evolutionary algorithms and other metaheuristics. Environ Model Softw 114:195–213. https://doi.org/10.1016/j.envsoft.2018.11.018
Mahini R et al (2014) Optimal number of clusters by measuring similarity among topographies for spatio-temporal ERP analysis. Brain Topogr 35:357
Masud MA, Huang JZ, Wei C, Wang J, Khan I, Zhong M (2018) I-nice: a new approach for identifying the number of clusters and initial cluster centres. Inf Sci (Ny) 466:129–151. https://doi.org/10.1016/j.ins.2018.07.034
Moscato V, Sperlì G (2021) A survey about community detection over On-line Social and Heterogeneous Information Networks. Knowledge-Based Syst 224:107112. https://doi.org/10.1016/j.knosys.2021.107112
Mouton JP, Ferreira M, Helberg ASJ (2020) A comparison of clustering algorithms for automatic modulation classification. Expert Syst Appl 151:113317. https://doi.org/10.1016/j.eswa.2020.113317
Physicians. http://moreno.ss.uci.edu/data.html#ckm
Political books. https://www.cc.gatech.edu/dimacs10/archive/clustering.shtml
Ramesh AC, Srivatsun G (2021) Evolutionary algorithm for overlapping community detection using a merged maximal cliques representation scheme. Appl Soft Comput 112:107746. https://doi.org/10.1016/j.asoc.2021.107746
Sai Krishna TV, Yesu A, Babu, Kiran Kumar R (2018) Determination of optimal clusters for a non-hierarchical clustering paradigm k-means algorithm. Lect Notes Data Eng Commun Technol 9:301–316. https://doi.org/10.1007/978-981-10-6319-0_26
Shi C, Wei B, Wei S, Wang W, Liu H, Liu J (2021) A quantitative discriminant method of elbow point for the optimal number of clusters in clustering algorithm. Eurasip J Wirel Commun Netw. https://doi.org/10.1186/s13638-021-01910-w
Yang J, Lee JY, Choi M, Joo Y (2020) A new approach to determine the optimal number of clusters based on the gap statistic. Springer International Publishing, US
Zachary WW (1977) Zachary karate club network, http://networkrepository.com/soc-karate.php
Zhu J, Li X, Gao C, Wang Z, Kurths J (2021) Unsupervised community detection in attributed networks based on mutual information maximization. New J Phys. https://doi.org/10.1088/1367-2630/ac2fbd
Funding
This research received no external funding.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Jouyban, M., Hosseini, S. A new approach for estimating the number of communities in complex networks using PGD-SNMTF and GA. Evolving Systems 15, 591–609 (2024). https://doi.org/10.1007/s12530-023-09530-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12530-023-09530-z