Abstract
The allocation of boundary points and low-density clusters has become an essential part of clustering research. Most of the recent improved methods that focused on identifing allocation of points have not addressed the issue of specific data point assignment in terms of the data’s distribution feature. In this article, a rolling iteration clustering model (ROCM) was proposed for assigning the specific data point by extracting the feature of data points. In this model, data points were transformed into multiple units with different distribution structures, and then each unit’s dispersion used to discover representative groups was analyzed. Sparse data were clustered based on the proposed self-expansion principle to effectively capture boundary points and assign points at joint. Furthermore, the rolling iteration module avoided the over-partitioning and chaining effect and discovered clusters with diverse shapes and densities. Experimental results of twenty-two datasets proved the effectiveness of the proposed method. ROCM has better performance than other state-of-the-art methods.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Xing L, Chen B, Du S, Gu Y, Zheng N (2021) Correntropy-based multiview subspace clustering. IEEE Trans Cybern 51(6):3298–3311
Wu J, Huang L, Yang M, Liu C (2021) Sparse subspace clustering via two-step reweighted l1-minimization: Algorithm and provable neighbor recovery rates. IEEE Trans Inf Theory 67(2):1216–1263
Luchi D, Rodrigues AL, Varejão FM (2019) Sampling approaches for applying DBSCAN to large datasets. Pattern Recognit Lett 117:90–96
Xie P, Lv M, Zhao J (2020) An improved energy-low clustering hierarchy protocol based on ensemble algorithm. Concurr Comput Pract Exp 32(7):e5575
Lin J, Wu L, Chen R, Wu J, Wang X (2021) Double-weighted fuzzy clustering with samples and generalized entropy features. Concurr Comput Pract Exp 33(8):e5758
Hou J, Zhang A, Qi N (2020) Density peak clustering based on relative density relationship. Pattern Recognit 108:107554
Li X, Jiang Y, Li M, Yin S (2021) Lightweight attention convolutional neural network for retinal vessel image segmentation. IEEE Trans Ind Informatics 17(3):1958–1967
Hsu C, Lin C (2018) Cnn-based joint clustering and representation learning with feature drift compensation for large-scale image data. IEEE Trans Multim 20(2):421–429
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Sci 315(5814):972–976
Karanwal S, Diwakar M (2021) OD-LBP: orthogonal difference-local binary pattern for face recognition. Digit Signal Process 110:102948
Singh S, Ganie AH (2021) Applications of picture fuzzy similarity measures in pattern recognition, clustering, and MADM. Expert Syst Appl 168:114264
Huang D, Lai J, Wang C (2016) Robust ensemble clustering using probability trajectories. IEEE Trans Knowl Data Eng 28(5):1312–1326
Simonnet E (2016) Combinatorial analysis of the adaptive last particle method. Stat Comput 26(1–2):211–230
Wang G, Song Q (2016) Automatic clustering via outward statistical testing on density metrics. IEEE Trans Knowl Data Eng 28(8):1971–1985
Ester M, Kriegel H, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis, E., Han, J., Fayyad, U.M. (eds.) Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96), Portland, Oregon, USA, pp. 226–231. AAAI Press, ???
Du M, Wang R, Ji R, Wang X, Zhang Y (2021) ROBP a robust border-peeling clustering using cauchy kernel. Inf Sci 571:375–400
Rodriguez A (2014) Laio, Alessandro: Clustering by fast search and find of density peaks. Sci 344:1492–1496
Wang Y, Wang D, Zhang X, Pang W, Miao C, Tan A, Zhou Y (2020) Mcdpc: multi-center density peak clustering. Neural Comput Appl 32(17):13465–13478
Parmar MD, Wang D, Zhang X, Tan A, Miao C, Jiang J, Zhou Y (2019) REDPC: A residual error-based density peak clustering algorithm. Neurocomputing 348:82–96
Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208:210–217
Wang S, Li Q, Zhao C, Zhu X, Yuan H, Dai T (2021) Extreme clustering - A clustering method via density extreme points. Inf Sci 542:24–39
Kim Y, Do H, Kim SB (2020) Outer-points shaver: Robust graph-based clustering via node cutting. Pattern Recognit 97:107001
Lotfi A, Moradi P, Beigy H (2020) Density peaks clustering based on density backbone and fuzzy neighborhood. Pattern Recognit 107:107449
McInnes L, Healy J (2017) Accelerated hierarchical density based clustering. In: 2017 IEEE International Conference on Data Mining Workshops (ICDMW), pp. 33–42
Zhao J, Tang J, Fan T, Li C, Xu L (2019) Density peaks clustering based on circular partition and grid similarity. Concurrency Comput practice Experience 32(7):e5567
Xu X, Ding S, Shi Z (2018) An improved density peaks clustering algorithm with fast finding cluster centers. Knowl Based Syst 158:65–74
Wu B, Wilamowski BM (2017) A fast density and grid based clustering method for data with arbitrary shapes and noise. IEEE Trans Ind Informatics 13(4):1620–1628
Seyedi SA, Lotfi A, Moradi P, Qader NN (2019) Dynamic graph-based label propagation for density peaks clustering. Expert Syst Appl 115:314–328
Yaohui L, Ma Z, Fang Y (2017) Adaptive density peak clustering based on k-nearest neighbors with aggregating strategy. Knowl Based Syst 133:208–220
Li H, Liu X, Li T, Gan R (2020) A novel density-based clustering algorithm using nearest neighbor graph. Pattern Recognit 102:107206
Abbas MA, El-Zoghabi AA, Shoukry AA (2021) Denmune: Density peak based clustering using mutual nearest neighbors. Pattern Recognit 109:107589
Sieranoja S, Fränti P (2019) Fast and general density peaks clustering. Pattern Recognit Lett 128:551–558
Guan J, Li S, He X, Zhu J, Chen J (2021) Fast hierarchical clustering of local density peaks via an association degree transfer method. Neurocomputing 455:401–418
Liu X, Yang Q, He L (2017) A novel DBSCAN with entropy and probability for mixed data. Clust Comput 20(2):1313–1323
Ding S, Du M, Sun T, Xu X, Xue Y (2017) An entropy-based density peaks clustering algorithm for mixed type data employing fuzzy neighborhood. Knowl Based Syst 133:294–313
Sublime J, Matei B, Cabanes G, Grozavu N, Bennani Y, Cornuéjols A (2017) Entropy based probabilistic collaborative clustering. Pattern Recognit 72:144–157
Puri C, Kumar N (2017) An entropy based method for overlapping subspace clustering. Procedia Comput Sci 122:276–283
Averbuch-Elor H, Bar N, Cohen-Or D (2020) Border-peeling clustering. IEEE Trans Pattern Anal Mach Intell 42(7):1791–1797
van der Maaten L (2014) Accelerating t-sne using tree-based algorithms. J Mach Learn Res 15(1):3221–3245
Nguyen XV, Epps J, Bailey J (2010) Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J Mach Learn Res 11:2837–2854
Powers DMW (2020) Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. CoRR arxiv: 2010.16061
Han J, Kamber M, Pei J (2011) Data Mining: Concepts and Techniques, 3rd Edition
Rosner B, Glynn RJ, Lee MLT (2004) Incorporation of clustering effects for the wilcoxon rank sum test: A large-sample approach. Biometrics 59(4):1089–1098
Brbic M, Kopriva I (2018) Multi-view low-rank sparse subspace clustering. Pattern Recognit 73:247–258
Zhan K, Zhang C, Guan J, Wang J (2018) Graph learning for multiview clustering. IEEE Trans Cybern 48(10):2887–2895
Acknowledgements
This work was supported in part by the National Science Foundation of China (No. 61472049 and No.61572225) and Department of Science and Technology of Jilin Province (No. 20190302071GX, No. 20200201164JC) and Development and Reform Commission Foundation of Jilin Province (No. 2019C05311).
Author information
Authors and Affiliations
Corresponding authors
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Guo, L., Wang, L., Han, X. et al. ROCM: A Rolling Iteration Clustering Model Via Extracting Data Features. Neural Process Lett 55, 3899–3922 (2023). https://doi.org/10.1007/s11063-022-10972-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10972-w