Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

MDBSCAN: : A multi-density DBSCAN based on relative density

Published: 25 June 2024 Publication History

Abstract

DBSCAN is a widely used clustering algorithm based on density metrics that can efficiently identify clusters with uniform density. However, if the densities of different clusters are varying, the corresponding clustering results may be not good. To address this issue, we propose a multi-density DBSCAN based on the relative density (MDBSCAN), which can achieve better results on clusters with multiple densities. The intuition of our work is simple but effective, we first divide the dataset into two parts: low density and high density, and then we take a divide and conquer method to deal with the respective parts to avoid them interfering with each other. Specifically, the proposed MDBSCAN consists of three steps: (i) extract the low-density data points in the dataset by relative density. (ii) find natural clusters among the identified low-density data points. (iii) clustering the remaining data points (except the data points of natural clusters in a dataset) by using DBSCAN and assigning the noises (generated by DBSCAN) to the nearest clusters. To verify the effectiveness of the proposed MDBSCAN algorithm, we conduct experiments on ten synthetic datasets and six real-world datasets. Experimental results demonstrate that the proposed MDBSCAN algorithm outperforms the original DBSCAN and six extends of DBSCAN, especially including two state-of-the-art algorithms (DRL-DBSCAN and AMD-DBSCAN) in most cases.

References

[1]
Gormley I.C., Murphy T.B., Raftery A.E., Model-based clustering, Annu. Rev. Stat. Appl. 10 (2023) 573–595.
[2]
Moseley B., Wang J.R., Approximation bounds for hierarchical clustering: Average linkage, bisecting k-means, and local search, J. Mach. Learn. Res. 24 (1) (2023) 1–36.
[3]
Murtagh F., Contreras P., Algorithms for hierarchical clustering: an overview, Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 2 (1) (2012) 86–97.
[4]
Ikotun A.M., Ezugwu A.E., Abualigah L., Abuhaija B., Heming J., K-means clustering algorithms: A comprehensive review, variants analysis, and advances in the era of big data, Inform. Sci. 622 (2023) 178–210.
[5]
Ankerst M., Breunig M.M., Kriegel H.P., Sander J., OPTICS: Ordering points to identify the clustering structure, ACM Sigmod Rec. 28 (2) (1999) 49–60.
[6]
Ester M., Kriegel H.P., Sander J., Xu X., et al., A density-based algorithm for discovering clusters in large spatial databases with noise, Kdd, vol. 96, no. 34, 1996, pp. 226–231.
[7]
Rodriguez A., Laio A., Clustering by fast search and find of density peaks, Science 344 (6191) (2014) 1492–1496.
[8]
Müller E., Graph clustering with graph neural networks, J. Mach. Learn. Res. 24 (2023) 1–21.
[9]
Meila M., Shi J., Learning segmentation by random walks, Adv. Neural Inf. Process. Syst. 13 (2000).
[10]
Schölkopf B., Smola A., Müller K.-R., Nonlinear component analysis as a kernel eigenvalue problem, Neural Comput. 10 (5) (1998) 1299–1319.
[11]
Xiang T., Gong S., Spectral clustering with eigenvector selection, Pattern Recognit. 41 (3) (2008) 1012–1029.
[12]
Ng A., Jordan M., Weiss Y., On spectral clustering: Analysis and an algorithm, Advances in Neural Information Processing Systems, vol. 14, 2001.
[13]
Pei S., Nie F., Wang R., Li X., An efficient density-based clustering algorithm for face groping, Neurocomputing 462 (2021) 331–343.
[14]
Mittal H., Pandey A.C., Pal R., Tripathi A., A new clustering method for the diagnosis of CoVID19 using medical images, Appl. Intell. 51 (2021) 2988–3011.
[15]
Arauz-Garofalo G., Jodar M., Vilanova M., de la Iglesia Rodriguez A., Castillo J., Soler-Ventura A., Oliva R., Vilaseca M., Gay M., Protamine characterization by top-down proteomics: Boosting Proteoform identification with DBSCAN, Proteomes 9 (2) (2021) 21.
[16]
Cheng D., Xu R., Zhang B., Jin R., Fast density estimation for density-based clustering methods, Neurocomputing 532 (2023) 170–182.
[17]
Kim J.H., Choi J.H., Yoo K.H., Nasridinov A., AA-DBSCAN: an approximate adaptive DBSCAN for finding clusters with varying densities, J. Supercomput. 75 (1) (2019) 142–169.
[18]
Bunkhumpornpat C., Sinapiromsaran K., Lursinsap C., DBSMOTE: density-based synthetic minority over-sampling technique, Appl. Intell. 36 (2012) 664–684.
[19]
Xie J., Wu R., Wang H., Chen H., Xu X., Kong Y., Zhang W., Prediction of cardiovascular diseases using weight learning based on density information, Neurocomputing 452 (2021) 566–575.
[20]
R. Zhang, H. Peng, Y. Dou, J. Wu, Q. Sun, Y. Li, J. Zhang, P.S. Yu, Automating DBSCAN via deep reinforcement learning, in: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 2620–2630.
[21]
Wang Z., Ye Z., Du Y., Mao Y., Liu Y., Wu Z., Wang J., AMD-DBSCAN: An adaptive multi-density DBSCAN for datasets of extremely variable density, in: 2022 IEEE 9th International Conference on Data Science and Advanced Analytics, DSAA, IEEE, 2022, pp. 1–10.
[22]
Wang Y., Yang Y., Relative density-based clustering algorithm for identifying diverse density clusters effectively, Neural Comput. Appl. 33 (2021) 10141–10157.
[23]
Campello R.J., Moulavi D., Zimek A., Sander J., Hierarchical density estimates for data clustering, visualization, and outlier detection, ACM Trans. Knowl. Discov. Data (TKDD) 10 (1) (2015) 1–51.
[24]
Lai W., Zhou M., Hu F., Bian K., Song Q., A new DBSCAN parameters determination method based on improved MVO, IEEE Access 7 (2019) 104085–104095.
[25]
Latifi-Pakdehi A., Daneshpour N., DBHC: A DBSCAN-based hierarchical clustering algorithm, Data Knowl. Eng. 135 (2021).
[26]
Smiti A., Elouedi Z., Dbscan-gm: An improved clustering method based on gaussian means and dbscan techniques, in: 2012 IEEE 16th International Conference on Intelligent Engineering Systems, INES, IEEE, 2012, pp. 573–578.
[27]
Kumar K.M., Reddy A.R.M., A fast DBSCAN clustering algorithm by accelerating neighbor searching using groups method, Pattern Recognit. 58 (2016) 39–48.
[28]
Gholizadeh N., Saadatfar H., Hanafi N., K-DBSCAN: An improved DBSCAN algorithm for big data, J Supercomput. 77 (2021) 6214–6235.
[29]
Zhu Y., Ting K.M., Carman M.J., Density-ratio based clustering for discovering clusters with varying densities, Pattern Recognit. 60 (2016) 983–997.
[30]
Darong H., Peng W., Grid-based DBSCAN algorithm with referential parameters, Physics Procedia 24 (2012) 1166–1170.
[31]
Scitovski R., Sabo K., DBSCAN-like clustering method for various data densities, Pattern Anal. Appl. 23 (2) (2020) 541–554.
[32]
Birant D., Kut A., ST-DBSCAN: An algorithm for clustering spatial–temporal data, Data Knowl. Eng. 60 (1) (2007) 208–221.
[33]
Meilă M., Comparing clusterings—an information based distance, J. Multivariate Anal. 98 (5) (2007) 873–895.
[34]
Hubert L., Arabie P., Comparing partitions, J. Classification 2 (1985) 193–218.
[35]
Bryant A., Cios K., RNN-DBSCAN: A density-based clustering algorithm using reverse nearest neighbor density estimates, IEEE Trans. Knowl. Data Eng. 30 (6) (2017) 1109–1121.
[36]
Scitovski R., Sabo K., Martínez-Álvarez F., Ungar Š., Cluster Analysis and Applications, Springer, 2021.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Neurocomputing
Neurocomputing  Volume 576, Issue C
Apr 2024
359 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 25 June 2024

Author Tags

  1. Relative density
  2. DBSCAN
  3. Multiple density clustering

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media