Abstract
The development of external evaluation criteria for soft clustering (SC) has received limited attention: existing methods do not provide a general approach to extend comparison measures to SC, and are unable to account for the uncertainty represented in the results of SC algorithms. In this article, we propose a general method to address these limitations, grounding on a novel interpretation of SC as distributions over hard clusterings, which we call distributional measures. We provide an in-depth study of complexity- and metric-theoretic properties of the proposed approach, and we describe approximation techniques that can make the calculations tractable. Finally, we illustrate our approach through a simple but illustrative experiment.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Due to space constraints, the complete version of all proofs appears online at https://arxiv.org/abs/2206.09827.
- 2.
- 3.
The problem is trivially in P w.r.t. the distribution-based representation of \(R_1, R_2\).
- 4.
References
Anderson, D.T., Bezdek, J.C., Popescu, M., et al.: Comparing fuzzy, probabilistic, and possibilistic partitions. IEEE Trans. Fuzzy Syst. 18(5), 906–918 (2010)
Pattern Recognition with Fuzzy Objective Function Algorithms. AAPR, Springer, Boston (1981). https://doi.org/10.1007/978-1-4757-0450-1
Campagner, A., Ciucci, D.: Orthopartitions and soft clustering: soft mutual information measures for clustering validation. Knowl. Based Syst. 180, 51–61 (2019)
Campello, R.J.: A fuzzy extension of the rand index and other related indexes for clustering and classification assessment. Pattern Recognit. Lett. 28(7), 833–841 (2007)
Day, W.H.: The complexity of computing metric distances between partitions. Math. Soc. Sci. 1(3), 269–287 (1981)
Denoeux, T.: Decision-making with belief functions: a review. Int. J. Approx. Reason. 109, 87–110 (2019)
Huynh, V.-N., Inuiguchi, M., Le, B., Le, B.N., Denoeux, T. (eds.): IUKM 2016. LNCS (LNAI), vol. 9978. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49046-5
Denœux, T., Li, S., Sriboonchitta, S.: Evaluating and comparing soft partitions: an approach based on Dempster-Shafer theory. IEEE Trans. Fuzzy Syst. 26(3), 1231–1244 (2017)
Denœux, T., Masson, M.H.: EVCLUS: evidential clustering of proximity data. IEEE Trans. Syst. Man Cybern. B Cybern. 34(1), 95–109 (2004)
Depaolini, M.R., Ciucci, D., Calegari, S., Dominoni, M.: External indices for rough clustering. In: Nguyen, H.S., Ha, Q.-T., Li, T., Przybyła-Kasperek, M. (eds.) IJCRS 2018. LNCS (LNAI), vol. 11103, pp. 378–391. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99368-3_29
Frigui, H., Hwang, C., Rhee, F.C.H.: Clustering and aggregation of relational data with applications to image database categorization. Pattern Recognit. 40(11), 3053–3068 (2007)
Hüllermeier, E., Rifqi, M., Henzgen, S., et al.: Comparing fuzzy partitions: a generalization of the rand index and related measures. IEEE Trans. Fuzzy Syst. 20(3), 546–556 (2011)
Krishnapuram, R., Keller, J.M.: A possibilistic approach to clustering. IEEE Trans. Fuzzy Syst. 1(2), 98–110 (1993)
Lingras, P., West, C.: Interval set clustering of web users with rough k-means. J Intell. Inform. Syst. 23(1), 5–16 (2004)
Masson, M.H., Denoeux, T.: ECM: an evidential version of the fuzzy c-means algorithm. Pattern Recognit. 41(4), 1384–1397 (2008)
Peters, G.: Rough clustering utilizing the principle of indifference. Inf. Sci. 277, 358–374 (2014)
Rand, W.M.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66(336), 846–850 (1971)
Vinh, N.X., Epps, J., Bailey, J.: Information theoretic measures for clusterings comparison: variants, properties, normalization and correction for chance. J. Mach. Learn. Res. 11, 2837–2854 (2010)
Xiong, H., Li, Z.: Clustering validation measures. In: Data Clustering, pp. 571–606. Chapman and Hall/CRC (2018)
Zhou, D., Li, J., Zha, H.: A new Mallows distance based metric for comparing clusterings. In: Proceeding of ICML 2005, pp. 1028–1035 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Campagner, A., Ciucci, D., Denœux, T. (2022). A Distributional Approach for Soft Clustering Comparison and Evaluation. In: Le Hégarat-Mascle, S., Bloch, I., Aldea, E. (eds) Belief Functions: Theory and Applications. BELIEF 2022. Lecture Notes in Computer Science(), vol 13506. Springer, Cham. https://doi.org/10.1007/978-3-031-17801-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-17801-6_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-17800-9
Online ISBN: 978-3-031-17801-6
eBook Packages: Computer ScienceComputer Science (R0)