A Sober Look at Clustering Stability

Shai Ben-David²⁰,
Ulrike von Luxburg²¹ &
Dávid Pál²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4005))

Included in the following conference series:

International Conference on Computational Learning Theory

3149 Accesses
76 Citations
1 Altmetric

Abstract

Stability is a common tool to verify the validity of sample based algorithms. In clustering it is widely used to tune the parameters of the algorithm, such as the number k of clusters. In spite of the popularity of stability in practical applications, there has been very little theoretical analysis of this notion. In this paper we provide a formal definition of stability and analyze some of its basic properties. Quite surprisingly, the conclusion of our analysis is that for large sample size, stability is fully determined by the behavior of the objective function which the clustering algorithm is aiming to minimize. If the objective function has a unique global minimizer, the algorithm is stable, otherwise it is unstable. In particular we conclude that stability is not a well-suited tool to determine the number of clusters – it is determined by the symmetries of the data which may be unrelated to clustering parameters. We prove our results for center-based clusterings and for spectral clustering, and support our conclusions by many examples in which the behavior of stability is counter-intuitive.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Estimating the number of clusters via a corrected clustering instability

Article Open access 18 May 2020

Estimations of Clustering Quality via Evaluation of Its Stability

Selecting the Number of Clusters K with a Stability Trade-off: An Internal Validation Criterion

References

Ben-David, S.: A framework for statistical clustering with a constant time approximation algorithms for K-median clustering. In: Shawe-Taylor, J., Singer, Y. (eds.) COLT 2004. LNCS, vol. 3120, pp. 415–426. Springer, Heidelberg (2004)
Chapter Google Scholar
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing (2002)
Google Scholar
Bousquet, O., Elisseeff, A.: Stability and generalization. JMLR 2(3), 499–526 (2002)
Article MathSciNet MATH Google Scholar
Chan, A., Godsil, C.: Symmetry and eigenvectors. In: Hahn, G., Sabidussi, G. (eds.) Graph Symmetry, Algebraic Methods and Applications. Kluwer, Dordrecht (1997)
Google Scholar
Kulis, B., Dhillon, I., Guan, Y.: A unified view of kernel k-means, spectral clustering, and graph partitioning. Technical Report TR-04-25, UTCS Technical Report (2005)
Google Scholar
Kutin, S., Niyogi, P.: Almost-everywhere algorithmic stability and generalization error. Technical report, TR-2002-03, University of of Chicago (2002)
Google Scholar
Lange, T., Roth, V., Braun, M., Buhmann, J.: Stability-based validation of clustering solutions. Neural Computation (2004)
Google Scholar
Rakhlin, A., Caponnetto, A.: Stability properties of empirical risk minimization over donsker classes. Technical report, MIT AI Memo 2005-018 (2005)
Google Scholar
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)
Article Google Scholar
von Luxburg, U., Belkin, M., Bousquet, O.: Consistency of spectral clustering. Technical Report 134, Max Planck Institute for Biological Cybernetics (2004)
Google Scholar
von Luxburg, U., Ben-David, S.: Towards a statistical theory of clustering. In: PASCAL workshop on Statistics and Optimization of Clustering (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
Shai Ben-David & Dávid Pál
Fraunhofer IPSI, Darmstadt, Germany
Ulrike von Luxburg

Authors

Shai Ben-David
View author publications
You can also search for this author in PubMed Google Scholar
Ulrike von Luxburg
View author publications
You can also search for this author in PubMed Google Scholar
Dávid Pál
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ICREA and Department of Economics, Universitat Pompeu Fabra, Ramon Trias Fargas 25-27, 08005, Barcelona, Spain
Gábor Lugosi
Ruhr-Universität Bochum, Germany
Hans Ulrich Simon

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ben-David, S., von Luxburg, U., Pál, D. (2006). A Sober Look at Clustering Stability. In: Lugosi, G., Simon, H.U. (eds) Learning Theory. COLT 2006. Lecture Notes in Computer Science(), vol 4005. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11776420_4

Download citation

DOI: https://doi.org/10.1007/11776420_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-35294-5
Online ISBN: 978-3-540-35296-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

A Sober Look at Clustering Stability

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Estimating the number of clusters via a corrected clustering instability

Estimations of Clustering Quality via Evaluation of Its Stability

Selecting the Number of Clusters K with a Stability Trade-off: An Internal Validation Criterion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Sober Look at Clustering Stability

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Estimating the number of clusters via a corrected clustering instability

Estimations of Clustering Quality via Evaluation of Its Stability

Selecting the Number of Clusters K with a Stability Trade-off: An Internal Validation Criterion

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation