Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

Characterization, Stability and Convergence of Hierarchical Clustering Methods

Published: 01 August 2010 Publication History

Abstract

We study hierarchical clustering schemes under an axiomatic view. We show that within this framework, one can prove a theorem analogous to one of Kleinberg (2002), in which one obtains an existence and uniqueness theorem instead of a non-existence result. We explore further properties of this unique scheme: stability and convergence are established. We represent dendrograms as ultrametric spaces and use tools from metric geometry, namely the Gromov-Hausdorff distance, to quantify the degree to which perturbations in the input metric space affect the result of hierarchical methods.

References

[1]
S. Ben-David, U. von Luxburg, and D. Pál. A sober look at clustering stability. In Gábor Lugosi and Hans-Ulrich Simon, editors, COLT, volume 4005 of Lecture Notes in Computer Science, pages 5-19. Springer, 2006. ISBN 3-540-35294-5.
[2]
F. L. Bookstein, B. Chernoff, R. L. Elder, J. M. Humphries Jr., G. R. Smith, and R. E. Strauss. Morphometrics in Evolutionary Biology: the Geometry of Size and Shape Change, with Examples from Fishes. Academy of Natural Sciences of Philadelphia., 1985.
[3]
M. R. Bridson and A. Haefliger. Metric Spaces of Non-positive Curvature, volume 319 of Grundlehren der Mathematischen Wissenschaften {Fundamental Principles of Mathematical Sciences}. Springer-Verlag, Berlin, 1999. ISBN 3-540-64324-9.
[4]
D. Burago, Y. Burago, and S. Ivanov. A Course in Metric Geometry, volume 33 of AMS Graduate Studies in Math. American Mathematical Society, 2001.
[5]
G. Carlsson and F. Mémoli. Persistent clustering and a theorem of J. Kleinberg. ArXiv e-prints, August 2008.
[6]
G. Carlsson and F. Mémoli. Classifying clustering schemes. Technical report, Stanford, 2009.
[7]
G. Carlsson and F. Mémoli. Multiparameter clustering methods. In Claus Weihs Hermann Locarek-Junge, editor, 'Classification as a Tool for Research'. Proceedings of the 11th IFCS Biennial Conference and 33rd Annual Conference of the Gesellschaft fr Klassifikation e.V., Berlin-Heidelberg-New York, to appear 2009. Springer.
[8]
M. Gromov. Metric Structures for Riemannian and non-Riemannian spaces. Modern Birkhäuser Classics. Birkhäuser Boston Inc., Boston, MA, english edition, 2007. ISBN 978-0-8176-4582-3; 0-8176-4582-9. Based on the 1981 French original, With appendices by M. Katz, P. Pansu and S. Semmes, Translated from the French by Sean Michael Bates.
[9]
M. Gromov. Hyperbolic groups. In Essays in Group Theory, volume 8 of Math. Sci. Res. Inst. Publ., pages 75-263. Springer, New York, 1987.
[10]
J. A. Hartigan. Consistency of single linkage for high-density clusters. J. Amer. Statist. Assoc., 76 (374):388-394, 1981. ISSN 0162-1459.
[11]
J. A. Hartigan. Statistical theory in clustering. J. Classification, 2(1):63-76, 1985. ISSN 0176-4268.
[12]
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall Advanced Reference Series. Prentice Hall Inc., Englewood Cliffs, NJ, 1988. ISBN 0-13-022278-X.
[13]
N. Jardine and R. Sibson. Mathematical Taxonomy. JohnWiley & Sons Ltd., London, 1971. Wiley Series in Probability and Mathematical Statistics.
[14]
D. G. Kendall, D. Barden, T. K. Carne, and H. Le. Shape and Shape Theory. Wiley Series in Probability and Statistics. John Wiley & Sons Ltd., Chichester, 1999. ISBN 0-471-96823-4.
[15]
J. M. Kleinberg. An impossibility theorem for clustering. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer, editors, NIPS, pages 446-453. MIT Press, 2002. ISBN 0-262-02550-7.
[16]
G. N. Lance and W. T. Williams. A general theory of classificatory sorting strategies 1. Hierarchical systems. Computer Journal, 9(4):373-380, February 1967. ISSN 0010-4620.
[17]
W. Stuetzle. Estimating the cluster type of a density by analyzing the minimal spanning tree of a sample. J. Classification, 20(1):25-47, 2003. ISSN 0176-4268.
[18]
U. von Luxburg and S. Ben-David. Towards a statistical theory of clustering. Technical report, PASCAL workshop on clustering, London, 2005.
[19]
D. Wishart. Mode analysis: a generalization of nearest neighbor which reduces chaining effects. In Numerical Taxonomy, pages 282-311. Academic Press, 1969.
[20]
R. Zadeh and S. Ben-David. A uniqueness theorem for clustering. In Proceedings of the Twenty-Fifth Conference Annual Conference on Uncertainty in Artificial Intelligence (UAI- 09), page 8, Corvallis, Oregon, 2009. AUAI Press.

Cited By

View all
  • (2024)Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant FactorJournal of the ACM10.1145/363945371:2(1-41)Online publication date: 10-Apr-2024
  • (2024)Forecasting financial market structure from network features using machine learningKnowledge and Information Systems10.1007/s10115-024-02095-666:8(4497-4525)Online publication date: 1-Aug-2024
  • (2024)Curvature Sets Over Persistence DiagramsDiscrete & Computational Geometry10.1007/s00454-024-00634-072:1(91-180)Online publication date: 1-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The Journal of Machine Learning Research
The Journal of Machine Learning Research  Volume 11, Issue
3/1/2010
3637 pages
ISSN:1532-4435
EISSN:1533-7928
Issue’s Table of Contents

Publisher

JMLR.org

Publication History

Published: 01 August 2010
Published in JMLR Volume 11

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)95
  • Downloads (Last 6 weeks)28
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant FactorJournal of the ACM10.1145/363945371:2(1-41)Online publication date: 10-Apr-2024
  • (2024)Forecasting financial market structure from network features using machine learningKnowledge and Information Systems10.1007/s10115-024-02095-666:8(4497-4525)Online publication date: 1-Aug-2024
  • (2024)Curvature Sets Over Persistence DiagramsDiscrete & Computational Geometry10.1007/s00454-024-00634-072:1(91-180)Online publication date: 1-Jul-2024
  • (2024)Extracting Persistent Clusters in Dynamic Data via Möbius InversionDiscrete & Computational Geometry10.1007/s00454-023-00590-171:4(1276-1342)Online publication date: 1-Jun-2024
  • (2023)Optimization of inter-group criteria for clustering with minimum size constraintsProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3666576(10339-10358)Online publication date: 10-Dec-2023
  • (2023)On the Discrepancy between Kleinberg’s Clustering Axioms and k-Means Clustering Algorithm BehaviorMachine Language10.1007/s10994-023-06308-x112:7(2501-2553)Online publication date: 1-Mar-2023
  • (2023)Forty years of color quantization: a modern, algorithmic surveyArtificial Intelligence Review10.1007/s10462-023-10406-656:12(13953-14034)Online publication date: 27-Apr-2023
  • (2023)The Ultrametric Gromov–Wasserstein DistanceDiscrete & Computational Geometry10.1007/s00454-023-00583-070:4(1378-1450)Online publication date: 1-Dec-2023
  • (2022)Order preserving hierarchical agglomerative clusteringMachine Language10.1007/s10994-021-06125-0111:5(1851-1901)Online publication date: 1-May-2022
  • (2022)Towards continuous consistency axiomApplied Intelligence10.1007/s10489-022-03710-153:5(5635-5663)Online publication date: 30-Jun-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media