Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification

Published: 01 June 2013 Publication History

Abstract

Given a row-stochastic matrix describing pairwise similarities between data objects, spectral clustering makes use of the eigenvectors of this matrix to perform dimensionality reduction for clustering in fewer dimensions. One example from this class of algorithms is the Robust Perron Cluster Analysis (PCCA+), which delivers a fuzzy clustering. Originally developed for clustering the state space of Markov chains, the method became popular as a versatile tool for general data classification problems. The robustness of PCCA+, however, cannot be explained by previous perturbation results, because the matrices in typical applications do not comply with the two main requirements: reversibility and nearly decomposability. We therefore demonstrate in this paper that PCCA+ always delivers an optimal fuzzy clustering for nearly uncoupled, not necessarily reversible, Markov chains with transition states.

References

[1]
Bapat RB, Rhagavan TES (1997) Nonnegative matrices and applications. Cambridge University Press, Cambridge.
[2]
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15:1373-1396.
[3]
Bezdek JC, Ehrlich R, Full W (1984) The fuzzy c-means clustering algorithm. Comput Geosci 10(2-3): 191-203.
[4]
Bowman GR (2012) Coarse-grained Markov chains capture molecular thermodynamics and kinetics in no uncertain terms. arxiv.org/abs/1201.3867.
[5]
Bowman GR, Beauchamp KA, Boxer G, Pande VS (2009) Progress and challenges in the automated construction of Markov state models for full protein systems. J Chem Phys 131(12):124101.
[6]
Brémaud P (1999) Markov Chains: Gibbs Fields, Monte Carlo simulation, and Queues. Number 31 in texts in applied mathematics. Springer, New York.
[7]
Chodera JD, Singhal N, Swope WC, Pande VS, Dill KA (2007) Automatic discovery of metastable states for the construction of Markov models of macromolecular conformational dynamics. J Chem Phys 126(155101).
[8]
Coifman RR, Lafon S (2006) Diffusion maps. Appl Comput Harmon Anal 21:5-30.
[9]
Courtois PJ (1977) Decomposability: Queueing and computer system applications. Academic Press, Orlando.
[10]
Dellnitz M, Junge O (1999) On the approximation of complicated dynamical behavior. SIAM J Numer Anal 36(2):491-515.
[11]
Deuflhard P (2003) From molecular dynamics to conformational dynamics in drug design. In: Kirkilionis M, Krömker S, Rannacher R, Tomi F (eds) Trends in nonlinear analysis. Springer, Berlin, pp 269-287.
[12]
Deuflhard P, Weber M (2005) Robust Perron cluster analysis in conformation dynamics. Linear Algebra Appl 398:161-184.
[13]
Deuflhard P, Huisinga W, Fischer A, Schütte Ch (2000) Identification of almost invariant aggregates in reversible nearly uncoupled Markov chains. Linear Algebra Appl 315:39-59.
[14]
Fackeldey K, Bujotzek A, Weber M (2013) A meshless discretization method for Markov state models applied to explicit water peptide folding simulations. In: Griebel M, Schweitzer MA (eds) Meshfree methods for partial differential equations VI, volume 89 of Lecture Notes in Computational Science and Engineering. Springer, Berlin, pp 141-154.
[15]
Fischer B, Buhmann JM (2002) Data resampling for path based clustering. In: Proceedings of the 24th DAGM symposium on pattern regognition, volume 2449 of Lecture Notes in Computer Science. Springer, London, pp 206-214.
[16]
Fischer I, Poland J (2005) Amplifying the block matrix structure for spectral clustering. In: van Otterlo M, Poel M, Nijholt A (eds) Proceedings of the 14th annual machine learning conference of Belgium and the Netherlands, pp 21-28.
[17]
Fischer B, Zöller T, Buhmann J (2001) Path based pairwise data clustering with application to texture segmentation. In: Energy minimization methods in computer vision and pattern recognition, volume 2134 of Lecture Notes in Computer Science. Springer, Berlin, pp 235-250.
[18]
Frenkel D, Smit B (2002) Understanding molecular simulation: from Algorithms to applications, volume 1 of computational science series. Academic Press, London.
[19]
Halgren T, Nachbar B (1996) Merck molecular force field. IV. Conformational energies and geometries for MMFF94. J Comput Chem 17(5-6):587-615.
[20]
Jimenez R (2008) Fuzzy spectral clustering for identification of rock discontinuity sets. Rock Mech Rock Eng 41:929-939.
[21]
Kannan R, Vempala S, Vetter A (2004) On clustering: good, bad and spectral. J ACM 51:497-515.
[22]
Kato T (1984) Perturbation theory for linear operators. Springer, Berlin.
[23]
Kijima M (1997) Markov processes for stochastic modeling. Chapman and Hall, Stochastic Modeling Series.
[24]
Korenblum D, Shalloway D (2003) Macrostate data clustering. Phys Rev E 67:056704.
[25]
Kube S, Deuflhard P (2006) Errata on "Robust Perron Cluster Analysis in Conformation Dynamics". December. http://www.zib.de/susanna.roeblitz
[26]
Kube S, Weber M (2006) Coarse grained molecular kinetics. ZIB-Report 06-35, Zuse Institute Berlin.
[27]
Kube S, Weber M (2007) A coarse graining method for the identification of transition rates between molecular conformations. J Chem Phys 126(2).
[28]
Lehoucq RB, Sorensen DC (1996) Deflation techniques for an implicitly re-started Arnoldi iteration. SIAM J Matrix Anal Appl 17(4):789-821.
[29]
Metzner P, Weber M, Schütte C (2010) Observation uncertainty in reversible Markov chains. Phys Rev E Stat Nonlinear Soft Matter Phys 82:031114.
[30]
Meyer CD (1989) Stochastic complementation, uncoupling Markov chains, and the theory of nearly reducible systems. SIAM Rev 31(2):240-272.
[31]
Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2005) Diffusion maps, spectral clustering and eigenfunctions of Fokker-Planck operators. In: Advances in neural information processing systems, vol. 18. MIT Press, Cambridge, pp 955-962.
[32]
Nadler B, Lafon S, Coifman RR, Kevrekidis IG (2006) Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Appl Comput Harmon Anal 21(1):113-127.
[33]
Ng A, Jordan M, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems, vol 14. MIT Press, Cambridge, pp 849-856.
[34]
Prinz J-H, Wu H, Sarich M, Keller B, Fischbach M, Held M, Chodera JD, Schütte Ch, Noé F (2011) Markov models of molecular kinetics: Generation and validation. J Chem Phys 134:174105.
[35]
Röblitz S (2008) Statistical error estimation and grid-free hierarchical refinement in conformation dynamics. Doctoral thesis, Department of Mathematics and Computer Science, Freie Universität Berlin. http://www.diss.fu-berlin.de/diss/receive/FUDISS_thesis_000000008079
[36]
Röblitz S, Weber M (2009) Fuzzy spectral clustering by PCCA+. In: Mucha H-J, Ritter G (eds) Classification and clustering: models, software and applications, number 26 in WIAS Report, Berlin. WIAS Berlin, WIAS Berlin, pp 73-79.
[37]
Sarich M, Noé F, Schütte Ch (2010) On the approximation quality of Markov state models. Multiscale Model Simul 8(4):1154-1177.
[38]
Schütte Ch (1999) Conformational dynamics: modelling, theory, algorithms, and application to biomolecules. Habilitation thesis, Department of Mathematics and Computer Science, Freie Universität Berlin.
[39]
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEE Trans Pattern Anal Mach Intell 22(8):888-905.
[40]
Sleijpen GLG, van der Vorst HA (1996) A Jacobi-Davidson iteration method for linear eigenvalue problems. SIAM J Matrix Anal Appl 17(2):401-425.
[41]
Stewart GW (1984) On the structure of nearly uncoupled Markov chains. In: Iazeolla G, Courtois PJ, Hordijk A (eds) Mathematical computer performance and reliability. Elsevier, New York, pp 287-302.
[42]
Stewart GW, Ji-guang Sun (1990) Matrix perturbation theory. Computer science and scientific computing. Academic Press, Boston.
[43]
von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17:395-416.
[44]
Weber M (2003) Improved Perron cluster analysis. ZIB-Report 03-04, Zuse Institute Berlin (ZIB).
[45]
WeberM(2006) Meshless methods in conformation dynamics. Doctoral thesis, Department of Mathematics and Computer Science, Freie Universität Berlin. Verlag Dr. Hut, München.
[46]
Weber M (2013) Adaptive spectral clustering in molecular simulations. In: Giusti A, Ritter G, Vichi M (eds) Classification and data mining. Springer, Berlin, pp 147-154.
[47]
Weber M, Galliat T (2002) Characterization of transition states in conformational dynamics using Fuzzy sets. ZIB-Report 02-12, Zuse Institute Berlin.
[48]
Weber M, Rungsarityotin W, Schliep A (2006) An indicator for the number of clusters using a linear map to simplex structure. In: Spiliopoulou M, Kruse R, Borgelt C, Nürnberger A, Gaul W (eds) From data and information analysis to knowledge engineering, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 103-110.
[49]
White B, Shalloway D (2009) Efficient uncertainty minimization for fuzzy spectral clustering. Phys Rev E 80:056704.
[50]
Zhao F, Liu H, Jiao L (2011) Spectral clustering with fuzzy similarity measure. Digit Signal Process 21:701-709.

Cited By

View all
  • (2024)Spectral clustering of Markov chain transition matrices with complex eigenvaluesJournal of Computational and Applied Mathematics10.1016/j.cam.2024.115791444:COnline publication date: 1-Jul-2024
  • (2024)Uncovering Dynamic Structures Within Cyclic Attractors of Asynchronous Boolean Networks with Spectral ClusteringComputational Methods in Systems Biology10.1007/978-3-031-71671-3_16(226-246)Online publication date: 17-Sep-2024
  • (2023)Implicit transfer operator learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667704(36449-36462)Online publication date: 10-Dec-2023
  1. Fuzzy spectral clustering by PCCA+: application to Markov state models and data classification

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Advances in Data Analysis and Classification
    Advances in Data Analysis and Classification  Volume 7, Issue 2
    June 2013
    112 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 01 June 2013

    Author Tags

    1. 60J10
    2. 62H30
    3. 65K05
    4. Molecular simulations
    5. Perron eigenvalues
    6. Perturbation theory

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Spectral clustering of Markov chain transition matrices with complex eigenvaluesJournal of Computational and Applied Mathematics10.1016/j.cam.2024.115791444:COnline publication date: 1-Jul-2024
    • (2024)Uncovering Dynamic Structures Within Cyclic Attractors of Asynchronous Boolean Networks with Spectral ClusteringComputational Methods in Systems Biology10.1007/978-3-031-71671-3_16(226-246)Online publication date: 17-Sep-2024
    • (2023)Implicit transfer operator learningProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667704(36449-36462)Online publication date: 10-Dec-2023
    • (undefined)Fuzzy density based clustering method: Soft DBSCAN-GM2016 IEEE 8th International Conference on Intelligent Systems (IS)10.1109/IS.2016.7737459(443-448)

    View Options

    View options

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media