Abstract
Although very widely used in unsupervised data mining, most clustering methods are affected by the instability of the resulting clusters w.r.t. the initialization of the algorithm (as e.g. in k-means). Here we show that this problem can be elegantly and efficiently tackled by meta-clustering the clusters produced in several different runs of the algorithm, especially if “soft” clustering algorithms (such as Nonnegative Matrix Factorization) are used both at the object- and the meta-level. The essential difference w.r.t. other meta-clustering approaches consists in the fact that our algorithm detects frequently occurring sub-clusters (rather than complete clusters) in the various runs, which allows it to outperform existing algorithms. Additionally, we show how to perform two-way meta-clustering, i.e. take both object and sample dimensions of clusters simultaneously into account, a feature which is essential e.g. for biclustering gene expression data, but has not been considered before.
Chapter PDF
Similar content being viewed by others
Keywords
- Nonnegative Matrix Factorization
- Nonnegative Matrix
- Nonnegativity Constraint
- Average Match
- Cluster Prototype
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bradley, P.S., Fayyad, U.M.: Refining Initial Points for K-Means Clustering. In: Proc. ICML 1998, pp. 91–99 (1998)
Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D.: Cluster analysis and display of genome-wide expression patterns. PNAS 95, 14863–14868 (1998)
Hoyer, P.O.: Non-negative sparse coding. Neural Networks for Signal Processing XII, 557–565 (2002)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401(6755), 788–791 (1999)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Proc. NIPS 2000. MIT Press, Cambridge (2001)
Welling, M., Weber, M.: Positive tensor factorization. Pattern Recognition Letters 22(12), 1255–1261 (2001)
Bezdek, J.C.: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York (1981)
Bauer, E., Kohavi, R.: An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Machine Learning 36, 105–139 (1999)
Cheng, Y., Church, G.: Biclustering of expression data. In: Proc. ISMB 2000, pp. 93–103 (2000)
Bhattacharjee, et al.: Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses. Proc. Natl. Acad. Sci. USA 98(24), 13790–13795 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Badea, L. (2005). Clustering and Metaclustering with Nonnegative Matrix Decompositions. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds) Machine Learning: ECML 2005. ECML 2005. Lecture Notes in Computer Science(), vol 3720. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564096_7
Download citation
DOI: https://doi.org/10.1007/11564096_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29243-2
Online ISBN: 978-3-540-31692-3
eBook Packages: Computer ScienceComputer Science (R0)