Abstract
We consider the problem of minimizing an objective function that depends on an orthonormal matrix. This situation is encountered, for example, when looking for common principal components. The Flury method is a popular approach but is not effective for higher dimensional problems. We obtain several simple majorization–minimization (MM) algorithms that provide solutions to this problem and are effective in higher dimensions. We use mixture model-based clustering applications to illustrate our MM algorithms. We then use simulated data to compare them with other approaches, with comparisons drawn with respect to convergence and computational time.
Similar content being viewed by others
References
Absil P-A, Mahony R, Sepulchre R (2008) Optimization algorithms on matrix manifolds. Princeton University Press, Princeton
Andrews JL, McNicholas PD (2012) Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat Comput 22(5):1021–1029
Arnold S, Phillips P (1999) Hierarchical comparison of genetic variance-covariance matrices. II. Coastal-inland divergence in the garter snake, Thamnophis elegans. Evolution 53:1516–1527
Banfield JD, Raftery AE (1993) Model-based Gaussian and non-Gaussian clustering. Biometrics 49(3): 803–821
Biernacki C, Celeux G, Govaert G, Langrognet F (2006) Model-based cluster analysis and discriminant analysis with the MIXMOD software. Comput Stat Data Anal 51:587–600
Boik RJ (2003) Principal component models for correlation matrices. Biometrika 90:679–701
Boik RJ (2007) Spectral models for covariance matrices. Biometrika 89:159–182
Bouveyron C, Girard S, Schmid C (2007) High-dimensional data clustering. Comput Stat and Data Anal 52:502–519
Browne RP, McNicholas PD (2012) Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models. Statistics and Computing. To appear. doi:10.1007/s11222-012-9364-2
Browne RP, McNicholas PD (2013) mixture: Mixture models for clustering and classification. R package version 1.0
Celeux G, Govaert G (1995) Gaussian parsimonious clustering models. Pattern Recogn 28(5):781–793
Dasgupta A, Raftery AE (1998) Detecting features in spatial point processes with clutter via model-based clustering. J Am Stat Assoc 93:294–302
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc Series B 39(1):1–38
Flury BW, Gautschi W (1984) Common principal components in k groups. J Am Stat Assoc 79(388): 892–898
Flury BW, Gautschi W (1986) An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. J Sci Stat Comput 7(1):169–184
Hunter D (2004) MM algorithms for generalized Bradley-Terry models. Ann Stat 32:386–408
Hunter D, Lange K (2000) Quantile regression via an MM algorithm. J Comput Graph Stat 9:60–77
Kiers H (2002) Setting up alternating least squares and iterative majorization algorithms for solving various matrix optimization problems. Comput Stat Data Anal 41:157–170
Klingenberg C, Neuenschwander B, Flury B (1996) Ontogeny and individual variation: Analysis of patterned covariance matrices with common principal components. Syste Biol 45:135–150
Krzanowski WJ (1990) Between-group analysis with heterogeneous covariance. matrices: The common principal component model. J Classif 7:81–98
Kulkarni B, Rao G (2000) The common principal components approach for clustering under multiple sampling. J Indian Soc Agric Stat 53:1–11
Lebret R, Iovleff S, Langrognet F (2012) Rmixmod: MIXture MODelling Package. R package version 1.1.1
Lefkomtch LP (2004) Consensus principal components. Biometrical J 35:567–580
Merbouha A, Mkhadri A (2004) Regularization of the location model in discrimination with mixed discrete and continuous variables. Comput Stat Data Anal 45:463–576
Oksanen J, Huttunen P (1989) Finding a common ordination for several data sets by individual differences scaling. Plant Ecol 83:137–145
R Development Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna
Schott J (1998) Estimating correlation matrices that have common eigenvectors. Comput Stat Data Anal 27:445–459
Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Sengupta S, Boyle J (1998) Using common principal components for comparing GCM simulations. J Climate 11:816–830
von Mises R, Pollaczek-Geiringer H (1929) Praktische verfahren der gleichungsauflösung. Zeitschrift für Angewandte Mathematik und Mechanik 9(1):58–77
Yang K, Shahabi C (2006) An efficient k nearest neighbor search for multivariate time series. Info Comput 205:65–98
Acknowledgments
The authors gratefully acknowledge the helpful comments of two anonymous reviewers and a guest editor. This work was supported by the University Research Chair in Computational Statistics.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Browne, R.P., McNicholas, P.D. Estimating common principal components in high dimensions. Adv Data Anal Classif 8, 217–226 (2014). https://doi.org/10.1007/s11634-013-0139-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-013-0139-1
Keywords
- Clustering
- Common principal components
- GPCM
- Flury
- Minimization
- Maximization
- Mixture
- Mixture models
- Model-based clustering
- MM algorithm.