Abstract
Recent advances in the matrix-variate model-based clustering literature have shown the growing interest for this kind of data modelization. In this framework, finite mixture models constitute a powerful clustering technique, despite the fact that they tend to suffer from overparameterization problems because of the high number of parameters to be estimated. To cope with this issue, parsimonious matrix-variate normal mixtures have been recently proposed in the literature. However, for many real phenomena, the tails of the mixture components of such models are lighter than required, with a direct effect on the corresponding fitting results. Thus, in this paper we introduce a family of 196 parsimonious mixture models based on the matrix-variate tail-inflated normal distribution, an elliptical heavy-tailed generalization of the matrix-variate normal distribution. Parsimony is reached by applying the well-known eigen-decomposition of the component scale matrices, as well as by allowing the tailedness parameters of the mixture components to be tied across groups. An AECM algorithm for parameter estimation is presented. The proposed models are then fitted to simulated and real data. Comparisons with parsimonious matrix-variate normal mixtures are also provided.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Biernacki, C., Celeux, G., Govaert, G.: Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models. Comput. Stat. Data Anal. 41(3–4), 561–575 (2003)
Browne, R.P., McNicholas, P.D.: Estimating common principal components in high dimensions. Adv. Data Anal. Classific. 8(2), 217–226 (2014)
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28(5), 781–793 (1995)
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. Ser. B (Methodol.) 39(1), 1–22 (1977)
Doğru, F.Z., Bulut, Y.M., Arslan, O.: Finite mixtures of matrix variate t distributions. Gazi Univ. J. Sci. 29(2), 335–341 (2016)
Farcomeni, A., Punzo, A.: Robust model-based clustering with mild and gross outliers. Test 29(4), 989–1007 (2020)
Gallaugher, M.P.B., McNicholas, P.D.: Finite mixtures of skewed matrix variate distributions. Pattern Recognit. 80, 83–93 (2018)
Gupta, A.K., Varga, T., Bodnar, T.: Elliptically Contoured Models in Statistics and Portfolio Theory. Springer, New York (2013)
Leisch, F.: Flexmix: a general framework for finite mixture models and latent glass regression in R. J. Stat. Softw. 11(8), 1–18 (2004)
McLachlan, G.J., Krishnan, T.: The EM Algorithm and Extensions. Wiley (2007)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000)
Melnykov, V., Melnykov, I.: Initializing the EM algorithm in Gaussian mixture models with an unknown number of components. Comput. Stat. Data Anal. 56(6), 1381–1395 (2012)
Melnykov, V., Zhu, X.: On model-based clustering of skewed matrix data. J. Multivar. Anal. 167, 181–194 (2018)
Melnykov, V., Zhu, X.: Studying crime trends in the USA over the years 2000–2012. Adv. Data Anal. Classific. 13(1), 325–341 (2019)
Meng, X.L., Van Dyk, D.: The EM algorithm-an old folk-song sung to a fast new tune. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 59(3), 511–567 (1997)
Michael, S., Melnykov, V.: An effective strategy for initializing the EM algorithm in finite mixture models. Adv. Data Anal. Classific. 10(4), 563–583 (2016)
Sarkar, S., Zhu, X., Melnykov, V., Ingrassia, S.: On parsimonious models for modeling matrix data. Comput. Stat. Data Anal. 142, 106822 (2020)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6(2), 461–464 (1978)
Tomarchio, S.D., Punzo, A., Bagnato, L.: Two new matrix-variate distributions with application in model-based clustering. Comput. Stat. Data Anal. 152, 107050 (2020)
Tomarchio, S.D., Gallaugher, M.P.B., Punzo, A., McNicholas, P.D.: Mixtures of matrix-variate contaminated normal distributions. J. Comput. Graph. Stat. 31(2), 413–421 (2022)
Tomarchio, S.D., McNicholas, P.D., Punzo, A.: Matrix normal cluster-weighted models. J. Classific. 38(3), 556–575 (2021)
Viroli, C.: Finite mixtures of matrix normal distributions for classifying three-way data. Stat. Comput. 21(4), 511–522 (2011)
Viroli, C.: Model based clustering for three-way data structures. Bayesian Anal. 6(4), 573–602 (2011)
Zhu, X., Melnykov V.: MatTransMix: an R package for clustering matrices. R package version 0.1.15 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix A
Let \(\ddot{\textbf{V}}=\displaystyle \sum _{g=1}^G \ddot{\textbf{V}}_g\), where \(\ddot{\textbf{V}}_g = \displaystyle \sum _{i=1}^N \ddot{z}_{ig}\ddot{w}_{ig}\left( \textbf{X}_{i}-\ddot{\textbf{M}}_g\right) \dot{\boldsymbol{\Psi }}_g^{-1}\left( \textbf{X}_{i}-\ddot{\textbf{M}}_g\right) '\). Then, we have the following updates:
-
Model EII
$$\begin{aligned} {\ddot{\lambda }} = \frac{{{\,\text{ tr }}\left\{ \ddot{\textbf{V}}\right\} }}{prN}; \end{aligned}$$ -
Model VII
$$\begin{aligned} \ddot{\lambda }_g = \frac{{{\,\text{ tr }}\left\{ \ddot{\textbf{V}}_g\right\} }}{pr \displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EEI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }} =\frac{\text {diag}\left( \ddot{\textbf{V}}\right) }{ \left| \text {diag}\left( \ddot{\textbf{V}}\right) \right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda } = \frac{\left| \text {diag}\left( \ddot{\textbf{V}}\right) \right| ^\frac{1}{p}}{rN}; \end{aligned}$$ -
Model VEI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }} = \frac{\text {diag}\left( \displaystyle \sum \limits _{g=1}^G \dot{\lambda }_g^{-1}\ddot{\textbf{V}}_g\right) }{\left| \text {diag}\left( \displaystyle \sum \limits _{g=1}^G \dot{\lambda }_g^{-1}\ddot{\textbf{V}}_g\right) \right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda }_g = \frac{{{\,\text{ tr }}\left\{ \ddot{\boldsymbol{\Delta }}^{-1} \ddot{\textbf{V}}_g \right\} }}{pr\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EVI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }}_g = \frac{\text {diag}\left( \ddot{\textbf{V}}_g\right) }{\left| \text {diag}\left( \ddot{\textbf{V}}_g\right) \right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda } = \frac{\displaystyle \sum \limits _{g=1}^G\left| \text {diag}\left( \ddot{\textbf{V}}_g\right) \right| ^\frac{1}{p}}{rN}; \end{aligned}$$ -
Model VVI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }}_g = \frac{\text {diag}\left( \ddot{\textbf{V}}_g\right) }{\left| \text {diag}\left( \ddot{\textbf{V}}_g\right) \right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda }_g = \frac{\left| \text {diag}\left( \ddot{\textbf{V}}_g\right) \right| ^\frac{1}{p}}{r\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EEE
$$\begin{aligned} \ddot{\boldsymbol{\Sigma }}= \frac{\ddot{\textbf{V}}}{rN}; \end{aligned}$$ -
Model VEE
$$\begin{aligned} \ddot{\boldsymbol{\Gamma }}\ddot{\boldsymbol{\Delta }}\ddot{\boldsymbol{\Gamma }}' = \frac{\displaystyle \sum \limits _{g=1}^G \dot{\lambda }_g^{-1}\ddot{\textbf{V}}_g}{\left| \displaystyle \sum \limits _{g=1}^G \dot{\lambda }_g^{-1}\ddot{\textbf{V}}_g\right| ^{\frac{1}{p}}} \quad \text {and} \quad \ddot{\lambda }_g = \frac{{{\,\text{ tr }}\left\{ (\ddot{\boldsymbol{\Gamma }}\ddot{\boldsymbol{\Delta }}\ddot{\boldsymbol{\Gamma }}')^{-1} \ddot{\textbf{V}}_g \right\} }}{pr\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EVE
For this model, there is no analytical solution for \(\boldsymbol{\Gamma }\). Thus, an iterative minorization-maximization (MM) algorithm [2] is implemented. Specifically, the following surrogate function is defined
$$\begin{aligned} f\left( \boldsymbol{\Gamma }\right) = \sum \limits _{g=1}^G \,\text{ tr }\left\{ \textbf{V}_g\boldsymbol{\Gamma }\boldsymbol{\Delta }_{k}^{-1}\boldsymbol{\Gamma }'\right\} \le S + \,\text{ tr }\left\{ \dot{\textbf{F}}\boldsymbol{\Gamma }\right\} , \end{aligned}$$where S is a constant and \(\dot{\textbf{F}} = \displaystyle \sum _{g=1}^G\left( \boldsymbol{\Delta }_{k}^{-1} \dot{\boldsymbol{\Gamma }}' \textbf{V}_g - e_g \boldsymbol{\Delta }_{k}^{-1} \dot{\boldsymbol{\Gamma }}'\right) \), with \(e_g\) being the largest eigenvalue of \(\textbf{V}_g\). The update of \(\boldsymbol{\Gamma }\) is given by \(\ddot{\boldsymbol{\Gamma }} = \dot{\textbf{G}} \dot{\textbf{H}} '\), where \(\dot{\textbf{G}}\) and \(\dot{\textbf{H}}\) are obtained from the singular value decomposition of \(\dot{\textbf{F}}\). This process is repeated until a specified convergence criterion is met and the estimate \(\ddot{\boldsymbol{\Gamma }}\) is obtained from the last iteration. Then, we obtain
$$\begin{aligned} {\ddot{{\boldsymbol{\Delta }}}}_{g} = \frac{{\text {diag}}\left( {\ddot{{\boldsymbol{\Gamma }}}'} {\ddot{{\textbf{V}}}}_{g} {\ddot{{\boldsymbol{\Gamma }}}}\right) }{{\left| {\text {diag}}\left( {\ddot{{\boldsymbol{\Gamma }}}'}{\ddot{{\textbf{V}}}}_{g} {\ddot{{\boldsymbol{\Gamma }}}}\right) \right| ^{\frac{1}{p}}}} \quad \text {and} \quad {\ddot{\lambda }} = \frac{\sum \limits _{g=1}^{G}{\text {tr}}\left( \ddot{\boldsymbol{\Gamma }} \ddot{\boldsymbol{\Delta }}_{g}^{-1} \ddot{\boldsymbol{\Gamma }}'\ddot{\textbf{V}}_{g}\right) }{prN}; \end{aligned}$$ -
Model VVE
Similarly to the EVE case, there is no analytical solution for \(\boldsymbol{\Gamma }\), and the MM algorithm described above is implemented. Then, we have
$$\begin{aligned} \ddot{\boldsymbol{\Delta }}_g = \frac{\text {diag}\left( \ddot{\boldsymbol{\Gamma }}' \ddot{\textbf{V}}_g \ddot{\boldsymbol{\Gamma }}\right) }{\left| \text {diag}\left( \ddot{\boldsymbol{\Gamma }}' \ddot{\textbf{V}}_g \ddot{\boldsymbol{\Gamma }}\right) \right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda }_g = \frac{\left| \text {diag}\left( \ddot{\boldsymbol{\Gamma }}' \ddot{\textbf{V}}_g \ddot{\boldsymbol{\Gamma }}\right) \right| ^{\frac{1}{p}}}{r\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EEV
Consider the eigen-decomposition \(\textbf{V}_g=\textbf{L}_g \textbf{D}_g \textbf{L}_g'\), with eigenvalues in the diagonal matrix \(\textbf{D}_g\) following descending order and orthogonal matrix \(\textbf{L}_g\) composed of the corresponding eigenvectors. Then, we obtain
$$\begin{aligned} \ddot{\boldsymbol{\Gamma }}_g=\ddot{\textbf{L}}_g , \quad \ddot{\boldsymbol{\Delta }} = \frac{\displaystyle \sum \limits _{g=1}^G \ddot{\textbf{D}}_g}{\left| \displaystyle \sum \limits _{g=1}^G \ddot{\textbf{D}}_g\right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda } = \frac{\left| \displaystyle \sum \limits _{g=1}^G \ddot{\textbf{D}}_g\right| ^\frac{1}{p}}{rN}; \end{aligned}$$ -
Model VEV
By using the same algorithm applied for the EEV model, we have
$$\begin{aligned} \ddot{\boldsymbol{\Gamma }}_g=\ddot{\textbf{L}}_g , \quad \ddot{\boldsymbol{\Delta }} = \frac{\displaystyle \sum \limits _{g=1}^G \lambda _g^{-1} \ddot{\textbf{D}}_g}{\left| \displaystyle \sum \limits _{g=1}^G \lambda _g^{-1} \ddot{\textbf{D}}_g\right| ^\frac{1}{p}} \quad \text {and} \quad \ddot{\lambda }_g = \frac{{{\,\text{ tr }}\left\{ \ddot{\textbf{D}}_g \ddot{\boldsymbol{\Delta }}^{-1} \right\} }}{pr\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}; \end{aligned}$$ -
Model EVV
$$\begin{aligned} \ddot{\boldsymbol{\Gamma }}_g\ddot{\boldsymbol{\Delta }}_g\ddot{\boldsymbol{\Gamma }}_g' = \frac{\ddot{\textbf{V}}_g}{\left| \ddot{\textbf{V}}_g\right| ^{\frac{1}{p}}} \quad \text {and} \quad \ddot{\lambda } = \frac{\displaystyle \sum \limits _{g=1}^G \left| \ddot{\textbf{V}}_g\right| ^{\frac{1}{p}}}{rN}; \end{aligned}$$ -
Model VVV
$$\begin{aligned} \ddot{\boldsymbol{\Sigma }}_g = \frac{\ddot{\textbf{V}}_g}{r\displaystyle \sum _{i=1}^N \ddot{z}_{ig}}. \end{aligned}$$
Appendix B
Let \(\ddot{\textbf{W}}=\displaystyle \sum _{g=1}^G \ddot{\textbf{W}}_g\), where \(\ddot{\textbf{W}}_g = \displaystyle \sum _{i=1}^N \ddot{z}_{ig}\ddot{w}_{ig}\left( \textbf{X}_{i}-\ddot{\textbf{M}}_{g}\right) '\ddot{\boldsymbol{\Sigma }}_g^{-1}\left( \textbf{X}_{i}-\ddot{\textbf{M}}_{g}\right) \). With the exclusion of the II model, for which no parameters need to be estimated, we have the following updates:
-
Model EI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }} = \frac{\text {diag}\left( \ddot{\textbf{W}}\right) }{\left| \text {diag}\left( \ddot{\textbf{W}}\right) \right| ^\frac{1}{r}}; \end{aligned}$$ -
Model VI
$$\begin{aligned} \ddot{\boldsymbol{\Delta }}_g = \frac{\text {diag}\left( \ddot{\textbf{W}}_g\right) }{\left| \text {diag}\left( \ddot{\textbf{W}}_g\right) \right| ^\frac{1}{r}}; \end{aligned}$$ -
Model EE
$$\begin{aligned} \ddot{\boldsymbol{\Psi }} = \frac{\ddot{\textbf{W}}}{\left| \ddot{\textbf{W}} \right| ^\frac{1}{r}}; \end{aligned}$$ -
Model VE
As for the EVE and VVE models, there is no analytical solution for \(\boldsymbol{\Gamma }\) and a MM algorithm of the type described for the EVE model is implemented, after replacing \(\textbf{V}\) with \(\textbf{W}\). Then, we have
$$\begin{aligned} \ddot{\boldsymbol{\Delta }}_g= \frac{\text {diag}\left( \ddot{\boldsymbol{\Gamma }}' \ddot{\textbf{W}}_g \ddot{\boldsymbol{\Gamma }}\right) }{\left| \text {diag}\left( \ddot{\boldsymbol{\Gamma }}' \ddot{\textbf{W}}_g \ddot{\boldsymbol{\Gamma }}\right) \right| ^\frac{1}{r}}; \end{aligned}$$ -
Model EV
By using the same approach of the EEV and VEV models, after replacing \(\ddot{\textbf{V}}\) with \(\ddot{\textbf{W}}\), we have
$$\begin{aligned} \ddot{\boldsymbol{\Gamma }}_g=\ddot{\textbf{L}}_g \quad \text {and} \quad \ddot{\boldsymbol{\Delta }} = \frac{\displaystyle \sum \limits _{g=1}^G \ddot{\textbf{D}}_g}{\left| \displaystyle \sum \limits _{g=1}^G \ddot{\textbf{D}}_g\right| ^\frac{1}{r}}; \end{aligned}$$ -
Model VV
$$\begin{aligned} \ddot{\boldsymbol{\Psi }}_g = \frac{\ddot{\textbf{W}}_g}{\left| \ddot{\textbf{W}}_g\right| ^\frac{1}{r}}. \end{aligned}$$
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Tomarchio, S.D., Punzo, A., Bagnato, L. (2022). On the Use of the Matrix-Variate Tail-Inflated Normal Distribution for Parsimonious Mixture Modeling. In: Salvati, N., Perna, C., Marchetti, S., Chambers, R. (eds) Studies in Theoretical and Applied Statistics . SIS 2021. Springer Proceedings in Mathematics & Statistics, vol 406. Springer, Cham. https://doi.org/10.1007/978-3-031-16609-9_24
Download citation
DOI: https://doi.org/10.1007/978-3-031-16609-9_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16608-2
Online ISBN: 978-3-031-16609-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)