Sparse representation of precision matrices used in GMMs

Branko Brkljač¹,
Marko Janev²,
Radovan Obradović¹,
Danilo Rapaić¹,
Nebojša Ralević¹ &
…
Vladimir Crnojević¹

277 Accesses
7 Citations
Explore all metrics

Abstract

The paper presents a novel precision matrix modeling technique for Gaussian Mixture Models (GMMs), which is based on the concept of sparse representation. Representation coefficients of each precision matrix (inverse covariance), as well as an accompanying overcomplete matrix dictionary, are learned by minimizing an appropriate functional, the first component of which corresponds to the sum of Kullback-Leibler (KL) divergences between the initial and the target GMM, and the second represents the sparse regularizer of the coefficients. Compared to the existing, alternative approaches for approximate GMM modeling, like popular subspace-based representation methods, the proposed model results in notably better trade-off between the representation error and the computational (memory) complexity. This is achieved under assumption that the training data in the recognition system utilizing GMM have an inherent sparseness property, which enables application of the proposed model and approximate representation using only one dictionary and a significantly smaller number of coefficients. Proposed model is experimentally compared with the Subspace Precision and Mean (SPAM) model, a state of the art instance of subspace-based representation models, using both the data from a real Automatic Speech Recognition (ASR) system, and specially designed sets of artificially created/synthetic data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Article 25 October 2016

Gaussian mixture model decomposition of multivariate signals

Article Open access 29 October 2021

Sparse representation and reproduction of speech signals in complex Fourier basis

Article 26 November 2021

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

¹ Usual term in image processing when working with sparse image representations, see e.g. [1, 11, 12, 32].
² $\lVert {x}\rVert _{l_{0}}$ represents the number of nonzero coefficients in vector $x \in {\mathbb {R}^{n}}$, while $\lVert {x}\rVert _{l_{1}}$ represents its convex relaxation.
There, a distinction was made between inner product and scalar multiplications, and the number of scalar and vector additions was also considered.
⁴ Sub-gradient ∂ f(x) of a convex function $f: D\subset \mathbb {R}^{d} \rightarrow \mathbb {R}$ obtained in some x ∈ D is defined as the set of all $a \in \mathbb {R}^{d}$, such that $f(y)-f(x)\ge {\langle a\vert x-y \rangle }_{\mathbb {R}^{d}}$.
⁵ Note that each matrix P can be written as P = QΛQ, where Λ is diagonal matrix whose entries are eigenvalues of P, and Q is orthogonal matrix from COE.

References

Aharon M, Bruckstein MEA (2006) K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process 54(11):4311–4322
Article Google Scholar
Axelrod S, Gopinath R, Olsen P (2002) Modeling with a subspace constraint on inverse covariance matrices. In: Proceedings of the ISCA internaional conference on spoken language processing, pp 2177–2180
Axelrod S, Goel V, Gopinath RA, Olsen PA, Visweswariah K (2005) Subspace constrained Gaussian mixture models for speech recognition. IEEE Trans Speech Audio Process 13(6):1144– 1160
Article Google Scholar
Bertolami R, Bunke H (2008) Hidden Markov model-based ensemble methods for offline handwritten text line recognition. Pattern Recog 41(11):3452–3460
Article MATH Google Scholar
Boyd SP, Vandenberghe L (2004) Convex optimization. Cambridge University Press
Burget L, Schwarz P, Agarwal M, Akyazi P, Kai F, Glembek O, Goel N, Karafiát M, Povey D, Rastrow A, Rose RC, Thomas S (2010) Multilingual acoustic modeling for speech recognition based on subspace Gaussian mixture models. In: Proceedings of the IEEE international conference on acoustics speech and signal processing, pp 4334–4337
Cai R, Hao Z, Wen W, Wang L (2013) Regularized Gaussian mixture model based discretization for gene expression data association mining. Appl Intell 39(3):607–613
Article Google Scholar
Chen J, Zhang B, Cao H, Prasad R, Natarajan P (2012a) Applying discriminatively optimized feature transform for HMM-based off-line handwriting recognition. In: Proceedings of the IEEE international conference on frontiers in handwriting recognition, pp 219–224
Chen L, Mao X, Wei P, Xue Y, Ishizuka M (2012b) Mandarin emotion recognition combining acoustic and emotional point information. Appl Intell 37(4):602–612
Article Google Scholar
Dharanipragada S, Visweswariah K (2006) Gaussian mixture models with covariances or precisions in shared multiple subspaces. IEEE Trans Speech Audio Process 14(4):1255– 1266
Article Google Scholar
Elad M (2010) Sparse and redundant representations: from theory to applications in signal and image processing. Springer Verlag
Elad M, Figueiredo MAT, Ma Y (2010) On the role of sparse and redundant representations in image processing. Proc IEEE 98(6):972–982
Article Google Scholar
Gales MJF (1999) Semi-tied covariance matrices for hidden Markov models. IEEE Trans Speech Audio Process 7(3):272–281
Article Google Scholar
Gopinath RA (1998) Maximum likelihood modeling with Gaussian distributions for classification. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, vol 2, pp 661–664
Günter S, Bunke H (2004) HMM-based handwritten word recognition: on the optimization of the number of states, training iterations and Gaussian components. Pattern Recog 37(10):2069–2079
Article Google Scholar
Hershey JR, Olsen PA (2007) Approximating the Kullback Leibler divergence between Gaussian mixture models. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, vol 4, pp 317–320
Horn RA, Johnson CR (1990) Matrix analysis. Cambridge University Press
Hörster E, Lienhart R, Slaney M (2008) Continuous visual vocabulary models for pLSA-based scene recognition. In: Proceedings of the ACM international conference on content-based image and video retrieval, pp 319–328
Inoue N, Shinoda K (2012) A fast and accurate video semantic-indexing system using fast MAP adaptation and GMM supervectors. IEEE Trans Multimedia 14(4):1196–1205
Article Google Scholar
Janev M, Pekar D, Jakovljević N, Delić V (2010) Eigenvalues driven Gaussian selection in continuous speech recognition using HMMs with full covariance matrices. Appl Intell 33(2):107– 116
Article Google Scholar
Kannan A, Ostendorf N, Rohlicek J R (1994) Maximum likelihood clustering of Gaussian mixtures for speech recognition. IEEE Trans Speech Audio Process 2(3):453–455
Article Google Scholar
Liwicki M, Bunke H (2009) Combining diverse on-line and off-line systems for handwritten text line recognition. Pattern Recog 42(12):3254–3263
Article MATH Google Scholar
Mezzadri F (2007) How to generate random matrices from the classical compact groups. AMS Not 54(5):592–04
MathSciNet MATH Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer Verlag
Olsen P A, Gopinath R A (2004) Modeling inverse covariance matrices by basis expansion. IEEE Trans Speech Audio Process 12(1):37–46
Article Google Scholar
Perkins S, Theiler J (2003) Online feature selection using Grafting. In: Proceedings of the IMLS international conference on machine learning, vol 20, pp 592–599
Perkins S, Lacker K, Theiler J (2003) Grafting: fast, incremental feature selection by gradient descent in function space. J Mach Learn Res 3:1333–1356
MathSciNet MATH Google Scholar
Popović B, Janev M, Pekar D, Jakovljević N, Gnjatović M, Sečujski M, Delić V (2012) A novel split-and-merge algorithm for hierarchical clustering of Gaussian mixture models. Appl Intell 37(3):377–389
Article Google Scholar
Povey D (2009) A tutorial-style introduction to subspace Gaussian mixture models for speech recognition. Tech. Rep. MSR-TR-2009-111. Microsoft Research, Redmond, WA
Povey D, Burget L, Agarwal M, Akyazi P, Feng K, Ghoshal A, Glembek O, Goel NK, Karafiát M, Rastrow A, Rose RC, Schwarz P, Thomas S (2010) Subspace Gaussian mixture models for speech recognition. In: Proceedings of the IEEE international conference on acoustics speech and signal processing, pp 4330–4333
Povey D, Burget L, Agarwal M, Akyazi P, FKai Ghoshal A, Glembek O, Goel N, Karafiát M, Rastrow A, Rose R C, Schwarz P, Thomas S (2011) The subspace Gaussian mixture modela structured model for speech recognition. Comput Speech Lang 25(2):404–439
Article Google Scholar
Rubinstein R, Bruckstein AM, Elad M (2010) Dictionaries for sparse representation modeling. Proc IEEE 98(6):1045–1057
Article Google Scholar
Schmidt M, Fung G, Rosaless R (2009) Optimization methods for ℓ ₁-regularization. Tech. Rep. TR-2009-19, University of British Columbia
Spall JC (2003) Introduction to stochastic search and optimization - Estimation, simulation and control. Wiley
Trefethen LN, Bau D (1997) Numerical linear algebra. 50, SIAM
Vanhoucke V, Sankar A (2004) Mixtures of inverse covariances. IEEE Trans Speech Audio Process 12(3):250–264
Article Google Scholar
Wang Y, Huo Q (2009) Modeling inverse covariance matrices by expansion of tied basis matrices for online handwritten Chinese character recognition. Pattern Recog 42(12):3296–3302
Article MATH Google Scholar
Webb AR (2002) Statistical pattern recognition. Wiley
Wright SJ, Nowak RD, Figueiredo MAT (2009) Sparse reconstruction by separable approximation. IEEE Trans Signal Process 57(7):2479–2493
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research work has been supported by the Ministry of Science and Technology of Republic of Serbia, as part of projects: III44003, III43002 and TR32035.

Author information

Authors and Affiliations

Faculty of Technical Sciences (FTN), University of Novi Sad, Trg Dositeja Obradovića 6, 21000, Novi Sad, Republic of Serbia
Branko Brkljač, Radovan Obradović, Danilo Rapaić, Nebojša Ralević & Vladimir Crnojević
Mathematical Institute of the Serbian Academy of Sciences and Arts, Kneza Mihaila 36, 11001, Belgrade, Republic of Serbia
Marko Janev

Authors

Branko Brkljač
View author publications
You can also search for this author in PubMed Google Scholar
Marko Janev
View author publications
You can also search for this author in PubMed Google Scholar
Radovan Obradović
View author publications
You can also search for this author in PubMed Google Scholar
Danilo Rapaić
View author publications
You can also search for this author in PubMed Google Scholar
Nebojša Ralević
View author publications
You can also search for this author in PubMed Google Scholar
Vladimir Crnojević
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Branko Brkljač.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brkljač, B., Janev, M., Obradović, R. et al. Sparse representation of precision matrices used in GMMs. Appl Intell 41, 956–973 (2014). https://doi.org/10.1007/s10489-014-0581-6

Download citation

Published: 26 August 2014
Issue Date: October 2014
DOI: https://doi.org/10.1007/s10489-014-0581-6

Sparse representation of precision matrices used in GMMs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Gaussian mixture model decomposition of multivariate signals

Sparse representation and reproduction of speech signals in complex Fourier basis

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Sparse representation of precision matrices used in GMMs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

An Expectation-Maximization Algorithm for Blind Separation of Noisy Mixtures Using Gaussian Mixture Model

Gaussian mixture model decomposition of multivariate signals

Sparse representation and reproduction of speech signals in complex Fourier basis

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation