Sparse regression mixture modeling with the multi-kernel relevance vector machine

Konstantinos Blekas¹ &
Aristidis Likas¹

465 Accesses
Explore all metrics

Abstract

A regression mixture model is proposed where each mixture component is a multi-kernel version of the Relevance Vector Machine (RVM). This mixture model exploits the enhanced modeling capability of RVMs, due to their embedded sparsity enforcing properties. In order to deal with the selection problem of kernel parameters, a weighted multi-kernel scheme is employed, where the weights are estimated during training. The mixture model is trained using the maximum a posteriori approach, where the Expectation Maximization (EM) algorithm is applied offering closed form update equations for the model parameters. Moreover, an incremental learning methodology is also presented that tackles the parameter initialization problem of the EM algorithm along with a BIC-based model selection methodology to estimate the proper number of mixture components. We provide comparative experimental results using various artificial and real benchmark datasets that empirically illustrate the efficiency of the proposed mixture model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive Sparse Bayesian Regression with Variational Inference for Parameter Estimation

Robust relevance vector machine for classification with variational inference

Article 30 August 2015

Online Kernel Matrix Factorization

References

Alon J, Sclaroff S, Kollios G, Pavlovic V (2003) Discovering clusters in motion time-series data. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 375–381
Antonini G, Thiran J (2006) Counting pedestrians in video sequences using trajectory clustering. IEEE Trans Circuits Syst Video Technol 16(8):1008–1020
Article Google Scholar
Bishop C (2006) Pattern recognition and machine learning. Springer, Berlin
MATH Google Scholar
Blekas K, Likas A (2012) The mixture of multi-kernel relevance vector machines model. In: International Conference on Data Mining (ICDM), pp 111–120
Blekas K, Nikou C, Galatsanos N, Tsekos NV (2008) A regression mixture model with spatial constraints for clustering spatiotemporal data. Int J Artif Intell Tools 17(5):1023–1041
Article Google Scholar
Chudova D, Gaffney S, Mjolsness E, Smyth P (2003) Mixture models for translation-invariant clustering of sets of multi-dimensional curves. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Washington, pp 79–88
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc B 39:1–38
MATH MathSciNet Google Scholar
DeSarbo W, Cron W (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(1):249–282
Article MATH MathSciNet Google Scholar
Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In PVLDB, pp 1542–1552
Fraley C, Raftery AE (1998) Bayesian regularization for normal mixture estimation and model-based clustering. Comput J 41:578–588
Article MATH Google Scholar
Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Bishop CM, Frey BJ (eds) Proceedings of the ninth international workshop on artificial intelligence and statistics
Girolami M, Rogers S (2005) Hierarchic bayesian models for kernel learning. In: International conference on machine learning (ICML’05), pp 241–248
Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268
MathSciNet Google Scholar
Gunn S, Kandola J (2002) Structural modelling with sparse kernels. Mach Learn 48:137–163
Article MATH Google Scholar
Harrell F (2001) Regression modeling strategies. With applications to linear models, logistic regression and survival analysis. Springer, New York
MATH Google Scholar
Hu M, Chen Y, Kwok J (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans. Neural Netw 20(5):827–839
Article Google Scholar
Keogh E, Lin J, Truppel W (2005) Clustering of time series subsequences is meaningless: implications for past and future research. Knowl Inf Syst KAIS 2:154–177
Article Google Scholar
Keogh E, Xi X, Wei L, Ratanamahatana C (2006) The ucr time series classification/clustering. homepage: www.cs.ucr.edu/~eamonn/timeseriesdata/
Li J, Barron A (2000) Mixture density estimation. In: Advances in neural information processing systems, Vol 12. The MIT Press, Cambridge, pp 279–285
Liao T (2005) Clustering of time series data: a survey. Patt Recognit 38:1857–1874
Article MATH Google Scholar
McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York
Book MATH Google Scholar
Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York
Book MATH Google Scholar
Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2011) Clustering uncertain trajectories. Knowl Inf Syst KAIS 28:117–147
Article Google Scholar
Rakthanmanon T, Campana B et al (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov Data 7(3):1–31
Google Scholar
Schmolck A, Everson R (2007) Smooth relevance vector machine: a smoothness prior extension of the RVM. Mach Learn 68(2):107–135
Article Google Scholar
Schwarz C (1978) Estimating the dimension of a model. Ann Stat 6:461–464
Article MATH Google Scholar
Seeger M (2008) Bayesian inference and optimal design for the sparse linear model. J Mach Learn Res 9:759–813
MATH MathSciNet Google Scholar
Shi J, Wang B (2008) Curve prediction and clustering with mixtures of Gaussian process functional regression models. Stat Comput 18:267–283
Article MathSciNet Google Scholar
Smyth P (1997) Clustering sequences with hidden Markov models. In: Advances in neural information processing systems, pp 648–654
Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244
MATH MathSciNet Google Scholar
Ueda N, Nakano R, Ghahramani Z, Hinton G (2000) SMEM algorithm for mixture models. Neural Comput 12(9):2109–2128
Article Google Scholar
Vlassis N, Likas A (2001) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15:77–87
Article Google Scholar
Wasserman L (2000) Bayesian model selection and model averaging. J Math Psychol 44(1):92–107
Article MATH MathSciNet Google Scholar
Williams B, Toussaint M, Storkey A (2008) Modelling motion primitives and their timing in biologically executed movements. In: Advances in neural information processing systems, vol 15, pp 1547–1554
Williams O, Blake A, Cipolla R (2005) Sparse Bayesian learning for efficient visual tracking. IEEE Trans. Pattern Anal Mach Intell 27(8):1292–1304
Article Google Scholar
Xiong Y, Yeung D-Y (2002) Mixtures of ARMA models for model-based time series clustering. In: IEEE international conference on data mining (ICDM), pp 717–720
Zhong M (2006) A variational method for learning sparse Bayesian regression. Neurocomputing 69:2351–2355
Article Google Scholar

Download references

Acknowledgments

This paper substantially improves and extends our previous work presented in [4]. This manuscript is dedicated to the memory of our friend and colleague Professor Nikolaos P. Galatsanos who contributed significantly to the research and preparation of this work.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University of Ioannina, P.O. Box 1186, 45110 , Ioannina, Greece
Konstantinos Blekas & Aristidis Likas

Authors

Konstantinos Blekas
View author publications
You can also search for this author in PubMed Google Scholar
Aristidis Likas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Konstantinos Blekas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blekas, K., Likas, A. Sparse regression mixture modeling with the multi-kernel relevance vector machine. Knowl Inf Syst 39, 241–264 (2014). https://doi.org/10.1007/s10115-013-0704-0

Download citation

Received: 01 February 2013
Revised: 01 August 2013
Accepted: 15 October 2013
Published: 30 October 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s10115-013-0704-0

Sparse regression mixture modeling with the multi-kernel relevance vector machine

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Sparse Bayesian Regression with Variational Inference for Parameter Estimation

Robust relevance vector machine for classification with variational inference

Online Kernel Matrix Factorization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Sparse regression mixture modeling with the multi-kernel relevance vector machine

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Adaptive Sparse Bayesian Regression with Variational Inference for Parameter Estimation

Robust relevance vector machine for classification with variational inference

Online Kernel Matrix Factorization

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now