Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Sparse regression mixture modeling with the multi-kernel relevance vector machine

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

A regression mixture model is proposed where each mixture component is a multi-kernel version of the Relevance Vector Machine (RVM). This mixture model exploits the enhanced modeling capability of RVMs, due to their embedded sparsity enforcing properties. In order to deal with the selection problem of kernel parameters, a weighted multi-kernel scheme is employed, where the weights are estimated during training. The mixture model is trained using the maximum a posteriori approach, where the Expectation Maximization (EM) algorithm is applied offering closed form update equations for the model parameters. Moreover, an incremental learning methodology is also presented that tackles the parameter initialization problem of the EM algorithm along with a BIC-based model selection methodology to estimate the proper number of mixture components. We provide comparative experimental results using various artificial and real benchmark datasets that empirically illustrate the efficiency of the proposed mixture model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Alon J, Sclaroff S, Kollios G, Pavlovic V (2003) Discovering clusters in motion time-series data. In: Proceedings of IEEE conference on computer vision and pattern recognition (CVPR), pp 375–381

  2. Antonini G, Thiran J (2006) Counting pedestrians in video sequences using trajectory clustering. IEEE Trans Circuits Syst Video Technol 16(8):1008–1020

    Article  Google Scholar 

  3. Bishop C (2006) Pattern recognition and machine learning. Springer, Berlin

    MATH  Google Scholar 

  4. Blekas K, Likas A (2012) The mixture of multi-kernel relevance vector machines model. In: International Conference on Data Mining (ICDM), pp 111–120

  5. Blekas K, Nikou C, Galatsanos N, Tsekos NV (2008) A regression mixture model with spatial constraints for clustering spatiotemporal data. Int J Artif Intell Tools 17(5):1023–1041

    Article  Google Scholar 

  6. Chudova D, Gaffney S, Mjolsness E, Smyth P (2003) Mixture models for translation-invariant clustering of sets of multi-dimensional curves. In: Proceedings of the Ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM, Washington, pp 79–88

  7. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc B 39:1–38

    MATH  MathSciNet  Google Scholar 

  8. DeSarbo W, Cron W (1988) A maximum likelihood methodology for clusterwise linear regression. J Classif 5(1):249–282

    Article  MATH  MathSciNet  Google Scholar 

  9. Ding H, Trajcevski G, Scheuermann P, Wang X, Keogh E (2008) Querying and mining of time series data: experimental comparison of representations and distance measures. In PVLDB, pp 1542–1552

  10. Fraley C, Raftery AE (1998) Bayesian regularization for normal mixture estimation and model-based clustering. Comput J 41:578–588

    Article  MATH  Google Scholar 

  11. Gaffney S, Smyth P (2003) Curve clustering with random effects regression mixtures. In: Bishop CM, Frey BJ (eds) Proceedings of the ninth international workshop on artificial intelligence and statistics

  12. Girolami M, Rogers S (2005) Hierarchic bayesian models for kernel learning. In: International conference on machine learning (ICML’05), pp 241–248

  13. Gonen M, Alpaydin E (2011) Multiple kernel learning algorithms. J Mach Learn Res 12:2211–2268

    MathSciNet  Google Scholar 

  14. Gunn S, Kandola J (2002) Structural modelling with sparse kernels. Mach Learn 48:137–163

    Article  MATH  Google Scholar 

  15. Harrell F (2001) Regression modeling strategies. With applications to linear models, logistic regression and survival analysis. Springer, New York

    MATH  Google Scholar 

  16. Hu M, Chen Y, Kwok J (2009) Building sparse multiple-kernel SVM classifiers. IEEE Trans. Neural Netw 20(5):827–839

    Article  Google Scholar 

  17. Keogh E, Lin J, Truppel W (2005) Clustering of time series subsequences is meaningless: implications for past and future research. Knowl Inf Syst KAIS 2:154–177

    Article  Google Scholar 

  18. Keogh E, Xi X, Wei L, Ratanamahatana C (2006) The ucr time series classification/clustering. homepage: www.cs.ucr.edu/~eamonn/timeseriesdata/

  19. Li J, Barron A (2000) Mixture density estimation. In: Advances in neural information processing systems, Vol 12. The MIT Press, Cambridge, pp 279–285

  20. Liao T (2005) Clustering of time series data: a survey. Patt Recognit 38:1857–1874

    Article  MATH  Google Scholar 

  21. McLachlan G, Peel D (2000) Finite mixture models. Wiley, New York

    Book  MATH  Google Scholar 

  22. Nocedal J, Wright SJ (1999) Numerical optimization. Springer, New York

    Book  MATH  Google Scholar 

  23. Pelekis N, Kopanakis I, Kotsifakos E, Frentzos E, Theodoridis Y (2011) Clustering uncertain trajectories. Knowl Inf Syst KAIS 28:117–147

    Article  Google Scholar 

  24. Rakthanmanon T, Campana B et al (2013) Addressing big data time series: mining trillions of time series subsequences under dynamic time warping. ACM Trans Knowl Discov Data 7(3):1–31

    Google Scholar 

  25. Schmolck A, Everson R (2007) Smooth relevance vector machine: a smoothness prior extension of the RVM. Mach Learn 68(2):107–135

    Article  Google Scholar 

  26. Schwarz C (1978) Estimating the dimension of a model. Ann Stat 6:461–464

    Article  MATH  Google Scholar 

  27. Seeger M (2008) Bayesian inference and optimal design for the sparse linear model. J Mach Learn Res 9:759–813

    MATH  MathSciNet  Google Scholar 

  28. Shi J, Wang B (2008) Curve prediction and clustering with mixtures of Gaussian process functional regression models. Stat Comput 18:267–283

    Article  MathSciNet  Google Scholar 

  29. Smyth P (1997) Clustering sequences with hidden Markov models. In: Advances in neural information processing systems, pp 648–654

  30. Tipping M (2001) Sparse Bayesian learning and the relevance vector machine. J Mach Learn Res 1:211–244

    MATH  MathSciNet  Google Scholar 

  31. Ueda N, Nakano R, Ghahramani Z, Hinton G (2000) SMEM algorithm for mixture models. Neural Comput 12(9):2109–2128

    Article  Google Scholar 

  32. Vlassis N, Likas A (2001) A greedy EM algorithm for Gaussian mixture learning. Neural Process Lett 15:77–87

    Article  Google Scholar 

  33. Wasserman L (2000) Bayesian model selection and model averaging. J Math Psychol 44(1):92–107

    Article  MATH  MathSciNet  Google Scholar 

  34. Williams B, Toussaint M, Storkey A (2008) Modelling motion primitives and their timing in biologically executed movements. In: Advances in neural information processing systems, vol 15, pp 1547–1554

  35. Williams O, Blake A, Cipolla R (2005) Sparse Bayesian learning for efficient visual tracking. IEEE Trans. Pattern Anal Mach Intell 27(8):1292–1304

    Article  Google Scholar 

  36. Xiong Y, Yeung D-Y (2002) Mixtures of ARMA models for model-based time series clustering. In: IEEE international conference on data mining (ICDM), pp 717–720

  37. Zhong M (2006) A variational method for learning sparse Bayesian regression. Neurocomputing 69:2351–2355

    Article  Google Scholar 

Download references

Acknowledgments

This paper substantially improves and extends our previous work presented in [4]. This manuscript is dedicated to the memory of our friend and colleague Professor Nikolaos P. Galatsanos who contributed significantly to the research and preparation of this work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Konstantinos Blekas.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Blekas, K., Likas, A. Sparse regression mixture modeling with the multi-kernel relevance vector machine. Knowl Inf Syst 39, 241–264 (2014). https://doi.org/10.1007/s10115-013-0704-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-013-0704-0

Keywords

Navigation