article

Free access

On Spectral Learning

Authors:

Andreas Argyriou,

Charles A. Micchelli,

Massimiliano PontilAuthors Info & Claims

The Journal of Machine Learning Research, Volume 11

Pages 935 - 953

Published: 01 March 2010 Publication History

Abstract

In this paper, we study the problem of learning a matrix W from a set of linear measurements. Our formulation consists in solving an optimization problem which involves regularization with a spectral penalty term. That is, the penalty term is a function of the spectrum of the covariance of W. Instances of this problem in machine learning include multi-task learning, collaborative filtering and multi-view learning, among others. Our goal is to elucidate the form of the optimal solution of spectral learning. The theory of spectral learning relies on the von Neumann characterization of orthogonally invariant norms and their association with symmetric gauge functions. Using this tool we formulate a representer theorem for spectral regularization and specify it to several useful example, such as Schatten p-norms, trace norm and spectral norm, which should proved useful in applications.

References

[1]

J. Abernethy, F. Bach, T. Evgeniou, and J.-P. Vert. A new approach to collaborative filtering: operator estimation with spectral regularization. Journal of Machine Learning Research, 10:803-826, 2009.

Digital Library

[2]

Y. Amit and M. Fink and N. Srebro and S. Ullman. Uncovering shared structures in multiclass classification, In Proceedings of the Twenty-Fourth International Conference on Machine Learning, 2007.

Digital Library

[3]

A. Argyriou, T. Evgeniou, and M. Pontil. Convex multi-task feature learning. Machine Learning, 73(3):243-272, 2008.

Digital Library

[4]

A. Argyriou, C. A. Micchelli, M. Pontil, and Y. Ying. A spectral regularization framework for multi-task structure learning. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20. MIT Press, 2007b.

[5]

A. Argyriou, C.A. Micchelli, and M. Pontil. When is there a representer theorem? Vector versus matrix regularizers. Journal of Machine Learning Research, 10:2507-2529, 2009.

Digital Library

[6]

R. Bhatia. Matrix Analysis. Graduate Texts in Mathematics. Springer, 1997.

[7]

J. M. Borwein and A. S. Lewis. Convex Analysis and Nonlinear Optimization: Theory and Examples. CMS Books in Mathematics. Springer, 2005.

[8]

E. J. Candès and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational Mathematics, 9:717-772, 2008.

[9]

G. Cavallanti, N. Cesa-Bianchi, and C. Gentile. Linear algorithms for online multitask classification. In Proceedings of the 21st Annual Conference on Learning Theory (COLT), 2008.

[10]

T. Evgeniou, C. A. Micchelli, and M. Pontil. Learning multiple tasks with kernel methods. Journal of Machine Learning Research, 6:615-637, 2005.

Digital Library

[11]

M. Fazel, H. Hindi, and S. P. Boyd. A rank minimization heuristic with application to minimum order system approximation. In Proceedings, American Control Conference, volume 6, pages 4734-4739, 2001.

[12]

G. H. Hardy, J. E. Littlewood, and G. Pólya. Inequalities. Cambridge University Press, 1988.

[13]

R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, 1985.

Digital Library

[14]

R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge University Press, 1991.

Digital Library

[15]

A. J. Izenman. Reduced-rank regression for the multivariate linear model. Journal of Multivariate Analysis, 5:248-264, 1975.

[16]

A. S. Lewis. The convex analysis of unitarily invariant matrix functions. Journal of Convex Analysis, 2(1/2):173-183, 1995.

[17]

A. Maurer. Bounds for linear multi-task learning. Journal of Machine Learning Research, 7:117- 139, 2006a.

Digital Library

[18]

A. Maurer. Learning similarity with operator-valued large-margin classifier. Journal of Machine Learning Research, 9:1049-1082, 2008a.

Digital Library

[19]

C. A. Micchelli and A. Pinkus. Variational problems arising from balancing several error criteria. Rendiconti di Matematica, Serie VII, 14:37-86, 1994.

[20]

B. Recht, M. Fazel, and P. A. Parrilo. Guaranteed minimum rank solutions to linear matrix equations via nuclear norm minimization. Preprint, 2008.

[21]

R. T. Rockafellar. Convex Analysis. Princeton University Press, 1970.

[22]

B. Schölkopf, R. Herbrich, and A.J. Smola. A generalized representer theorem. In Proceedings of the Fourteenth Annual Conference on Computational Learning Theory, 2001.

Digital Library

[23]

N. Srebro, J. D.M. Rennie, and T. S. Jaakkola. Maximum-margin matrix factorization. In Advances in Neural Information Processing Systems 17, pages 1329-1336. MIT Press, 2005.

Digital Library

[24]

J. von Neumann. Some matrix inequalities and metrization of matric-space. In Tomsk University Review, 1:286-300, 1937, volume IV of Collected Works, pages 205-218. Pergamon, Oxford, 1962.

[25]

G. Wahba. Spline Models for Observational Data, volume 59 of Series in Applied Mathematics. SIAM, Philadelphia, 1990.

[26]

M. Yuan, A. Ekici, Z. Lu, and R. Monteiro. Dimension reduction and coefficient estimation in multivariate linear regression. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(3):329-346, 2007.

[27]

G. Zames. Feedback and optimal sensitivity: Model reference transformations, multiplicative seminorms, and approximate inverses. IEEE Transactions on Automatic Control, 26(2):301-320, 1981.

Cited By

Combettes PMcdonald AMicchelli CPontil M(2019)Learning with optimal interpolation normsNumerical Algorithms10.1007/s11075-018-0568-181:2(695-717)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s11075-018-0568-1
Stock MPahikkala TAirola ADe Baets BWaegeman W(2018)A comparative study of pairwise learning methods based on kernel ridge regressionNeural Computation10.1162/neco_a_0109630:8(2245-2283)Online publication date: 1-Aug-2018
https://dl.acm.org/doi/10.1162/neco_a_01096
Liu HWang LZhaoy T(2015)Calibrated multivariate regression with application to neural semantic basis discoveryThe Journal of Machine Learning Research10.5555/2789272.288680016:1(1579-1606)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.5555/2789272.2886800
Show More Cited By

Index Terms

On Spectral Learning

Recommendations

Scalable Spectral k-Support Norm Regularization for Robust Low Rank Subspace Learning
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

As a fundamental tool in the fields of data mining and computer vision, robust low rank subspace learning is to recover a low rank matrix under gross corruptions that are often modeled by another sparse matrix. Within this learning, we investigate the ...
Learning spectral embedding via iterative eigenvalue thresholding
CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management

Learning data representation is a fundamental problem in data mining and machine learning. Spectral embedding is one popular method for learning effective data representations. In this paper we propose a novel framework to learn enhanced spectral ...
Spectral k-support norm regularization
NIPS'14: Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2

The k-support norm has successfully been applied to sparse vector prediction problems. We observe that it belongs to a wider class of norms, which we call the box-norms. Within this framework we derive an efficient algorithm to compute the proximity ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

Publisher

JMLR.org

Publication History

Published: 01 March 2010

Published in JMLR Volume 11

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
291
Total Downloads

Downloads (Last 12 months)19
Downloads (Last 6 weeks)9

Reflects downloads up to 24 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Combettes PMcdonald AMicchelli CPontil M(2019)Learning with optimal interpolation normsNumerical Algorithms10.1007/s11075-018-0568-181:2(695-717)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s11075-018-0568-1
Stock MPahikkala TAirola ADe Baets BWaegeman W(2018)A comparative study of pairwise learning methods based on kernel ridge regressionNeural Computation10.1162/neco_a_0109630:8(2245-2283)Online publication date: 1-Aug-2018
https://dl.acm.org/doi/10.1162/neco_a_01096
Liu HWang LZhaoy T(2015)Calibrated multivariate regression with application to neural semantic basis discoveryThe Journal of Machine Learning Research10.5555/2789272.288680016:1(1579-1606)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.5555/2789272.2886800
Argyriou ADinuzzo F(2014)A unifying view of representer theoremsProceedings of the 31st International Conference on International Conference on Machine Learning - Volume 3210.5555/3044805.3044976(II-748-II-756)Online publication date: 21-Jun-2014
https://dl.acm.org/doi/10.5555/3044805.3044976
Hoffman JRodner EDonahue JKulis BSaenko K(2014)Asymmetric and Category Invariant Feature Transformations for Domain AdaptationInternational Journal of Computer Vision10.1007/s11263-014-0719-3109:1-2(28-41)Online publication date: 1-Aug-2014
https://dl.acm.org/doi/10.1007/s11263-014-0719-3
Ma ZYang YNie FSebe NYan SHauptmann A(2014)Harnessing Lab Knowledge for Real-World Action RecognitionInternational Journal of Computer Vision10.1007/s11263-014-0717-5109:1-2(60-73)Online publication date: 1-Aug-2014
https://dl.acm.org/doi/10.1007/s11263-014-0717-5
Zhang HZhang J(2013)Vector-valued reproducing kernel Banach spaces with applications to multi-task learningJournal of Complexity10.1016/j.jco.2012.09.00229:2(195-215)Online publication date: 1-Apr-2013
https://dl.acm.org/doi/10.1016/j.jco.2012.09.002
Micchelli CMorales JPontil M(2013)Regularizers for structured sparsityAdvances in Computational Mathematics10.1007/s10444-011-9245-938:3(455-489)Online publication date: 1-Apr-2013
https://dl.acm.org/doi/10.1007/s10444-011-9245-9
Liu JMicchelli CWang RXu Y(2013)Finite rank kernels for multi-task learningAdvances in Computational Mathematics10.1007/s10444-011-9244-x38:2(427-439)Online publication date: 1-Feb-2013
https://dl.acm.org/doi/10.1007/s10444-011-9244-x
Jain PKulis BDhillon I(2010)Inductive regularized learning of kernel functionsProceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 110.5555/2997189.2997295(946-954)Online publication date: 6-Dec-2010
https://dl.acm.org/doi/10.5555/2997189.2997295
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents