research-article

Active learning for online bayesian matrix factorization

Authors:

Lawrence CarinAuthors Info & Claims

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

Pages 325 - 333

https://doi.org/10.1145/2339530.2339584

Published: 12 August 2012 Publication History

Abstract

The problem of large-scale online matrix completion is addressed via a Bayesian approach. The proposed method learns a factor analysis (FA) model for large matrices, based on a small number of observed matrix elements, and leverages the statistical model to actively select which new matrix entries/observations would be most informative if they could be acquired, to improve the model; the model inference and active learning are performed in an online setting. In the context of online learning, a greedy, fast and provably near-optimal algorithm is employed to sequentially maximize the mutual information between past and future observations, taking advantage of submodularity properties. Additionally, a simpler procedure, which directly uses the posterior parameters learned by the Bayesian approach, is shown to achieve slightly lower estimation quality, with far less computational effort. Inference is performed using a computationally efficient online variational Bayes (VB) procedure. Competitive results are obtained in a very large collaborative filtering problem, namely the Yahoo! Music ratings dataset.

Supplementary Material

JPG File (311a_m_talk_9.jpg)

Download
11.43 KB

MP4 File (311a_m_talk_9.mp4)

Download
156.15 MB

References

[1]

H. Attias. A variational bayesian framework for graphical models. Advances in Neural Information Processing Systems (NIPS), 12(1--2):209--215, 2000.

[2]

R. Bell and Y. Koren. Lessons from the Netflix prize challenge. ACM SIGKDD Explorations Newsletter, 9(2):75--79, 2007.

Digital Library

[3]

E. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion. Information Theory, IEEE Transactions on, 56(5):2053--2080, 2010.

Digital Library

[4]

G. Dror, N. Koenigstein, Y. Koren, and M. Weimer. The Yahoo! Music Dataset and KDD-Cup'11. In ACM International Conference on Knowledge Discovery and Data Mining (KDD), KDD Cup Workshop, 2011.

[5]

M. Hoffman, D. Blei, and F. Bach. Online learning for latent Dirichlet allocation. Advances in Neural Information Processing Systems (NIPS), 23:856--864, 2010.

[6]

M. Jahrer and A. Töscher. Collaborative filtering ensemble. In ACM International Conference on Knowledge Discovery and Data Mining (KDD), KDD Cup Workshop, 2011.

[7]

Y. Koren. Factorization meets the neighborhood: a multifaceted collaborative filtering model. In ACM International Conference on Knowledge Discovery and Data Mining (KDD), pages 426--434, 2008.

Digital Library

[8]

A. Krause, A. Singh, and C. Guestrin. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. Journal of Machine Learning Research, 9:235--284, 2008.

Digital Library

[9]

J. Lee, B. Recht, R. Salakhutdinov, N. Srebro, and J. Tropp. Practical large-scale optimization for max-norm regularization. Advances in Neural Information Processing Systems (NIPS), 23:1297--1305, 2010.

[10]

Y. Lim and Y. Teh. Variational bayesian approach to movie rating prediction. In ACM International Conference on Knowledge Discovery and Data Mining (KDD), KDD Cup Workshop, pages 15--21, 2007.

[11]

J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research, 11:19--60, 2010.

Digital Library

[12]

S. Nakajima and M. Sugiyama. Implicit regularization in variational Bayesian matrix factorization. In 27th International Conference on Machine Learning (ICML), 2010.

Digital Library

[13]

G. Nemhauser, L. Wolsey, and M. Fisher. An analysis of approximations for maximizing submodular set functions. Mathematical Programming, 14(1):265--294, 1978.

Digital Library

[14]

T. Raiko, A. Ilin, and J. Karhunen. Principal component analysis for large scale problems with lots of missing values. Machine Learning: ECML 2007, pages 691--698, 2007.

Digital Library

[15]

T. Raiko, H. Valpola, M. Harva, and J. Karhunen. Building blocks for variational Bayesian learning of latent variable models. Journal of Machine Learning Research, 8:155--201, 2007.

Digital Library

[16]

R. Salakhutdinov and A. Mnih. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the 25th International Conference on Machine Learning (ICML), 2008.

Digital Library

[17]

M. Sato. Online model selection based on the variational Bayes. Neural Computation, 13(7):1649--1681, 2001.

Digital Library

Cited By

Song ALei ZChen S(2024)ALQ: An Efficient Active Learning Recommendation Algorithm2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575704(1626-1630)Online publication date: 24-May-2024
https://doi.org/10.1109/IMCEC59810.2024.10575704
Li XXie KWang XXie GLi KCao JZhang DWen J(2023)Tripartite Graph Aided Tensor Completion For Sparse Network MeasurementIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.321325934:1(48-62)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TPDS.2022.3213259
Zhao YWang SWang YLiu H(2023)MbSRS: A multi-behavior streaming recommender systemInformation Sciences10.1016/j.ins.2023.01.101631(145-163)Online publication date: Jun-2023
https://doi.org/10.1016/j.ins.2023.01.101
Show More Cited By

Index Terms

Active learning for online bayesian matrix factorization
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Global analytic solution of fully-observed variational Bayesian matrix factorization

The variational Bayesian (VB) approximation is known to be a promising approach to Bayesian estimation, when the rigorous calculation of the Bayes posterior is intractable. The VB approximation has been successfully applied to matrix factorization (MF), ...
Bayesian mean-parameterized nonnegative binary matrix factorization
Abstract
Binary data matrices can represent many types of data such as social networks, votes, or gene expression. In some cases, the analysis of binary matrices can be tackled with nonnegative matrix factorization (NMF), where the observed data matrix is ...
Bayesian matrix co-factorization: variational algorithm and Cramér-Rao bound
ECMLPKDD'11: Proceedings of the 2011th European Conference on Machine Learning and Knowledge Discovery in Databases - Volume Part III

Matrix factorization is a popular method for collaborative prediction, where unknown ratings are predicted by user and item factor matrices which are determined to approximate a user-item matrix as their product. Bayesian matrix factorization is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

KDD '12: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining

August 2012

1616 pages

ISBN:9781450314626

DOI:10.1145/2339530

General Chair:
Qiang Yang
Hong Kong University of Science and Technology
,
Program Chairs:
Deepak Agarwal
LinkedIn
,
Jian Pei
Simon Fraser University

Copyright © 2012 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 August 2012

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

KDD '12

Sponsor:

KDD '12: The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 12 - 16, 2012

Beijing, China

Acceptance Rates

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Upcoming Conference

KDD '25

Sponsor:
sigkdd
sigkdd

The 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining

August 3 - 7, 2025

Toronto , ON , Canada

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

30
Total Citations
View Citations
672
Total Downloads

Downloads (Last 12 months)24
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Song ALei ZChen S(2024)ALQ: An Efficient Active Learning Recommendation Algorithm2024 IEEE 6th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC)10.1109/IMCEC59810.2024.10575704(1626-1630)Online publication date: 24-May-2024
https://doi.org/10.1109/IMCEC59810.2024.10575704
Li XXie KWang XXie GLi KCao JZhang DWen J(2023)Tripartite Graph Aided Tensor Completion For Sparse Network MeasurementIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.321325934:1(48-62)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TPDS.2022.3213259
Zhao YWang SWang YLiu H(2023)MbSRS: A multi-behavior streaming recommender systemInformation Sciences10.1016/j.ins.2023.01.101631(145-163)Online publication date: Jun-2023
https://doi.org/10.1016/j.ins.2023.01.101
Lu HLyu FRen JYu JWu FZhang YShen X(2022)CODE: Compact IoT Data Collection with Precise Matrix Sampling and Efficient Inference2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS54860.2022.00077(743-753)Online publication date: Jul-2022
https://doi.org/10.1109/ICDCS54860.2022.00077
Liu CWang LWen XLiu LZheng WLu Z(2022)Efficient Data Collection Scheme based on Information Entropy for Vehicular Crowdsensing2022 IEEE International Conference on Communications Workshops (ICC Workshops)10.1109/ICCWorkshops53468.2022.9882168(1-6)Online publication date: 16-May-2022
https://doi.org/10.1109/ICCWorkshops53468.2022.9882168
Zhu KZhang ANiyato D(2021)Cost-Effective Active Sparse Urban Sensing: Adversarial Autoencoder ApproachIEEE Internet of Things Journal10.1109/JIOT.2021.30608158:15(12064-12078)Online publication date: 1-Aug-2021
https://doi.org/10.1109/JIOT.2021.3060815
Zhang CTaylor SCobb CSekhon J(2020)Active matrix factorization for surveysThe Annals of Applied Statistics10.1214/20-AOAS132214:3Online publication date: 1-Sep-2020
https://doi.org/10.1214/20-AOAS1322
Zhao YWang SWang YLiu H(2020)Stratified and time-aware sampling based adaptive ensemble learning for streaming recommendationsApplied Intelligence10.1007/s10489-020-01851-9Online publication date: 9-Nov-2020
https://doi.org/10.1007/s10489-020-01851-9
Zhao YWang SWang YLiu HZhang W(2020)Double-Wing Mixture of Experts for Streaming RecommendationsWeb Information Systems Engineering – WISE 202010.1007/978-3-030-62008-0_19(269-284)Online publication date: 21-Oct-2020
https://doi.org/10.1007/978-3-030-62008-0_19
Jia YBatra NWang HWhitehouse KZhu WTao DCheng XCui PRundensteiner ECarmel DHe QXu Yu J(2019)Active Collaborative Sensing for Energy BreakdownProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357929(1943-1952)Online publication date: 3-Nov-2019
https://dl.acm.org/doi/10.1145/3357384.3357929
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten