research-article

Sparse online topic models

Authors:

Bo ZhangAuthors Info & Claims

WWW '13: Proceedings of the 22nd international conference on World Wide Web

Pages 1489 - 1500

https://doi.org/10.1145/2488388.2488518

Published: 13 May 2013 Publication History

Abstract

Topic models have shown great promise in discovering latent semantic structures from complex data corpora, ranging from text documents and web news articles to images, videos, and even biological data. In order to deal with massive data collections and dynamic text streams, probabilistic online topic models such as online latent Dirichlet allocation (OLDA) have recently been developed. However, due to normalization constraints, OLDA can be ineffective in controlling the sparsity of discovered representations, a desirable property for learning interpretable semantic patterns, especially when the total number of topics is large. In contrast, sparse topical coding (STC) has been successfully introduced as a non-probabilistic topic model for effectively discovering sparse latent patterns by using sparsity-inducing regularization. But, unfortunately STC cannot scale to very large datasets or deal with online text streams, partly due to its batch learning procedure. In this paper, we present a sparse online topic model, which directly controls the sparsity of latent semantic patterns by imposing sparsity-inducing regularization and learns the topical dictionary by an online algorithm. The online algorithm is efficient and guaranteed to converge. Extensive empirical results of the sparse online topic model as well as its collapsed and supervised extensions on a large-scale Wikipedia dataset and the medium-sized 20Newsgroups dataset demonstrate appealing performance.

References

[1]

E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research, (9):1981--2014, 2008.

Digital Library

[2]

A. Asuncion, M. Welling, P. Smyth, and Y. Teh. On smoothing and inference for topic models. In Conference on Uncertainty in Artificial Intelligence, pages 27--34, 2009.

Digital Library

[3]

D. Blei and J. Lafferty. Correlated topic models. In Advances in Neural Information Processing Systems, pages 147--154, 2005.

Digital Library

[4]

D. Blei and J. McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems, pages 121--128, 2007.

[5]

D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, (3):993--1022, 2003.

Digital Library

[6]

D. M. Blei and J. D. Lafferty. Dynamic topic models. In International Conference on Machine Learning, pages 113--120, 2006.

Digital Library

[7]

L. Bottou. Online Learning and Stochastic Approximations, chapter On-line learning in neural networks. 1998.

Digital Library

[8]

L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, pages 161--168, 2008.

[9]

J. Boyd-Graber, D. Blei, and X. Zhu. A topic model for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing, pages 1024--1033, 2007.

[10]

K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, (2):265--292, 2001.

Digital Library

[11]

J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l1-ball for learning in high dimensions. In International Conference on Machine Learning, pages 272--279, 2008.

Digital Library

[12]

L. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 524--531, 2005.

Digital Library

[13]

R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from Google's image search. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1816--1823, 2005.

Digital Library

[14]

W. Fu, J. Wang, Z. Li, H. Lu, and S. Ma. Learning semantic motion patterns for dynamic scenes by improved sparse topical coding. In International Conference on Multimedia and Expo, pages 296--301, 2012.

Digital Library

[15]

T. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, (101):5228--5235, 2004.

[16]

M. Hoffman, D. Blei, and F. Bach. Online learning for latent Dirichlet allocation. In Advances in Neural Information Processing Systems, pages 156--164, 2010.

[17]

T. Hofmann. Probabilistic latent semantic analysis. In Uncertainty in Artificial Intelligence, 1999.

Digital Library

[18]

P. Hoyer. Non-negative sparse coding. In IEEE Workshop on Neural Networks for Signal Processing, 2002.

[19]

A. Hyvarinen. Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation. Neural Computation, (11):1739--1768, 1999.

Digital Library

[20]

T. Iwata, T. Yamada, Y. Sakurai, and N. Ueda. Online multiscale dynamic topic models. In Conference on Knowledge Discovery and Data Mining, pages 663--672, 2010.

Digital Library

[21]

R. Ji, L. Duan, J. Chen, and W. Gao. Towards compact topical descriptors. In Conference on Computer Vision and Pattern Recognition, pages 2925--2932, 2012.

Digital Library

[22]

J. J. Kivinen, E. B. Sudderth, and M. I. Jordan. Learning multiscale representations of natural scenes using Dirichlet processes. In IEEE International Conference on Computer Vision, pages 1--8, 2007.

[23]

D. Lee and H. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788 -- 791, 1999.

[24]

H. Lee, R. Raina, A. Teichman, and A. Ng. Exponential family sparse coding with applications to self-taught learning. In International Joint Conferences on Artificial Intelligence, pages 1113--1119, 2009.

Digital Library

[25]

L.-J. Li, J. Zhu, H. Su, E. Xing, and L. Fei-Fei. Multi-level structured image coding on high-dimensional image representation. In Asian Conference on Computer Vision, 2012.

Digital Library

[26]

J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Supervised dictionary learning. In Advances in Neural Information Processing Systems, pages 1033--1040, 2008.

[27]

D. Mimno, M. Hoffman, and D. Blei. Sparse stochastic inference for latent Dirichlet allocation. In International Conference on Machine Learning, 2012.

Digital Library

[28]

D. Mimno, H. Wallach, J. Naradowsky, D. A. Smith, and A. McCallum. Polylingual topic models. In Conference on Empirical Methods in Natural Language Processing, pages 880--889, 2009.

Digital Library

[29]

B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607--609, 1996.

[30]

M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Conference on Uncertainty in Artificial Intelligence, pages 487--494, 2004.

Digital Library

[31]

B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. Using multiple segmentations to discover objects and their extent in image collections. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1605--1614, 2006.

Digital Library

[32]

S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In International Conference on Machine Learning, pages 807--814, 2007.

Digital Library

[33]

M. Shashanka, B. Raj, and P. Smaragdis. Sparse overcomplete latent variable decomposition of counts data. In Advances in Neural Information Processing Systems, pages 1313--1320, 2007.

[34]

J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. Discovering objects and their locatioins in images. In IEEE International Conference on Computer Vision, pages 370--377, 2005.

Digital Library

[35]

S. Sra, D. Kim, and B. Scholkopf. Non-monotonic Poisson likelihood maximization. Tech. Report, MPI for Biological Cybernetics, 2008.

[36]

Y. W. Teh, D. Newman, and M. Welling. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Advances in Neural Information Processing Systems, pages 1353--1360, 2007.

Digital Library

[37]

C. Wang, D. Blei, and L. Fei-Fei. Simultaneous image classification and annotation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1903--1910, 2009.

[38]

Q. Wang, J. Xu, H. Li, and N. Craswell. Regularized latent semantic indexing. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pages 685--694, 2011.

Digital Library

[39]

L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Conference on Knowledge Discovery and Data Mining, pages 937--946, 2009.

Digital Library

[40]

K. Zhai, J. Boyd-Graber, N. Asadi, and M. Alkhouja. Mr. LDA: A flexible large scale topic modeling package using variational inference in MapReduce. In Proceedings of World Wide Web Conference, pages 879--888, 2012.

Digital Library

[41]

J. Zhu, A. Ahmed, and E. Xing. MedLDA: Maximum margin supervised topic models for regression and classification. In International Conference on Machine Learning, pages 1257--1264, 2009.

Digital Library

[42]

J. Zhu and E. Xing. Sparse topical coding. In Conference on Uncertainty in Artificial Intelligence, pages 831--838, 2011.

Cited By

Kumar AEsmaili NPiccardi M(2022)Neural Topic Model Training with the REBAR Gradient EstimatorACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351733621:5(1-18)Online publication date: 15-Nov-2022
https://dl.acm.org/doi/10.1145/3517336
Kumar AEsmaili NPiccardi M(2022)A Temperature-Modified Dynamic Embedded Topic ModelData Mining10.1007/978-981-19-8746-5_2(15-27)Online publication date: 5-Dec-2022
https://doi.org/10.1007/978-981-19-8746-5_2
Chelmis CZois D(2021)Dynamic, Incremental, and Continuous Detection of Cyberbullying in Online Social MediaACM Transactions on the Web10.1145/344801415:3(1-33)Online publication date: 13-May-2021
https://dl.acm.org/doi/10.1145/3448014
Show More Cited By

Sparse online topic models
1. Computing methodologies

Recommendations

Modeling online reviews with multi-grain topic models
WWW '08: Proceedings of the 17th international conference on World Wide Web

In this paper we present a novel framework for extracting the ratable aspects of objects from online user reviews. Extracting such aspects is an important challenge in automatically mining product opinions from the web and in generating opinion-based ...
Topic sentiment mixture: modeling facets and opinions in weblogs
WWW '07: Proceedings of the 16th international conference on World Wide Web

In this paper, we define the problem of topic-sentiment analysis on Weblogs and propose a novel probabilistic model to capture the mixture of topics and sentiments simultaneously. The proposed Topic-Sentiment Mixture (TSM) model can reveal the latent ...
Topic model tutorial: A basic introduction on latent dirichlet allocation and extensions for web scientists
WebSci '16: Proceedings of the 8th ACM Conference on Web Science

In this tutorial, we teach the intuition and the assumptions behind topic models. Topic models explain the co-occurrences of words in documents by extracting sets of semantically related words, called topics. These topics are semantically coherent and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

WWW '13: Proceedings of the 22nd international conference on World Wide Web

May 2013

1628 pages

ISBN:9781450320351

DOI:10.1145/2488388

General Chairs:
Daniel Schwabe
PUC-Rio - Brazil
,
Virgílio Almeida
UFMG - Brazil
,
Hartmut Glaser
CGI.br - Brazil
,
Program Chairs:
Ricardo Baeza-Yates
Yahoo! Labs - Spain & Chile
,
Sue Moon
KAIST - South Korea

Copyright © 2013 Copyright is held by the International World Wide Web Conference Committee (IW3C2).

Sponsors

NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
CGIBR: Comite Gestor da Internet no Brazil

In-Cooperation

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2013

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '13

Sponsor:

NICBR
CGIBR

WWW '13: 22nd International World Wide Web Conference

May 13 - 17, 2013

Rio de Janeiro, Brazil

Acceptance Rates

WWW '13 Paper Acceptance Rate 125 of 831 submissions, 15%;

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
452
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kumar AEsmaili NPiccardi M(2022)Neural Topic Model Training with the REBAR Gradient EstimatorACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351733621:5(1-18)Online publication date: 15-Nov-2022
https://dl.acm.org/doi/10.1145/3517336
Kumar AEsmaili NPiccardi M(2022)A Temperature-Modified Dynamic Embedded Topic ModelData Mining10.1007/978-981-19-8746-5_2(15-27)Online publication date: 5-Dec-2022
https://doi.org/10.1007/978-981-19-8746-5_2
Chelmis CZois D(2021)Dynamic, Incremental, and Continuous Detection of Cyberbullying in Online Social MediaACM Transactions on the Web10.1145/344801415:3(1-33)Online publication date: 13-May-2021
https://dl.acm.org/doi/10.1145/3448014
Chen ZPark J(2021)Analyzing Cultural Assimilation through the Lens of Yelp Restaurant Reviews2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA53316.2021.9564170(1-10)Online publication date: 6-Oct-2021
https://doi.org/10.1109/DSAA53316.2021.9564170
Kumar AEsmaili NPiccardi M(2021)Topic-Document Inference With the Gumbel-Softmax DistributionIEEE Access10.1109/ACCESS.2020.30466079(1313-1320)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2020.3046607
Yao MChelmis CZois D(2019)Cyberbullying Ends Here: Towards Robust Detection of Cyberbullying in Social MediaThe World Wide Web Conference10.1145/3308558.3313462(3427-3433)Online publication date: 13-May-2019
https://dl.acm.org/doi/10.1145/3308558.3313462
Lin THu ZGuo XCulpepper JMoffat ABennett PLerman K(2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
https://dl.acm.org/doi/10.1145/3289600.3290957
Chen JZhu JTeh YZhang T(2018)Stochastic expectation maximization with variance reductionProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327893(7978-7988)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327757.3327893
Zhu J(2018)Probabilistic machine learningProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304839(5754-5759)Online publication date: 13-Jul-2018
https://dl.acm.org/doi/10.5555/3304652.3304839
Tang YMao XHuang H(2018)Labeled Phrase Latent Dirichlet Allocation and its online learning algorithmData Mining and Knowledge Discovery10.1007/s10618-018-0555-032:4(885-912)Online publication date: 1-Jul-2018
https://dl.acm.org/doi/10.1007/s10618-018-0555-0
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents