research-article

LDA Revisited: Entropy, Prior and Convergence

Authors:

Jianfeng YanAuthors Info & Claims

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Pages 1763 - 1772

https://doi.org/10.1145/2983323.2983794

Published: 24 October 2016 Publication History

Abstract

Inference algorithms of latent Dirichlet allocation (LDA), either for small or big data, can be broadly categorized into expectation-maximization (EM), variational Bayes (VB) and collapsed Gibbs sampling (GS). Looking for a unified understanding of these different inference algorithms is currently an important open problem. In this paper, we revisit these three algorithms from the entropy perspective, and show that EM can achieve the best predictive perplexity (a standard performance metric for LDA accuracy) by minimizing directly the cross entropy between the observed word distribution and LDA's predictive distribution. Moreover, EM can change the entropy of LDA's predictive distribution through tuning priors of LDA, such as the Dirichlet hyperparameters and the number of topics, to minimize the cross entropy with the observed word distribution. Finally, we propose the adaptive EM (AEM) algorithm that converges faster and more accurate than the current state-of-the-art SparseLDA [20] and AliasLDA [12] from small to big data and LDA models. The core idea is that the number of active topics, measured by the residuals between E-steps at successive iterations, decreases significantly, leading to the amortized σ(1) time complexity in terms of the number of topics. The open source code of AEM is available at GitHub.

References

[1]

A. Ahmed, M. Aly, J. Gonzalez, S. Narayanamurthy, and A. Smola. Scalable inference in latent variable models. In WSDM, pages 123--132, 2012.

Digital Library

[2]

S. Arora, R. Ge, Y. Halpern, D. Mimno, A. Moitra, D. Sontag, Y. Wu, and M. Zhu. A practical algorithm for topic modeling with provable guarantees. In ICML, 2013.

Digital Library

[3]

A. Asuncion, M. Welling, P. Smyth, and Y. W. Teh. On smoothing and inference for topic models. In UAI, pages 27--34, 2009.

Digital Library

[4]

D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003.

[5]

N. de Freitas and K. Barnard. Bayesian latent semantic analysis of multimedia databases. Technical report, University of British Columbia, 2001.

Digital Library

[6]

E. Gaussier and C. Goutte. Relation between PLSA and NMF and implications. In SIGIR, pages 15--19, 2005.

Digital Library

[7]

M. Girolami and A. Kabán. On an equivalence between PLSI and LDA. In SIGIR, pages 433--434, 2003.

Digital Library

[8]

T. L. Griffiths and M. Steyvers. Finding scientific topics. Proc. Natl. Acad. Sci., 101:5228--5235, 2004.

[9]

M. Hoffman, D. Blei, and F. Bach. Online learning for latent Dirichlet allocation. In NIPS, pages 856--864, 2010.

Digital Library

[10]

T. Hofmann. Unsupervised learning by probabilistic latent semantic analysis. Machine Learning, 42:177--196, 2001.

[11]

D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.

[12]

A. Q. Li, A. Ahmed, S. Ravi, and A. J. Smola. Reducing the sampling complexity of topic models. In KDD, pages 891--900, 2014.

Digital Library

[13]

X. Liu, J. Zeng, X. Yang, J. Yan, and Q. Yang. Scalable parallel EM algorithms for latent Dirichlet allocation in multi-core systems. In WWW, pages 669--679, 2015.

Digital Library

[14]

K. P. Murphy. Machine Learning: A Probabilistic Perspective. The MIT Press, 2012.

Digital Library

[15]

I. Porteous, D. Newman, A. Ihler, A. Asuncion, P. Smyth, and M. Welling. Fast collapsed Gibbs sampling for latent Dirichlet allocation. In KDD, pages 569--577, 2008.

Digital Library

[16]

J. Tang, Z. Meng, X. Nguyen, Q. Mei, and M. Zhang. Understanding the limiting factors of topic modeling via posterior contraction analysis. In ICML, 2014.

[17]

Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei. Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(576):1566--1581, 2006.

[18]

H. Wallach, D. Mimno, and A. McCallum. Rethinking LDA: Why priors matter. In NIPS, pages 1973--1981, 2009.

Digital Library

[19]

Y. Wang, X. Zhao, Z. Sun, H. Yan, L. Wang, Z. Jin, L. Wang, Y. Gao, C. Law, and J. Zeng. Peacock: Learning long-tail topic features for industrial applications. ACM Transactions on Intelligent Systems and Technology, 6(4):47, 2015.

Digital Library

[20]

L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In KDD, pages 937--946, 2009.

Digital Library

[21]

J. Yuan, F. Gao, Q. Ho, W. Dai, J. Wei, X. Zheng, E. P. Xing, T.-Y. Liu, and W.-Y. Ma. LightLDA: Big topic models on modest compute clusters. In WWW, pages 1351--1361, 2015.

Digital Library

[22]

J. Zeng. A topic modeling toolbox using belief propagation. J. Mach. Learn. Res., 13:2233--2236, 2012.

Digital Library

[23]

J. Zeng, W. K. Cheung, and J. Liu. Learning topic models by belief propagation. IEEE Trans. Pattern Anal. Mach. Intell., 35(5):1121--1134, 2013.

Digital Library

[24]

J. Zeng, Z.-Q. Liu, and X.-Q. Cao. Fast online EM for big topic modeling. IEEE Trans. Knowl. Data Eng., page arXiv:1210.2179 {cs.LG}, 2015.

Digital Library

[25]

K. Zhai, J. Boyd-Graber, and N. Asadi. Mr. LDA: A flexible large scale topic modeling package using variational inference in MapReduce. In WWW, pages 879--888, 2012.

Digital Library

Cited By

(2018)Weighting schemes based on EM algorithm for LDAInternational Journal of High Performance Systems Architecture10.5555/3271844.32718538:1-2(94-104)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3271844.3271853

Index Terms

LDA Revisited: Entropy, Prior and Convergence
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning
        Topic modeling
    2. Learning settings
      1. Online learning settings
2. Mathematics of computing
  1. Probability and statistics
    1. Probabilistic reasoning algorithms
      1. Expectation maximization

Recommendations

ADM-LDA: An aspect detection model based on topic modelling using the structure of review sentences

Probabilistic topic models are statistical methods whose aim is to discover the latent structure in a large collection of documents. The intuition behind topic models is that, by generating documents by latent topics, the word distribution for each ...
Concept-LDA: Incorporating Babelfy into LDA for aspect extraction

Latent Dirichlet allocation (LDA) is one of the probabilistic topic models; it discovers the latent topic structure in a document collection. The basic assumption under LDA is that documents are viewed as a probabilistic mixture of latent topics; a topic ...
A Novel Hybrid HDP-LDA Model for Sentiment Analysis
WI-IAT '13: Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) - Volume 01

Sentiment analysis studies the public opinions towards an entity, and it is an important research area in data mining. Recently, a lot of sentiment analysis models have been proposed, including supervised and unsupervised approaches. However, the role ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

October 2016

2566 pages

ISBN:9781450340731

DOI:10.1145/2983323

General Chairs:
Snehasis Mukhopadhyay
Indiana University Purdue University Indianapolis, USA
,
ChengXiang Zhai
University of Illinois at Urbana-Champaign, USA
,
Program Chairs:
Elisa Bertino
Purdue University
,
Fabio Crestani
University of Lugano
,
Javed Mostafa
University of North Carolina
,
Jie Tang
Tsinghua University
,
Luo Si
Alibaba Group Inc & Purdue University
,
Xiaofang Zhou
University of Queensland
,
Yi Chang
Yahoo Research
,
Yunyao Li
IBM Research - Almaden
,
Parikshit Sondhi
WalmartLabs

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 October 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Collaborative Innovation Center of Novel Software Technology and Industrialization
Science and Technology Commission of Shanghai Municipality
National Natural Science Foundation of China

Conference

CIKM'16

Sponsor:

CIKM'16: ACM Conference on Information and Knowledge Management

October 24 - 28, 2016

Indiana, Indianapolis, USA

Acceptance Rates

CIKM '16 Paper Acceptance Rate 160 of 701 submissions, 23%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
358
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

(2018)Weighting schemes based on EM algorithm for LDAInternational Journal of High Performance Systems Architecture10.5555/3271844.32718538:1-2(94-104)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3271844.3271853

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten