Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2488388.2488518acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article

Sparse online topic models

Published: 13 May 2013 Publication History

Abstract

Topic models have shown great promise in discovering latent semantic structures from complex data corpora, ranging from text documents and web news articles to images, videos, and even biological data. In order to deal with massive data collections and dynamic text streams, probabilistic online topic models such as online latent Dirichlet allocation (OLDA) have recently been developed. However, due to normalization constraints, OLDA can be ineffective in controlling the sparsity of discovered representations, a desirable property for learning interpretable semantic patterns, especially when the total number of topics is large. In contrast, sparse topical coding (STC) has been successfully introduced as a non-probabilistic topic model for effectively discovering sparse latent patterns by using sparsity-inducing regularization. But, unfortunately STC cannot scale to very large datasets or deal with online text streams, partly due to its batch learning procedure. In this paper, we present a sparse online topic model, which directly controls the sparsity of latent semantic patterns by imposing sparsity-inducing regularization and learns the topical dictionary by an online algorithm. The online algorithm is efficient and guaranteed to converge. Extensive empirical results of the sparse online topic model as well as its collapsed and supervised extensions on a large-scale Wikipedia dataset and the medium-sized 20Newsgroups dataset demonstrate appealing performance.

References

[1]
E. M. Airoldi, D. M. Blei, S. E. Fienberg, and E. P. Xing. Mixed membership stochastic blockmodels. Journal of Machine Learning Research, (9):1981--2014, 2008.
[2]
A. Asuncion, M. Welling, P. Smyth, and Y. Teh. On smoothing and inference for topic models. In Conference on Uncertainty in Artificial Intelligence, pages 27--34, 2009.
[3]
D. Blei and J. Lafferty. Correlated topic models. In Advances in Neural Information Processing Systems, pages 147--154, 2005.
[4]
D. Blei and J. McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems, pages 121--128, 2007.
[5]
D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, (3):993--1022, 2003.
[6]
D. M. Blei and J. D. Lafferty. Dynamic topic models. In International Conference on Machine Learning, pages 113--120, 2006.
[7]
L. Bottou. Online Learning and Stochastic Approximations, chapter On-line learning in neural networks. 1998.
[8]
L. Bottou and O. Bousquet. The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, pages 161--168, 2008.
[9]
J. Boyd-Graber, D. Blei, and X. Zhu. A topic model for word sense disambiguation. In Conference on Empirical Methods in Natural Language Processing, pages 1024--1033, 2007.
[10]
K. Crammer and Y. Singer. On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research, (2):265--292, 2001.
[11]
J. Duchi, S. Shalev-Shwartz, Y. Singer, and T. Chandra. Efficient projections onto the l1-ball for learning in high dimensions. In International Conference on Machine Learning, pages 272--279, 2008.
[12]
L. Fei-Fei and P. Perona. A Bayesian hierarchical model for learning natural scene categories. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 524--531, 2005.
[13]
R. Fergus, L. Fei-Fei, P. Perona, and A. Zisserman. Learning object categories from Google's image search. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1816--1823, 2005.
[14]
W. Fu, J. Wang, Z. Li, H. Lu, and S. Ma. Learning semantic motion patterns for dynamic scenes by improved sparse topical coding. In International Conference on Multimedia and Expo, pages 296--301, 2012.
[15]
T. Griffiths and M. Steyvers. Finding scientific topics. Proceedings of the National Academy of Sciences, (101):5228--5235, 2004.
[16]
M. Hoffman, D. Blei, and F. Bach. Online learning for latent Dirichlet allocation. In Advances in Neural Information Processing Systems, pages 156--164, 2010.
[17]
T. Hofmann. Probabilistic latent semantic analysis. In Uncertainty in Artificial Intelligence, 1999.
[18]
P. Hoyer. Non-negative sparse coding. In IEEE Workshop on Neural Networks for Signal Processing, 2002.
[19]
A. Hyvarinen. Sparse code shrinkage: Denoising of nongaussian data by maximum likelihood estimation. Neural Computation, (11):1739--1768, 1999.
[20]
T. Iwata, T. Yamada, Y. Sakurai, and N. Ueda. Online multiscale dynamic topic models. In Conference on Knowledge Discovery and Data Mining, pages 663--672, 2010.
[21]
R. Ji, L. Duan, J. Chen, and W. Gao. Towards compact topical descriptors. In Conference on Computer Vision and Pattern Recognition, pages 2925--2932, 2012.
[22]
J. J. Kivinen, E. B. Sudderth, and M. I. Jordan. Learning multiscale representations of natural scenes using Dirichlet processes. In IEEE International Conference on Computer Vision, pages 1--8, 2007.
[23]
D. Lee and H. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788 -- 791, 1999.
[24]
H. Lee, R. Raina, A. Teichman, and A. Ng. Exponential family sparse coding with applications to self-taught learning. In International Joint Conferences on Artificial Intelligence, pages 1113--1119, 2009.
[25]
L.-J. Li, J. Zhu, H. Su, E. Xing, and L. Fei-Fei. Multi-level structured image coding on high-dimensional image representation. In Asian Conference on Computer Vision, 2012.
[26]
J. Mairal, F. Bach, J. Ponce, G. Sapiro, and A. Zisserman. Supervised dictionary learning. In Advances in Neural Information Processing Systems, pages 1033--1040, 2008.
[27]
D. Mimno, M. Hoffman, and D. Blei. Sparse stochastic inference for latent Dirichlet allocation. In International Conference on Machine Learning, 2012.
[28]
D. Mimno, H. Wallach, J. Naradowsky, D. A. Smith, and A. McCallum. Polylingual topic models. In Conference on Empirical Methods in Natural Language Processing, pages 880--889, 2009.
[29]
B. A. Olshausen and D. J. Field. Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381(6583):607--609, 1996.
[30]
M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth. The author-topic model for authors and documents. In Conference on Uncertainty in Artificial Intelligence, pages 487--494, 2004.
[31]
B. C. Russell, A. A. Efros, J. Sivic, W. T. Freeman, and A. Zisserman. Using multiple segmentations to discover objects and their extent in image collections. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1605--1614, 2006.
[32]
S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In International Conference on Machine Learning, pages 807--814, 2007.
[33]
M. Shashanka, B. Raj, and P. Smaragdis. Sparse overcomplete latent variable decomposition of counts data. In Advances in Neural Information Processing Systems, pages 1313--1320, 2007.
[34]
J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. Discovering objects and their locatioins in images. In IEEE International Conference on Computer Vision, pages 370--377, 2005.
[35]
S. Sra, D. Kim, and B. Scholkopf. Non-monotonic Poisson likelihood maximization. Tech. Report, MPI for Biological Cybernetics, 2008.
[36]
Y. W. Teh, D. Newman, and M. Welling. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. In Advances in Neural Information Processing Systems, pages 1353--1360, 2007.
[37]
C. Wang, D. Blei, and L. Fei-Fei. Simultaneous image classification and annotation. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1903--1910, 2009.
[38]
Q. Wang, J. Xu, H. Li, and N. Craswell. Regularized latent semantic indexing. In Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval, pages 685--694, 2011.
[39]
L. Yao, D. Mimno, and A. McCallum. Efficient methods for topic model inference on streaming document collections. In Conference on Knowledge Discovery and Data Mining, pages 937--946, 2009.
[40]
K. Zhai, J. Boyd-Graber, N. Asadi, and M. Alkhouja. Mr. LDA: A flexible large scale topic modeling package using variational inference in MapReduce. In Proceedings of World Wide Web Conference, pages 879--888, 2012.
[41]
J. Zhu, A. Ahmed, and E. Xing. MedLDA: Maximum margin supervised topic models for regression and classification. In International Conference on Machine Learning, pages 1257--1264, 2009.
[42]
J. Zhu and E. Xing. Sparse topical coding. In Conference on Uncertainty in Artificial Intelligence, pages 831--838, 2011.

Cited By

View all
  • (2022)Neural Topic Model Training with the REBAR Gradient EstimatorACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351733621:5(1-18)Online publication date: 15-Nov-2022
  • (2022)A Temperature-Modified Dynamic Embedded Topic ModelData Mining10.1007/978-981-19-8746-5_2(15-27)Online publication date: 5-Dec-2022
  • (2021)Dynamic, Incremental, and Continuous Detection of Cyberbullying in Online Social MediaACM Transactions on the Web10.1145/344801415:3(1-33)Online publication date: 13-May-2021
  • Show More Cited By
  1. Sparse online topic models

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    WWW '13: Proceedings of the 22nd international conference on World Wide Web
    May 2013
    1628 pages
    ISBN:9781450320351
    DOI:10.1145/2488388

    Sponsors

    • NICBR: Nucleo de Informatcao e Coordenacao do Ponto BR
    • CGIBR: Comite Gestor da Internet no Brazil

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 13 May 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. large-scale data
    2. online learning
    3. sparse latent representations
    4. topic models

    Qualifiers

    • Research-article

    Conference

    WWW '13
    Sponsor:
    • NICBR
    • CGIBR
    WWW '13: 22nd International World Wide Web Conference
    May 13 - 17, 2013
    Rio de Janeiro, Brazil

    Acceptance Rates

    WWW '13 Paper Acceptance Rate 125 of 831 submissions, 15%;
    Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Neural Topic Model Training with the REBAR Gradient EstimatorACM Transactions on Asian and Low-Resource Language Information Processing10.1145/351733621:5(1-18)Online publication date: 15-Nov-2022
    • (2022)A Temperature-Modified Dynamic Embedded Topic ModelData Mining10.1007/978-981-19-8746-5_2(15-27)Online publication date: 5-Dec-2022
    • (2021)Dynamic, Incremental, and Continuous Detection of Cyberbullying in Online Social MediaACM Transactions on the Web10.1145/344801415:3(1-33)Online publication date: 13-May-2021
    • (2021)Analyzing Cultural Assimilation through the Lens of Yelp Restaurant Reviews2021 IEEE 8th International Conference on Data Science and Advanced Analytics (DSAA)10.1109/DSAA53316.2021.9564170(1-10)Online publication date: 6-Oct-2021
    • (2021)Topic-Document Inference With the Gumbel-Softmax DistributionIEEE Access10.1109/ACCESS.2020.30466079(1313-1320)Online publication date: 2021
    • (2019)Cyberbullying Ends Here: Towards Robust Detection of Cyberbullying in Social MediaThe World Wide Web Conference10.1145/3308558.3313462(3427-3433)Online publication date: 13-May-2019
    • (2019)Sparsemax and Relaxed Wasserstein for Topic SparsityProceedings of the Twelfth ACM International Conference on Web Search and Data Mining10.1145/3289600.3290957(141-149)Online publication date: 30-Jan-2019
    • (2018)Stochastic expectation maximization with variance reductionProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327893(7978-7988)Online publication date: 3-Dec-2018
    • (2018)Probabilistic machine learningProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304652.3304839(5754-5759)Online publication date: 13-Jul-2018
    • (2018)Labeled Phrase Latent Dirichlet Allocation and its online learning algorithmData Mining and Knowledge Discovery10.1007/s10618-018-0555-032:4(885-912)Online publication date: 1-Jul-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media