Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1568296.1568311acmotherconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Discovering voter preferences in blogs using mixtures of topic models

Published: 23 July 2009 Publication History

Abstract

In this paper we propose a new approach to capture the inclination towards a certain election candidate from the contents of blogs and to explain why that inclination may be so. The method is based on the availability of "ground truth" speeches from the election candidates that are labeled and also on the collection of noisy blogs which are not labeled in any way. In this unsupervised learning scenario, we used probabilistic topic models to cluster the ground truth documents for each candidate into different underlying latent themes. The same topic models were then applied on the blog collection and the "orientation" of each of the blogs with different themes of the election candidate speeches was performed using KL divergence of the topic distribution over the overlapping vocabularies. We used four models for such theme matching, one with a baseline topic model and the other three by weighting the baseline topic model with the positive, negative and the neutral sentiments of the topics. We then used a collaborative objective function to combine the outcome of candidate preference for the blogs under the four models using an Expectation Maximization algorithm. The novelty of our method is highlighted in its use of unannotated data as well as in the combination of the views of the different "experts" explaining the same phenomenon.

References

[1]
Lada Adamic and Natalie Glace. The political blogosphere and the 2004 u.s. election: divided they blog. In LinkKDD '05: Proceedings of the 3rd international workshop on Link discovery, pages 36--43, 2005.
[2]
David Blei and John Lafferty. Correlated topic models. In Advances in Neural Information Processing Systems, volume 18, 2005.
[3]
David Blei and Jon McAuliffe. Supervised topic models. In Advances in Neural Information Processing Systems, volume 20, 2008.
[4]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3: 993--1022, 2003.
[5]
Eric Breck, Yejin Choi, and Claire Cardie. Identifying expressions of opinion in context. In Twentieth International Joint Conference on Artificial Intelligence, 2007.
[6]
Yejin Choi and Claire Cardie. Learning with compositional semantics as structural inference for subsentential sentiment analysis. In Empirical Methods in Natural Language Processing (EMNLP), 2008.
[7]
Arthur P. Dempster, Nan M. Laird, and Donald B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the Royal Statistical Society, Series B, 39(1): 1--38, 1977.
[8]
Kathleen Durant and Michael Smith. Mining sentiment classification from political web logs. In Proceedings of Workshop on Web Mining and Web Usage Analysis of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (WebKDD-2006), 2006.
[9]
Andrea Esuli and Fabrizio Sebastiani. Sentiwordnet: A publicly available lexical resource for opinion mining. In In Proceedings of the 5th Conference on Language Resources and Evaluation (LREC, pages 417--422, 2006.
[10]
ICWSM. Icwsm 2009 spinn3r dataset. In Proceedings of the Third International Conference on Weblogs and Social Media (ICWSM 2009), San Jose, CA, May 2009.
[11]
Michael I. Jordan. Hierarchical mixtures of experts and the em algorithm. Neural Computation, 6: 181--214, 1994.
[12]
Frank Lin and William W. Cohen. The multirank bootstrap algorithm: Semi-supervised political blog classification and ranking using semi-supervised link classification. In ICWSM'08 Poster, 2008.
[13]
Qiaozhu Mei, Xu Ling, Matthew Wondra, Hang Su, and Chengxiang Zhai. Topic sentiment mixture: modeling facets and opinions in weblogs. In In Proc. of the 16th Int. Conference on World Wide Web, pages 171--180, 2007.
[14]
Tony Mullen and Robert Malouf. A preliminary investigation into sentiment analysis of informal political discourse. In Proceedings of the AAAI-2006 Spring Symposium on Computational Approaches to Analyzing Weblogs, 2006.
[15]
Tae Yano, William W. Cohen, and Noah A. Smith. Predicting response to political blog posts with topic models. In Proceedings of NAACL HLT, page TBD, 2009.

Cited By

View all
  • (2012)Textual predictors of bill survival in congressional committeesProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies10.5555/2382029.2382157(793-802)Online publication date: 3-Jun-2012
  • (2012)Sentiment analysis of online news text: a case study of appraisal theoryOnline Information Review10.1108/1468452121128793636:6(858-878)Online publication date: 23-Nov-2012

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
AND '09: Proceedings of The Third Workshop on Analytics for Noisy Unstructured Text Data
July 2009
127 pages
ISBN:9781605584966
DOI:10.1145/1568296
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 23 July 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. KL divergence
  2. blogs
  3. sentiments
  4. social network
  5. topic models

Qualifiers

  • Research-article

Conference

AND '09

Acceptance Rates

AND '09 Paper Acceptance Rate 15 of 22 submissions, 68%;
Overall Acceptance Rate 15 of 22 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Textual predictors of bill survival in congressional committeesProceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies10.5555/2382029.2382157(793-802)Online publication date: 3-Jun-2012
  • (2012)Sentiment analysis of online news text: a case study of appraisal theoryOnline Information Review10.1108/1468452121128793636:6(858-878)Online publication date: 23-Nov-2012

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media