Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3290688.3290735acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacswConference Proceedingsconference-collections
research-article

Topic Modelling for Identification of Vaccine Reactions in Twitter

Published: 29 January 2019 Publication History

Abstract

Background: Detection of vaccine safety signals depends on various established reporting systems, where there is inevitably a lag between an adverse reaction to a vaccine and the reporting of it, and subsequent processing of reports. Therefore, it is desirable to try and detect safety signals earlier, ideally close to real-time. Extensive use of social media has provided a platform for sharing and seeking health-related information, and the immediacy of social media conversations mean that they are an ideal candidate for early detection of vaccine safety signals. The objective of this study is to evaluate topic models for identifying user posts on Twitter that most likely contain vaccine safety signals. This is an initial step in the overall research to determine if reliable vaccine safety signals can be detected in social media streams. The techniques used were focused on identifying the model design and number of topics that best revealed documents that contained vaccine safety signals, to assist with dimension reduction and subsequent labelling of the text data. The study compared Gensim LDA, MALLET, and jLDADMM DMM models to determine the most effective model for detecting vaccine safety signals, assisted by an evaluation process that used an adjusted F-Scoring technique over a labelled subset of the documents.

References

[1]
P. K. Armstrong, G. K. Dowse, P. V. Effler, D. Carcione, C. C. Blyth, P. C. Richmond, G. C. Geelhoed, F. Mascaro, M. Scully, and T. S. Weeramanthri. 2011. Epidemiological study of severe febrile reactions in young children in Western Australia caused by a 2010 trivalent inactivated influenza vaccine. BMJ Open 1, 1: e000016--e000016.
[2]
Bhagyashree Vyankatrao Barde and Anant Madhavrao Bainwad. 2017. An Overview of Topic Modeling Methods and Tools. 745--750.
[3]
Bryan Martin Bennett. 2018. Mining Patients ' Narratives in Social Media for Pharmacovigilance: Adverse Effects and Misuse of Methylphenidate. 9, May.
[4]
Paulo Bicalho, Marcelo Pita, Gabriel Pedrosa, Anisio Lacerda, and Gisele L. Pappa. 2017. A general framework to expand short text for topic modeling. Information Sciences 393: 66--81.
[5]
David Blei, Lawrence Carin, and David Dunson. 2010. Probabilistic topic models. IEEE Signal Processing Magazine 27, 6: 55--65.
[6]
David M Blei, Blei@cs Berkeley Edu, Andrew Y Ng, Ang@cs Stanford Edu, Michael I Jordan, and Jordan@cs Berkeley Edu. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3: 993--1022.
[7]
Vincent D Blondel, Jean-loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. 2008. Fast unfolding of communities in large networks.
[8]
Council for International Organizations of Medical Sciences & WHO. 2012. Definition and application of terms for vaccine pharmacovigilance: report of CIOMS/WHO Working Group on Vaccine Pharmacovigilance. Geneva: 39--40. Retrieved from http://www.who.int/vaccine_safety/initiative/tools/CIOMS_report_WG_vaccine.pdf
[9]
Nigel W Crawford, Hazel Clothier, Kate Hodgson, Gowri Selvaraj, Mee Lee Easton, and Jim P Buttery. 2014. Active surveillance for adverse events following immunization. Expert Review of Vaccines 13, 2 spontaneous reporting, but increasingly active surveillance and supplemental strategies are being incorporated into vaccine safety programs. These include active screening for targeted conditions of interest (e.g., hospitalization), monitoring of new da: 265--276.
[10]
Huw P. Evans, Alison Cooper, Huw Williams, and Andrew Carson-Stevens. 2016. Improving the safety of vaccine delivery. Human Vaccines & Immunotherapeutics 12, 5: 1280--1281.
[11]
Debarchana Ghosh and Rajarshi Guha. 2013. What are we "tweeting" about obesity? Mapping tweets with topic modeling and Geographic Information System. Cartography and Geographic Information Science 40, 2: 90--102.
[12]
Matthew Honnibal. 2017. Spacy.
[13]
Chenliang Li, Haoran Wang, Zhiqian Zhang, Aixin Sun, and Zongyang Ma. 2016. Topic Modeling for Short Texts with Auxiliary Word Embeddings Topic Modeling for Short Texts with Auxiliary Word Embeddings. September: 165--174.
[14]
Christopher D Manning, Christopher D Manning, and Hinrich Schütze. 1999. Foundations of statistical natural language processing. MIT press.
[15]
Jocelyn Mazarura and Alta De Waal. 2016. A comparison of the performance of latent Dirichlet allocation and the Dirichlet multinomial mixture model on short text. 1--6.
[16]
A K Mccallum. 2002. MALLET: A Machine Learning for Language Toolkit. Retrieved from citeulike-article-id:1062263
[17]
David Mimno, Hanna M. Wallach, Edmund Talley, Miriam Leenders, and Andrew McCallum. 2011. Optimizing semantic coherence in topic models. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2: 262--272.
[18]
David Newman, Jh Lau, Karl Grieser, and Timothy Baldwin. 2010. Automatic evaluation of topic coherence. ... Language Technologies: The ..., June: 100--108.
[19]
Dat Quoc Nguyen. 2018. jLDADMM: A Java package for the LDA and DMM topic models. Dmm: 1--5. Retrieved from http://arxiv.org/abs/1808.03835
[20]
Dat Quoc Nguyen, Richard Billingsley, Lan Du, and Mark Johnson. 2015. Improving topic models with latent feature word representations. Transactions of the Association for Computational Linguistics 3: 299--313.
[21]
Kamal Nigam, Andrew Kachites Mccallum, Sebastian Thrun, and Tom Mitchell. 2000. Text Classification from Labeled and Unlabeled Documents using EM. Machine Learning 39: 103--134.
[22]
Michael J. Paul and Mark Dredze. 2014. Discovering health topics in social media using topic models. PLoS ONE 9, 8.
[23]
Michael J Paul and Mark Dredze. 2011. You are what you Tweet: Analyzing Twitter for public health. Proceedings of the Fifth International AAAI Conference on Weblogs and Social Media: 265--272.
[24]
Kyle W. Prier, Matthew S. Smith, Christophe Giraud-Carrier, and Carl L. Hanson. 2011. Identifying health-related topics on twitter an exploration of tobacco-related tweets as a test topic. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 6589 LNCS: 18--25.
[25]
Xiaojun Quan, Chunyu Kit, Yong Ge, and Sinno Jialin Pan. 2015. Short and sparse text topic modeling via self-aggregation. IJCAI International Joint Conference on Artificial Intelligence 2015-January, Ijcai: 2270--2276.
[26]
R Řehůřek and Petr Sojka. 2010. Software framework for topic modelling with large corpora. Retrieved from http://www.muni.cz/research/publications/884893
[27]
Michael Röder, Andreas Both, and Alexander Hinneburg. 2015. Exploring the Space of Topic Coherence Measures. Proceedings of the Eighth ACM International Conference on Web Search and Data Mining - WSDM '15: 399--408.
[28]
Daniel A. Salmon and Neal A. Halsey. 2016. How Vaccine Safety is Monitored. The Vaccine Book: Second Edition: 153--165.
[29]
Carson Sievert, Kenneth E Shirley, and New York. 2014. LDAvis: A method for visualizing and interpreting topics. 63--70.
[30]
B Stokes. 2010. Ministerial review into the public health response into the adverse events to the seasonal influenza vaccine. Final report to the minister for health. 2011, 14 Jan.
[31]
Didi Surian, Dat Quoc Nguyen, Georgina Kennedy, Mark Johnson, Enrico Coiera, and Adam G. Dunn. 2016. Characterizing twitter discussions about HPV vaccines using topic modeling and community detection. Journal of Medical Internet Research 18, 8: 1--12.
[32]
Ming Yang, Melody Kiang, and Wei Shang. 2015. Filtering big data from social media - Building an early warning system for adverse drug reactions. Journal of Biomedical Informatics 54: 230--240.
[33]
Andrew Yates, Nazli Goharian, and Ophir Frieder. 2015. Extracting Adverse Drug Reactions from Social Media. Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, iii: 2460--2467.
[34]
Jianhua Yin and Jianyong Wang. 2014. A dirichlet multinomial mixture model-based approach for short text clustering. Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '14: 233--242.
[35]
Wayne Xin Zhao, Jing Jiang, Jianshu Weng, Jing He, Ee-Peng Lim, Hongfei Yan, and Xiaoming Li. 2011. Comparing twitter and traditional media using topic models. In European conference on information retrieval, 338--349.
[36]
Gephi-The Open Graph Viz Platform. Retrieved from https://gephi.org/
[37]
2018. NLTK, Natural Language Toolkit. Retrieved September 6, 2018 from http://www.nltk.org/
[38]
2018. pyLDAvis - Python library for interactive topic model visualization. Retrieved September 1, 2018 from https://github.com/bmabey/pyLDAvis

Cited By

View all
  • (2023)Visualizing Change and Correlation of Topics With LDA and Agglomerative Clustering on COVID-19 Vaccine TweetsIEEE Access10.1109/ACCESS.2023.327897911(51647-51656)Online publication date: 2023
  • (2022)Topic Modelling Application for Determining Competitiveness Factors of the Small Business FirmsInternational Journal of Social Science and Business10.23887/ijssb.v6i2.431646:2(174-182)Online publication date: 28-Jun-2022
  • (2022)Vaccine Adverse Event Mining of Twitter Conversations: 2-Phase Classification StudyJMIR Medical Informatics10.2196/3430510:6(e34305)Online publication date: 16-Jun-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ACSW '19: Proceedings of the Australasian Computer Science Week Multiconference
January 2019
486 pages
ISBN:9781450366038
DOI:10.1145/3290688
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • CORE - Computing Research and Education
  • Macquarie University-Sydney

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 January 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Social media
  2. Topic modelling
  3. Twitter
  4. Vaccine safety surveillance

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ACSW 2019
ACSW 2019: Australasian Computer Science Week 2019
January 29 - 31, 2019
NSW, Sydney, Australia

Acceptance Rates

ACSW '19 Paper Acceptance Rate 61 of 141 submissions, 43%;
Overall Acceptance Rate 61 of 141 submissions, 43%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)49
  • Downloads (Last 6 weeks)3
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Visualizing Change and Correlation of Topics With LDA and Agglomerative Clustering on COVID-19 Vaccine TweetsIEEE Access10.1109/ACCESS.2023.327897911(51647-51656)Online publication date: 2023
  • (2022)Topic Modelling Application for Determining Competitiveness Factors of the Small Business FirmsInternational Journal of Social Science and Business10.23887/ijssb.v6i2.431646:2(174-182)Online publication date: 28-Jun-2022
  • (2022)Vaccine Adverse Event Mining of Twitter Conversations: 2-Phase Classification StudyJMIR Medical Informatics10.2196/3430510:6(e34305)Online publication date: 16-Jun-2022
  • (2022)Understanding Public Sentiment Towards a Public Rally Using Text and Social Media Analytic2022 IEEE International Conference on Computing (ICOCO)10.1109/ICOCO56118.2022.10031692(16-20)Online publication date: 14-Nov-2022
  • (2022)Latent Semantic Analysis based Real-world Application of Topic Modeling: A Review Study2022 Second International Conference on Artificial Intelligence and Smart Energy (ICAIS)10.1109/ICAIS53314.2022.9742848(1142-1149)Online publication date: 23-Feb-2022
  • (2022)Information systems for vaccine safety surveillanceHuman Vaccines & Immunotherapeutics10.1080/21645515.2022.210017318:6Online publication date: 26-Sep-2022
  • (2022)Evolving Consumer Responses to Social Issue Campaigns: A Data-Mining Case of COVID-19 Ads on YouTubeJournal of Interactive Advertising10.1080/15252019.2022.206377022:2(195-206)Online publication date: 15-Jun-2022
  • (2021)Exploring the Expression Differences Between Professionals and Laypeople Toward the COVID-19 Vaccine: Text Mining ApproachJournal of Medical Internet Research10.2196/3071523:8(e30715)Online publication date: 27-Aug-2021
  • (2021)Themes, communities and influencers of online probiotics chatter: A retrospective analysis from 2009-2017PLOS ONE10.1371/journal.pone.025809816:10(e0258098)Online publication date: 21-Oct-2021
  • (2021)Exploring public perceptions of the COVID-19 vaccine online from a cultural perspectiveTelematics and Informatics10.1016/j.tele.2021.10171265:COnline publication date: 1-Dec-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media