Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A systematic review of the use of topic models for short text social media analysis

Published: 01 May 2023 Publication History

Abstract

Recently, research on short text topic models has addressed the challenges of social media datasets. These models are typically evaluated using automated measures. However, recent work suggests that these evaluation measures do not inform whether the topics produced can yield meaningful insights for those examining social media data. Efforts to address this issue, including gauging the alignment between automated and human evaluation tasks, are hampered by a lack of knowledge about how researchers use topic models. Further problems could arise if researchers do not construct topic models optimally or use them in a way that exceeds the models’ limitations. These scenarios threaten the validity of topic model development and the insights produced by researchers employing topic modelling as a methodology. However, there is currently a lack of information about how and why topic models are used in applied research. As such, we performed a systematic literature review of 189 articles where topic modelling was used for social media analysis to understand how and why topic models are used for social media analysis. Our results suggest that the development of topic models is not aligned with the needs of those who use them for social media analysis. We have found that researchers use topic models sub-optimally. There is a lack of methodological support for researchers to build and interpret topics. We offer a set of recommendations for topic model researchers to address these problems and bridge the gap between development and applied research on short text topic models.

References

[1]
Abd-Alrazaq A, Alhuwail D, Househ M, Hamdi M, Shah Z, et al. Top concerns of tweeters during the covid-19 pandemic: infoveillance study J Med Internet Res 2020 22 4 19016
[2]
Abdul-Rahman M, Chan EH, Wong MS, Irekponor VE, and Abdul-Rahman MO A framework to simplify pre-processing location-based social media big data for sustainable urban planning and management Cities 2021 109 102986
[3]
Agarwal AK, Wong V, Pelullo AM, Guntuku S, Polsky D, Asch DA, Muruako J, and Merchant RM Online reviews of specialized drug treatment facilities–identifying potential drivers of high and low patient satisfaction J Gen Intern Med 2020 35 6 1647-1653
[4]
Albalawi R, Yeap TH, and Benyoucef M Using topic modeling methods for short-text data: A comparative analysis Frontiers in Artificial Intelligence 2020 3 42
[5]
Alghamdi R and Alfalqi K A survey of topic modeling in text mining Int J Adv Comput Sci Appl 2015 6 1 1-10
[6]
Al-Ramahi MA, Liu J, and El-Gayar OF Discovering design principles for health behavioral change support systems: A text mining approach ACM Transactions on Management Information Systems (TMIS) 2017 8 2–3 1-24
[7]
Alshalan R, Al-Khalifa H, Alsaeed D, Al-Baity H, and Alshalan S Detection of hate speech in COVID-19-related tweets in the Arab region: deep learning and topic modeling approach J Med Internet Res 2020 22 12 22609
[8]
Amin MH, Mohamed EK, and Elragal A Corporate disclosure via social media: a data science approach Online Information Review 2020 40 1 278-298
[9]
Arun R, Suresh V, Madhavan CV, and Murthy MN On finding the natural number of topics with Latent Dirichlet Allocation: Some observations 2010 Pacific-Asia Conference on Knowledge Discovery and Data Mining 2010 Springer 391-402
[10]
Aslett K, Webb Williams N, Casas A, Zuidema W, and Wilkerson J What was the problem in Parkland? using social media to measure the effectiveness of issue frames Policy Studies Journal 2020 50 1 266-289
[11]
Bahja M and Safdar GA Unlink the link between COVID-19 and 5G networks: an NLP and SNA based approach IEEE Access 2020 8 209127-209137
[12]
Bail CA, Argyle LP, Brown TW, Bumpus JP, Chen H, Hunzaker MF, Lee J, Mann M, Merhout F, and Volfovsky A Exposure to opposing views on social media can increase political polarization Proc Natl Acad Sci 2018 115 37 9216-9221
[13]
Berg S, König T, and Koster A-K Political opinion formation as epistemic practice: The hashtag assemblage of #metwo Media and Communication 2020 8 4 84-95
[14]
Bérubé M, Tang T-U, Fortin F, Ozalp S, Williams ML, and Burnap P Social media forensics applied to assessment of post-critical incident social reaction: the case of the 2017 Manchester Arena terrorist attack Forensic Sci Int 2020 313 110364
[15]
Bhatia S, Lau JH, and Baldwin T Topic intrusion for automatic topic model evaluation Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2018 EMNLP 844-849
[16]
Bird S and Loper E NLTK: the natural language toolkit 2004 Association for Computational Linguistics
[17]
Blei DM and Lafferty JD Dynamic topic models Proceeding of the 23rd international conference on machine learning 2006 IEEE 113-120
[18]
Blei DM, Ng AY, and Jordan MI Latent dirichlet allocation J Mach Learn Res 2003 3 1 993-1022
[19]
Booth A Cochrane or cock-eyed? How should we conduct systematic reviews of qualitative research? Qualitative Evidence-Based Practice Conference ‘Taking a Critical Stance’ 2001 Education-line
[20]
Bose T, Illina I, Fohr D (2021) Generalisability of topic models in cross-corpora abusive language detection. In: 2021 Workshop on NLP4IF: Censorship, Disinformation, and Propaganda, North American Chapter of the Association for Computational Linguistics
[21]
Brown NM Methodological cyborg as black feminist technology: constructing the social self using computational digital autoethnography and social media Cult Stud Crit Methodol 2019 19 1 55-67
[22]
Cai M, Shah N, Li J, Chen W-H, Cuomo RE, Obradovich N, and Mackey TK Identification and characterization of tweets related to the 2015 Indiana HIV outbreak: A retrospective infoveillance study Plos one 2020 15 8 0235150
[23]
Cao J, Xia T, Li J, Zhang Y, and Tang S A density-based method for adaptive LDA model selection Neurocomputing 2009 72 7–9 1775-1781
[24]
Carlson J and Harris K Quantifying and contextualizing the impact of bioRxiv preprints through automated social media audience segmentation PLoS Biology 2020 18 9 3000860
[25]
Cesare N, Oladeji O, Ferryman K, Wijaya D, Hendricks-Muñoz KD, Ward A, and Nsoesie EO Discussions of miscarriage and preterm births on Twitter Paediatric and perinatal epidemiology 2020 34 5 544-552
[26]
Chae BK The evolution of the Internet of Things (IoT): A computational text analysis Telecommunications Policy 2019 43 10
[27]
Chan M-pS, Jamieson KH, and Albarracin D Prospective associations of regional social media messages with attitudes and actual vaccination: A big data and survey study of the influenza vaccine in the United States Vaccine 2020 38 40 6236-6247
[28]
Chang J, Gerrish S, Wang C, Boyd-Graber JL, and Blei DM Reading tea leaves: how humans interpret topic models Proceedings of the 23rd Annual Conference on Neural Information Processing Systems 2009 IEEE 288-296
[29]
Charmaz K Teaching theory construction with initial grounded theory tools: A reflection on lessons and learning Qualitative health research 2015 25 12 1610-1622
[30]
Chauhan U and Shah A Topic modeling using latent Dirichlet allocation: A survey ACM Computing Surveys (CSUR) 2021 54 7 1-35
[31]
Chen T-H, Thomas SW, and Hassan AE A survey on the use of topic models when mining software repositories Empirical Software Engineering 2016 21 5 1843-1919
[32]
Chen L, Lu X, Yuan J, Luo J, Luo J, Xie Z, and Li D A social media study on the associations of flavored electronic cigarettes with health symptoms: Observational study Journal of Medical Internet Research 2020 22 6 17496
[33]
Cheng X, Yan X, Lan Y, and Guo J BTM: Topic modeling over short texts IEEE Transactions on Knowledge and Data Engineering 2014 26 12 2928-2941
[34]
Colicchia C and Strozzi F Supply chain risk management: a new methodology for a systematic literature review 2012 Supply Chain Management An International Journal
[35]
Creswell JW, Klassen AC, Plano Clark VL, Smith KC, et al. Best practices for mixed methods research in the health sciences Bethesda (Maryland): Natl Inst Health 2011 2013 541-545
[36]
Cuello-Garcia C, Pérez-Gaxiola G, and Amelsvoort L Social media can have an impact on how we manage and investigate the COVID-19 pandemic Journal of clinical epidemiology 2020 127 198-201
[37]
Curiskis SA, Drake B, Osborn TR, and Kennedy PJ An evaluation of document clustering and topic modelling in two online social networks: Twitter and Reddit Info Process Manag 2019 57 102034
[38]
Deng Q, Gao Y, Wang C, and Zhang H Detecting information requirements for crisis communication from social media data: an interactive topic modeling approach International J Disaster Risk Reduct 2020 50 101692
[39]
Denyer D and Tranfield D Buchanan DA and Bryman A Producing a systematic review The Sage Handbook of Organizational Research Methods 2009 USA Sage Publications Ltd 671-689
[40]
Deveaud R, SanJuan E, and Bellot P Accurate and effective latent concept modeling for ad hoc information retrieval Document numérique 2014 17 1 61-84
[41]
Doogan C and Buntine W Topic model or topic twaddle? re-evaluating semantic interpretability measures Proceedings of the 2021 conference of the North American chapter of the association for computational linguistics: human language technologies 2021 Association for Computational Linguistics 3824-3848
[42]
Doogan C, Buntine W, Linger H, and Brunt S Public perceptions and attitudes toward COVID-19 nonpharmaceutical interventions across six countries: a topic modeling analysis of Twitter data J Med Internet Res 2020 22 9 21419
[43]
Dyda A, Shah Z, Surian D, Martin P, Coiera E, Dey A, Leask J, and Dunn AG HPV vaccine coverage in Australia and associations with HPV vaccine information exposure among Australian Twitter users Human Vaccines Immunother 2019 15 7–8 1488-1495
[44]
El-Bassel N, Hochstatter KR, Slavin MN, Yang C, Zhang Y, and Muresan S Harnessing the power of social media to understand the impact of COVID-19 on people who use drugs during lockdown and social distancing J Addict Med 2021 2021 10
[45]
Erfanian PY, Cami BR, and Hassanpour H An evolutionary event detection model using the matrix decomposition oriented Dirichlet process Expert Systems with Applications 2022 189
[46]
Eysenbach G et al. Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the internet J Med Internet Res 2009 11 1 1157
[47]
Feldhege J, Moessner M, and Bauer S Who says what? Content and participation characteristics in an online depression community Journal of Affective Disorders 2020 263 521-527
[48]
Fischer-Preßler D, Schwemmer C, and Fischbach K Collective sense-making in times of crisis: Connecting terror management theory with Twitter user reactions to the Berlin terrorist attack Computers in Human Behavior 2019 100 138-151
[49]
Gobbo E, Fontanella S, Sarra A, and Fontanella L Emerging topics in Brexit debate on Twitter around the deadlines Social Indicators Research 2021 156 2 669-688
[50]
Greene D, O’Callaghan D, and Cunningham P How many topics? Stability analysis for topic models 2014 joint European conference on machine learning and knowledge discovery in databases (ECML-PKDD) 2014 Springer 498-513
[51]
Gregoriades A and Pampaka M Electronic word of mouth analysis for new product positioning evaluation Electronic Commerce Research and Applications 2020 42
[52]
Griffiths TL and Steyvers M Finding scientific topics Proceedings of the National academy of Sciences 2004 101 1 5228-5235
[53]
Gurajala S, Dhaniyala S, and Matthews JN Understanding public response to air quality using tweet analysis Social Media + Society 2019 5 3 1-14
[54]
Ha T, Beijnon B, Kim S, Lee S, and Kim JH Examining user perceptions of smartwatch through dynamic topic modeling Telematics and Informatics 2017 34 7 1262-1273
[55]
Hacker J, Brocke J, Handali J, Otto M, and Schneider J Virtually in this together-how web-conferencing systems enabled a new virtual togetherness during the COVID-19 crisis European Journal of Information Systems 2020 29 5 563-584
[56]
Haghighi NN, Liu XC, Wei R, Li W, and Shao H Using Twitter data for transit performance assessment: a framework for evaluating transit riders’ opinions about quality of service Public Transport 2018 10 2 363-377
[57]
Han AT, Laurian L, and Dewald J Plans versus political priorities: Lessons from municipal election candidates’ social media communications J Am Plan Assoc 2020 2020 1-17
[58]
Hannigan TR, Haans RF, Vakili K, Tchalian H, Glaser VL, Wang MS, Kaplan S, and Jennings PD Topic modeling in management research: Rendering new theory from textual data Academy of Management Annals 2019 13 2 586-632
[59]
Harrando I, Lisena P, and Troncy R Apples to Apples: a systematic evaluation of topic models Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021) 2021 INCOMA Ltd. 483-493
[60]
Hemmatian B, Sloman SJ, Priva UC, and Sloman SA Think of the consequences: A decade of discourse about same-sex marriage Behavior research methods 2019 51 4 1565-1585
[61]
Hemsley J, Erickson I, Jarrahi MH, and Karami A Digital nomads, coworking, and other expressions of mobile work on Twitter First Monday 2020 2020 10
[62]
Hoffman M, Bach F, and Blei D Online learning for Latent Dirichlet Allocation Advances in Neural Information Processing Systems 2010 23 856-864
[63]
Hong L, Davison BD (2010) Empirical study of topic modeling in Twitter. Proceedings of the first workshop on social media analytics, pp. 80–88
[64]
Hoyle AM, Goel P, and Resnik P Improving neural topic models using knowledge distillation Proceeding of the 2020 conference on Empirical Methods in Natural Language Processing (EMNLP) 2020 EMNLP 1752-1771
[65]
Hu Y, Deng C, and Zhou Z A semantic and sentiment analysis on online neighborhood reviews for understanding the perceptions of people toward their living environments Annals of the American Association of Geographers 2019 109 4 1052-1073
[66]
Huang J, Peng M, Li P, Hu Z, and Xu C Improving biterm topic model with word embeddings World Wide Web 2020 23 6 3099-3124
[67]
Hwang Y, Kim HJ, Choi HJ, and Lee J Exploring abnormal behavior patterns of online users with emotional eating behavior: topic modeling study J Med Internet Res 2020 22 3 15700
[68]
Ibrahim NF and Wang X Decoding the sentiment dynamics of online retailing customers: Time series analysis of social media Computers in Human Behavior 2019 96 32-45
[69]
Ibrahim NF and Wang X A text analytics approach for online retailing service improvement: Evidence from Twitter Decision Support Systems 2019 121 37-50
[70]
Jacobi C, Van Atteveldt W, and Welbers K Quantitative analysis of large amounts of journalistic texts using topic modelling Digital Journalism 2016 4 1 89-106
[71]
Jamison A, Broniatowski DA, Smith MC, Parikh KS, Malik A, Dredze M, and Quinn SC Adapting and extending a typology to identify vaccine misinformation on Twitter American Journal of Public Health 2020 110 S3 331-339
[72]
Jelodar H, Wang Y, Yuan C, Feng X, Jiang X, Li Y, and Zhao L Latent Dirichlet Allocation (LDA) and topic modeling: Models, applications, a survey Multimedia Tools and Applications 2019 78 11 15169-15211
[73]
Jenkins A, Croitoru A, Crooks AT, and Stefanidis A Crowdsourcing a collective sense of place PloS One 2016 11 4 0152932
[74]
Jeong B, Yoon J, and Lee J-M Social media mining for product planning: A product opportunity mining approach based on topic modeling and sentiment analysis International Journal of Information Management 2019 48 280-290
[75]
Jin Y, Zhao H, Liu M, Du L, and Buntine W Neural attention-aware hierarchical topic model Proceedings of the 2021Conference on Empirical Methods in Natural Language Processing (EMNLP) 2021 USA EMNLP 1042-1052
[76]
Jónsso E An evaluation of topic modelling techniques for Twitter 2016 ACM
[77]
Joo S, Lu K, and Lee T Analysis of content topics, user engagement and library factors in public library social media based on text mining Online Info Rev 2020 44 258
[78]
Kar AK What affects usage satisfaction in mobile payments? Modelling user generated content to develop the ‘digital service usage satisfaction model’ Information Systems Frontiers 2020 23 5 1341-1361
[79]
Kirilenko AP, Stepchenkova SO, and Dai X Automated topic modeling of tourist reviews: Does the Anna Karenina principle apply? Tourism Management 2021 83
[80]
Kitazawa K and Hale SA Social media and early warning systems for natural disasters: a case study of Typhoon Etau in Japan Int J Disaster Risk Reduct 2021 52 101926
[81]
Kitchenham BA, Dyba T, and Jorgensen M Evidence-based software engineering Proceedings of the 26th international conference on software engineering 2004 IEEE 273-281
[82]
Kitchenham B, Brereton OP, Budgen D, Turner M, Bailey J, and Linkman S Systematic literature reviews in software engineering-a systematic literature review Information and software technology 2009 51 1 7-15
[83]
Kjellin PE and Liu Y A survey on interactivity in topic models International Journal of Advanced Computer Science and Applications 2016 7 4 456-461
[84]
Kurten S and Beullens K #Coronavirus: monitoring the Belgian Twitter discourse on the severe acute respiratory syndrome coronavirus 2 pandemic Cyberpsychology, Behavior, and Social Networking 2021 24 2 117-122
[85]
Kwon KH, Chadha M, and Wang F Proximity and networked news public: Structural topic modeling of global Twitter conversations about the 2017 Quebec mosque shooting International Journal of Communication 2019 13 2652-2675
[86]
Lau JH, Newman D, and Baldwin T Machine reading tea leaves: Automatically evaluating topic coherence and topic model quality Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics 2014 ACM 530-539
[87]
Le GM, Radcliffe K, Lyles C, Lyson HC, Wallace B, Sawaya G, Pasick R, Centola D, and Sarkar U Perceptions of cervical cancer prevention on Twitter uncovered by different sampling strategies PloS One 2019 14 2 0211931
[88]
Lee TY, Smith A, Seppi K, Elmqvist N, Boyd-Graber J, and Findlater L The human touch: How non-expert users perceive, interpret, and fix topic models International Journal of Human-Computer Studies 2017 105 28-42
[89]
Li P, Cho H, Qin Y, and Chen A #MeToo as a connective movement: Examining the frames adopted in the anti-sexual harassment movement in China Social Science Computer Review 2020 39 5 1030-1049
[90]
Li Y, Cai M, Qin S, and Lu X Depressive emotion detection and behavior analysis of men who have sex with men via social media Frontiers in Psychiatry 2020 11 830
[91]
Liang B, Wang Y, and Tsou M-H A fitness theme may mitigate regional prevalence of overweight and obesity: Evidence from Google search and tweets Journal of Health Communication 2019 24 9 683-692
[92]
Likhitha S, Harish B, and Kumar HK A detailed survey on topic modeling for document and short text data International Journal of Computer Applications 2019 178 39 1-9
[93]
Lima BN, Balducci P, Passos RP, Novelli C, Fileni CHP, Vieira F, Camargo LB, and Junior GdBV Artificial Intelligence based on fuzzy logic for the analysis of human movement in healthy people: A systematic review Artificial Intelligence Review 2021 54 2 1507-1523
[94]
Liu X A big data approach to examining social bots on Twitter J Serv Market 2019 11 1-10
[95]
Liu X Analyzing the impact of user-generated content on B2B Firms’ stock performance: Big data analysis with machine learning methods Industrial Marketing Management 2020 86 30-39
[96]
Liu L and Tang L A survey of statistical topic model for multi-label classification Proceedings of the 26th international conference on geoinformatics 2018 IEEE 1-5
[97]
Lock O and Pettit C Social media as passive geo-participation in transportation planning-How effective are topic modeling and sentiment analysis in comparison with citizen surveys? Geo-spatial Info Sci 2020 23 4 275-292
[98]
Loper E and Bird S NLTK: The Natural Language Toolkit 2002 Association for Computational Linguistics
[99]
Low DM, Rumker L, Talkar T, Torous J, Cecchi G, and Ghosh SS Natural Language Processing reveals vulnerable mental health support groups and heightened health anxiety on Reddit during COVID-19: Observational study Journal of Medical Internet Research 2020 22 10 22635
[100]
Mazarura J and de Waal A A comparison of the performance of Latent Dirichlet Allocation and the Dirichlet Multinomial Mixture Model on short text 2016 Pattern Recognition Association of South Africa and Robotics and Mechatronics International Conference (PRASA-RobMech) 2016 IEEE 1-6
[101]
McCallum AK MALLET: A Machine Learning for Language Toolkit 2002 MALLET
[102]
Medford RJ, Saleh SN, Sumarsono A, Perl TM, and Lehmann CU An infodemic: Leveraging high-volume Twitter data to understand early public sentiment for the Coronavirus disease 2019 outbreak Open Forum Infect dis 2020 7 7 1-10
[103]
Mehrotra R, Sanner S, Buntine W, and Xie L Improving LDA topic models for microblogs via tweet pooling and automatic labeling Proceeding of the 36th International ACM SIGIR conference on research and development in information retrieval 2013 ACM 889-892
[104]
Meyer TR, Balague D, Camacho-Collados M, Li H, Khuu K, Brantingham PJ, and Bertozzi AL A year in Madrid as described through the analysis of geotagged Twitter data Environment and Planning B: Urban Analytics and City Science 2019 46 9 1724-1740
[105]
Moher D, Liberati A, Tetzlaff J, Altman DG, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement International Journal of Surgery 2010 8 5 336-341
[106]
Mostafa MM and Nebot NR The Arab image in Spanish social media: A Twitter sentiment analytics approach Journal of Intercultural Communication Research 2020 49 2 133-155
[107]
Mulunda CK, Wagacha PW, and Muchemi L Review of trends in topic modeling techniques, tools, inference algorithms and applications Proceedings of the 5th International Conference on Soft Computing and Machine Intelligence (ISCMI) 2018 IEEE 28-37
[108]
Murzintcev N (2020) ldatuning: tuning of the latent dirichlet allocation models parameters. version 1.0.2
[109]
Murashka V, Liu J, and Peng Y Fitspiration on Instagram: identifying topic clusters in user comments to posts with objectification features Health Commun 2020 2020 1-12
[110]
Nguyen D, Liakata M, DeDeo S, Eisenstein J, Mimno D, Tromble R, and Winters J How we do things with words: analyzing text as social and cultural data Front Artif Intell 2020 62 1-10
[111]
Nizzoli L, Tardelli S, Avvenuti M, Cresci S, Tesconi M, and Ferrara E Charting the landscape of online cryptocurrency manipulation IEEE Access 2020 8 113230-113245
[112]
Nobles AL, Leas EC, Latkin CA, Dredze M, Strathdee SA, and Ayers JW HIV: Alignment of HIV-related visual content on Instagram with public health priorities in the US AIDS and Behav 2020 2020 1-9
[113]
Nolasco D and Oliveira J Mining social influence in science and vice-versa: A topic correlation approach International Journal of Information Management 2020 51
[114]
Nugroho R, Paris C, Nepal S, Yang J, and Zhao W A survey of recent methods on deriving topics from Twitter: algorithm to evaluation Knowl Info Syst 2020 62 2485-2519
[115]
Okon E, Rachakonda V, Hong HJ, Callison-Burch C, and Lipoff JB Natural Language Processing of Reddit data to evaluate dermatology patient experiences and therapeutics Journal of the American Academy of Dermatology 2020 83 3 803-808
[116]
Pang PC-I, McKay D, Chang S, Chen Q, Zhang X, and Cui L Privacy concerns of the Australian My Health Record: Implications for other large-scale opt-out personal health records Information Processing & Management 2020 57 6
[117]
Pavlova A and Berkers P “Mental health” as defined by Twitter: Frames, emotions, stigma Health Commun 2020 2020 1-11
[118]
Peres R, Talwar S, Alter L, Elhanan M, and Friedmann Y Narrowband influencers and global icons: Universality and media compatibility in the communication patterns of political leaders worldwide Journal of International Marketing 2020 28 1 48-65
[119]
Pousti H, Urquhart C, and Linger H Researching the virtual: A framework for reflexivity in qualitative social media research Information Systems Journal 2021 31 3 356-383
[120]
Pruss D, Fujinuma Y, Daughton AR, Paul MJ, Arnot B, Albers Szafir D, and Boyd-Graber J Zika discourse in the Americas: A multilingual topic analysis of Twitter PloS one 2019 14 5 0216922
[121]
Puschmann C, Ausserhofer J, and Šlerka J Converging on a nativist core? Comparing issues on the Facebook pages of the Pegida movement and the alternative for Germany Euro J Commun 2020 35 3 230-248
[122]
Qi B, Costin A, and Jia M A framework with efficient extraction and analysis of Twitter data for evaluating public opinions on transportation services Travel Behaviour and sSciety 2020 21 10-23
[123]
Qiang J, Qian Z, Li Y, Yuan Y, and Wu X Short text topic modeling techniques, applications, and performance: a survey IEEE Trans Knowl Data Eng 2020 2020 19
[124]
Rana TA, Cheah YN, and Letchmunan S Topic modeling in sentiment analysis: a systematic review J ICT Res Appl 2016 10 1 76-93
[125]
Rashman L, Withers E, and Hartley J Organizational learning and knowledge in public service organizations: A systematic review of the literature International journal of management reviews 2009 11 4 463-494
[126]
Řehůřek P and Sojka P Software Framework for Topic Modelling with Large Corpora Proceedings of the 7thFrameworks 2010 ELRA 45-50
[127]
Reyes-Menendez A, Saura JR, and Filipe F Marketing challenges in the #MeToo era: Gaining business insights using an exploratory sentiment analysis Heliyon 2020 6 3 03626
[128]
Roberts ME, Stewart BM, Tingley D, Lucas C, Leder-Luis J, Gadarian SK, Albertson B, and Rand DG Structural topic models for open-ended survey responses American Journal of Political Science 2014 58 4 1064-1082
[129]
Rosen A and Ihara I Giving you more characters to express yourself 2017 Twitter
[130]
Schofield A and Mimno D Comparing apples to apple: The effects of stemmers on topic models Transactions of the Association for Computational Linguistics 2016 4 287-300
[131]
Schofield A, Magnusson M, Thompson L, and Mimno D Understanding text pre-processing for Latent Dirichlet Allocation Proceedings of the 15th conference of the European chapter of the association for computational linguistics (EACL) 2017 EACL 432-436
[132]
Steuber F, Schoenfeld M, and Rodosek GD Topic modeling of short texts using anchor words International Conference on Web Intelligence, Mining And Semantics 2020 Association for Computing Machinery 210-219
[133]
Sun X, Liu X, Li B, Duan Y, Yang H, and Hu J Exploring topic models in software engineering data analysis: A survey Proceedings of the 17th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD) 2016 IEEE 357-362
[134]
Surian D, Nguyen DQ, Kennedy G, Johnson M, Coiera E, and Dunn AG Characterizing Twitter discussions about HPV vaccines using topic modeling and community detection Journal of Medical Internet Research 2016 18 8 6045
[135]
Svartzman GG, Ramirez-Marquez JE, and Barker K Social media analytics to connect system performability and quality of experience, with an application to Citibike Computers & Industrial Engineering 2020 139
[136]
Thorson K, Medeiros M, Cotter K, Chen Y, Rodgers K, Bae A, and Baykaldi S Platform civics: Facebook in the local information infrastructure Digital Journalism 2020 8 10 1231-1257
[137]
Titov I, McDonald R (2008) Modeling online reviews with multi-grain topic models. Proceedings of the 17th international conference on the world wide web, pp. 111–120
[138]
Tommasel A and Godoy D Short-text feature construction and selection in social media data: A survey Artificial Intelligence Review 2018 49 3 301-338
[139]
Tranfield D, Denyer D, and Smart P Towards a methodology for developing evidence-informed management knowledge by means of systematic review British Journal of Management 2003 14 3 207-222
[140]
Valdez D, Ten Thij M, Bathina K, Rutter LA, and Bollen J Social media insights into US mental health during the COVID-19 pandemic: longitudinal analysis of Twitter data J Med Internet Res 2020 22 12 21418
[141]
Vaughan M Talking about tax: the discursive distance between 38 Degrees and GetUp Journal of Information Technology & Politics 2020 17 2 114-129
[142]
Vayansky I and Kumar SA A review of topic modeling methods Information Systems 2020 94
[143]
Wang J, Zhou Y, Zhang W, Evans R, and Zhu C Concerns expressed by Chinese social media users during the COVID-19 pandemic: Content analysis of Sina Weibo microblogging data Journal of Medical Internet Research 2020 22 11 22152
[144]
Wicke P and Bolognesi MM Framing COVID-19: How we conceptualize and discuss the pandemic on Twitter PloS one 2020 15 9 0240010
[145]
Wong A, Ho S, Olusanya O, Antonini MV, and Lyness D The use of social media and online communications in times of pandemic COVID-19 Journal of the Intensive Care Society 2021 22 3 255-260
[146]
Wu W, Li J, He Z, Ye X, Zhang J, Cao X, and Qu H Tracking spatio-temporal variation of geo-tagged topics with social media in China: A case study of 2016 Hefei rainstorm International Journal of Disaster Risk Reduction 2020 50
[147]
Wu X, Li C, Zhu Y, and Miao Y Short text topic modeling with topic distribution quantization and negative sampling decoder Proceeding of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020 IEEE 1772-1782
[148]
Wu Z, Zhang Y, Chen Q, and Wang H Attitude of Chinese public towards municipal solid waste sorting policy: A text mining study Science of The Total Environment 2021 756
[149]
Xia L, Luo D, Zhang C, and Wu Z A survey of topic models in text classification Proceedings of the 2nd International Conference on Artificial Intelligence and Big Data (ICAIBD) 2019 IEEE 244-250
[150]
Xin Y and MacEachren AM Characterizing traveling fans: a workflow for event-oriented travel pattern analysis using Twitter data Int J Geograp Info Sci 2020 34 12 2497-2516
[151]
Xu S and Xiong Y Setting socially mediated engagement parameters: A topic modeling and text analytic approach to examining polarized discourses on Gillette’s campaign Public Relations Review 2020 46 5
[152]
Xu S and Zhou A Hashtag homophily in Twitter network: examining a controversial cause-related marketing campaign Comput Human Behav 2020 102 87-96
[153]
Xu Z, Lachlan K, Ellis L, and Rainear AM Understanding public opinion in different disaster stages: A case study of Hurricane Irma Internet Research 2019 30 2 695-709
[154]
Xue J, Chen J, Chen C, Zheng C, Li S, and Zhu T Public discourse and sentiment during the COVID-19 pandemic: Using Latent Dirichlet Allocation for topic modeling on Twitter PloS One 2020 15 9 0239441
[155]
Xue J, Chen J, Hu R, Chen C, Zheng C, Su Y, and Zhu T Twitter discussions and emotions about the COVID-19 pandemic: Machine learning approach Journal of Medical Internet Research 2020 22 11 20550
[156]
Xue J, Chen J, Chen C, Hu R, and Zhu T The hidden pandemic of family violence during COVID-19: Unsupervised learning of tweets Journal of Medical Internet Research 2020 22 11 24361
[157]
Yan X, Guo J, Lan Y, and Cheng X A biterm topic model for short texts Proceedings of the 22nd international conference on the world wide web 2013 ACM 1445-1456
[158]
Yan Y, Chen J, and Wang Z Mining public sentiments and perspectives from geotagged social media data for appraising the post-earthquake recovery of tourism destinations Applied Geography 2020 123
[159]
Yao L, Mimno D, and McCallum A Efficient methods for topic model inference on streaming document collections Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining 2009 ACM 937-946
[160]
Yin J and Wang J A Dirichlet Multinomial Mixture model-based approach for short text clustering ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2014 ACM 233-242
[161]
Yu L, Jiang W, Ren Z, Xu S, Zhang L, and Hu X Detecting changes in attitudes toward depression on Chinese social media: A text analysis Journal of affective disorders 2021 280 354-363
[162]
Zhai W, Peng Z-R, and Yuan F Examine the effects of neighborhood equity on disaster situational awareness: Harness machine learning and geotagged Twitter data International Journal of Disaster Risk Reduction 2020 48
[163]
Zhang H, Wheldon C, Dunn AG, Tao C, Huo J, Zhang R, Prosperi M, Guo Y, and Bian J Mining Twitter to assess the determinants of health behavior toward Human Papillomavirus vaccination in the United States J Am Med Info Assoc 2020 27 2 225-235
[164]
Zhang T, Shen S, Cheng C, Su K, and Zhang X A topic model based framework for identifying the distribution of demand for relief supplies using social media data Int J Geograp Info Sci 2021 2021 1-22
[165]
Zhao H, Du L, Buntine W, and Liu G MetaLDA: a topic model that efficiently incorporates meta information 2017 IEEE International Conference on Data Mining (ICDM) 2017 IEEE 635-644
[166]
Zhao H, Du L, Buntine WL, and Liu G Leveraging external information in topic modelling Knowl Info Syst 2019 61 2 661-693
[167]
Zhao H, Phung D, Jin Y, DU L, and Buntine W Topic modelling meets deep neural networks: a survey. Proceedings of the 13th International joint conference on artificial intelligence (IJCAI-21) 2021 IJCAI
[168]
Zhao X, Wang D, Zhao Z, Liu W, Lu C, and Zhuang F A neural topic model with word vectors and entity vectors for short texts Information Processing & Management 2021 58 2
[169]
Zheng P and Shahin S Live tweeting live debates: How Twitter reflects and refracts the US political climate in a campaign season Information, Communication & Society 2020 23 3 337-357
[170]
Zhou H, Yu H, and Hu R Topic evolution based on the probabilistic topic model: A review Frontiers of Computer Science 2017 11 5 786-802
[171]
Zhou Y and Na J-C A comparative analysis of Twitter users who tweeted on psychology and political science journal articles Online Information Review 2019 43 7 1188-1208
[172]
Zhu B, Zheng X, Liu H, Li J, and Wang P Analysis of spatiotemporal characteristics of big data on social media sentiment with COVID-19 epidemic topics Chaos Solitons Fractals 2020 140 110123
[173]
Zou L and Song WW LDA-TM: a two-step approach to Twitter topic data clustering. 2016 IEEE International Conference on Cloud Computing and Big Data Analysis (ICCCBDA) 2016 IEEE 342-347
[174]
Zuo Y, Zhao J, and Xu K Word network topic model: a simple but general solution for short and imbalanced texts Knowledge and Information Systems 2016 48 2 379-398

Cited By

View all
  • (2024)A Topic Modeling Approach Towards Understanding the Discourse between Religion and Videogames on RedditProceedings of the ACM on Human-Computer Interaction10.1145/36770548:CHI PLAY(1-44)Online publication date: 15-Oct-2024
  • (2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Artificial Intelligence Review
Artificial Intelligence Review  Volume 56, Issue 12
Dec 2023
1600 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 May 2023
Accepted: 14 March 2023

Author Tags

  1. Topic model
  2. Social media
  3. Short text
  4. Twitter
  5. NLP
  6. LDA

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)A Topic Modeling Approach Towards Understanding the Discourse between Religion and Videogames on RedditProceedings of the ACM on Human-Computer Interaction10.1145/36770548:CHI PLAY(1-44)Online publication date: 15-Oct-2024
  • (2024)Hate speech detection in social mediaWIREs Computational Statistics10.1002/wics.164816:2Online publication date: 11-Mar-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media