Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3543873.3587643acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
research-article
Public Access

Claim Extraction and Dynamic Stance Detection in COVID-19 Tweets

Published: 30 April 2023 Publication History

Abstract

The information ecosystem today is noisy, and rife with messages that contain a mix of objective claims and subjective remarks or reactions. Any automated system that intends to capture the social, cultural, or political zeitgeist, must be able to analyze the claims as well as the remarks. Due to the deluge of such messages on social media, and their tremendous power to shape our perceptions, there has never been a greater need to automate these analyses, which play a pivotal role in fact-checking, opinion mining, understanding opinion trends, and other such downstream tasks of social consequence. In this noisy ecosystem, not all claims are worth checking for veracity. Such a check-worthy claim, moreover, must be accurately distilled from subjective remarks surrounding it. Finally, and especially for understanding opinion trends, it is important to understand the stance of the remarks or reactions towards that specific claim. To this end, we introduce a COVID-19 Twitter dataset, and present a three-stage process to (i) determine whether a given Tweet is indeed check-worthy, and if so, (ii) which portion of the Tweet ought to be checked for veracity, and finally, (iii) determine the author’s stance towards the claim in that Tweet, thus introducing the novel task of topic-agnostic stance detection.

References

[1]
Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, and Roland Vollgraf. 2019. FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (Demonstrations). Association for Computational Linguistics, Minneapolis, Minnesota, 54–59. https://doi.org/10.18653/v1/N19-4010
[2]
Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Sajjad, Alex Nikolov, Hamdy Mubarak, Giovanni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Abdulaziz Al-Homaid, Wajdi Zaghouani, Tommaso Caselli, Gijs Danoe, Friso Stolk, Britt Bruntink, and Preslav Nakov. 2021. Fighting the COVID-19 Infodemic: Modeling the Perspective of Journalists, Fact-Checkers, Social Media Platforms, Policy Makers, and the Society. In Findings of the Association for Computational Linguistics: EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 16-20 November, 2021, Marie-Francine Moens, Xuanjing Huang, Lucia Specia, and Scott Wen-tau Yih (Eds.). Association for Computational Linguistics, Punta Cana, Dominican Republic, 611–649. https://doi.org/10.18653/v1/2021.findings-emnlp.56
[3]
Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, and Kalina Bontcheva. 2016. Stance Detection with Bidirectional Conditional Encoding. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 876–885. https://doi.org/10.18653/v1/D16-1084
[4]
Eytan Bakshy, Jake M. Hofman, Winter A. Mason, and Duncan J. Watts. 2011. Everyone’s an Influencer: Quantifying Influence on Twitter. In WSDM ’11 (Hong Kong, China). Association for Computing Machinery, New York, NY, USA, 65–74. https://doi.org/10.1145/1935826.1935845
[5]
Juan M. Banda, Ramya Tekumalla, Guanyu Wang, Jingyuan Yu, Tuo Liu, Yuning Ding, and Gerardo Chowell. 2020. A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration. arxiv:2004.03688https://arxiv.org/abs/2004.03688
[6]
Alberto Barrón-Cedeno, Tamer Elsayed, Preslav Nakov, Giovanni Da San Martino, Maram Hasanain, Reem Suwaileh, and Fatima Haouari. 2020. Checkthat! at clef 2020: Enabling the automatic identification and verification of claims in social media. Advances in Information Retrieval 12036 (2020), 499.
[7]
Giannis Bekoulis, Johannes Deleu, Thomas Demeester, and Chris Develder. 2019. Sub-event detection from twitter streams as a sequence labeling problem. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 745–750. https://doi.org/10.18653/v1/N19-1081
[8]
Tuhin Chakrabarty, Christopher Hidey, and Kathy McKeown. 2019. IMHO Fine-Tuning Improves Claim Detection. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 558–563. https://doi.org/10.18653/v1/N19-1054
[9]
Arpan Chaudhury, Partha Basuchowdhuri, and Subhashis Majumder. 2012. Spread of Information in a Social Network Using Influential Nodes. In Advances in Knowledge Discovery and Data Mining, Pang-Ning Tan, Sanjay Chawla, Chin Kuan Ho, and James Bailey (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 121–132.
[10]
Emily Chen, Kristina Lerman, and Emilio Ferrara. 2020. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health Surveill 6, 2 (29 May 2020), e19273. https://doi.org/10.2196/19273
[11]
Emily Chen, Kristina Lerman, Emilio Ferrara, 2020. Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set. JMIR Public Health and Surveillance 6, 2 (2020), e19273.
[12]
Council of Europe 2023. Global scale - Table 1 (CEFR 3.3): Common Reference levels. Council of Europe. Retrieved March 13, 2023 from https://www.coe.int/en/web/common-european-framework-reference-languages/table-1-cefr-3.3-common-reference-levels-global-scale
[13]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, 4171–4186. https://doi.org/10.18653/v1/N19-1423
[14]
Dimitar Dimitrov, Erdal Baran, Pavlos Fafalios, Ran Yu, Xiaofei Zhu, Matthäus Zloch, and Stefan Dietze. 2020. TweetsCOV19 - A Knowledge Base of Semantically Annotated Tweets about the COVID-19 Pandemic. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management (Virtual Event, Ireland) (CIKM ’20). Association for Computing Machinery, New York, NY, USA, 2991–2998. https://doi.org/10.1145/3340531.3412765
[15]
Kyle Glandt, Sarthak Khanal, Yingjie Li, Doina Caragea, and Cornelia Caragea. 2021. Stance Detection in COVID-19 Tweets. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Association for Computational Linguistics, Online, 1596–1611. https://doi.org/10.18653/v1/2021.acl-long.127
[16]
Andreas Hanselowski, Christian Stab, Claudia Schulz, Zile Li, and Iryna Gurevych. 2019. A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking. In Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL). Association for Computational Linguistics, Hong Kong, China, 493–503. https://doi.org/10.18653/v1/K19-1046
[17]
Momchil Hardalov, Arnav Arora, Preslav Nakov, and Isabelle Augenstein. 2022. A Survey on Stance Detection for Mis- and Disinformation Identification. In Findings of the Association for Computational Linguistics: NAACL 2022. Association for Computational Linguistics, Seattle, United States, 1259–1277. https://doi.org/10.18653/v1/2022.findings-naacl.94
[18]
Naeemul Hassan, Gensheng Zhang, Fatma Arslan, Josue Caraballo, Damian Jimenez, Siddhant Gawsane, Shohedul Hasan, Minumol Joseph, Aaditya Kulkarni, Anil Kumar Nayak, 2017. Claimbuster: The first-ever end-to-end fact-checking system. Proceedings of the VLDB Endowment 10, 12 (2017), 1945–1948.
[19]
Zhiyong He, Zanbo Wang, Wei Wei, Shanshan Feng, Xianling Mao, and Sheng Jiang. 2020. A Survey on Recent Advances in Sequence Labeling from Deep Learning Models., 16 pages. https://doi.org/10.48550/ARXIV.2011.06727 arXiv:arXiv:2011.06727
[20]
Abhyuday Jagannatha and Hong Yu. 2016. Structured prediction models for RNN based sequence labeling in clinical text. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Austin, Texas, 856–865. https://doi.org/10.18653/v1/D16-1082
[21]
Dilek Küçük and Fazli Can. 2020. Stance Detection: A Survey. ACM Comput. Surv. 53, 1, Article 12 (feb 2020), 37 pages. https://doi.org/10.1145/3369026
[22]
Namhee Kwon, Liang Zhou, Eduard Hovy, and Stuart W. Shulman. 2007. Identifying and Classifying Subjective Claims. In Proceedings of the 8th Annual International Conference on Digital Government Research: Bridging Disciplines & Domains(dg.o ’07). Digital Government Society of North America, Philadelphia, Pennsylvania, USA, 76––81.
[23]
Hankyol Lee, Youngjae Yu, and Gunhee Kim. 2020. Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context. In Proceedings of the Second Workshop on Figurative Language Processing. Association for Computational Linguistics, Online, 12–17. https://doi.org/10.18653/v1/2020.figlang-1.2
[24]
Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. RoBERTa: A Robustly Optimized BERT Pretraining Approach., 13 pages. https://doi.org/10.48550/arXiv.1907.11692
[25]
Xuezhe Ma and Eduard Hovy. 2016. End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 1064–1074. https://doi.org/10.18653/v1/P16-1101
[26]
Zulfat Miftahutdinov, Ilseyar Alimova, and Elena Tutubalina. 2019. KFU NLP Team at SMM4H 2019 Tasks: Want to Extract Adverse Drugs Reactions from Tweets? BERT to The Rescue. In Proceedings of the Fourth Social Media Mining for Health Applications (#SMM4H) Workshop & Shared Task. Association for Computational Linguistics, Florence, Italy, 52–57. https://doi.org/10.18653/v1/W19-3207
[27]
Saif Mohammad, Svetlana Kiritchenko, Parinaz Sobhani, Xiaodan Zhu, and Colin Cherry. 2016. SemEval-2016 Task 6: Detecting Stance in Tweets. In Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016). Association for Computational Linguistics, San Diego, California, 31–41. https://doi.org/10.18653/v1/S16-1003
[28]
Saif M Mohammad, Parinaz Sobhani, and Svetlana Kiritchenko. 2017. Stance and sentiment in tweets. ACM Transactions on Internet Technology (TOIT) 17, 3 (2017), 1–23.
[29]
Lynnette Hui Xian Ng and Kathleen M Carley. 2022. Is my stance the same as your stance? A cross validation study of stance detection datasets. Information Processing & Management 59, 6 (2022), 103070.
[30]
Jeffrey Pennington, Richard Socher, and Christopher Manning. 2014. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Doha, Qatar, 1532–1543. https://doi.org/10.3115/v1/D14-1162
[31]
Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. 2018. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 2227–2237. https://doi.org/10.18653/v1/N18-1202
[32]
Dean Pomerleau and Delip Rao. 2017. Fake news challenge stage 1 (FNC-I): Stance detection. Retrieved March 15, 2023 from https://www.fakenewschallenge.org/
[33]
Sina Mahdipour Saravani, Ritwik Banerjee, and Indrakshi Ray. 2021. An Investigation into the Contribution of Locally Aggregated Descriptors to Figurative Language Identification. In Proceedings of the Second Workshop on Insights from Negative Results in NLP. Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, 103–109. https://doi.org/10.18653/v1/2021.insights-1.15
[34]
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016. Improving Neural Machine Translation Models with Monolingual Data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Association for Computational Linguistics, Berlin, Germany, 86–96. https://doi.org/10.18653/v1/P16-1009
[35]
Jana Straková, Milan Straka, and Jan Hajic. 2019. Neural Architectures for Nested NER through Linearization. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Florence, Italy, 5326–5331. https://doi.org/10.18653/v1/P19-1527
[36]
James Thorne, Andreas Vlachos, Christos Christodoulopoulos, and Arpit Mittal. 2018. FEVER: a Large-scale Dataset for Fact Extraction and VERification. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, 809–819. https://doi.org/10.18653/v1/N18-1074
[37]
Erik F. Tjong Kim Sang and Fien De Meulder. 2003. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003. Association for Computational Linguistics, Edmonton, Canada, 142–147. https://aclanthology.org/W03-0419
[38]
Erik F. Tjong Kim Sang and Jorn Veenstra. 1999. Representing Text Chunks. In Ninth Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Bergen, Norway, 173–179. https://aclanthology.org/E99-1023
[39]
Twitter, Inc. 2023. GET statuses/lookup. Twitter, Inc. Retrieved March 13, 2023 from https://developer.twitter.com/en/docs/twitter-api/v1/tweets/post-and-engage/api-reference/get-statuses-lookup
[40]
David Wadden, Shanchuan Lin, Kyle Lo, Lucy Lu Wang, Madeleine van Zuylen, Arman Cohan, and Hannaneh Hajishirzi. 2020. Fact or Fiction: Verifying Scientific Claims. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics, Online, 7534–7550. https://doi.org/10.18653/v1/2020.emnlp-main.609
[41]
Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. 2019. XLNet: Generalized Autoregressive Pretraining for Language Understanding. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.). Vol. 32. Curran Associates, Inc., Vancouver, Canada, 5753–5763. https://proceedings.neurips.cc/paper/2019/file/dc6a7e655d7e5840e66733e9ee67cc69-Paper.pdf
[42]
Chaoyuan Zuo, Ritwik Banerjee, Fateme Hashemi Chaleshtori, Hossein Shirazi, and Indrakshi Ray. 2022. Seeing Should Probably Not Be Believing: The Role of Deceptive Support in COVID-19 Misinformation on Twitter. J. Data and Information Quality 15, 1, Article 9 (dec 2022), 26 pages. https://doi.org/10.1145/3546914
[43]
Chaoyuan Zuo, Kritik Mathur, Dhruv Kela, Noushin Salek Faramarzi, and Ritwik Banerjee. 2022. Beyond belief: a cross-genre study on perception and validation of health information online. International Journal of Data Science and Analytics 13, 4 (2022), 299–314.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023
April 2023
1567 pages
ISBN:9781450394192
DOI:10.1145/3543873
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 April 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. COVID-19
  2. Claim Extraction
  3. Stance Detection

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

WWW '23
Sponsor:
WWW '23: The ACM Web Conference 2023
April 30 - May 4, 2023
TX, Austin, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 297
    Total Downloads
  • Downloads (Last 12 months)188
  • Downloads (Last 6 weeks)34
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media