Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3578503.3583617acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
research-article
Public Access

Misinformation Detection Algorithms and Fairness across Political Ideologies: The Impact of Article Level Labeling

Published: 30 April 2023 Publication History

Abstract

Multiple recent efforts have used large-scale data and computational models to automatically detect misinformation in online news articles. Given the potential impact of misinformation on democracy, many of these efforts have also used the political ideology of these articles to better model misinformation and study political bias in such algorithms. However, almost all such efforts have used source level labels for credibility and political alignment, thereby assigning the same credibility and political alignment label to all articles from the same source (e.g., the New York Times or Breitbart). Here, we report on the impact of journalistic best practices to label individual news articles for their credibility and political alignment. We found that while source level labels are decent proxies for political alignment labeling, they are very poor proxies – almost the same as flipping a coin – for credibility ratings. Next, we study the implications of such source level labeling on downstream processes such as the development of automated misinformation detection algorithms and political fairness audits therein. We find that the automated misinformation detection and fairness algorithms can be suitably revised to support their intended goals but might require different assumptions and methods than those which are appropriate using source level labeling. The results suggest caution in generalizing recent results on misinformation detection and political bias therein. On a positive note, this work shares a new dataset of journalistic quality individually labeled articles and an approach for misinformation detection and fairness audits.

References

[1]
[1] Jamal Alasadi, Ahmed Al Hilli, and Vivek K Singh. 2019. Toward fairness in face matching algorithms. In Proceedings of the 1st International Workshop on Fairness, Accountability, and Transparency in MultiMedia. 19–25.
[2]
[2] Hunt Allcott and Matthew Gentzkow. 2017. Social media and fake news in the 2016 election. Journal of economic perspectives 31, 2 (2017), 211–36.
[3]
[3] Abdulaziz A Almuzaini, Chidansh A Bhatt, David M Pennock, and Vivek K Singh. 2022. ABCinML: Anticipatory Bias Correction in Machine Learning Applications. In 2022 ACM Conference on Fairness, Accountability, and Transparency. 1552–1560.
[4]
[4] Fatemeh Torabi Asr and Maite Taboada. 2018. The data challenge in misinformation detection: Source reputation vs. content veracity. In Proceedings of the first workshop on fact extraction and verification (FEVER). 10–15.
[5]
[5] Rachel KE Bellamy, Kuntal Dey, Michael Hind, Samuel C Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, 2018. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943 (2018).
[6]
[6] Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. PMLR, 77–91.
[7]
[7] Flavio Calmon, Dennis Wei, Bhanukiran Vinzamuri, Karthikeyan Natesan Ramamurthy, and Kush R Varshney. 2017. Optimized pre-processing for discrimination prevention. In Advances in Neural Information Processing Systems. 3992–4001.
[8]
[8] Civil Rights Act. 1964. Civil Rights Act of 1964. Title VII, Equal Employment Opportunities (1964).
[9]
[9] Nadia K Conroy, Victoria L Rubin, and Yimin Chen. 2015. Automatic deception detection: Methods for finding fake news. Proceedings of Association for Information Science and Technology 52, 1 (2015), 1–4.
[10]
[10] W Cummings. 2018. Diamond and Silk tell Congress, ‘Facebook censored our free speech!,’. USA Today (2018). Available online: https://bit.ly/3r6FsJp.
[11]
[11] O Darcy. 2021. Republicans and right-wing media use Facebook Oversight Board’s Trump decision to claim bias. CNN (2021). Available online: https://www.cnn.com/2021/05/05/media/facebook-oversight-board-trump-right-wing-reaction/index.html.
[12]
[12] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. 214–226.
[13]
[13] Robert Faris, Hal Roberts, Bruce Etling, Nikki Bourassa, Ethan Zuckerman, and Yochai Benkler. 2017. Partisanship, propaganda, and disinformation: Online media and the 2016 US presidential election. Berkman Klein Center Research Publication 6 (2017).
[14]
[14] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proc. ACM SIGKDD international conference on knowledge discovery and data mining. 259–268.
[15]
[15] Devin Gaffney and J Nathan Matias. 2018. Caveat emptor, computational social science: Large-scale missing data in a widely-published Reddit corpus. PloS one 13, 7 (2018), e0200162.
[16]
[16] Soumen Ganguly, Juhi Kulshrestha, Jisun An, and Haewoon Kwak. 2020. Empirical evaluation of three common assumptions in building political media bias datasets. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 14. 939–943.
[17]
[17] Kelly M Greenhill and Ben Oppenheim. 2017. Rumor has it: The adoption of unverified information in conflict zones. International Studies Quarterly 61, 3 (2017), 660–676.
[18]
[18] Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson, and David Lazer. 2019. Fake news on Twitter during the 2016 US presidential election. Science 363, 6425 (2019), 374–378.
[19]
[19] Andrew Guess, Brendan Nyhan, and Jason Reifler. 2018. Selective exposure to misinformation: Evidence from the consumption of fake news during the 2016 US presidential campaign. European Research Council 9, 3 (2018), 4.
[20]
[20] Andrew M Guess, Pablo Barberá, Simon Munzert, and JungHwan Yang. 2021. The consequences of online partisan media. Proceedings of the National Academy of Sciences 118, 14 (2021).
[21]
[21] Andrew F Hayes and Klaus Krippendorff. 2007. Answering the call for a standard reliability measure for coding data. Communication methods and measures 1, 1 (2007), 77–89.
[22]
[22] Benjamin D Horne, William Dron, Sara Khedr, and Sibel Adali. 2018. Assessing the news landscape: A multi-module toolkit for evaluating the credibility of news. In Companion Proceedings of the The Web Conference 2018. 235–238.
[23]
[23] Dirk Hovy and Shrimai Prabhumoye. 2021. Five sources of bias in natural language processing. Language and Linguistics Compass 15, 8 (2021), e12432.
[24]
[24] Clayton Hutto and Eric Gilbert. 2014. Vader: A parsimonious rule-based model for sentiment analysis of social media text. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 8.
[25]
[25] IPTC. n.d. IPTC API newscodes. https://show.newscodes.org/index.html
[26]
[26] Mohsin Iqbal, Asim Karim, and Faisal Kamiran. 2019. Balancing prediction errors for robust sentiment classification. ACM Trans. on Knowledge Discovery from Data (TKDD) 13, 3 (2019), 1–21.
[27]
[27] Shan Jiang, Ronald E Robertson, and Christo Wilson. 2020. Reasoning about political bias in content moderation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 13669–13672.
[28]
[28] John T Jost and Margarita Krochik. 2014. Ideological differences in epistemic motivation: Implications for attitude structure, depth of information processing, susceptibility to persuasion, and stereotyping. In Advances in motivation science. Vol. 1. Elsevier, 181–231.
[29]
[29] Faisal Kamiran, Sameen Mansha, Asim Karim, and Xiangliang Zhang. 2018. Exploiting reject option in classification for social discrimination control. Information Sciences 425 (2018), 18–33.
[30]
[30] Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35–50.
[31]
[31] J Koebler and J Cox. 2018. The Impossible Job: Inside Facebook’s Struggle to Moderate Two Billion People - Motherboard. Motherboard (2018). Available online. https://motherboard.vice.com/en_us/article/xwk9zd/how-facebook-content-moderation-works.
[32]
[32] Omkar N Kulkarni, Vikram Patil, Vivek K Singh, and Pradeep K Atrey. 2021. Accuracy and Fairness in Pupil Detection Algorithm. In 2021 IEEE Seventh International Conference on Multimedia Big Data (BigMM). IEEE, 17–24.
[33]
[33] Juhi Kulshrestha, Motahhare Eslami, Johnnatan Messias, Muhammad Bilal Zafar, Saptarshi Ghosh, Krishna P Gummadi, and Karrie Karahalios. 2017. Quantifying search bias: Investigating sources of bias for political searches in social media. In Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. 417–432.
[34]
[34] Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor detection over varying time windows. PloS one 12, 1 (2017), e0168344.
[35]
[35] David MJ Lazer, Matthew A Baum, Yochai Benkler, Adam J Berinsky, Kelly M Greenhill, Filippo Menczer, Miriam J Metzger, Brendan Nyhan, Gordon Pennycook, David Rothschild, 2018. The science of fake news. Science 359, 6380 (2018), 1094–1096.
[36]
[36] Bruno Lepri, Nuria Oliver, Emmanuel Letouzé, Alex Pentland, and Patrick Vinck. 2018. Fair, transparent, and accountable algorithmic decision-making processes. Philosophy & Technology 31, 4 (2018), 611–627.
[37]
[37] Michael Lutz, Sanjana Gadaginmath, Natraj Vairavan, and Phil Mui. 2021. Examining Political Bias within YouTube Search and Recommendation Algorithms. In 2021 IEEE Symposium Series on Computational Intelligence (SSCI). IEEE, 1–7.
[38]
[38] Jing Ma, Wei Gao, Prasenjit Mitra, Sejeong Kwon, Bernard J Jansen, Kam-Fai Wong, and Meeyoung Cha. 2016. Detecting rumors from microblogs with recurrent neural networks. (2016).
[39]
[39] Susan Morgan. 2018. Fake news, disinformation, manipulation and online tactics to undermine democracy. Journal of Cyber Policy 3, 1 (2018), 39–43.
[40]
[40] Rachel R Mourão and Craig T Robertson. 2019. Fake news as discursive integration: An analysis of sites that publish false, misleading, hyperpartisan and sensational information. Journalism Studies 20, 14 (2019), 2077–2095.
[41]
[41] Jeppe Nørregaard, Benjamin D Horne, and Sibel Adalı. 2019. NELA-GT-2018: A large multi-labelled news dataset for the study of misinformation in news articles. In Proceedings of the International AAAI Conference on Web and Social Media, Vol. 13. 630–638.
[42]
[42] Mathias Osmundsen, Alexander Bor, Peter Bjerregaard Vahlstrup, Anja Bechmann, and Michael Bang Petersen. 2021. Partisan polarization is the primary psychological motivation behind political fake news sharing on Twitter. Am. Political Science Review 115, 3 (2021), 999–1015.
[43]
[43] Jinkyung Park, Ramanathan Arunachalam, Vincent Silenzio, Vivek K Singh, 2022. Fairness in Mobile Phone–Based Mental Health Assessment Algorithms: Exploratory Study. JMIR formative research 6, 6 (2022), e34366.
[44]
[44] Jinkyung Park, Rahul Ellezhuthil, Ramanathan Arunachalam, Lauren Feldman, and Vivek Singh. 2022. Toward Fairness in Misinformation Detection Algorithms. In Workshop Proceedings of the 16th International AAAI Conference on Web and Social Media. Retrieved from https://doi. org/10.36190.
[45]
[45] James W Pennebaker, Ryan L Boyd, Kayla Jordan, and Kate Blackburn. 2015. The development and psychometric properties of LIWC2015. Technical Report.
[46]
[46] Dana Pessach and Erez Shmueli. 2020. Algorithmic fairness. arXiv preprint arXiv:2001.09784 (2020).
[47]
[47] Martin Potthast, Johannes Kiesel, Kevin Reinartz, Janek Bevendorff, and Benno Stein. 2017. A stylometric inquiry into hyperpartisan and fake news. arXiv preprint arXiv:1702.05638 (2017).
[48]
[48] John Rawls. 1999. A theory of justice: Revised edition. Harvard university press.
[49]
[49] Karishma Sharma, Emilio Ferrara, and Yan Liu. 2022. Construction of Large-Scale Misinformation Labeled Datasets from Social Media Discourse using Label Refinement. In Proceedings of the ACM Web Conference 2022. 3755–3764.
[50]
[50] Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. 2020. Fakenewsnet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media. Big Data 8, 3 (2020), 171–188.
[51]
[51] Vivek K Singh, Isha Ghosh, and Darshan Sonagara. 2021. Detecting fake news stories via multimodal analysis. Journal of the Assoc. for Information Science and Technology 72, 1 (2021), 3–17.
[52]
[52] Vivek K Singh and Connor Hofenbitzer. 2019. Fairness across network positions in cyberbullying detection algorithms. In 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE, 557–559.
[53]
[53] Edson C Tandoc Jr, Ryan J Thomas, and Lauren Bishop. 2021. What is (fake) news? Analyzing news values (and more) in fake stories. Media and Communication 9, 1 (2021), 110–119.
[54]
[54] Soroush Vosoughi, Deb Roy, and Sinan Aral. 2018. The spread of true and false news online. Science 359, 6380 (2018), 1146–1151.
[55]
[55] William Yang Wang. 2017. " liar, liar pants on fire": A new benchmark dataset for fake news detection. arXiv preprint arXiv:1705.00648 (2017).
[56]
[56] Liang Wu, Fred Morstatter, Kathleen M Carley, and Huan Liu. 2019. Misinformation in social media: definition, manipulation, and detection. ACM SIGKDD Explorations Newsletter 21, 2 (2019), 80–90.
[57]
[57] Dannagal Goldthwaite Young. 2019. Irony and outrage: The polarized landscape of rage, fear, and laughter in the United States. Oxford University Press, USA.
[58]
[58] Xinyi Zhou and Reza Zafarani. 2020. A survey of fake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR) 53, 5 (2020), 1–40.

Cited By

View all
  • (2024)Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data AnnotationSSRN Electronic Journal10.2139/ssrn.4774771Online publication date: 2024
  • (2024)YouTube and Conspiracy Theories: A Longitudinal Audit of Information PanelsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675128(273-284)Online publication date: 10-Sep-2024
  • (2024)Personally Targeted Risk vs. Humor: How Online Risk Perceptions of Youth vs. Third-Party Annotators Differ based on Privately Shared Media on InstagramProceedings of the 23rd Annual ACM Interaction Design and Children Conference10.1145/3628516.3655799(1-13)Online publication date: 17-Jun-2024
  • Show More Cited By

Index Terms

  1. Misinformation Detection Algorithms and Fairness across Political Ideologies: The Impact of Article Level Labeling

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WebSci '23: Proceedings of the 15th ACM Web Science Conference 2023
      April 2023
      373 pages
      ISBN:9798400700897
      DOI:10.1145/3578503
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 30 April 2023

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. algorithmic fairness
      2. article level labeling
      3. misinformation detection
      4. political bias

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Funding Sources

      Conference

      WebSci '23
      Sponsor:
      WebSci '23: 15th ACM Web Science Conference 2023
      April 30 - May 1, 2023
      TX, Austin, USA

      Acceptance Rates

      Overall Acceptance Rate 245 of 933 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)208
      • Downloads (Last 6 weeks)40
      Reflects downloads up to 22 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data AnnotationSSRN Electronic Journal10.2139/ssrn.4774771Online publication date: 2024
      • (2024)YouTube and Conspiracy Theories: A Longitudinal Audit of Information PanelsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675128(273-284)Online publication date: 10-Sep-2024
      • (2024)Personally Targeted Risk vs. Humor: How Online Risk Perceptions of Youth vs. Third-Party Annotators Differ based on Privately Shared Media on InstagramProceedings of the 23rd Annual ACM Interaction Design and Children Conference10.1145/3628516.3655799(1-13)Online publication date: 17-Jun-2024
      • (2024)Systemization of Knowledge (SoK): Creating a Research Agenda for Human-Centered Real-Time Risk Detection on Social Media PlatformsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642315(1-21)Online publication date: 11-May-2024
      • (2024)Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media LanguageProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642117(1-20)Online publication date: 11-May-2024
      • (2023)AI Fairness in Data Management and Analytics: A Review on Challenges, Methodologies and ApplicationsApplied Sciences10.3390/app13181025813:18(10258)Online publication date: 13-Sep-2023

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media