Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Public Access

Crossmod: A Cross-Community Learning-based System to Assist Reddit Moderators

Published: 07 November 2019 Publication History

Abstract

In this paper, we introduce a novel sociotechnical moderation system for Reddit called Crossmod. Through formative interviews with 11 active moderators from 10 different subreddits, we learned about the limitations of currently available automated tools, and how a new system could extend their capabilities. Developed out of these interviews, Crossmod makes its decisions based on cross-community learning---an approach that leverages a large corpus of previous moderator decisions via an ensemble of classifiers. Finally, we deployed Crossmod in a controlled environment, simulating real-time conversations from two large subreddits with over 10M subscribers each. To evaluate Crossmod's moderation recommendations, 4 moderators reviewed comments scored by Crossmod that had been drawn randomly from existing threads. Crossmod achieved an overall accuracy of 86% when detecting comments that would be removed by moderators, with high recall (over 87.5%). Additionally, moderators reported that they would have removed 95.3% of the comments flagged by Crossmod; however, 98.3% of these comments were still online at the time of this writing (i.e., not removed by the current moderation system). To the best of our knowledge, Crossmod is the first open source, AI-backed sociotechnical moderation system to be designed using participatory methods.

References

[1]
Mark S Ackerman. 2000. The intellectual challenge of CSCW: the gap between social requirements and technical feasibility. Human-Computer Interaction, Vol. 15, 2--3 (2000), 179--203.
[2]
Zahra Ashktorab and Jessica Vitak. 2016. Designing cyberbullying mitigation and prevention solutions through participatory design with teenagers. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 3895--3905.
[3]
Jennifer Beckett. 2018. We need to talk about the mental health of content moderators, September 2018. http://theconversation.com/we-need-to-talk-about-the-mental-health-of-content-moderators-103830 (2018).
[4]
Michael S Bernstein, Andrés Monroy-Hernández, Drew Harry, Paul André, Katrina Panovich, and Gregory G Vargas. 2011. 4chan and/b: An Analysis of Anonymity and Ephemerality in a Large Online Community. In ICWSM. 50--57.
[5]
Monika Bickert. 2018. Publishing Our Internal Enforcement Guidelines and Expanding Our Appeals Process, Apr. 2018. https://newsroom.fb.com/news/2018/04/comprehensive-community-standards/ (2018).
[6]
Wiebe E Bijker. 1987. The social construction of Bakelite: Toward a theory of invention. The social construction of technological systems: New directions in the sociology and history of technology (1987), 159--187.
[7]
Lindsay Blackwell, Jill Dimond, Sarita Schoenebeck, and Cliff Lampe. 2017. Classification and its consequences for online harassment: Design insights from heartmob. Proceedings of the ACM on Human-Computer Interaction, Vol. 1, CSCW (2017), 24.
[8]
Bryce Boe. 2016. Python Reddit API Wrapper (PRAW).
[9]
Piotr Bojanowski, Edouard Grave, Armand Joulin, and Tomas Mikolov. 2016. Enriching word vectors with subword information. arXiv preprint arXiv:1607.04606 (2016).
[10]
Samuel Brody and Nicholas Diakopoulos. 2011. Cooooooooooooooollllllllllllll!!!!!!!!!!!!!!: using word lengthening to detect sentiment in microblogs. In Proceedings of the conference on empirical methods in natural language processing. Association for Computational Linguistics, 562--570.
[11]
Brooks Buffington. April 4, 2015. Personal communication.
[12]
Catherine Buni and Soraya Chemaly. 2016. The Secret Rules of the Internet, Apr. 2016. http://www.theverge.com/2016/4/13/11387934/internet-moderator-history-youtube-facebook-reddit-censorship-free-speech (2016).
[13]
Stevie Chancellor, Zhiyuan Jerry Lin, and Munmun De Choudhury. 2016a. "This Post Will Just Get Taken Down": Characterizing Removed Pro-Eating Disorder Social Media Content. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 1157--1162.
[14]
Stevie Chancellor, Jessica Annette Pater, Trustin Clear, Eric Gilbert, and Munmun De Choudhury. 2016b. # thyghgapp: Instagram Content Moderation and Lexical Variation in Pro-Eating Disorder Communities. In Proceedings of the 19th ACM Conference on Computer-Supported Cooperative Work & Social Computing. ACM, 1201--1213.
[15]
Eshwar Chandrasekharan and Eric Gilbert. 2019. Hybrid Approaches to Detect Comments Violating Macro Norms on Reddit. arXiv preprint arXiv:1904.03596 (2019).
[16]
Eshwar Chandrasekharan, Umashanthi Pavalanathan, Anirudh Srinivasan, Adam Glynn, Jacob Eisenstein, and Eric Gilbert. 2017a. You Can't Stay Here: The Efficacy of Reddit's 2015 Ban Examined Through Hate Speech. Proc. ACM Hum.-Comput. Interact., Vol. 1, CSCW, Article 31 (Dec. 2017), bibinfonumpages22 pages. https://doi.org/10.1145/3134666
[17]
Eshwar Chandrasekharan, Mattia Samory, Shagun Jhaver, Hunter Charvat, Amy Bruckman, Cliff Lampe, Jacob Eisenstein, and Eric Gilbert. 2018. The Internet's Hidden Rules: An Empirical Study of Reddit Norm Violations at Micro, Meso, and Macro Scales. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 32.
[18]
Eshwar Chandrasekharan, Mattia Samory, Anirudh Srinivasan, and Eric Gilbert. 2017b. The Bag of Communities: Identifying Abusive Behavior Online with Preexisting Internet Data. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM.
[19]
Adrian Chen. 2014. The Laborers Who Keep Dick Pics And Beheadings Out Of Your Facebook Feed, October 2014. https://www.wired.com/2014/10/content-moderation/ (2014).
[20]
Anupam Datta, Shayak Sen, and Yair Zick. 2016. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In 2016 IEEE symposium on security and privacy (SP) . IEEE, 598--617.
[21]
Nicholas Diakopoulos. 2015. Algorithmic accountability: Journalistic investigation of computational power structures. Digital journalism, Vol. 3, 3 (2015), 398--415.
[22]
Motahhare Eslami. 2017. Understanding and Designing Around Users' Interaction with Hidden Algorithms in Sociotechnical Systems. In Companion of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 57--60.
[23]
Randy Farmer and Bryce Glass. 2010. Building web reputation systems ." O'Reilly Media, Inc.". 243--276 pages.
[24]
Casey Fiesler, Joshua McCann, Kyle Frye, Jed R Brubaker, et almbox. 2018. Reddit rules! characterizing an ecosystem of governance. In Twelfth International AAAI Conference on Web and Social Media .
[25]
Laura Garton, Caroline Haythornthwaite, and Barry Wellman. 1997. Studying online social networks. Journal of computer-mediated communication, Vol. 3, 1 (1997), JCMC313.
[26]
R Stuart Geiger. 2016. Bot-based collective blocklists in Twitter: the counterpublic moderation of harassment in a networked public space. Information, Communication & Society, Vol. 19, 6 (2016), 787--803.
[27]
Tarleton Gillespie. 2017. Governance of and by platforms. Sage handbook of social media. London: Sage (2017).
[28]
Tarleton Gillespie. 2018. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media .Yale University Press.
[29]
Google. 2018. YouTube Community Guidelines enforcement in Google's Tranparency Report for 2018. https://transparencyreport.google.com/youtube-policy/removals (2018).
[30]
Eric Horvitz. 1999. Principles of mixed-initiative user interfaces. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems. ACM, 159--166.
[31]
Shagun Jhaver, Iris Birman, Eric Gilbert, and Amy Bruckman. 2019. Human-Machine Collaboration for Content Regulation: The Case of Reddit Automoderator. ACM Transactions on Computer-Human Interaction (TOCHI), Vol. 26, 5 (2019), 31.
[32]
Shagun Jhaver, Sucheta Ghoshal, Amy Bruckman, and Eric Gilbert. 2018. Online harassment and content moderation: The case of blocklists. ACM Transactions on Computer-Human Interaction (TOCHI), Vol. 25, 2 (2018), 12.
[33]
Armand Joulin, Edouard Grave, Piotr Bojanowski, and Tomas Mikolov. 2017. Bag of Tricks for Efficient Text Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 2, Short Papers, Vol. 2. 427--431.
[34]
David Jurgens, Eshwar Chandrasekharan, and Libby Hemphill. 2019. A Just and Comprehensive Strategy for Using NLP to Address Online Abuse. arXiv preprint arXiv:1906.01738 (2019).
[35]
Anna Kasunic and Geoff Kaufman. 2018. " At Least the Pizzas You Make Are Hot": Norms, Values, and Abrasive Humor on the Subreddit r/RoastMe. In Twelfth International AAAI Conference on Web and Social Media .
[36]
Sara Kiesler, Robert Kraut, Paul Resnick, and Aniket Kittur. 2012. Regulating behavior in online communities. Building Successful Online Communities: Evidence-Based Social Design. MIT Press, Cambridge, MA (2012), 125--178.
[37]
Rob Kling, Roberta Lamb, et almbox. 2000. IT and organizational change in digital economies: A sociotechnical approach. Understanding the Digital Economy. Data, Tools, and Research. The MIT Press, Cambridge, MA (2000).
[38]
Rachael Krishna. 2018. Tumblr Launched An Algorithm To Flag Porn And So Far It's Just Caused Chaos, Dec 2018. https://www.buzzfeednews.com/article/krishrach/tumblr-porn-algorithm-ban (2018).
[39]
Saebom Kwon, Puhe Liang, Sonali Tandon, Jacob Berman, Pai-ju Chang, and Eric Gilbert. 2018. Tweety Holmes: A Browser Extension for Abusive Twitter Profile Detection. In Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing. ACM, 17--20.
[40]
Cliff Lampe and Paul Resnick. 2004. Slash (dot) and burn: distributed moderation in a large online conversation space. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 543--550.
[41]
John Law and John Hassard. 1999. Actor network theory and after. (1999).
[42]
Lawrence Lessig. 1999. Code and other laws of cyberspace . Vol. 3. Basic books New York.
[43]
Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. 2015. Click-through prediction for advertising in twitter timeline. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1959--1968.
[44]
Kiel Long, John Vines, Selina Sutton, Phillip Brooker, Tom Feltwell, Ben Kirman, Julie Barnett, and Shaun Lawson. 2017. Could You Define That in Bot Terms?: Requesting, Creating and Using Bots on Reddit. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. ACM, 3488--3500.
[45]
Sam Machkovech. 2014. No fooling: Reddit's r/games goes silent for one day to call out hate, bigotry, April 2019. https://arstechnica.com/gaming/2019/04/no-fooling-reddits-rgames-goes-silent-for-one-day-to-call-out-hate/ (2014).
[46]
Kaitlin Mahar, Amy X Zhang, and David Karger. 2018. Squadbox: A tool to combat email harassment using friendsourced moderation. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 586.
[47]
Enid Mumford. 2000. Socio-technical design: An unfulfilled promise or a future opportunity? In Organizational and social perspectives on information technology. Springer, 33--46.
[48]
Chikashi Nobata, Joel Tetreault, Achint Thomas, Yashar Mehdad, and Yi Chang. 2016. Abusive Language Detection in Online User Content. In Proceedings of the 25th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 145--153.
[49]
Jessica Annette Pater, Yacin Nadji, Elizabeth D Mynatt, and Amy S Bruckman. 2014. Just awful enough: the functional dysfunction of the something awful forums. In Proceedings of the 32nd annual ACM conference on Human factors in computing systems. ACM, 2407--2410.
[50]
John Pavlopoulos, Prodromos Malakasiotis, and Ion Androutsopoulos. 2017. Deep learning for user comment moderation. arXiv preprint arXiv:1705.09993 (2017).
[51]
Robert Peck. 2019. The Punishing Ecstasy of Being a Reddit Moderator, Mar. 2019. https://www.wired.com/story/the-punishing-ecstasy-of-being-a-reddit-moderator/ (2019).
[52]
Benjamin Plackett. 2018. Unpaid and abused: Moderators speak out against Reddit, Aug 2018. https://www.engadget.com/2018/08/31/reddit-moderators-speak-out/ (2018).
[53]
Twitter Public Policy. 2018. Evolving our Twitter Transparency Report: expanded data and insights, December 2018. https://blog.twitter.com/official/en_us/topics/company/2018/evolving-our-twitter-transparency-report.html (2018).
[54]
Jenny Preece and Diane Maloney-Krichmar. 2003. Online communities: focusing on sociability and usability. Handbook of human-computer interaction (2003), 596--620.
[55]
Emilee Rader and Rebecca Gray. 2015. Understanding user beliefs about algorithmic curation in the Facebook news feed. In Proceedings of the 33rd annual ACM conference on human factors in computing systems . ACM, 173--182.
[56]
Paul Resnick, Ko Kuwabara, Richard Zeckhauser, and Eric Friedman. 2000. Reputation systems. Commun. ACM, Vol. 43, 12 (2000), 45--48.
[57]
Paul Resnick and Richard Zeckhauser. 2002. Trust among strangers in internet transactions: Empirical analysis of ebay's reputation system. The Economics of the Internet and E-commerce, Vol. 11, 2 (2002), 23--25.
[58]
Sarah T Roberts. 2014. Behind the screen: The hidden digital labor of commercial content moderation . Ph.D. Dissertation. University of Illinois at Urbana-Champaign.
[59]
Sarah T Roberts. 2016. Commercial content moderation: digital laborers' dirty work. (2016).
[60]
Koustuv Saha, Eshwar Chandrasekharan, and Munmun De Choudhury. 2019. Prevalence and Psychological Effects of Hateful Speech in Online College Communities. In Proceedings of the 11th ACM Conference on Web Science .
[61]
Steven B Sawyer and Mohammad Hossein Jarrahi. 2014. Sociotechnical approaches to the study of Information Systems. In Computing handbook, third edition: Information systems and information technology. CRC Press, 5--1.
[62]
Joseph Seering, Tony Wang, Jina Yoon, and Geoff Kaufman. 2019. Moderator engagement and community development in the age of algorithms. New Media & Society (2019), 1461444818821316.
[63]
Sara Sood, Judd Antin, and Elizabeth Churchill. 2012. Profanity use in online communities. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 1481--1490.
[64]
Abhay Sukumaran, Stephanie Vezich, Melanie McHugh, and Clifford Nass. 2011. Normative influences on thoughtful online participation. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 3401--3410.
[65]
HN Moderation Team. July 16, 2015. https://news.ycombinator.com/threads?id=dang. ( July 16, 2015).
[66]
Nitasha Tiku and Casey Newton. February 4, 2015. Twitter CEO:“We suck at dealing with abuse.”. http://www.theverge.com/2015/2/4/7982099/twitter-ceo-sent-memo-taking-personal-responsibility-for-the . The Verge ( February 4, 2015).
[67]
Ruth L Williams and Joseph Cothrel. 2000. Four smart ways to run online communities. MIT Sloan Management Review, Vol. 41, 4 (2000), 81.
[68]
Daniel Dylan Wray. 2018. The Companies Cleaning the Deepest, Darkest Parts of Social Media, June 2018. https://www.vice.com/en_us/article/ywe7gb/the-companies-cleaning-the-deepest-darkest-parts-of-social-media (2018).
[69]
Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex machina: Personal attacks seen at scale. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1391--1399.
[70]
Tal Zarsky. 2016. The trouble with algorithmic decisions: An analytic road map to examine efficiency and fairness in automated and opaque decision making. Science, Technology, & Human Values, Vol. 41, 1 (2016), 118--132.
[71]
Amy X Zhang and Justin Cranshaw. 2018. Making sense of group chat through collaborative tagging and summarization. Proceedings of the ACM on Human-Computer Interaction, Vol. 2, CSCW (2018), 196.

Cited By

View all
  • (2024)Chillbot: Content Moderation in the BackchannelProceedings of the ACM on Human-Computer Interaction10.1145/36869418:CSCW2(1-26)Online publication date: 8-Nov-2024
  • (2024)"Positive reinforcement helps breed positive behavior": Moderator Perspectives on Encouraging Desirable BehaviorProceedings of the ACM on Human-Computer Interaction10.1145/36869298:CSCW2(1-33)Online publication date: 8-Nov-2024
  • (2024)Governance of the Black Experience on Reddit: r/BlackPeopleTwitter as a Case Study in Supporting Sense of Virtual Community for Black UsersProceedings of the ACM on Human-Computer Interaction10.1145/36869208:CSCW2(1-32)Online publication date: 8-Nov-2024
  • Show More Cited By

Index Terms

  1. Crossmod: A Cross-Community Learning-based System to Assist Reddit Moderators

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Human-Computer Interaction
    Proceedings of the ACM on Human-Computer Interaction  Volume 3, Issue CSCW
    November 2019
    5026 pages
    EISSN:2573-0142
    DOI:10.1145/3371885
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 November 2019
    Published in PACMHCI Volume 3, Issue CSCW

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ai
    2. community norms
    3. machine learning
    4. mixed initiative
    5. moderation
    6. online communities
    7. online governance
    8. open source.
    9. participatory design
    10. sociotechnical systems

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)495
    • Downloads (Last 6 weeks)77
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Chillbot: Content Moderation in the BackchannelProceedings of the ACM on Human-Computer Interaction10.1145/36869418:CSCW2(1-26)Online publication date: 8-Nov-2024
    • (2024)"Positive reinforcement helps breed positive behavior": Moderator Perspectives on Encouraging Desirable BehaviorProceedings of the ACM on Human-Computer Interaction10.1145/36869298:CSCW2(1-33)Online publication date: 8-Nov-2024
    • (2024)Governance of the Black Experience on Reddit: r/BlackPeopleTwitter as a Case Study in Supporting Sense of Virtual Community for Black UsersProceedings of the ACM on Human-Computer Interaction10.1145/36869208:CSCW2(1-32)Online publication date: 8-Nov-2024
    • (2024)Adopting Third-party Bots for Managing Online CommunitiesProceedings of the ACM on Human-Computer Interaction10.1145/36537078:CSCW1(1-26)Online publication date: 26-Apr-2024
    • (2024)Opportunities, tensions, and challenges in computational approaches to addressing online harassmentProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3661623(1483-1498)Online publication date: 1-Jul-2024
    • (2024)Exploring Intervention Techniques to Alleviate Negative Emotions during Video Content Moderation Tasks as a Worker-centered Task DesignProceedings of the 2024 ACM Designing Interactive Systems Conference10.1145/3643834.3660708(1701-1721)Online publication date: 1-Jul-2024
    • (2024)DeLink: An Adversarial Framework for Defending against Cross-site User Identity LinkageACM Transactions on the Web10.1145/364382818:2(1-34)Online publication date: 5-Feb-2024
    • (2024)SenseMate: An Accessible and Beginner-Friendly Human-AI Platform for Qualitative Data AnalysisProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645194(922-939)Online publication date: 18-Mar-2024
    • (2024)Linguistically Differentiating Acts and Recalls of Racial Microaggressions on Social MediaProceedings of the ACM on Human-Computer Interaction10.1145/36373668:CSCW1(1-36)Online publication date: 26-Apr-2024
    • (2024)AppealMod: Inducing Friction to Reduce Moderator Workload of Handling User AppealsProceedings of the ACM on Human-Computer Interaction10.1145/36372968:CSCW1(1-35)Online publication date: 26-Apr-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media