Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3331184.3331262acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation

Published: 18 July 2019 Publication History

Abstract

Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and classifying hateful content, without much success. On the other hand, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect performance but only within specific datasets. In this work, we analyze this apparent contradiction between existing literature and actual applications. We study closely the experimental methodology used in prior work and their generalizability to other datasets. Our findings evidence methodological issues, as well as an important dataset bias. As a consequence, performance claims of the current state-of-the-art have become significantly overestimated. The problems that we have found are mostly related to data overfitting and sampling issues. We discuss the implications for current research and re-conduct experiments to give a more accurate picture of the current state-of-the art methods.

Supplementary Material

MP4 File (cite3-11h40-d1.mp4)

References

[1]
Sweta Agrawal and Amit Awekar. 2018. Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms. In Advances in Information Retrieval - 40th European Conference on IR Research, ECIR 2018, Grenoble, France, March 26--29, 2018, Proceedings. 141--153.
[2]
Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 759--760.
[3]
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 (2014). http://arxiv.org/abs/1409.0473
[4]
Valerio Basile, Cristina Bosco, Viviana Patti, Manuela Sanguinetti, Elisabetta Fersini, Debora Nozza, Francisco Rangel, and Paolo Rosso. {n.d.}. Shared Task on Multilingual Detection of Hate. SemEval 2019, Task 5, https://competitions. codalab.org/competitions/19935.
[5]
Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Mean Birds: Detecting Aggression and Bullying on Twitter. In Proceedings of the 2017 ACM on Web Science Conference, WebSci 2017, Troy, NY, USA, June 25 - 28, 2017. 13--22.
[6]
Maral Dadvar and Kai Eckert. 2018. Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study. CoRR abs/1812.08046 (2018). arXiv:1812.08046 http://arxiv.org/abs/1812.08046
[7]
Maral Dadvar, Dolf Trieschnigg, and Franciska de Jong. 2014. Experts and Machines against Bullies: A Hybrid Approach to Detect Cyberbullies. In Advances in Artificial Intelligence - 27th Canadian Conference on Artificial Intelligence, Canadian AI 2014, Montréal, QC, Canada, May 6--9, 2014. Proceedings. 275--281.
[8]
Thomas Davidson, Dana Warmsley, Michael W. Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the Eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, May 15--18, 2017. AAAI Press, 512--515. https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15665
[9]
Laure Delisle, Alfredo Kalaitzis, Krzysztof Majewski, Archy de Berker, Milena Marin, and Julien Cornebise. 2018. A large-scale crowd-sourced analysis of abuse against women journalists and politicians on Twitter. (2018).
[10]
Nicholas Fandos and Kevin Roose. 2018. Facebook Identifies an Active Political Influence Campaign Using Fake Accounts. https://www.nytimes.com/2018/07/ 31/us/politics/facebook-political-campaign-midterms.html. {Online; accessed 26-January-2019}.
[11]
Paula Fortuna and Sérgio Nunes. 2018. A Survey on Automatic Detection of Hate Speech in Text. ACM Comput. Surv. 51, 4 (2018), 85:1--85:30.
[12]
Jerome H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29 (2000), 1189--1232.
[13]
Björn Gambäck and Utpal Kumar Sikdar. 2017. Using Convolutional Neural Networks to Classify Hate-Speech. In Proceedings of the FirstWorkshop on Abusive Language Online. Association for Computational Linguistics, 85--90.
[14]
Yoav Goldberg. 2016. A Primer on Neural Network Models for Natural Language Processing. J. Artif. Intell. Res. 57 (2016), 345--420.
[15]
Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org/
[16]
Edouard Grave, Tomas Mikolov, Armand Joulin, and Piotr Bojanowski. 2017. Bag of Tricks for Efficient Text Classification. In EACL 2017. 427--431.
[17]
Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson, and David Lazer. 2019. Fake news on Twitter during the 2016 U.S. presidential election. Science 363, 6425 (2019), 374--378. aau2706 arXiv:http://science.sciencemag.org/content/363/6425/374.full.pdf
[18]
Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, and N. Asokan. 2018. All You Need is "Love": Evading Hate Speech Detection. In Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security, CCS 2018, Toronto, ON, Canada, October 19, 2018. 2--12.
[19]
Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Detection of Cyberbullying Incidents on the Instagram Social Network. CoRR abs/1503.03909 (2015). arXiv:1503.03909 http://arxiv.org/abs/1503.03909
[20]
Jihye Lee. 2018. Twitter Apologizes for Mishandling Reported Threat From Mail- Bomb Suspect. http://time.com/5436809/twitter-apologizes-threat-mail-bombsuspect/. {Online; accessed 26-January-2019}.
[21]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS 2013. 3111--3119.
[22]
Etienne Papegnies, Vincent Labatut, Richard Dufour, and Georges Linares. 2017. Graph-based Features for Automatic Online Abuse Detection. In International Conference on Statistical Language and Speech Processing. Springer, 70--81.
[23]
Ji Ho Park and Pascale Fung. 2017. One-step and Two-step Classification for Abusive Language Detection on Twitter. In Proceedings of the First Workshop on Abusive Language Online. Association for Computational Linguistics, 41--45.
[24]
Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP 2014. 1532--1543.
[25]
Kelly Reynolds, April Kontostathis, and Lynne Edwards. 2011. Using Machine Learning to Detect Cyberbullying. In 10th International Conference on Machine Learning and Applications and Workshops, ICMLA 2011, Honolulu, Hawaii, USA, December 18--21, 2011. Volume 2: Special Sessions and Workshop. 241--244.
[26]
Semiu Salawu, Yulan He, and Joanna Lumsden. 2017. Approaches to Automated Detection of Cyberbullying: A Survey. IEEE Transactions on Affective Computing (2017).
[27]
Anna Schmidt and Michael Wiegand. 2017. A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. 1--10.
[28]
Vivek K Singh, Souvick Ghosh, and Christin Jose. 2017. Toward multimodal cyberbullying detection. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 2090--2099.
[29]
ZeerakWaseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science. 138--142.
[30]
Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the Student Research Workshop, SRW@HLT-NAACL 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12--17, 2016. 88--93. http://aclweb.org/anthology/N/N16/N16--2013.pdf
[31]
Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex machina: Personal attacks seen at scale. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1391--1399.
[32]
Ziqi Zhang, David Robinson, and Jonathan A. Tepper. 2018. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network. In The Semantic Web - 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings. 745--760.

Cited By

View all
  • (2025)Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public FiguresComputer Speech & Language10.1016/j.csl.2024.10169089(101690)Online publication date: Jan-2025
  • (2024)A systematic literature review of hate speech identification on Arabic Twitter data: research challenges and future directionsPeerJ Computer Science10.7717/peerj-cs.196610(e1966)Online publication date: 2-Apr-2024
  • (2024)A survey on multi-lingual offensive language detectionPeerJ Computer Science10.7717/peerj-cs.193410(e1934)Online publication date: 29-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2019
1512 pages
ISBN:9781450361729
DOI:10.1145/3331184
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. experimental evaluation
  3. hate speech classification
  4. social media

Qualifiers

  • Research-article

Funding Sources

  • Fondecyt
  • Millennium Science Initiative of the Ministry of Economy Development and Tourism of Chile

Conference

SIGIR '19
Sponsor:

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;
Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)184
  • Downloads (Last 6 weeks)17
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public FiguresComputer Speech & Language10.1016/j.csl.2024.10169089(101690)Online publication date: Jan-2025
  • (2024)A systematic literature review of hate speech identification on Arabic Twitter data: research challenges and future directionsPeerJ Computer Science10.7717/peerj-cs.196610(e1966)Online publication date: 2-Apr-2024
  • (2024)A survey on multi-lingual offensive language detectionPeerJ Computer Science10.7717/peerj-cs.193410(e1934)Online publication date: 29-Mar-2024
  • (2024)Domain adaptation-based method for improving generalization of hate speech detection modelsSignal and Data Processing10.61186/jsdp.21.1.12521:1(125-142)Online publication date: 1-Jun-2024
  • (2024)Hate speech detection: A comprehensive review of recent worksExpert Systems10.1111/exsy.13562Online publication date: 25-Feb-2024
  • (2024)Detecting Offensive Language Based on Graph Attention Networks and Fusion FeaturesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325050211:1(1493-1505)Online publication date: Feb-2024
  • (2024)You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00061(770-787)Online publication date: 19-May-2024
  • (2024)Coding latent concepts: a human and LLM-coordinated content analysis procedureCommunication Research Reports10.1080/08824096.2024.241026341:5(324-334)Online publication date: 3-Oct-2024
  • (2024)HA-GCENKnowledge-Based Systems10.1016/j.knosys.2024.112166300:COnline publication date: 18-Nov-2024
  • (2024)Deep learning for hate speech detection: a comparative studyInternational Journal of Data Science and Analytics10.1007/s41060-024-00650-6Online publication date: 22-Oct-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media