research-article

Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation

Authors:

Barbara PobleteAuthors Info & Claims

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 45 - 54

https://doi.org/10.1145/3331184.3331262

Published: 18 July 2019 Publication History

Abstract

Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and classifying hateful content, without much success. On the other hand, the results reported by state-of-the-art systems indicate that supervised approaches achieve almost perfect performance but only within specific datasets. In this work, we analyze this apparent contradiction between existing literature and actual applications. We study closely the experimental methodology used in prior work and their generalizability to other datasets. Our findings evidence methodological issues, as well as an important dataset bias. As a consequence, performance claims of the current state-of-the-art have become significantly overestimated. The problems that we have found are mostly related to data overfitting and sampling issues. We discuss the implications for current research and re-conduct experiments to give a more accurate picture of the current state-of-the art methods.

Supplementary Material

MP4 File (cite3-11h40-d1.mp4)

Download
368.07 MB

References

[1]

Sweta Agrawal and Amit Awekar. 2018. Deep Learning for Detecting Cyberbullying Across Multiple Social Media Platforms. In Advances in Information Retrieval - 40th European Conference on IR Research, ECIR 2018, Grenoble, France, March 26--29, 2018, Proceedings. 141--153.

[2]

Pinkesh Badjatiya, Shashank Gupta, Manish Gupta, and Vasudeva Varma. 2017. Deep learning for hate speech detection in tweets. In Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 759--760.

Digital Library

[3]

Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural Machine Translation by Jointly Learning to Align and Translate. CoRR abs/1409.0473 (2014). http://arxiv.org/abs/1409.0473

[4]

Valerio Basile, Cristina Bosco, Viviana Patti, Manuela Sanguinetti, Elisabetta Fersini, Debora Nozza, Francisco Rangel, and Paolo Rosso. {n.d.}. Shared Task on Multilingual Detection of Hate. SemEval 2019, Task 5, https://competitions. codalab.org/competitions/19935.

[5]

Despoina Chatzakou, Nicolas Kourtellis, Jeremy Blackburn, Emiliano De Cristofaro, Gianluca Stringhini, and Athena Vakali. 2017. Mean Birds: Detecting Aggression and Bullying on Twitter. In Proceedings of the 2017 ACM on Web Science Conference, WebSci 2017, Troy, NY, USA, June 25 - 28, 2017. 13--22.

Digital Library

[6]

Maral Dadvar and Kai Eckert. 2018. Cyberbullying Detection in Social Networks Using Deep Learning Based Models; A Reproducibility Study. CoRR abs/1812.08046 (2018). arXiv:1812.08046 http://arxiv.org/abs/1812.08046

[7]

Maral Dadvar, Dolf Trieschnigg, and Franciska de Jong. 2014. Experts and Machines against Bullies: A Hybrid Approach to Detect Cyberbullies. In Advances in Artificial Intelligence - 27th Canadian Conference on Artificial Intelligence, Canadian AI 2014, Montréal, QC, Canada, May 6--9, 2014. Proceedings. 275--281.

[8]

Thomas Davidson, Dana Warmsley, Michael W. Macy, and Ingmar Weber. 2017. Automated Hate Speech Detection and the Problem of Offensive Language. In Proceedings of the Eleventh International Conference on Web and Social Media, ICWSM 2017, Montréal, Québec, Canada, May 15--18, 2017. AAAI Press, 512--515. https://aaai.org/ocs/index.php/ICWSM/ICWSM17/paper/view/15665

[9]

Laure Delisle, Alfredo Kalaitzis, Krzysztof Majewski, Archy de Berker, Milena Marin, and Julien Cornebise. 2018. A large-scale crowd-sourced analysis of abuse against women journalists and politicians on Twitter. (2018).

[10]

Nicholas Fandos and Kevin Roose. 2018. Facebook Identifies an Active Political Influence Campaign Using Fake Accounts. https://www.nytimes.com/2018/07/ 31/us/politics/facebook-political-campaign-midterms.html. {Online; accessed 26-January-2019}.

[11]

Paula Fortuna and Sérgio Nunes. 2018. A Survey on Automatic Detection of Hate Speech in Text. ACM Comput. Surv. 51, 4 (2018), 85:1--85:30.

Digital Library

[12]

Jerome H. Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. Annals of Statistics 29 (2000), 1189--1232.

[13]

Björn Gambäck and Utpal Kumar Sikdar. 2017. Using Convolutional Neural Networks to Classify Hate-Speech. In Proceedings of the FirstWorkshop on Abusive Language Online. Association for Computational Linguistics, 85--90.

[14]

Yoav Goldberg. 2016. A Primer on Neural Network Models for Natural Language Processing. J. Artif. Intell. Res. 57 (2016), 345--420.

[15]

Ian J. Goodfellow, Yoshua Bengio, and Aaron C. Courville. 2016. Deep Learning. MIT Press. http://www.deeplearningbook.org/

Digital Library

[16]

Edouard Grave, Tomas Mikolov, Armand Joulin, and Piotr Bojanowski. 2017. Bag of Tricks for Efficient Text Classification. In EACL 2017. 427--431.

[17]

Nir Grinberg, Kenneth Joseph, Lisa Friedland, Briony Swire-Thompson, and David Lazer. 2019. Fake news on Twitter during the 2016 U.S. presidential election. Science 363, 6425 (2019), 374--378. aau2706 arXiv:http://science.sciencemag.org/content/363/6425/374.full.pdf

[18]

Tommi Gröndahl, Luca Pajola, Mika Juuti, Mauro Conti, and N. Asokan. 2018. All You Need is "Love": Evading Hate Speech Detection. In Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security, CCS 2018, Toronto, ON, Canada, October 19, 2018. 2--12.

Digital Library

[19]

Homa Hosseinmardi, Sabrina Arredondo Mattson, Rahat Ibn Rafiq, Richard Han, Qin Lv, and Shivakant Mishra. 2015. Detection of Cyberbullying Incidents on the Instagram Social Network. CoRR abs/1503.03909 (2015). arXiv:1503.03909 http://arxiv.org/abs/1503.03909

[20]

Jihye Lee. 2018. Twitter Apologizes for Mishandling Reported Threat From Mail- Bomb Suspect. http://time.com/5436809/twitter-apologizes-threat-mail-bombsuspect/. {Online; accessed 26-January-2019}.

[21]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Gregory S. Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In NIPS 2013. 3111--3119.

Digital Library

[22]

Etienne Papegnies, Vincent Labatut, Richard Dufour, and Georges Linares. 2017. Graph-based Features for Automatic Online Abuse Detection. In International Conference on Statistical Language and Speech Processing. Springer, 70--81.

[23]

Ji Ho Park and Pascale Fung. 2017. One-step and Two-step Classification for Abusive Language Detection on Twitter. In Proceedings of the First Workshop on Abusive Language Online. Association for Computational Linguistics, 41--45.

[24]

Jeffrey Pennington, Richard Socher, and Christopher D. Manning. 2014. Glove: Global Vectors for Word Representation. In EMNLP 2014. 1532--1543.

[25]

Kelly Reynolds, April Kontostathis, and Lynne Edwards. 2011. Using Machine Learning to Detect Cyberbullying. In 10th International Conference on Machine Learning and Applications and Workshops, ICMLA 2011, Honolulu, Hawaii, USA, December 18--21, 2011. Volume 2: Special Sessions and Workshop. 241--244.

Digital Library

[26]

Semiu Salawu, Yulan He, and Joanna Lumsden. 2017. Approaches to Automated Detection of Cyberbullying: A Survey. IEEE Transactions on Affective Computing (2017).

[27]

Anna Schmidt and Michael Wiegand. 2017. A survey on hate speech detection using natural language processing. In Proceedings of the Fifth International Workshop on Natural Language Processing for Social Media. 1--10.

[28]

Vivek K Singh, Souvick Ghosh, and Christin Jose. 2017. Toward multimodal cyberbullying detection. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems. ACM, 2090--2099.

Digital Library

[29]

ZeerakWaseem. 2016. Are you a racist or am i seeing things? annotator influence on hate speech detection on twitter. In Proceedings of the first workshop on NLP and computational social science. 138--142.

[30]

Zeerak Waseem and Dirk Hovy. 2016. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter. In Proceedings of the Student Research Workshop, SRW@HLT-NAACL 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego California, USA, June 12--17, 2016. 88--93. http://aclweb.org/anthology/N/N16/N16--2013.pdf

[31]

Ellery Wulczyn, Nithum Thain, and Lucas Dixon. 2017. Ex machina: Personal attacks seen at scale. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1391--1399.

Digital Library

[32]

Ziqi Zhang, David Robinson, and Jonathan A. Tepper. 2018. Detecting Hate Speech on Twitter Using a Convolution-GRU Based Deep Neural Network. In The Semantic Web - 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3--7, 2018, Proceedings. 745--760.

Cited By

Yuan LRizoiu M(2025)Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public FiguresComputer Speech & Language10.1016/j.csl.2024.10169089(101690)Online publication date: Jan-2025
https://doi.org/10.1016/j.csl.2024.101690
Alhazmi AMahmud RIdris NMohamed Abo MEke C(2024)A systematic literature review of hate speech identification on Arabic Twitter data: research challenges and future directionsPeerJ Computer Science10.7717/peerj-cs.196610(e1966)Online publication date: 2-Apr-2024
https://doi.org/10.7717/peerj-cs.1966
Mnassri KFarahbakhsh RChalehchaleh RRajapaksha PJafari ALi GCrespi N(2024)A survey on multi-lingual offensive language detectionPeerJ Computer Science10.7717/peerj-cs.193410(e1934)Online publication date: 29-Mar-2024
https://doi.org/10.7717/peerj-cs.1934
Show More Cited By

Index Terms

Hate Speech Detection is Not as Easy as You May Think: A Closer Look at Model Validation
1. Computing methodologies
  1. Machine learning
    1. Cross-validation
    2. Machine learning approaches
2. Information systems
  1. World Wide Web
    1. Web searching and information discovery
      1. Social tagging

Recommendations

Hate speech detection is not as easy as you may think: A closer look at model validation (extended version)
Abstract
Hate speech is an important problem that is seriously affecting the dynamics and usefulness of online social communities. Large scale social platforms are currently investing important resources into automatically detecting and ...
Highlights
- The state-of-the-art results are highly overestimated due to experimental issues.
The Virality of Hate Speech on Social Media
CSCW

Online hate speech is responsible for violent attacks such as, e.g., the Pittsburgh synagogue shooting in 2018, thereby posing a significant threat to vulnerable groups and society in general. However, little is known about what makes hate speech on ...
A Measurement Study of Hate Speech in Social Media
HT '17: Proceedings of the 28th ACM Conference on Hypertext and Social Media

Social media platforms provide an inexpensive communication medium that allows anyone to quickly reach millions of users. Consequently, in these platforms anyone can publish content and anyone interested in the content can obtain it, representing a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR'19: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2019

1512 pages

ISBN:9781450361729

DOI:10.1145/3331184

General Chairs:
Benjamin Piwowarski
CNRS - Sorbonne Universite, France
,
Max Chevalier
Universite de Toulouse, CNRS, France
,
Eric Gaussier
Universite Grenoble Alpes, CNRS, France
,
Program Chairs:
Yoelle Maarek
Amazon Research, Israel
,
Jian-Yun Nie
University of Montreal, Canada
,
Falk Scholer
RMIT University, Australia

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Fondecyt
Millennium Science Initiative of the Ministry of Economy Development and Tourism of Chile

Conference

SIGIR '19

Sponsor:

SIGIR

SIGIR '19: The 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval

July 21 - 25, 2019

Paris, France

Acceptance Rates

SIGIR'19 Paper Acceptance Rate 84 of 426 submissions, 20%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

78
Total Citations
View Citations
1,916
Total Downloads

Downloads (Last 12 months)184
Downloads (Last 6 weeks)17

Reflects downloads up to 19 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yuan LRizoiu M(2025)Generalizing Hate Speech Detection Using Multi-Task Learning: A Case Study of Political Public FiguresComputer Speech & Language10.1016/j.csl.2024.10169089(101690)Online publication date: Jan-2025
https://doi.org/10.1016/j.csl.2024.101690
Alhazmi AMahmud RIdris NMohamed Abo MEke C(2024)A systematic literature review of hate speech identification on Arabic Twitter data: research challenges and future directionsPeerJ Computer Science10.7717/peerj-cs.196610(e1966)Online publication date: 2-Apr-2024
https://doi.org/10.7717/peerj-cs.1966
Mnassri KFarahbakhsh RChalehchaleh RRajapaksha PJafari ALi GCrespi N(2024)A survey on multi-lingual offensive language detectionPeerJ Computer Science10.7717/peerj-cs.193410(e1934)Online publication date: 29-Mar-2024
https://doi.org/10.7717/peerj-cs.1934
Nourollahi SBaradaran RAmirkhani H(2024)Domain adaptation-based method for improving generalization of hate speech detection modelsSignal and Data Processing10.61186/jsdp.21.1.12521:1(125-142)Online publication date: 1-Jun-2024
https://doi.org/10.61186/jsdp.21.1.125
Gandhi AAhir PAdhvaryu KShah PLohiya RCambria EPoria SHussain A(2024)Hate speech detection: A comprehensive review of recent worksExpert Systems10.1111/exsy.13562Online publication date: 25-Feb-2024
https://doi.org/10.1111/exsy.13562
Miao ZChen XWang HTang RYang ZHuang TTang W(2024)Detecting Offensive Language Based on Graph Attention Networks and Fusion FeaturesIEEE Transactions on Computational Social Systems10.1109/TCSS.2023.325050211:1(1493-1505)Online publication date: Feb-2024
https://doi.org/10.1109/TCSS.2023.3250502
He XZannettou SShen YZhang Y(2024)You Only Prompt Once: On the Capabilities of Prompt Learning on Large Language Models to Tackle Toxic Content2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00061(770-787)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00061
Fan JAi YLiu XDeng YLi Y(2024)Coding latent concepts: a human and LLM-coordinated content analysis procedureCommunication Research Reports10.1080/08824096.2024.241026341:5(324-334)Online publication date: 3-Oct-2024
https://doi.org/10.1080/08824096.2024.2410263
Mu YYang JLi TLi SLiang W(2024)HA-GCENKnowledge-Based Systems10.1016/j.knosys.2024.112166300:COnline publication date: 18-Nov-2024
https://dl.acm.org/doi/10.1016/j.knosys.2024.112166
Malik JQiao HPang Gvan den Hengel A(2024)Deep learning for hate speech detection: a comparative studyInternational Journal of Data Science and Analytics10.1007/s41060-024-00650-6Online publication date: 22-Oct-2024
https://doi.org/10.1007/s41060-024-00650-6
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents