Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space

1048 Accesses
24 Citations
1 Altmetric
Explore all metrics

Abstract

Religious hatred is a serious problem on Arabic Twitter space and has the potential to ignite terrorism and hate crimes beyond cyber space. To the best of our knowledge, this is the first research effort investigating the problem of recognizing Arabic tweets using inflammatory and dehumanizing language to promote hatred and violence against people on the basis of religious beliefs. In this work, we create the first public Arabic dataset of tweets annotated for religious hate speech detection. We also create three public Arabic lexicons of terms related to religion along with hate scores. We then present a thorough analysis of the labeled dataset, reporting most targeted religious groups and hateful and non-hateful tweets’ country of origin. The labeled dataset is then used to train seven classification models using lexicon-based, n-gram-based, and deep-learning-based approaches. These models are evaluated on new unseen dataset to assess the generalization ability of the developed classifiers. While using Gated Recurrent Units with pre-trained word embeddings provides best precision (0.76) and $F_1$ score (0.77), training that same neural network on additional temporal, users, and content features provides the state-of-the-art performance in terms of recall (0.84).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Levantine hate speech detection in twitter

Article 29 August 2022

Detection of hate speech in Arabic tweets using deep learning

Article 21 January 2021

Detecting Suicidality in Arabic Tweets Using Machine Learning and Deep Learning Techniques

Article 05 March 2024

Notes

References

Al-garadi MA, Varathan KD, Ravana SD (2016) Cybercrime detection in online communications: the experimental case of cyberbullying detection in the twitter network. Comput Hum Behav 63:433–443
Article Google Scholar
Al-Twairesh N, Al-Khalifa H, AlSalman A (2016) Arasenti: large-scale twitter-specific Arabic sentiment lexicons. In: Proceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), vol 1, pp 697–705
Albadi N, Kurdi M, Mishra S (2018) Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In: 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM). IEEE, pp 69–76
Badjatiya P, Gupta S, Gupta M, Varma V (2017) Deep learning for hate speech detection in tweets. In: Proceedings of the 26th international conference on world wide web companion. International World Wide Web Conferences Steering Committee, pp 759–760
Burton SH, Tanner KW, Giraud-Carrier CG, West JH, Barnes MD (2012) “right time, right place” health communication on twitter: value and accuracy of location information. J Med Internet Res 14(6):e156
Article Google Scholar
Chatzakou D, Kourtellis N, Blackburn J, De Cristofaro E, Stringhini G, Vakali A (2017) Mean birds: detecting aggression and bullying on twitter. In: Proceedings of the 2017 ACM on web science conference. ACM, pp 13–22
Chong A (2006) Intolerance of terror, or the terror of intolerance-religious tolerance and the response to terrorism. UTS L Rev 8:153
Google Scholar
Chung J, Gülçehre Ç, Cho K, Bengio Y (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. CoRR, arXiv:abs/1412.3555
Church KW, Hanks P (1990) Word association norms, mutual information, and lexicography. Comput Linguist 16(1):22–29
Google Scholar
Darwish K, Magdy W, Mourad A (2012) Language processing for Arabic microblog retrieval. In: Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, pp 2427–2430
Davidson T, Warmsley D, Macy M, Weber I (2017) Automated hate speech detection and the problem of offensive language. In: Proceedings of the 11th international AAAI conference on web and social media. ICWSM ’17, pp 512–515
Djuric N, Zhou J, Morris R, Grbovic M, Radosavljevic V, Bhamidipati N (2015) Hate speech detection with comment embeddings. In: Proceedings of the 24th international conference on world wide web. ACM, pp 29–30
Duwairi RM, Marji R, Sha’ban N, Rushaidat S (2014) Sentiment analysis in Arabic tweets. In: 2014 5th international conference on information and communication systems (ICICS). IEEE, pp 1–6
Forman G (2003) An extensive empirical study of feature selection metrics for text classification. J Mach Learn Res 3:1289–1305
MATH Google Scholar
Forman G (2008) Bns feature scaling: an improved representation over tf-idf for svm text classification. In: Proceedings of the 17th ACM conference on information and knowledge management. ACM, pp 263–270
Founta AM, Chatzakou D, Kourtellis N, Blackburn J, Vakali A, Leontiadis I (2019) A unified deep learning architecture for abuse detection. In: Proceedings of the 10th ACM conference on web science. ACM, pp 105–114
Gitari ND, Zuping Z, Damien H, Long J (2015) A lexicon-based approach for hate speech detection. Int J Multimed Ubiquitous Eng 10(4):215–230
Article Google Scholar
Gouws S, Metzler D, Cai C, Hovy E (2011) Contextual bearing on linguistic variation in social media. In: Proceedings of the workshop on languages in social media. Association for Computational Linguistics, pp 20–29
Kaati L, Omer E, Prucha N, Shrestha A (2015) Detecting multipliers of jihadism on twitter. In: 2015 IEEE international conference on data mining workshop (ICDMW). IEEE, pp 954–960
Kaji N, Kitsuregawa M (2007) Building lexicon for sentiment analysis from massive collection of html documents. In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL)
Kulshrestha J, Kooti F, Nikravesh A, Gummadi KP (2012) Geographic dissection of the twitter network. In: Sixth international AAAI conference on weblogs and social media. AAAI
Kwok I, Wang Y (2013) Locate the hate: detecting tweets against blacks. In: AAAI
Larsen ME, Boonstra TW, Batterham PJ, O’Dea B, Paris C, Christensen H (2015) We feel: mapping emotion on twitter. IEEE J Biomed Health Inform 19(4):1246–1252
Article Google Scholar
Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188–1196
Magdy W, Darwish K, Abokhodair N, Rahimi A, Baldwin T (2016a) # isisisnotislam or# deportallmuslims? Predicting unspoken views. In: Proceedings of the 8th ACM conference on web science. ACM, pp 95–106
Magdy W, Darwish K, Weber I (2016b) # failedrevolutions: using twitter to study the antecedents of ISIS support. First Monday 21(2)
Mohammad SM, Kiritchenko S (2015) Using hashtags to capture fine emotion categories from tweets. Comput Intell 31(2):301–326
Article MathSciNet Google Scholar
Mohammad SM, Kiritchenko S, Zhu X (2013) Nrc-canada: building the state-of-the-art in sentiment analysis of tweets. In: Proceedings of the seventh international workshop on semantic evaluation exercises (SemEval-2013), Atlanta, Georgia, USA
Mubarak H, Darwish K, Magdy W (2017) Abusive language detection on Arabic social media. In: Proceedings of the first workshop on abusive language online, pp 52–56
Müller K, Schwarz C, et al. (2018) Fanning the flames of hate: social media and hate crime. Technical reports. Competitive Advantage in the Global Economy (CAGE)
Olteanu A, Castillo C, Diaz F, Vieweg S (2014) Crisislex: a lexicon for collecting and filtering microblogged communications in crises. In: ICWSM
Pasha A, Al-Badrashiny M, Diab MT, El Kholy A, Eskander R, Habash N, Pooleery M, Rambow O, Roth R (2014) Madamira: a fast, comprehensive tool for morphological analysis and disambiguation of Arabic. LREC 14:1094–1101
Google Scholar
Pearson K (1900) X on the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Lond Edinb Dublin Philos Mag J Sci 50(302):157–175
Article Google Scholar
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830
MathSciNet MATH Google Scholar
Pew Research Center, Washington, DC (2009) Mapping the global Muslim population. http://www.pewforum.org/2009/10/07/mapping-the-global-muslim-population/. Accessed 24 October 2018
Pew Research Center, Washington, DC (2015) Religious composition by country, 2010–2050. http://www.pewforum.org/2015/04/02/religious-projection-table/2010/percent/Middle_East-North_Africa/. Accessed 24 October 2018
Pew Research Center, Washington, DC (2017) Global restrictions on religion rise modestly in 2015, reversing downward trend - appendix b: social hostilities index. http://assets.pewresearch.org/wp-content/uploads/sites/11/2017/04/07154135/Appendix-B.pdf. Accessed 24 October 2018
Ribeiro MH, Calais PH, Santos YA, Almeida VA, Meira Jr W (2018) Characterizing and detecting hateful users on twitter. In: Twelfth international AAAI conference on web and social media
Salem F (2017) Social media and the internet of things towards data-driven policymaking in the Arab world: potential, limits and concerns. MBR School of Government 7, Dubai
Silva LA, Mondal M, Correa D, Benevenuto F, Weber I (2016) Analyzing the targets of hate in online social media. In: Proceedings of the 11th international AAAI conference on web and social media. ICWSM’16, pp 687–690
Soliman AB, Eissa K, El-Beltagy SR (2017) Aravec: a set of Arabic word embedding models for use in Arabic NLP. Proc Comput Sci 117:256–265
Article Google Scholar
Taghva K, Elkhoury R, Coombs J (2005) Arabic stemming without a root dictionary. In: International conference on information technology: coding and computing, 2005, ITCC 2005. IEEE, vol 1, pp 152–157
Twitter Safety (2017) Enforcing new rules to reduce hateful conduct and abusive behavior. https://blog.twitter.com/official/en_us/topics/company/2017/safetypoliciesdec2017.html. Accessed 27 November 2018
Waseem Z, Hovy D (2016) Hateful symbols or hateful people? predictive features for hate speech detection on twitter. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics—human language technologies. NAACL-HLT’16, pp 88–93
Wiktorowicz Q, Amanullah S (2015) How tech can fight extremism. https://www.cnn.com/2015/02/16/opinion/wiktorowicz-tech-fighting-extremism/index.html. Accessed 24 October 2018
Yang J, Jiang YG, Hauptmann AG, Ngo CW (2007) Evaluating bag-of-visual-words representations in scene classification. In: Proceedings of the international workshop on multimedia information retrieval. ACM, pp 197–206

Download references

Author information

Authors and Affiliations

Taibah University, Medina, Saudi Arabia
Nuha Albadi
Taif University, Taif, Saudi Arabia
Maram Kurdi
University of Colorado, Boulder, USA
Nuha Albadi, Maram Kurdi & Shivakant Mishra

Authors

Nuha Albadi
View author publications
You can also search for this author in PubMed Google Scholar
Maram Kurdi
View author publications
You can also search for this author in PubMed Google Scholar
Shivakant Mishra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nuha Albadi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Albadi, N., Kurdi, M. & Mishra, S. Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space. Soc. Netw. Anal. Min. 9, 41 (2019). https://doi.org/10.1007/s13278-019-0587-5

Download citation

Received: 12 December 2018
Revised: 20 July 2019
Accepted: 26 July 2019
Published: 05 August 2019
DOI: https://doi.org/10.1007/s13278-019-0587-5

Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Levantine hate speech detection in twitter

Detection of hate speech in Arabic tweets using deep learning

Detecting Suicidality in Arabic Tweets Using Machine Learning and Deep Learning Techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Investigating the effect of combining GRU neural networks with handcrafted features for religious hatred detection on Arabic Twitter space

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Levantine hate speech detection in twitter

Detection of hate speech in Arabic tweets using deep learning

Detecting Suicidality in Arabic Tweets Using Machine Learning and Deep Learning Techniques

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation