Abstract
Offensive content, such as verbal attacks, demeaning comments, or hate speech, has become widespread on social media. Automatic detection of this content is considered an important and challenging task. Although several research works have been proposed to address this challenge for high-resource languages, research on detecting offensive content in Dialectal Arabic (DA) remains under-explored. Recently, the detection of offensive language in DA has gained increasing interest among researchers in Natural Language Processing (NLP). However, only a limited number of annotated datasets have been introduced for single or multiple coarse-grained dialects. In this paper, we introduce Offensive Moroccan Comments Dataset (OMCD), the first dataset for offensive language detection for the Moroccan dialect. First, we present the data collection steps, the statistical analysis, and the annotation guidelines of the introduced dataset. Then, we evaluate several state-of-the-art Machine Learning (ML) and Deep Learning (DL) based models on the OMCD dataset. Finally, we highlight the impact of emojis on the evaluated models for offensive language detection.
Similar content being viewed by others
Notes
The dataset is publicly available at: https://github.com/kabilessefar/OMCD-Offensive-Moroccan-Comments-Dataset.
References
Abdelali, A., Hassan, S., Mubarak, H., Darwish, K., & Samih, Y. (2021). Pre-training BERT on Arabic tweets: Practical considerations. CoRR. http://arxiv.org/2102.10684
Abdul-Mageed, M., Elmadany, A. A., & Nagoudi, E. M. B. (2021). ARBERT & MARBERT: Deep bidirectional transformers for Arabic. CoRR. arXiv:abs/2101.01785
Abozinadah, E. A., Mbaziira, A. V., & Jones, J. (2015). Detection of abusive accounts with Arabic tweets. International Journal of Knowledge Engineering-IACSIT, 1(2), 113–119.
Agarwal, S., & Sureka, A. (2014). A focused crawler for mining hate and extremism promoting videos on YouTube. In Proceedings of the 25th ACM conference on hypertext and social media. HT ’14 (pp. 294–296). Association for Computing Machinery. https://doi.org/10.1145/2631775.2631776
Alakrot, A., Murray, L., & Nikolov, N. S. (2018a). Dataset construction for the detection of anti-social behaviour in online communication in Arabic. Procedia Computer Science, 142, 174–181. https://doi.org/10.1016/j.procs.2018.10.473
Alakrot, A., Murray, L., & Nikolov, N. S. (2018b). Towards accurate detection of offensive language in online communication in Arabic. Procedia Computer Science, 142, 315–320. https://doi.org/10.1016/j.procs.2018.10.491
Albadi, N., Kurdi, M., & Mishra, S. (2018). Are they our brothers? Analysis and detection of religious hate speech in the Arabic Twittersphere. In 2018 IEEE/ACM international conference on advances in social networks analysis and mining (ASONAM) (pp. 69–76).
Antoine, J.-Y., Villaneau, J., & Lefeuvre, A. (2014). Weighted Krippendorff’s alpha is a more reliable metrics for multi-coders ordinal annotations: Experimental studies on emotion, opinion and coreference annotation. In Proceedings of the 14th conference of the European chapter of the Association for Computational Linguistics (pp. 550–559). Association for Computational Linguistics. https://doi.org/10.3115/v1/E14-1058, https://www.aclweb.org/anthology/E14-1058
Antoun, W., Baly, F., & Hajj, H. M. (2020). Arabert: Transformer-based model for Arabic language understanding. CoRR. arXiv:abs/2003.00104
Artstein, R., & Poesio, M. (2008). Survey article: Inter-coder agreement for computational linguistics. Computational Linguistics, 34(4), 555–596. https://doi.org/10.1162/coli.07-034-R2
Barbieri, F., Ballesteros, M., & Saggion, H. (2017). Are emojis predictable? CoRR. arXiv:abs/1702.07285
Baudhuin, E. S. (1973). Obscene language and evaluative response: An empirical study. Psychological Reports, 32(2), 399–402.
Burnap, P., & Williams, M. L. (2015). Cyber hate speech on Twitter: An application of machine classification and statistical modeling for policy and decision making. Policy & Internet, 7(2), 223–242. https://doi.org/10.1002/poi3.85
Chatzakou, D., Kourtellis, N., Blackburn, J., De Cristofaro, E., Stringhini, G., & Vakali, A. (2017). Mean birds: Detecting aggression and bullying on twitter. In Proceedings of the 2017 ACM on web science conference. WebSci ’17 (pp. 13–22). Association for Computing Machinery. https://doi.org/10.1145/3091478.3091487
Chowdhury, S .A., Mubarak, H., Abdelali, A., Jung, S.-G., Jansen, B. J., & Salminen, J. (2020). A multi-platform Arabic news comment dataset for offensive language detection. In Proceedings of the 12th language resources and evaluation conference (pp. 6203–6212). European Language Resources Association. https://www.aclweb.org/anthology/2020.lrec-1.761
Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104
Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
Cox, D. R. (1958). The regression analysis of binary sequences. Journal of the Royal Statistical Society: Series B (Methodological), 20(2), 215–232.
Dai, W., Yu, T., Liu, Z., & Fung, P. (2020). Kungfupanda at SemEval-2020 task 12: BERT-based multi-task learning for offensive language detection. In Proceedings of the fourteenth workshop on semantic evaluation (pp. 2060–2066). International Committee for Computational Linguistics. https://doi.org/10.18653/v1/2020.semeval-1.272, https://aclanthology.org/2020.semeval-1.272
Darwish, K., Habash, N., Abbas, M., Al-Khalifa, H., Al-Natsheh, H. T., Bouamor, H., Bouzoubaa, K., Cavalli-Sforza, V., El-Beltagy, S. R., El-Hajj, W., Jarrar, M., & Mubarak, H. (2021). A panoramic survey of natural language processing in the Arab world. Communications of the ACM, 64(4), 72–81. https://doi.org/10.1145/3447735
Davidson, T., Warmsley, D., Macy, M., & Weber, I. (2017). Automated hate speech detection and the problem of offensive language. Proceedings of the International AAAI Conference on Web and Social Media, 11(1), 512–515.
El Mekki, A., El Mahdaouy, A., Berrada, I., & Khoumsi, A. (2021a). Domain adaptation for Arabic cross-domain and cross-dialect sentiment analysis from contextualized word embedding. In Proceedings of the 2021 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies (pp. 2824–2837). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.naacl-main.226, https://aclanthology.org/2021.naacl-main.226
El Mekki, A., El Mahdaouy, A., Berrada, I., & Khoumsi, A. (2021b). On the role of orthographic variations in building multidialectal Arabic word embeddings. In Proceedings of the Canadian conference on artificial intelligence. https://doi.org/10.21428/594757db.5febef29, https://caiac.pubpub.org/pub/pdf9jqoh
El Mekki, A., El Mahdaouy, A., Essefar, K., El Mamoun, N., Berrada, I., & Khoumsi, A. (2021c). BERT-based multi-task model for country and province level MSA and dialectal Arabic identification. In Proceedings of the sixth Arabic natural language processing workshop (pp. 271–275). Association for Computational Linguistics, Kyiv (Virtual). https://aclanthology.org/2021.wanlp-1.31
Erdmann, A., Zalmout, N., & Habash, N. (2018). Addressing noise in multidialectal word embeddings. In Proceedings of the 56th annual meeting of the Association for Computational Linguistics (Vol. 2: Short Papers, pp. 558–565).
Eryani, F., Habash, N., Bouamor, H., & Khalifa, S. (2020). A spelling correction corpus for multiple Arabic dialects. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the 12th language resources and evaluation conference, LREC 2020, Marseille, May 11–16, 2020 (pp. 4130–4138). European Language Resources Association.
Essefar, K., El Mekki, A., El Mahdaouy, A., El Mamoun, N., & Berrada, I. (2021). CS-UM6P at SemEval-2021 task 7: Deep multi-task learning model for detecting and rating humor and offense. In Proceedings of the 15th international workshop on semantic evaluation (SemEval-2021) (pp. 1135–1140). Association for Computational Linguistics. https://doi.org/10.18653/v1/2021.semeval-1.159, https://aclanthology.org/2021.semeval-1.159
Guellil, I., Saâdane, H., Azouaou, F., Gueni, B., & Nouvel, D. (2021). Arabic natural language processing: An overview. Journal of King Saud University: Computer and Information Sciences, 33(5), 497–507. https://doi.org/10.1016/j.jksuci.2019.02.006
Gwet, Kilem. (2011). On the Krippendorff’s alpha coefficient. Retrieved October 2, 2011
Haddad, H., Mulki, H., & Oueslati, A. (2019). T-HSAB: A Tunisian hate speech and abusive dataset. In K. Smaïli (Ed.), Arabic language processing: From theory to practice (pp. 251–263). Springer.
Hinduja, S., & Patchin, J. W. (2010). Bullying, cyberbullying, and suicide. Archives of Suicide Research, 14(3), 206–221.
Ho, T. K. (1995). Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition (Vol. 1, pp. 278–282). IEEE.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Hodeib, C. (2021). Variability in perceptions of (im)politeness in Syrian Arabic: The observers’ perspective. Argumentum, 17, 125–160.
Husain, F., & Uzuner, O. (2021a). Exploratory Arabic offensive language dataset analysis. arXiv Preprint. http://arxiv.org/abs/2101.11434
Husain, F., & Uzuner, O. (2021b). A survey of offensive language detection for the Arabic language. ACM Transactions on Asian and Low-Resource Language Information Processing. https://doi.org/10.1145/3421504
Inoue, G., Alhafni, B., Baimukan, N., Bouamor, H., & Habash, N. (2021). The interplay of variant, size, and task type in Arabic pre-trained language models. CoRR. http://arxiv.org/2103.06678
Jones, K. S. (1972). A statistical interpretation of term specificity and its application in retrieval. Journal of Documentation, 28, 11–21.
Khaddaj, A., Hajj, H., & El-Hajj, W. (2019). Improved generalization of Arabic text classifiers. In Proceedings of the fourth Arabic natural language processing workshop (pp. 167–174). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-4618. https://aclanthology.org/W19-4618
Krippendorff, K. (2004). Content analysis: An introduction to its methodology (p. 241). Sage.
Kumar, R., Ojha, A. K., Malmasi, S., & Zampieri, M. (2018). Benchmarking aggression identification in social media. In Proceedings of the first workshop on trolling, aggression and cyberbullying (TRAC-2018) (pp. 1–11). Association for Computational Linguistics. https://aclanthology.org/W18-4401
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
LaValle, S. M., Branicky, M. S., & Lindemann, S. R. (2004). On the relationship between classical grid search and probabilistic roadmaps. The International Journal of Robotics Research, 23(7–8), 673–692.
Liu, Y., Yang, M., Ramsay, M., Li, X., & Coid, J. (2011). A comparison of logistic regression, classification and regression tree, and neural networks models in predicting violent re-offending. Journal of Quantitative Criminology, 27, 547–573. https://doi.org/10.1007/s10940-011-9137-7
McCallum, A., Nigam, K., et al. (1998). A comparison of event models for Naive Bayes text classification. In AAAI-98 workshop on learning for text categorization (Vol. 752, pp. 41–48). CiteSeer.
Mengü, M., & Mengü, S. (2015). Violence and social media. Athens Journal of Mass Media and Communications, 1(3), 211–227.
Mouheb, D., Ismail, R., Qaraghuli, S. A., Aghbari, Z. A., & Kamel, I. (2018). Detection of offensive messages in Arabic social media communications. In 2018 international conference on innovations in information technology (IIT) (pp. 24–29). https://doi.org/10.1109/INNOVATIONS.2018.8606030
Mubarak, H., Darwish, K., & Magdy, W. (2017). Abusive language detection on Arabic social media. In Proceedings of the first workshop on abusive language online (pp. 52–56). Association for Computational Linguistics. https://doi.org/10.18653/v1/W17-3008. https://www.aclweb.org/anthology/W17-3008
Mubarak, H., Darwish, K., Magdy, W., Elsayed, T., & Al-Khalifa, H. (2020a). Overview of OSACT4 Arabic offensive language detection shared task. In Proceedings of the 4th workshop on open-source Arabic corpora and processing tools, with a shared task on offensive language detection (pp. 48–52). European Language Resource Association. https://www.aclweb.org/anthology/2020.osact-1.7
Mubarak, H., Hassan, S., & Chowdhury, S. A. (2022). Emojis as anchors to detect Arabic offensive language and hate speech. CoRR. arXiv:abs/2201.06723
Mubarak, H., Rashed, A., Darwish, K., Samih, Y., & Abdelali, A. (2020b). Arabic offensive language on twitter: Analysis and experiments. arXiv Preprint. arXiv:2004.02192
Mubarak, H., Rashed, A., Darwish, K., Samih, Y., & Abdelali, A. (2020c). Arabic offensive language on twitter: Analysis and experiments. CoRR. arXiv:2004.02192
Muhammad, A.-M., Chiyu, Z., Houda, B., & Nizar, H. (2020). NADI 2020: The first nuanced Arabic dialect identification shared task. arXiv:2010.11334arXiv:2010.11334
Mulki, H., Haddad, H., Bechikh Ali, C., & Alshabani, H. (2019). L-HSAB: A Levantine Twitter dataset for hate speech and abusive language. In Proceedings of the third workshop on abusive language online (pp. 111–118). Association for Computational Linguistics. https://doi.org/10.18653/v1/W19-3512. https://aclanthology.org/W19-3512
Obeid, O., Zalmout, N., Khalifa, S., Taji, D., Oudah, M., Alhafni, B., Inoue, G., Eryani, F., Erdmann, A., & Habash, N. (2020). CAMeL tools: An open source python toolkit for Arabic natural language processing. In Proceedings of the 12th language resources and evaluation conference (pp. 7022–7032). European Language Resources Association. https://www.aclweb.org/anthology/2020.lrec-1.868
Ousidhoum, N., Lin, Z., Zhang, H., Song, Y., & Yeung, D.-Y. (2019). Multilingual and multi-aspect hate speech analysis. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 4675–4684). Association for Computational Linguistics. https://doi.org/10.18653/v1/D19-1474, https://aclanthology.org/D19-1474
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay, E. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825–2830.
Qwaider, C., Chatzikyriakidis, S., & Dobnik, S. (2019). Can modern standard Arabic approaches be used for Arabic dialects? Sentiment analysis as a case study. In Proceedings of the 3rd workshop on Arabic corpus linguistics (pp. 40–50). Association for Computational Linguistics. https://aclanthology.org/W19-5606
Rainie, H., Anderson, J. Q., & Albright, J. (2017). The future of free speech, trolls, anonymity and fake news online. Washington, DC: Pew Research Center.
Saadane, H., & Habash, N. (2015). A conventional orthography for Algerian Arabic. In N. Habash, S. Vogel, & K. Darwish (Eds.), Proceedings of the second workshop on Arabic natural language processing, ANLP@ACL 2015, Beijing, July 30, 2015 (pp. 69–79). Association for Computational Linguistics. https://doi.org/10.18653/v1/W15-3208
Sarika. (2022). 84 YouTube statistics you can’t ignore in 2022. https://invideo.io/blog/youtube-statistics/
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. Transactions on Signal Processing, 45(11), 2673–2681. https://doi.org/10.1109/78.650093
Waseem, Z., & Hovy, D. (2016). Hateful symbols or hateful people? Predictive features for hate speech detection on Twitter. In Proceedings of the NAACL student research workshop (pp. 88–93). Association for Computational Linguistics. https://doi.org/10.18653/v1/N16-2013, https://www.aclweb.org/anthology/N16-2013
Whillock, R. K., & Slayden, D. (1995). Hate speech. ERIC.
Younes, J., Souissi, E., Achour, H., & Ferchichi, A. (2020). Language resources for Maghrebi Arabic dialects’ NLP: A survey. Language Resources and Evaluation, 54(4), 1079–1142. https://doi.org/10.1007/s10579-020-09490-9
Zampieri, M., Malmasi, S., Nakov, P., Rosenthal, S., Farra, N., & Kumar, R. (2019). Predicting the type and target of offensive posts in social media. In Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics: Human Language Technologies (Vol. 1 (Long and Short Papers), pp. 1415–1420). Association for Computational Linguistics. https://doi.org/10.18653/v1/N19-1144, https://aclanthology.org/N19-1144
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Essefar, K., Ait Baha, H., El Mahdaouy, A. et al. OMCD: Offensive Moroccan Comments Dataset. Lang Resources & Evaluation 57, 1745–1765 (2023). https://doi.org/10.1007/s10579-023-09663-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10579-023-09663-2