Nothing Special   »   [go: up one dir, main page]

Volume 21, Issue 1 (6-2024)                   JSDP 2024, 21(1): 125-142 | Back to browse issues page


XML Persian Abstract Print


Download citation:
BibTeX | RIS | EndNote | Medlars | ProCite | Reference Manager | RefWorks
Send citation to:

Nourollahi S F, Baradaran R, Amirkhani H. Domain adaptation-based method for improving generalization of hate speech detection models. JSDP 2024; 21 (1) : 10
URL: http://jsdp.rcisp.ac.ir/article-1-1341-en.html
Qom University
Abstract:   (348 Views)
Today, with the growth of activity in social media, we see an increase in hate speech online and for this reason, the issue of recognizing hate in cyberspace is important. Also, domain adaptation is one of the important challenges in this task and in general in the field of natural language processing. In many issues, while changing the domain, we face a drop in performance, which is also true in the task hate speech. In this research, we try to increase the generalizability of hate detection models by using domain adaptation methods. For this purpose, we use Transformer-based methods, including domain adversarial training and mixture of experts, and we also use multi-source training. Experiments are conducted using four datasets in the domain of hate. At first, we evaluate the models in an in-domain and single-source manner. In the next step, by adding other domains to the education section, we see a drop in results and a negative transfer. Then we perform the out-of-domain tests first as a single source with the DistilBERT model, which significantly reduces the results by changing the domain. In order to increase the power of domain adaptation of the model in the out-of-domain part, we perform the training on several sources, leads to improve the results in about half of the cases, which is not significant. In the following, we try to increase the domain adaptation power of the models, using transformer-based methods including domain adversarial training and the mixture of experts, which leads to increase in performance in 87% of multi-source out-of-domain tests. Of course, these methods are also effective in the performance of in-domain tests. An important issue that sometimes causes a significant drop in results is datasets. The similarity of the data and the similarity of the distribution of some domains increase the power of domain adaptation of the model and on the contrary.
Article number: 10
Full-Text [PDF 1047 kb]   (153 Downloads)    
Type of Study: Applicable | Subject: Paper
Received: 2022/09/24 | Accepted: 2024/02/25 | Published: 2024/08/3 | ePublished: 2024/08/3

References
1. H. Almerekhi, H. Kwak, B. J. Jansen and J. Salminen, "Detecting Toxicity Triggers in Online Discussions," in Proceedings of the 30th ACM Conference on Hypertext and Social Media, Hof, Germany, pp. 291-292, September 2019. [DOI:10.1145/3342220.3344933]
2. A. Arango, J. Pérez and B. Poblete, "Hate Speech Detection Is Not as Easy as You May Think: A Closer Look at Model Validation," in, Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Paris, France, pp. 45-54, July 2019. [DOI:10.1145/3331184.3331262]
3. P. Badjatiya, S. Gupta, M. Gupta, V. Varma, "Deep learning for hate speech detection in tweets," in Proceedings of the 26th International Conference on World Wide Web Companion, Perth Australia, pp. 759-760, April 2017. [DOI:10.1145/3041021.3054223]
4. S. Benesch, "Dangerous Speech: A Proposal to Prevent Group Violence," in Voices That Poison: Dangerous Speech Project Proposal Paper, 2013.
5. J. Blitzer, R. McDonald and F. Pereira, "Domain Adaptation with Structural Correspondence Learning," in Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, pp. 120-128, July 2006. [DOI:10.3115/1610075.1610094]
6. P. Burnap and M. Williams, "Us and Them: Identifying Cyber Hate on Twitter Across Multiple Protected Characteristics," EPJ Data Science, vol. 5, pp. 1-15, March 2016. [DOI:10.1140/epjds/s13688-016-0072-6]
7. R. Cohen-Almagor, "Fighting Hate and Bigotry on the Internet," Policy & Internet, vol. 3, pp. 1-26, August 2011. [DOI:10.2202/1944-2866.1059]
8. H. Daume III, "Frustratingly Easy Domain Adaptation," in Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, Prague, Czech Republic, pp. 256-263, June 2007.
9. T. Davidson, D. Warmsley, M. Macy and I. Weber, "Automated Hate Speech Detection and the Problem of Offensive Language," in International AAAI Conference on Web and Social Media, Montreal, Canada, vol. 11, pp. 512-515, May 2017. [DOI:10.1609/icwsm.v11i1.14955]
10. J. Devlin, M. W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding," in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, USA, pp. 4171-4186, June 2019.
11. J. Donahue, J. Hoffman, E. Rodner, K. Saenko and T. Darrell, "Semi-supervised Domain Adaptation with Instance Constraints," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, pp. 668-675, June 2013. [DOI:10.1109/CVPR.2013.92]
12. R. Faris, A. Ashar, U. Gasser and D. Joo, "Understanding Harmful Speech Online," Berkman Klein Center Research Publication, vol. 21, December 2016. [DOI:10.2139/ssrn.2882824]
13. J. R. Finkel and C. D. Manning, "Hierarchical Bayesian Domain Adaptation," in Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Boulder, USA, pp. 602-610, June 2009. [DOI:10.3115/1620754.1620842]
14. P. Fortuna, J. Soler-Company and L. Wanner, "How Well Do Hate Hpeech, Toxicity, Abusive and Offensive Language Classification Models Generalize Across Datasets?," Information Processing & Management, vol. 58, pp. 102524, May 2021. [DOI:10.1016/j.ipm.2021.102524]
15. Y. Ganin and V. Lempitsky, "Unsupervised Domain Adaptation by Backpropagation," in Proceedings of the 32nd International Conference on Machine Learning (ICML), Lille, France, pp. 1180-1189, June 2015.
16. A. Gretton, K. Borgwardt, M. Rasch, B. Scholkopf and A. J. Smola, "A Kernel Method for the Two-Sample-Problem," Advances in neural information processing systems, vol. 8, pp. 513-520, December 2007. [DOI:10.7551/mitpress/7503.003.0069]
17. T. Gröndahl, L. Pajola, M. Juuti, M. Conti and N. Asokan, "All You Need Is "Love": Evading Hate Speech Detection," in AISec '18, Proceedings of the 11th ACM Workshop on Artificial Intelligence and Security, Toronto, Canada, pp. 2-12, January 2018. [DOI:10.1145/3270101.3270103]
18. T. Gui, Q. Zhang, H. Huang, M. Peng and X. Huang, "Part-of-Speech Tagging for Twitter with Adversarial Neural Networks," in Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp. 2411-2420, September 2017. [DOI:10.18653/v1/D17-1256]
19. J. Guo, D. Shah and R. Barzilay,"Multi-Source Domain Adaptation with Mixture of Experts," in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 4694-4703, 2018. [DOI:10.18653/v1/D18-1498]
20. J. Sorensen, J. Elliott, L. Dixon, M. McDonald and W. Cukierski, "Toxic Comment Classification Challenge," kaggle.com, Mar. 21, 2018. [Online]. Available: https://www.kaggle.com/c/jigsaw-toxic-comment-classification-challenge/overview/citation.
21. M. Karan and J. Šnajder, "Cross-Domain Detection of Abusive Language Online," in Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), Brussels, Belgium, pp. 132-137, October 2018. [DOI:10.18653/v1/W18-5117]
22. Y. Kim, K. Stratos and D. Kim, "Domain Attention With an Ensemble of Experts," in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 643-653, July 2017. [DOI:10.18653/v1/P17-1060]
23. B. Kulis, K. Saenko and T. Darrell, "What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms," in Proceedings of the 2011 IEEE Conference on Computer Vision and Pattern Recognition, Los Alamitos, USA, pp. 1785-1792, June 2011. [DOI:10.1109/CVPR.2011.5995702]
24. Y. Li, T. Baldwin and T. Cohn, "What's in a Domain? Learning Domain-Robust Text Representations Using Adversarial Training," in Proceedings of the 2018 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, New Orleans, USA, pp. 474-479, June 2018. [DOI:10.18653/v1/N18-2076]
25. Z. C. Lipton, Y. Wang and A. J. Smola, "Detecting and Correcting for Label Shift with Black Box Predictors," in Proceedings of the 35th International Conference on Machine Learning, pp. 3128-3136, July 2018.
26. X. Ma, P. Xu, Z. Wang, R. Nallapati. and B. Xiang, "Domain Adaptation with BERT-based Domain Classification and Data Selection," in Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource NLP, Hong Kong, China, pp. 76-83, November 2019.
27. M. Mozafari, R. Farahbakhsh and N. Crespi, "A BERT-Based Transfer Learning Approach for Hate Speech Detection in Online Social Media," in Proceedings of the Eighth International Conference on Complex Networks and Their Applications COMPLEX NETWORKS 2019, Cham, Germany, pp. 928-940, December 2019. [DOI:10.1007/978-3-030-36687-2_77]
28. E. W. Pamungkas and V. Patti, "Cross-Domain and Cross-Lingual Abusive Language Detection: A Hybrid Approach with Deep Learning and a Multilingual Lexicon," in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Florence, Italy, pp. 363-370, July 2019. [DOI:10.18653/v1/P19-2051]
29. B. Parekh, "Is There a Case for Banning Hate Speech?," in The Content and Context of Hate Speech: Rethinking Regulation and Responses, M. Herz and P. Molnar, Cambridge: Cambridge University Press, pp. 37-56, 2012. [DOI:10.1017/CBO9781139042871.006]
30. B. Plank, "Domain Adaptation for Parsing", Ph.D. Thesis, University of Groningen, 2011.
31. A. Ramponi and B. Plank, "Neural Unsupervised Domain Adaptation in NLP-A Survey," in The 28th International Conference on Computational Linguistics, Barcelona, Spain, pp. 6838-6855, December 2020. [DOI:10.18653/v1/2020.coling-main.603]
32. J. Salminen, H. Almerekhi, M. Milenkovic, S. Jung, J. An, H. Kwak and B. Jansen, "Anatomy of Online Hate: Developing a Taxonomy and Machine Learning Models for Identifying and Classifying Hate in Online News Media," in Proceedings of the twelfth International Conference on Web and Social Media, ICWSM 2018, Palo Alto, USA, vol. 12, pp. 330-339, June 2018. [DOI:10.1609/icwsm.v12i1.15028]
33. J. Salminen, M. Hopf, S. A. Chowdhury, S. g. Jung, H. Almerekhi and B. J. Jansen, "Developing an Online Hate Classifier for Multiple Social Media Platforms," Human-Centric Computing and Information Sciences, vol. 10, pp. 1-34, January 2020. [DOI:10.1186/s13673-019-0205-6]
34. V. Sanh, L. Debut, J. Chaumond and T. Wolf, "DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter," in CoRR, 2019.
35. M. S. Jahan and M. Oussalah, "A Systematic Review of Hate Speech Automatic Detection Using Natural Language Processing," in arXiv, May 2021.
36. A. Sellars, "Defining Hate Speech," Berkman Klein Center Research Publication, Boston Univ. School of Law, Public Law Research Paper, 2016. [DOI:10.2139/ssrn.2882244]
37. A. A. Siegel, "Online Hate Speech," in Social Media and Democracy: The State of the Field, Prospects for Reform, N. Persily and J. A. Tucker, Cambridge, UK: Cambridge University Press, pp. 56-58, 2020. [DOI:10.1017/9781108890960.005]
38. B. Sun, J. Feng and K. Saenko, "Return of Frustratingly Easy Domain Adaptation," in Proceedings of the AAAI Conference on Artificial Intelligence, Phoenix, USA, vol. 30, pp. 2058-2065, March 2016. [DOI:10.1609/aaai.v30i1.10306]
39. S. D. Swamy, A. Jamatia and B. Gambäck, "Studying Generalisability Across Abusive Language Detection Datasets," in Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Hong Kong, China, pp. 940-950, November 2019. [DOI:10.18653/v1/K19-1088]
40. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is All You Need," Advances in Neural Information Processing Systems, vol. 11, pp. 5998-6008, December 2017.
41. Z. Waseem, J. Thorne and J. Bingel, "Bridging the Gaps: Multi Task Learning for Domain Transfer of Hate Speech Detection," in Online Harassment, pp. 29-55, July 2018. [DOI:10.1007/978-3-319-78583-7_3]
42. M. Wiegand, M. Siegel and J. Ruppenhofer, "Overview of the germeval 2018 shared task on the identification of offensive language," in Proceedings of the GermEval 2018 Workshop, September 2018.
43. D. Wright and I. Augenstein, "Transformer Based Multi-Source Domain Adaptation," in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 7963-7974, November 2020. [DOI:10.18653/v1/2020.emnlp-main.639]
44. T. Yao, Y. Pan, C. Ngo, H. Li and T. Mei, "Semi-supervised Domain Adaptation with Subspace Learning for Visual Recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 2142-2150, June 2015. [DOI:10.1109/CVPR.2015.7298826]
45. M. Zampieri, P. Nakov, S. Rosenthal, P. Atanasova, G. Karadzhov, H. Mubarak, L. Derczynski, Z. Pitenis and Ç. Çöltekin, "SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020)," in Proceedings of the Fourteenth Workshop on Semantic Evaluation, Barcelona, Spain, pp. 1425-1447, December 2020. [DOI:10.18653/v1/2020.semeval-1.188]
46. M. Zampieri, S. Malmasi, P. Nakov, S. Rosenthal, N. Farra and R. Kumar, "SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval)," in Proceedings of the 13th International Workshop on Semantic Evaluation, Minneapolis, USA, pp. 75-86, June 2019. [DOI:10.18653/v1/S19-2010]
47. P. Rajpurkar, J. Zhang, K. Lopyrev and P. Liang, "SQuAD: 100,000+ Questions for Machine Comprehension of Text," in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 2383-2392, October 2016, [DOI:10.18653/v1/D16-1264]
48. O. de Gibert, N. Pérez, A. García-Pablos and M. Cuadros, "Hate Speech Dataset from a White Supremacy Forum," in Proceedings of the 2nd Workshop on Abusive Language Online (ALW2), pp. 11-20, September 2018.‌ [DOI:10.18653/v1/W18-5102]
49. R. Kumar, A. K. Ojha, S. Malmasi and M. Zampieri, "Benchmarking Aggression Identification in Social Media," in Proceedings of the First Workshop on Trolling, Aggression and Cyberbullying (TRAC-2018), pp. 1-11, August 2018.
50. Z. Waseem and D. Hovy, "Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter," in Proceedings of the NAACL Student Research Workshop, pp. 88-93, June 2016.‌ [DOI:10.18653/v1/N16-2013]
51. T. Consigny, "Sesame Street Ensemble: A Mixture of DistiIBERT Experts," 2021.
52. M. Artetxe, S. Bhosale, N. Goyal, T. Mihaylov, M. Ott, S. Shleifer, X. V. Lin, J. Du, S. Lyer, R. Pasunuru, G. Anantharaman, X. Li, S. Chen, H. Akin, M. Baines, L. Martin, X. Zhou, P. Singh Koura, B. O'Horo, J. Wang, L. Zettlemoyer, M. Diab, Z. Kozareva and V. Stoyanov, "Efficient Large Scale Language Modeling with Mixtures of Experts," in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, pp. 11699-11732, December 2022. [DOI:10.18653/v1/2022.emnlp-main.804]
53. A. Repetto, "Neural Networks: A Mixture of Experts with Attention," towardsdatascience.com, Jul. 23, 2017. [Online]. Available: https://towardsdatascience.com/neural-networks-a-mixture-of-experts-with-attention-30e196657065.
54. R. A. Jacobs, M. I. Jordan, S. J. Nowlan and G. E. Hinton, "Adaptive Mixtures of Local Experts," in Neural Computation, vol. 3, pp. 79-87, March 1991. [DOI:10.1162/neco.1991.3.1.79]
55. N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. Le, G. Hinton, and J. Dean, "Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer," in International Conference on Learning Representations, November 2016.‌
56. W. Fedus, B. Zoph, and N. Shazeer, "Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity," in Journal of Machine Learning Research, vol. 23, pp. 1-39, 2022.‌
57. Z. Du, J. Li, H. Su, L. Zhu, and K. Lu, "Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3937-3946, June 2021.
58. Q. Ye, J. Zha, and X. Ren, "Eliciting Transferability in Multi-Task Learning with Task-Level Mixture-of-Experts," in arXiv, 2022.‌

Add your comments about this article : Your username or Email:
CAPTCHA

Send email to the article author


Rights and permissions
Creative Commons License This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

© 2015 All Rights Reserved | Signal and Data Processing