Abstract
Given the dynamic nature of toxic language use, automated methods for detecting toxic spans are likely to encounter distributional shift. To explore this phenomenon, we evaluate three approaches for detecting toxic spans under cross-domain conditions: lexicon-based, rationale extraction, and fine-tuned language models. Our findings indicate that a simple method using off-the-shelf lexicons performs best in the cross-domain setup. The cross-domain error analysis suggests that (1) rationale extraction methods are prone to false negatives, while (2) language models, despite performing best for the in-domain case, recall fewer explicitly toxic words than lexicons and are prone to certain types of false positives. Our code is publicly available at: https://github.com/sfschouten/toxic-cross-domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
van Aken, B., Risch, J., Krestel, R., Löser, A.: Challenges for toxic comment classification: an in-depth error analysis. In: Proceedings of ALW2, pp. 33–42 (2018). https://doi.org/10.18653/v1/W18-5105
Bassignana, E., Basile, V., Patti, V.: Hurtlex: a multilingual lexicon of words to hurt. In: Cabrio, E., Mazzei, A., Tamburini, F. (eds.) Proceedings of CLiC-it 2018, pp. 51–56 (2018). https://doi.org/10.4000/books.aaccademia.3085
Benlahbib, A., Alami, A., Alami, H.: LISAC FSDM USMBA at SemEval-2021 task 5: tackling toxic spans detection challenge with supervised SpanBERT-based model and unsupervised LIME-based model. In: Proceedings of SemEval-2021, pp. 865–869 (2021). https://doi.org/10.18653/v1/2021.semeval-1.116
Chhablani, G., Sharma, A., Pandey, H., Bhartia, Y., Suthaharan, S.: NLRG at SemEval-2021 task 5: toxic spans detection leveraging BERT-based token classification and span prediction techniques. In: Proceedings of SemEval-2021, pp. 233–242 (2021). https://doi.org/10.18653/v1/2021.semeval-1.27
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT2019 (Long and Short Papers), vol. 1, pp. 4171–4186 (2019). https://doi.org/10.18653/v1/N19-1423
Fortuna, P., Nunes, S.: A survey on automatic detection of hate speech in text. ACM Comput. Surv. 51(4), 85:1–85:30 (2018). https://doi.org/10.1145/3232676
Lafferty, J.D., McCallum, A., Pereira, F.C.N.: Conditional random fields: probabilistic models for segmenting and labeling sequence data. In: Proceedings of ICML 2001, pp. 282–289 (2001)
Markov, I., Daelemans, W.: Improving cross-domain hate speech detection by reducing the false positive rate. In: Proceedings of NLP4IF 2021, pp. 17–22 (2021). https://doi.org/10.18653/v1/2021.nlp4if-1.3
Markov, I., Gevers, I., Daelemans, W.: An ensemble approach for Dutch cross-domain hate speech detection. In: Rosso, P., Basile, V., Martínez, R., Métais, E., Meziane, F. (eds.) NLDB 2022. LNCS, vol. 13286, pp. 3–15. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08473-7_1
Markov, I., Ljubešić, N., Fišer, D., Daelemans, W.: Exploring stylometric and emotion-based features for multilingual cross-domain hate speech detection. In: Proceedings of WASSA2021, pp. 149–159 (2021)
Mathew, B., Saha, P., Yimam, S.M., Biemann, C., Goyal, P., Mukherjee, A.: HateXplain: a benchmark dataset for explainable hate speech detection. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, pp. 14867–14875 (2021). https://doi.org/10.1609/aaai.v35i17.17745. Number: 17
Nguyen, V.A., Nguyen, T.M., Quang Dao, H., Huu Pham, Q.: S-NLP at SemEval-2021 task 5: an analysis of dual networks for sequence tagging. In: Proceedings of SemEval-2021, pp. 888–897 (2021). https://doi.org/10.18653/v1/2021.semeval-1.120
Pamungkas, E.W., Basile, V., Patti, V.: Towards multidomain and multilingual abusive language detection: a survey. Pers. Ubiquit. Comput. 27(1), 17–43 (2021). https://doi.org/10.1007/s00779-021-01609-1
Pavlopoulos, J., Sorensen, J., Laugier, L., Androutsopoulos, I.: SemEval-2021 task 5: toxic spans detection. In: Proceedings of SemEval-2021, pp. 59–69 (2021). https://doi.org/10.18653/v1/2021.semeval-1.6
Pluciński, K., Klimczak, H.: GHOST at SemEval-2021 task 5: is explanation all you need? In: Proceedings of SemEval-2021, pp. 852–859 (2021). https://doi.org/10.18653/v1/2021.semeval-1.114
Ranasinghe, T., Zampieri, M.: MUDES: multilingual detection of offensive spans. In: Proceedings of NAACL-HLT2021: Demonstrations, pp. 144–152 (2021). https://doi.org/10.18653/v1/2021.naacl-demos.17
Ribeiro, M., Singh, S., Guestrin, C.: “Why should I trust you?”: explaining the predictions of any classifier. In: Proceedings of NAACL-HLT2016: Demonstrations, pp. 97–101 (2016). https://doi.org/10.18653/v1/N16-3020
Rusert, J.: NLP_UIOWA at Semeval-2021 task 5: transferring toxic sets to tag toxic spans. In: Proceedings of SemEval-2021, pp. 881–887 (2021). https://doi.org/10.18653/v1/2021.semeval-1.119
Shrikumar, A., Greenside, P., Kundaje, A.: Learning important features through propagating activation differences. In: Proceedings of ICML 2017, pp. 3145–3153 (2017). ISSN 2640-3498
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: visualising image classification models and saliency maps. In: ICLR (2014)
Sundararajan, M., Taly, A., Yan, Q.: Axiomatic attribution for deep networks. In: Proceedings of ICML 2017, pp. 3319–3328 (2017). ISSN 2640-3498
Wiegand, M., Ruppenhofer, J., Kleinbauer, T.: Detection of abusive language: the problem of biased datasets. In: Proceedings of NAACL-HLT2019 (Long and Short Papers), vol. 1, pp. 602–608 (2019). https://doi.org/10.18653/v1/N19-1060
Wiegand, M., Ruppenhofer, J., Schmidt, A., Greenberg, C.: Inducing a lexicon of abusive words - a feature-based approach. In: Proceedings of NAACL-HLT2018 (Long Papers), vol. 1, pp. 1046–1056 (2018). https://doi.org/10.18653/v1/N18-1095
Zhu, Q., et al.: HITSZ-HLT at SemEval-2021 task 5: ensemble sequence labeling and span boundary detection for toxic span detection. In: Proceedings of SemEval-2021, pp. 521–526 (2021). https://doi.org/10.18653/v1/2021.semeval-1.63
Acknowledgements
This research was supported by Huawei Finland through the DreamsLab project. All content represented the opinions of the authors, which were not necessarily shared or endorsed by their respective employers and/or sponsors.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schouten, S.F., Barbarestani, B., Tufa, W., Vossen, P., Markov, I. (2023). Cross-Domain Toxic Spans Detection. In: Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds) Natural Language Processing and Information Systems. NLDB 2023. Lecture Notes in Computer Science, vol 13913. Springer, Cham. https://doi.org/10.1007/978-3-031-35320-8_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-35320-8_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-35319-2
Online ISBN: 978-3-031-35320-8
eBook Packages: Computer ScienceComputer Science (R0)