Towards a Comprehensive Understanding and Accurate Evaluation of Societal Biases in Pre-Trained Transformers

Andrew Silva, Pradyumna Tambwekar, Matthew Gombolay

Abstract

The ease of access to pre-trained transformers has enabled developers to leverage large-scale language models to build exciting applications for their users. While such pre-trained models offer convenient starting points for researchers and developers, there is little consideration for the societal biases captured within these model risking perpetuation of racial, gender, and other harmful biases when these models are deployed at scale. In this paper, we investigate gender and racial bias across ubiquitous pre-trained language models, including GPT-2, XLNet, BERT, RoBERTa, ALBERT and DistilBERT. We evaluate bias within pre-trained transformers using three metrics: WEAT, sequence likelihood, and pronoun ranking. We conclude with an experiment demonstrating the ineffectiveness of word-embedding techniques, such as WEAT, signaling the need for more robust bias testing in transformers.

Anthology ID:: 2021.naacl-main.189
Volume:: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Month:: June
Year:: 2021
Address:: Online
Editors:: Kristina Toutanova, Anna Rumshisky, Luke Zettlemoyer, Dilek Hakkani-Tur, Iz Beltagy, Steven Bethard, Ryan Cotterell, Tanmoy Chakraborty, Yichao Zhou
Venue:: NAACL
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 2383–2389
Language:
URL:: https://aclanthology.org/2021.naacl-main.189
DOI:: 10.18653/v1/2021.naacl-main.189
Bibkey:
Cite (ACL):: Andrew Silva, Pradyumna Tambwekar, and Matthew Gombolay. 2021. Towards a Comprehensive Understanding and Accurate Evaluation of Societal Biases in Pre-Trained Transformers. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2383–2389, Online. Association for Computational Linguistics.
Cite (Informal):: Towards a Comprehensive Understanding and Accurate Evaluation of Societal Biases in Pre-Trained Transformers (Silva et al., NAACL 2021)
Copy Citation:
PDF:: https://aclanthology.org/2021.naacl-main.189.pdf
Video:: https://aclanthology.org/2021.naacl-main.189.mp4
Data: SWAG

PDF Cite Search Video