Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3461702.3462557acmconferencesArticle/Chapter ViewAbstractPublication PagesaiesConference Proceedingsconference-collections
research-article
Open access

Measuring Model Biases in the Absence of Ground Truth

Published: 30 July 2021 Publication History

Abstract

The measurement of bias in machine learning often focuses on model performance across identity subgroups (such as man and woman) with respect to groundtruth labels. However, these methods do not directly measure the associations that a model may have learned, for example between labels and identity subgroups. Further, measuring a model's bias requires a fully annotated evaluation dataset which may not be easily available in practice.
We present an elegant mathematical solution that tackles both issues simultaneously, using image classification as a working example. By treating a classification model's predictions for a given image as a set of labels analogous to a "bag of words", we rank the biases that a model has learned with respect to different identity labels. We use man, woman as a concrete example of an identity label set (although this set need not be binary), and present rankings for the labels that are most biased towards one identity or the other. We demonstrate how the statistical properties of different association metrics can lead to different rankings of the most "gender biased" labels, and conclude that normalized pointwise mutual information (nPMI) is most useful in practice. Finally, we announce an open-sourced nPMI visualization tool using TensorBoard.

Supplementary Material

ZIP File (aiespp034aux.zip)

References

[1]
Nikolaos Aletras and Mark Stevenson. 2013. Evaluating Topic Coherence Using Distributional Semantics. In Proceedings of the 10th International Conference on Computational Semantics (IWCS 2013) -- Long Papers. Association for Computational Linguistics, Potsdam, Germany, 13--22. https://www.aclweb.org/ anthology/W13-0102
[2]
Solon Barocas and Andrew D. Selbst. 2014. Big Data's Disparate Impact. SSRN eLibrary (2014).
[3]
Richard Berk. 2016. A primer on fairness in criminal justice risk assessments. The Criminologist 41, 6 (2016), 6--9.
[4]
G. Bouma. 2009. Normalized (pointwise) mutual information in collocation extraction.
[5]
Joy Buolamwini and Timnit Gebru. 2018. Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification (Proceedings of Machine Learning Research, Vol. 81), Sorelle A. Friedler and Christo Wilson (Eds.). PMLR, New York, NY, USA, 77--91. http://proceedings.mlr.press/v81/buolamwini18a.html
[6]
Kaylee Burns, Lisa Anne Hendricks, Trevor Darrell, and Anna Rohrbach. 2018. Women also Snowboard: Overcoming Bias in Captioning Models. CoRR abs/1803.09797 (2018). arXiv:1803.09797 http://arxiv.org/abs/1803.09797
[7]
Aylin Caliskan, Joanna J. Bryson, and Arvind Narayanan. 2017. Semantics derived automatically from language corpora contain human-like biases. Science 356, 6334 (2017), 183--186. https://doi.org/10.1126/science.aal4230 arXiv:https://science.sciencemag.org/content/356/6334/183.full.pdf
[8]
Alexandra Chouldechova. 2016. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv:1610.07524 [stat.AP]
[9]
KennethWard Church and Patrick Hanks. 1990. Word Association Norms, Mutual Information, and Lexicography. Computational Linguistics 16, 1 (1990), 22--29. https://www.aclweb.org/anthology/J90--1003
[10]
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. 2009. ImageNet: A Large-Scale Hierarchical Image Database. In CVPR09.
[11]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard S. Zemel. 2011. Fairness Through Awareness. CoRR abs/1104.3913 (2011). arXiv:1104.3913 http://arxiv.org/abs/1104.3913
[12]
Mark Everingham, S. M. Ali Eslami, Luc Van Gool, Christopher K. I. Williams, John Winn, and Andrew Zisserman. 2015. The Pascal Visual Object Classes Challenge: A Retrospective. International Journal of Computer Vision 111, 1 (Jan. 2015), 98--136. https://doi.org/10.1007/s11263-014-0733--5
[13]
Robert M Fano. 1961. Transmission of information: A statistical theory of communications. American Journal of Physics 29 (1961), 793--794.
[14]
A. G. Greenwald, D. E. McGhee, and J. L. Schwartz. 1998. Measuring individual differences in implicit cognition: the implicit association test. Journal of personality and social psychology 74 (1998). Issue 6.
[15]
Moritz Hardt, Eric Price, and Nathan Srebro. 2016. Equality of Opportunity in Supervised Learning. arXiv:1610.02413 [cs.LG]
[16]
Zellig Harris. 1954. Distributional structure. Word 10, 2--3 (1954), 146--162. https://doi.org/10.1007/978--94-009--8467--7_1
[17]
Dan Jurafsky and James H. Martin. 2009. Speech and language processing : an introduction to natural language processing, computational linguistics, and speech recognition. Pearson Prentice Hall, Upper Saddle River, N.J. http://www.amazon.com/Speech-Language-Processing-2nd-Edition/dp/ 0131873210/ref=pd_bxgy_b_img_y
[18]
M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika 30, 1--2 (06 1938), 81--93. arXiv:https://academic.oup.com/biomet/article-pdf/30/1- 2/81/423380/30--1--2--81.pdf https://doi.org/10.1093/biomet/30.1--2.81
[19]
Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. 2018. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. CoRR abs/1811.00982 (2018). arXiv:1811.00982 http://arxiv.org/abs/1811.00982
[20]
Tsung-Yi Lin, Michael Maire, Serge J. Belongie, Lubomir D. Bourdev, Ross B. Girshick, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C. Lawrence Zitnick. 2014. Microsoft COCO: Common Objects in Context. CoRR abs/1405.0312 (2014). arXiv:1405.0312 http://arxiv.org/abs/1405.0312
[21]
FranÇois Role and Mohamed Nadif. 2011. Handling the Impact of Low Frequency Events on Co-Occurrence Based Measures of Word Similarity - A Case Study of Pointwise Mutual Information. In Proceedings of the International Conference on Knowledge Discovery and Information Retrieval - KDIR, (IC3K 2011). SciTePress, 218--223.
[22]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252. https: //doi.org/10.1007/s11263-015-0816-y
[23]
Claude E. Shannon. 1948. A mathematical theory of communication. Bell Syst. Tech. J. 27, 3 (1948), 379--423. http://dblp.uni-trier.de/db/journals/bstj/bstj27. html#Shannon48
[24]
Jacob Snow. 2018. Amazon's Face Recognition Falsely Matched 28 Members of Congress With Mugshots. (2018).
[25]
C. Spearman. 1904. The Proof and Measurement of Association Between Two Things. American Journal of Psychology 15 (1904), 88--103.
[26]
Stanford Vision Lab. 2020. ImageNet. http://image-net.org/explore (2020). accessed 6.Oct.2020.
[27]
Pierre Stock and Moustapha Cisse. 2018. Convnets and imagenet beyond accuracy: Understanding mistakes and uncovering biases. In Proceedings of the European Conference on Computer Vision (ECCV). 498--512.
[28]
M. P. Toglia and W. F. Battig. 1978. Handbook of semantic word norms. Lawrence Erlbaum.
[29]
Benjamin Wilson, Judy Hoffman, and Jamie Morgenstern. 2019. Predictive Inequity in Object Detection. CoRR abs/1902.11097 (2019). arXiv:1902.11097 http://arxiv.org/abs/1902.11097

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
AIES '21: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society
July 2021
1077 pages
ISBN:9781450384735
DOI:10.1145/3461702
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 July 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bias
  2. datasets
  3. fairness
  4. image tagging
  5. information extraction
  6. model analysis
  7. stereotypes

Qualifiers

  • Research-article

Conference

AIES '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 61 of 162 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)567
  • Downloads (Last 6 weeks)84
Reflects downloads up to 23 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Algorithmic FairnessAnnual Review of Financial Economics10.1146/annurev-financial-110921-12593015:1(565-593)Online publication date: 1-Nov-2023
  • (2023)Algorithmic Censoring in Dynamic Learning SystemsProceedings of the 3rd ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization10.1145/3617694.3623247(1-20)Online publication date: 30-Oct-2023
  • (2023)Handling Bias in Toxic Speech Detection: A SurveyACM Computing Surveys10.1145/358049455:13s(1-32)Online publication date: 13-Jul-2023
  • (2023)Fake it Till You Make it: Learning Transferable Representations from Synthetic ImageNet Clones2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00774(8011-8021)Online publication date: Jun-2023
  • (2023)Measurement and Mitigation of Bias in Artificial Intelligence: A Narrative Literature Review for Regulatory ScienceClinical Pharmacology & Therapeutics10.1002/cpt.3117115:4(687-697)Online publication date: 12-Dec-2023
  • (2022)Scaling Vision Transformers2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52688.2022.01179(1204-1213)Online publication date: Jun-2022
  • (2022)Algorithmic fairness datasets: the story so farData Mining and Knowledge Discovery10.1007/s10618-022-00854-z36:6(2074-2152)Online publication date: 1-Nov-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media