research-article

When Biased Humans Meet Debiased AI: A Case Study in College Major Recommendation

Authors:

Andrew Y. Bian,

Rashidul Islam,

Kamrun Naher Keya,

Shimei PanAuthors Info & Claims

ACM Transactions on Interactive Intelligent Systems, Volume 13, Issue 3

Article No.: 17, Pages 1 - 28

https://doi.org/10.1145/3611313

Published: 11 September 2023 Publication History

Abstract

Currently, there is a surge of interest in fair Artificial Intelligence (AI) and Machine Learning (ML) research which aims to mitigate discriminatory bias in AI algorithms, e.g., along lines of gender, age, and race. While most research in this domain focuses on developing fair AI algorithms, in this work, we examine the challenges which arise when humans and fair AI interact. Our results show that due to an apparent conflict between human preferences and fairness, a fair AI algorithm on its own may be insufficient to achieve its intended results in the real world. Using college major recommendation as a case study, we build a fair AI recommender by employing gender debiasing machine learning techniques. Our offline evaluation showed that the debiased recommender makes fairer career recommendations without sacrificing its accuracy in prediction. Nevertheless, an online user study of more than 200 college students revealed that participants on average prefer the original biased system over the debiased system. Specifically, we found that perceived gender disparity is a determining factor for the acceptance of a recommendation. In other words, we cannot fully address the gender bias issue in AI recommendations without addressing the gender bias in humans. We conducted a follow-up survey to gain additional insights into the effectiveness of various design options that can help participants to overcome their own biases. Our results suggest that making fair AI explainable is crucial for increasing its adoption in the real world.

References

[1]

Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).

[2]

Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. 2019. Discrimination through optimization: How Facebook’s ad delivery can lead to biased outcomes. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–30.

Digital Library

[3]

J. Angwin, J. Larson, S. Mattu, and L. Kirchner. 2016. Machine bias: There’s software used across the country to predict future criminals. And it’s biased against Blacks. ProPublica, May 23 (2016).

[4]

ASCA. 2021. Student-to-School-Counselor Ratio 2020–2021. https://www.schoolcounselor.org/getmedia/238f136e-ec52-4bf2-94b6-f24c39447022/Ratios-20-21-Alpha.pdf

[5]

Albert Bandura. 1977. Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review 84, 2 (1977), 191.

[6]

Albert Bandura. 1978. Reflections on self-efficacy. Advances in Behaviour Research and Therapy 1, 4 (1978), 237–269.

[7]

Natã M. Barbosa and Monchu Chen. 2019. Rehumanized crowdsourcing: A labeling framework addressing bias and ethics in machine learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[8]

S. Barocas and A. D. Selbst. 2016. Big data’s disparate impact. Cal. L. Rev. 104 (2016), 671.

[9]

Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John T. Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. 2019. AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 63, 4/5 (2019), 4:1–4:15.

[10]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993–1022.

Digital Library

[11]

T. Bolukbasi, K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in NeurIPS.

[12]

T. Buonocore. 2019. Man is to doctor as woman is to nurse: The gender bias of word embeddings. https://towardsdatascience.com/gender-bias-word-embeddings-76d9806a0e17

[13]

A. Campolo, M. Sanfilippo, M. Whittaker, A. Selbst K. Crawford, and S. Barocas. 2017. AI Now 2017 Symposium Report. AI Now.

[14]

Jiawei Chen, Anbang Xu, Zhe Liu, Yufan Guo, Xiaotong Liu, Yingbei Tong, Rama Akkiraju, and John M. Carroll. 2020. A general methodology to quantify biases in natural language data. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–9.

Digital Library

[15]

Shelley J. Correll. 2001. Gender and the career choice process: The role of biased self-assessments. American Journal of Sociology 106, 6 (2001), 1691–1730.

[16]

Henriette Cramer, Jean Garcia-Gathright, Sravana Reddy, Aaron Springer, and Romain Takeo Bouyer. 2019. Translation, tracks & data: An algorithmic bias effort in practice. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[17]

K. Crenshaw. 1989. Demarginalizing the intersection of race and sex: A Black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. U. Chi. Legal F. (1989), 139–167.

[18]

Jenna Cryan, Shiliang Tang, Xinyi Zhang, Miriam Metzger, Haitao Zheng, and Ben Y. Zhao. 2020. Detecting gender stereotypes: Lexicon vs. supervised learning methods. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–11.

Digital Library

[19]

J. Dastin. 2018. Amazon scraps secret AI recruiting tool that showed bias against women. Reuters (2018). https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G

[20]

Sunipa Dev and Jeff Phillips. 2019. Attenuating Bias in Word Vectors. arxiv:1901.07656 [cs.CL].

[21]

Mark Díaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle. 2018. Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[22]

Jonathan Dodge, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, and Casey Dugan. 2019. Explaining models: An empirical study of how explanations impact fairness judgment(IUI’19). Association for Computing Machinery, New York, NY, USA, 275–85.

[23]

C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. 2012. Fairness through awareness. In Proceedings of ITCS. ACM, 214–226.

Digital Library

[24]

Allen L. Edwards. 1953. The relationship between the judged desirability of a trait and the probability that the trait will be endorsed. Journal of Applied Psychology 37, 2 (1953), 90.

[25]

Wilbert E. Fordyce. 1956. Social desirability in the MMPI. Journal of Consulting Psychology 20, 3 (1956), 171.

[26]

James R. Foulds, Rashidul Islam, Kamrun Naher Keya, and Shimei Pan. 2020a. An intersectional definition of fairness. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1918–1921.

[27]

James R. Foulds, Rashidul Islam, Kamrun Naher Keya, and Shimei Pan. 2020b. Bayesian modeling of intersectional fairness: The variance of bias. In Proceedings of the 2020 SIAM International Conference on Data Mining. SIAM, 424–432.

[28]

Pratik Gajane and Mykola Pechenizkiy. 2017. On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184 (2017).

[29]

Peter Glick and Susan T. Fiske. 1999. Gender, power dynamics, and social interaction. Revisioning Gender 5 (1999), 365–398.

[30]

A. G. Greenwald, D. E. McGhee, and J. L. Schwartz. 1998. Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology 74 (1998), 1464–1480.

[31]

A. G. Greenwald, B. A. Nosek, and M. R. Banaji. 2003. Understanding and using the implicit association test: I. An improved scoring algorithm. Journal of Personality and Social Psychology 85 (2003), 197–216.

[32]

M. Hardt, E. Price, and N. Srebro. 2016. Equality of opportunity in supervised learning. In Advances in NeurIPS. 3315–3323.

[33]

Xiangnan He, Tao Chen, Min-Yen Kan, and Xiao Chen. 2015. TriRank: Review-aware explainable recommendation by modeling aspects. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1661–1670.

Digital Library

[34]

Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173–182.

Digital Library

[35]

Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 549–558.

Digital Library

[36]

Elspeth J. R. Hill and James A. Giles. 2014. Career decisions and gender: The illusion of choice? Perspectives on Medical Education 3, 3 (2014), 151–154.

[37]

Youyang Hou, Cliff Lampe, Maximilian Bulinski, and James J. Prescott. 2017. Factors in fairness and emotion in online case resolution systems. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2511–2522.

Digital Library

[38]

Ayanna Howard and Jason Borenstein. 2018. The ugly truth about ourselves and our robot creations: The problem of bias and social inequity. Science and Engineering Ethics 24, 5 (2018), 1521–1536.

[39]

Janet S. Hyde, Elizabeth Fennema, and Susan J. Lamon. 1990. Gender differences in mathematics performance: A meta-analysis. Psychological Bulletin 107, 2 (1990), 139.

[40]

Rashidul Islam, Kamrun Naher Keya, Shimei Pan, and James Foulds. 2019a. Mitigating demographic biases in social media-based recommender systems. KDD (Social Impact Track) (2019).

[41]

Rashidul Islam, Kamrun Naher Keya, Shimei Pan, and James Foulds.2019b. Mitigating demographic biases in social media-based recommender systems. In KDD (Social Impact Track).

[42]

Rashidul Islam, Shimei Pan, and James R. Foulds. 2021. Can We Obtain Fairness For Free?Association for Computing Machinery, New York, NY, USA, 586–596.

[43]

Isaac Johnson, Connor McMahon, Johannes Schöning, and Brent Hecht. 2017. The effect of population and” structural” biases on social media-based algorithms: A case study in geolocation inference across the urban-rural spectrum. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 1167–1178.

Digital Library

[44]

Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3819–3828.

Digital Library

[45]

Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.

Digital Library

[46]

Ivar Krumpal. 2013. Determinants of social desirability bias in sensitive surveys: A literature review. Quality & Quantity 47, 4 (2013), 2025–2047.

[47]

Max Kuhn and Kjell Johnson. 2013. Applied Predictive Modeling. Vol. 26. Springer.

[48]

Matt J. Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 4066–4076.

[49]

Campbell Leaper and Christine R. Starr. 2019. Helping and hindering undergraduate women’s STEM motivation: Experiences with STEM encouragement, STEM-related gender bias, and sexual harassment. Psychology of Women Quarterly 43, 2 (2019), 165–183.

[50]

Robert W. Lent, Steven D. Brown, and Gail Hackett. 1994. Toward a unifying social cognitive theory of career and academic interest, choice, and performance. Journal of Vocational Behavior 45, 1 (1994), 79–122.

[51]

Penelope Lockwood. 2006. “Someone like me can be successful”: Do college students need same-gender role models? Psychology of Women Quarterly 30, 1 (2006), 36–46.

[52]

Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-designing checklists to understand organizational challenges and opportunities around fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[53]

Ronald McQuaid and Sue Bond. 2004. Gender stereotyping in career choice. European Commission. Employment Research Institute and Careers Scotland.

[54]

Judith L. Meece, Jacquelynne E. Parsons, Caroline M. Kaczala, and Susan B. Goff. 1982. Sex differences in math achievement: Toward a model of academic choice. Psychological Bulletin 91, 2 (1982), 324.

[55]

Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019).

[56]

Danaë Metaxa-Kakavouli, Kelly Wang, James A. Landay, and Jeff Hancock. 2018. Gender-inclusive design: Sense of belonging and bias in web interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–6.

Digital Library

[57]

Alex P. Miller and Kartik Hosanagar. 2010. How targeted ads and dynamic pricing can perpetuate bias. Harvard Business Review (2010).

[58]

S. U. Noble. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.

[59]

B. A. Nosek and M. R. Banaji. 2001. The go/no-go association task. Social Cognition 19, 6 (2001), 161–176.

[60]

Jahna Otterbacher, Jo Bates, and Paul Clough. 2017. Competent men and warm women: Gender stereotypes and backlash in image search results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 6620–6631.

Digital Library

[61]

Joni Salminen, Soon-gyo Jung, Shammur Chowdhury, and Bernard J. Jansen. 2020. Analyzing demographic bias in artificially generated facial pictures. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[62]

Joni Salminen, Soon-Gyo Jung, and Bernard J. Jansen. 2019. Detecting demographic bias in automatically generated personas. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–6.

Digital Library

[63]

Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web. 285–295.

Digital Library

[64]

Zoe Skinner, Stacey Brown, and Greg Walsh. 2020. Children of color’s perceptions of fairness in AI: An exploration of equitable and inclusive co-design. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.

Digital Library

[65]

Harald Steck. 2011. Item popularity and recommendation accuracy. In Proceedings of the Fifth ACM Conference on Recommender Systems. 125–132.

Digital Library

[66]

Samuel A. Stein, Gary M. Weiss, Yiwen Chen, and Daniel D. Leeds. 2020. A college major recommendation system. In RecSys’20: Proceedings of the 14th ACM Conference on Recommender Systems. 640–644.

Digital Library

[67]

Yolande Strengers, Lizhen Qu, Qiongkai Xu, and Jarrod Knibbe. 2020. Adhering, steering, and queering: Treatment of gender in natural language generation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[68]

Susan Kochenberger Stroeher. 1994. Sixteen kindergartners’ gender-related views of careers. The Elementary School Journal 95, 1 (1994), 95–103.

[69]

Harini Suresh and John V. Guttag. 2019. A Framework for Understanding Unintended Consequences of Machine Learning. arxiv:1901.10002 [cs.LG].

[70]

Mihaela Vorvoreanu, Lingyi Zhang, Yun-Han Huang, Claudia Hilderbrand, Zoe Steine-Hanson, and Margaret Burnett. 2019. From gender biases to gender-inclusive design: An empirical investigation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[71]

Ruotong Wang, F. Maxwell Harper, and Haiyi Zhu. 2020. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[72]

Michael White and Gwendolen White. 2006. Implicit and explicit occupational gender stereotypes. Sex Roles 55 (082006), 259–266.

[73]

Allison Woodruff, Sarah E. Fox, Steven Rousso-Schindler, and Jeffrey Warshaw. 2018. A qualitative exploration of perceptions of algorithmic fairness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.

Digital Library

[74]

Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. 2017. Learning non-discriminatory predictors. arXiv preprint arXiv:1702.06081 (2017).

[75]

Tanya V. Yadalam, Vaishnavi M. Gowda, Vanditha Shiva Kumar, Disha Girish, and Namratha M. 2020. Career recommendation systems using content based filtering. In Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020).

[76]

Jing Nathan Yan, Ziwei Gu, Hubert Lin, and Jeffrey M. Rzeszotarski. 2020. Silva: Interactively assessing machine learning fairness using causality. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.

Digital Library

[77]

Sirui Yao and Bert Huang. 2017. Beyond parity: Fairness objectives for collaborative filtering. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2925–2934.

[78]

Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171–1180.

Digital Library

[79]

Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. 2018. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 335–340.

Digital Library

Index Terms

When Biased Humans Meet Debiased AI: A Case Study in College Major Recommendation
1. Computing methodologies
  1. Machine learning
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

Do Humans Prefer Debiased AI Algorithms? A Case Study in Career Recommendation
IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Currently, there is a surge of interest in fair Artificial Intelligence (AI) and Machine Learning (ML) research which aims to mitigate discriminatory bias in AI algorithms, e.g. along lines of gender, age, and race. While most research in this domain ...
Mitigating sensitive data exposure with adversarial learning for fairness recommendation systems
Abstract
Fairness is an important research problem for recommendation systems, and unfair recommendation methods can lead to discrimination against users. Gender is a kind of sensitive feature, exposure sensitive feature can lead to unfair treatment of ...
Exploring and mitigating gender bias in book recommender systems with explicit feedback
Abstract
Recommender systems are indispensable because they influence our day-to-day behavior and decisions by giving us personalized suggestions. Services like Kindle, YouTube, and Netflix depend heavily on the performance of their recommender systems to ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Interactive Intelligent Systems

ACM Transactions on Interactive Intelligent Systems Volume 13, Issue 3

September 2023

263 pages

ISSN:2160-6455

EISSN:2160-6463

DOI:10.1145/3623489

Editor:
Shlomo Berkovsky
Macquarie University, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 September 2023

Online AM: 01 August 2023

Accepted: 17 July 2023

Revised: 11 July 2023

Received: 20 June 2022

Published in TIIS Volume 13, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

U.S. Department of Commerce, National Institute of Standards and Technology
National Science Foundation

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
799
Total Downloads

Downloads (Last 12 months)551
Downloads (Last 6 weeks)73

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents