Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

When Biased Humans Meet Debiased AI: A Case Study in College Major Recommendation

Published: 11 September 2023 Publication History

Abstract

Currently, there is a surge of interest in fair Artificial Intelligence (AI) and Machine Learning (ML) research which aims to mitigate discriminatory bias in AI algorithms, e.g., along lines of gender, age, and race. While most research in this domain focuses on developing fair AI algorithms, in this work, we examine the challenges which arise when humans and fair AI interact. Our results show that due to an apparent conflict between human preferences and fairness, a fair AI algorithm on its own may be insufficient to achieve its intended results in the real world. Using college major recommendation as a case study, we build a fair AI recommender by employing gender debiasing machine learning techniques. Our offline evaluation showed that the debiased recommender makes fairer career recommendations without sacrificing its accuracy in prediction. Nevertheless, an online user study of more than 200 college students revealed that participants on average prefer the original biased system over the debiased system. Specifically, we found that perceived gender disparity is a determining factor for the acceptance of a recommendation. In other words, we cannot fully address the gender bias issue in AI recommendations without addressing the gender bias in humans. We conducted a follow-up survey to gain additional insights into the effectiveness of various design options that can help participants to overcome their own biases. Our results suggest that making fair AI explainable is crucial for increasing its adoption in the real world.

References

[1]
Alekh Agarwal, Alina Beygelzimer, Miroslav Dudík, John Langford, and Hanna Wallach. 2018. A reductions approach to fair classification. arXiv preprint arXiv:1803.02453 (2018).
[2]
Muhammad Ali, Piotr Sapiezynski, Miranda Bogen, Aleksandra Korolova, Alan Mislove, and Aaron Rieke. 2019. Discrimination through optimization: How Facebook’s ad delivery can lead to biased outcomes. Proceedings of the ACM on Human-Computer Interaction 3, CSCW (2019), 1–30.
[3]
J. Angwin, J. Larson, S. Mattu, and L. Kirchner. 2016. Machine bias: There’s software used across the country to predict future criminals. And it’s biased against Blacks. ProPublica, May 23 (2016).
[5]
Albert Bandura. 1977. Self-efficacy: Toward a unifying theory of behavioral change. Psychological Review 84, 2 (1977), 191.
[6]
Albert Bandura. 1978. Reflections on self-efficacy. Advances in Behaviour Research and Therapy 1, 4 (1978), 237–269.
[7]
Natã M. Barbosa and Monchu Chen. 2019. Rehumanized crowdsourcing: A labeling framework addressing bias and ethics in machine learning. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–12.
[8]
S. Barocas and A. D. Selbst. 2016. Big data’s disparate impact. Cal. L. Rev. 104 (2016), 671.
[9]
Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovic, Seema Nagar, Karthikeyan Natesan Ramamurthy, John T. Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. 2019. AI fairness 360: An extensible toolkit for detecting and mitigating algorithmic bias. IBM J. Res. Dev. 63, 4/5 (2019), 4:1–4:15.
[10]
David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. Journal of Machine Learning Research 3 (2003), 993–1022.
[11]
T. Bolukbasi, K.-W. Chang, J. Y. Zou, V. Saligrama, and A. T. Kalai. 2016. Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in NeurIPS.
[12]
T. Buonocore. 2019. Man is to doctor as woman is to nurse: The gender bias of word embeddings. https://towardsdatascience.com/gender-bias-word-embeddings-76d9806a0e17
[13]
A. Campolo, M. Sanfilippo, M. Whittaker, A. Selbst K. Crawford, and S. Barocas. 2017. AI Now 2017 Symposium Report. AI Now.
[14]
Jiawei Chen, Anbang Xu, Zhe Liu, Yufan Guo, Xiaotong Liu, Yingbei Tong, Rama Akkiraju, and John M. Carroll. 2020. A general methodology to quantify biases in natural language data. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–9.
[15]
Shelley J. Correll. 2001. Gender and the career choice process: The role of biased self-assessments. American Journal of Sociology 106, 6 (2001), 1691–1730.
[16]
Henriette Cramer, Jean Garcia-Gathright, Sravana Reddy, Aaron Springer, and Romain Takeo Bouyer. 2019. Translation, tracks & data: An algorithmic bias effort in practice. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–8.
[17]
K. Crenshaw. 1989. Demarginalizing the intersection of race and sex: A Black feminist critique of antidiscrimination doctrine, feminist theory and antiracist politics. U. Chi. Legal F. (1989), 139–167.
[18]
Jenna Cryan, Shiliang Tang, Xinyi Zhang, Miriam Metzger, Haitao Zheng, and Ben Y. Zhao. 2020. Detecting gender stereotypes: Lexicon vs. supervised learning methods. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–11.
[20]
Sunipa Dev and Jeff Phillips. 2019. Attenuating Bias in Word Vectors. arxiv:1901.07656 [cs.CL].
[21]
Mark Díaz, Isaac Johnson, Amanda Lazar, Anne Marie Piper, and Darren Gergle. 2018. Addressing age-related bias in sentiment analysis. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.
[22]
Jonathan Dodge, Q. Vera Liao, Yunfeng Zhang, Rachel K. E. Bellamy, and Casey Dugan. 2019. Explaining models: An empirical study of how explanations impact fairness judgment(IUI’19). Association for Computing Machinery, New York, NY, USA, 275–85.
[23]
C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel. 2012. Fairness through awareness. In Proceedings of ITCS. ACM, 214–226.
[24]
Allen L. Edwards. 1953. The relationship between the judged desirability of a trait and the probability that the trait will be endorsed. Journal of Applied Psychology 37, 2 (1953), 90.
[25]
Wilbert E. Fordyce. 1956. Social desirability in the MMPI. Journal of Consulting Psychology 20, 3 (1956), 171.
[26]
James R. Foulds, Rashidul Islam, Kamrun Naher Keya, and Shimei Pan. 2020a. An intersectional definition of fairness. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1918–1921.
[27]
James R. Foulds, Rashidul Islam, Kamrun Naher Keya, and Shimei Pan. 2020b. Bayesian modeling of intersectional fairness: The variance of bias. In Proceedings of the 2020 SIAM International Conference on Data Mining. SIAM, 424–432.
[28]
Pratik Gajane and Mykola Pechenizkiy. 2017. On formalizing fairness in prediction with machine learning. arXiv preprint arXiv:1710.03184 (2017).
[29]
Peter Glick and Susan T. Fiske. 1999. Gender, power dynamics, and social interaction. Revisioning Gender 5 (1999), 365–398.
[30]
A. G. Greenwald, D. E. McGhee, and J. L. Schwartz. 1998. Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology 74 (1998), 1464–1480.
[31]
A. G. Greenwald, B. A. Nosek, and M. R. Banaji. 2003. Understanding and using the implicit association test: I. An improved scoring algorithm. Journal of Personality and Social Psychology 85 (2003), 197–216.
[32]
M. Hardt, E. Price, and N. Srebro. 2016. Equality of opportunity in supervised learning. In Advances in NeurIPS. 3315–3323.
[33]
Xiangnan He, Tao Chen, Min-Yen Kan, and Xiao Chen. 2015. TriRank: Review-aware explainable recommendation by modeling aspects. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 1661–1670.
[34]
Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, and Tat-Seng Chua. 2017. Neural collaborative filtering. In Proceedings of the 26th International Conference on World Wide Web. 173–182.
[35]
Xiangnan He, Hanwang Zhang, Min-Yen Kan, and Tat-Seng Chua. 2016. Fast matrix factorization for online recommendation with implicit feedback. In Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval. 549–558.
[36]
Elspeth J. R. Hill and James A. Giles. 2014. Career decisions and gender: The illusion of choice? Perspectives on Medical Education 3, 3 (2014), 151–154.
[37]
Youyang Hou, Cliff Lampe, Maximilian Bulinski, and James J. Prescott. 2017. Factors in fairness and emotion in online case resolution systems. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 2511–2522.
[38]
Ayanna Howard and Jason Borenstein. 2018. The ugly truth about ourselves and our robot creations: The problem of bias and social inequity. Science and Engineering Ethics 24, 5 (2018), 1521–1536.
[39]
Janet S. Hyde, Elizabeth Fennema, and Susan J. Lamon. 1990. Gender differences in mathematics performance: A meta-analysis. Psychological Bulletin 107, 2 (1990), 139.
[40]
Rashidul Islam, Kamrun Naher Keya, Shimei Pan, and James Foulds. 2019a. Mitigating demographic biases in social media-based recommender systems. KDD (Social Impact Track) (2019).
[41]
Rashidul Islam, Kamrun Naher Keya, Shimei Pan, and James Foulds.2019b. Mitigating demographic biases in social media-based recommender systems. In KDD (Social Impact Track).
[42]
Rashidul Islam, Shimei Pan, and James R. Foulds. 2021. Can We Obtain Fairness For Free?Association for Computing Machinery, New York, NY, USA, 586–596.
[43]
Isaac Johnson, Connor McMahon, Johannes Schöning, and Brent Hecht. 2017. The effect of population and” structural” biases on social media-based algorithms: A case study in geolocation inference across the urban-rural spectrum. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 1167–1178.
[44]
Matthew Kay, Cynthia Matuszek, and Sean A. Munson. 2015. Unequal representation and gender stereotypes in image search results for occupations. In Proceedings of the 33rd Annual ACM Conference on Human Factors in Computing Systems. 3819–3828.
[45]
Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30–37.
[46]
Ivar Krumpal. 2013. Determinants of social desirability bias in sensitive surveys: A literature review. Quality & Quantity 47, 4 (2013), 2025–2047.
[47]
Max Kuhn and Kjell Johnson. 2013. Applied Predictive Modeling. Vol. 26. Springer.
[48]
Matt J. Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). 4066–4076.
[49]
Campbell Leaper and Christine R. Starr. 2019. Helping and hindering undergraduate women’s STEM motivation: Experiences with STEM encouragement, STEM-related gender bias, and sexual harassment. Psychology of Women Quarterly 43, 2 (2019), 165–183.
[50]
Robert W. Lent, Steven D. Brown, and Gail Hackett. 1994. Toward a unifying social cognitive theory of career and academic interest, choice, and performance. Journal of Vocational Behavior 45, 1 (1994), 79–122.
[51]
Penelope Lockwood. 2006. “Someone like me can be successful”: Do college students need same-gender role models? Psychology of Women Quarterly 30, 1 (2006), 36–46.
[52]
Michael A. Madaio, Luke Stark, Jennifer Wortman Vaughan, and Hanna Wallach. 2020. Co-designing checklists to understand organizational challenges and opportunities around fairness in AI. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.
[53]
Ronald McQuaid and Sue Bond. 2004. Gender stereotyping in career choice. European Commission. Employment Research Institute and Careers Scotland.
[54]
Judith L. Meece, Jacquelynne E. Parsons, Caroline M. Kaczala, and Susan B. Goff. 1982. Sex differences in math achievement: Toward a model of academic choice. Psychological Bulletin 91, 2 (1982), 324.
[55]
Ninareh Mehrabi, Fred Morstatter, Nripsuta Saxena, Kristina Lerman, and Aram Galstyan. 2019. A survey on bias and fairness in machine learning. arXiv preprint arXiv:1908.09635 (2019).
[56]
Danaë Metaxa-Kakavouli, Kelly Wang, James A. Landay, and Jeff Hancock. 2018. Gender-inclusive design: Sense of belonging and bias in web interfaces. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–6.
[57]
Alex P. Miller and Kartik Hosanagar. 2010. How targeted ads and dynamic pricing can perpetuate bias. Harvard Business Review (2010).
[58]
S. U. Noble. 2018. Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.
[59]
B. A. Nosek and M. R. Banaji. 2001. The go/no-go association task. Social Cognition 19, 6 (2001), 161–176.
[60]
Jahna Otterbacher, Jo Bates, and Paul Clough. 2017. Competent men and warm women: Gender stereotypes and backlash in image search results. In Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems. 6620–6631.
[61]
Joni Salminen, Soon-gyo Jung, Shammur Chowdhury, and Bernard J. Jansen. 2020. Analyzing demographic bias in artificially generated facial pictures. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.
[62]
Joni Salminen, Soon-Gyo Jung, and Bernard J. Jansen. 2019. Detecting demographic bias in automatically generated personas. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems. 1–6.
[63]
Badrul Sarwar, George Karypis, Joseph Konstan, and John Riedl. 2001. Item-based collaborative filtering recommendation algorithms. In Proceedings of the 10th International Conference on World Wide Web. 285–295.
[64]
Zoe Skinner, Stacey Brown, and Greg Walsh. 2020. Children of color’s perceptions of fairness in AI: An exploration of equitable and inclusive co-design. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems. 1–8.
[65]
Harald Steck. 2011. Item popularity and recommendation accuracy. In Proceedings of the Fifth ACM Conference on Recommender Systems. 125–132.
[66]
Samuel A. Stein, Gary M. Weiss, Yiwen Chen, and Daniel D. Leeds. 2020. A college major recommendation system. In RecSys’20: Proceedings of the 14th ACM Conference on Recommender Systems. 640–644.
[67]
Yolande Strengers, Lizhen Qu, Qiongkai Xu, and Jarrod Knibbe. 2020. Adhering, steering, and queering: Treatment of gender in natural language generation. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.
[68]
Susan Kochenberger Stroeher. 1994. Sixteen kindergartners’ gender-related views of careers. The Elementary School Journal 95, 1 (1994), 95–103.
[69]
Harini Suresh and John V. Guttag. 2019. A Framework for Understanding Unintended Consequences of Machine Learning. arxiv:1901.10002 [cs.LG].
[70]
Mihaela Vorvoreanu, Lingyi Zhang, Yun-Han Huang, Claudia Hilderbrand, Zoe Steine-Hanson, and Margaret Burnett. 2019. From gender biases to gender-inclusive design: An empirical investigation. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems. 1–14.
[71]
Ruotong Wang, F. Maxwell Harper, and Haiyi Zhu. 2020. Factors influencing perceived fairness in algorithmic decision-making: Algorithm outcomes, development procedures, and individual differences. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–14.
[72]
Michael White and Gwendolen White. 2006. Implicit and explicit occupational gender stereotypes. Sex Roles 55 (082006), 259–266.
[73]
Allison Woodruff, Sarah E. Fox, Steven Rousso-Schindler, and Jeffrey Warshaw. 2018. A qualitative exploration of perceptions of algorithmic fairness. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. 1–14.
[74]
Blake Woodworth, Suriya Gunasekar, Mesrob I. Ohannessian, and Nathan Srebro. 2017. Learning non-discriminatory predictors. arXiv preprint arXiv:1702.06081 (2017).
[75]
Tanya V. Yadalam, Vaishnavi M. Gowda, Vanditha Shiva Kumar, Disha Girish, and Namratha M. 2020. Career recommendation systems using content based filtering. In Proceedings of the Fifth International Conference on Communication and Electronics Systems (ICCES 2020).
[76]
Jing Nathan Yan, Ziwei Gu, Hubert Lin, and Jeffrey M. Rzeszotarski. 2020. Silva: Interactively assessing machine learning fairness using causality. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–13.
[77]
Sirui Yao and Bert Huang. 2017. Beyond parity: Fairness objectives for collaborative filtering. In Proceedings of the 31st International Conference on Neural Information Processing Systems (Long Beach, California, USA) (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, 2925–2934.
[78]
Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. 1171–1180.
[79]
Brian Hu Zhang, Blake Lemoine, and Margaret Mitchell. 2018. Mitigating unwanted biases with adversarial learning. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society. 335–340.

Index Terms

  1. When Biased Humans Meet Debiased AI: A Case Study in College Major Recommendation

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Interactive Intelligent Systems
      ACM Transactions on Interactive Intelligent Systems  Volume 13, Issue 3
      September 2023
      263 pages
      ISSN:2160-6455
      EISSN:2160-6463
      DOI:10.1145/3623489
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 11 September 2023
      Online AM: 01 August 2023
      Accepted: 17 July 2023
      Revised: 11 July 2023
      Received: 20 June 2022
      Published in TIIS Volume 13, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. AI
      2. machine learning
      3. fairness
      4. gender bias
      5. career recommendation

      Qualifiers

      • Research-article

      Funding Sources

      • U.S. Department of Commerce, National Institute of Standards and Technology
      • National Science Foundation

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 799
        Total Downloads
      • Downloads (Last 12 months)551
      • Downloads (Last 6 weeks)73
      Reflects downloads up to 21 Nov 2024

      Other Metrics

      Citations

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media