research-article

Open access

You’d Better Stop! Understanding Human Reliance on Machine Learning Models under Covariate Shift

Authors:

Chun-Wei Chiang,

Ming YinAuthors Info & Claims

WebSci '21: Proceedings of the 13th ACM Web Science Conference 2021

Pages 120 - 129

https://doi.org/10.1145/3447535.3462487

Published: 22 June 2021 Publication History

All formats PDF

Abstract

Decision-making aids powered by machine learning models become increasingly prevalent on the web today. However, when applied to a new distribution of data that is different from the training data (i.e., when covariate shift occurs), machine learning models often suffer from performance degradation and may provide misleading recommendations to human decision-makers. In this paper, we conduct a randomized experiment to investigate how people rely on machine learning models to make decisions under covariate shift. Surprisingly, we find that people rely on machine learning models more when making decisions on out-of-distribution data than in-distribution data. Moreover, while increasing people’s awareness of the machine learning model’s possible performance disparity on different data helps decrease people’s over-reliance on the model under covariate shift, enabling people to visualize the data distributions and the model’s performance does not seem to help. We conclude by discussing the implication of our results.

Supplementary Material

MP4 File (PS3.3_Chun-WeiChiang_YoudBetterStop-UnderstandingHumanReliance_onMachineLearningModels_underCovariateShift.mp4)

You?d Better Stop! Understanding Human Reliance on Machine Learning Models under Covariate Shift

Download
85.23 MB

References

[1]

Saleema Amershi, Dan Weld, Mihaela Vorvoreanu, Adam Fourney, Besmira Nushi, Penny Collisson, Jina Suh, Shamsi Iqbal, Paul N Bennett, Kori Inkpen, 2019. Guidelines for human-AI interaction. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–13.

Digital Library

[2]

Gagan Bansal, Besmira Nushi, Ece Kamar, Walter S Lasecki, Daniel S Weld, and Eric Horvitz. 2019. Beyond Accuracy: The Role of Mental Models in Human-AI Team Performance. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 7. 2–11.

[3]

Katy Börner, Andreas Bueckle, and Michael Ginda. 2019. Data visualization literacy: Definitions, conceptual frameworks, exercises, and assessments. Proceedings of the National Academy of Sciences 116, 6 (2019), 1857–1864.

[4]

Joy Buolamwini and Timnit Gebru. 2018. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on fairness, accountability and transparency. 77–91.

[5]

Hao-Fei Cheng, Ruotong Wang, Zheng Zhang, Fiona O’Connell, Terrance Gray, F Maxwell Harper, and Haiyi Zhu. 2019. Explaining decision-making algorithms through UI: Strategies to help non-expert stakeholders. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–12.

Digital Library

[6]

Maria De-Arteaga, Riccardo Fogliato, and Alexandra Chouldechova. 2020. A Case for Humans-in-the-Loop: Decisions in the Presence of Erroneous Algorithmic Scores. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–12.

Digital Library

[7]

Dean De Cock. 2011. Ames, Iowa: Alternative to the Boston housing data as an end of semester regression project. Journal of Statistics Education 19, 3 (2011).

[8]

Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. 2015. Algorithm aversion: People erroneously avoid algorithms after seeing them err.Journal of Experimental Psychology: General 144, 1 (2015), 114.

[9]

Alexander Erlei, Franck Nekdem, Lukas Meub, Avishek Anand, and Ujwal Gadiraju. 2020. Impact of Algorithmic Decision Making on Human Behavior: Evidence from Ultimatum Bargaining. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing, Vol. 8. 43–52.

[10]

Yannick Forster, Sebastian Hergeth, Frederik Naujoks, Josef Krems, and Andreas Keinath. 2019. User Education in Automated Driving: Owner’s Manual and Interactive Tutorial Support Mental Model Formation and Human-Automation Interaction. Information 10, 4 (2019), 143.

[11]

Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven M Drucker. 2019. Gamut: A design probe to understand how data scientists understand machine learning models. In Proceedings of the 2019 CHI conference on human factors in computing systems. 1–13.

Digital Library

[12]

Spencer C Kohn, Daniel Quinn, Richard Pak, Ewart J de Visser, and Tyler H Shaw. 2018. Trust repair strategies with self-driving vehicles: An exploratory study. In Proceedings of the human factors and ergonomics society annual meeting, Vol. 62. SAGE Publications Sage CA: Los Angeles, CA, 1108–1112.

[13]

Jennifer M Logg, Julia A Minson, and Don A Moore. 2019. Algorithm appreciation: People prefer algorithmic to human judgment. Organizational Behavior and Human Decision Processes 151 (2019), 90–103.

[14]

Duri Long and Brian Magerko. 2020. What is AI Literacy? Competencies and Design Considerations. In Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems. 1–16.

Digital Library

[15]

Zhuoran Lu and Ming Yin. 2021. Human Reliance on Machine Learning Models When Performance Feedback is Limited: Heuristics and Risks. In Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems.

Digital Library

[16]

Adam V Maltese, Joseph A Harsh, and Dubravka Svetina. 2015. Data visualization literacy: Investigating data interpretation along the novice—expert continuum. Journal of College Science Teaching 45, 1 (2015), 84–90.

[17]

Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson, Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. 2019. Model cards for model reporting. In Proceedings of the conference on fairness, accountability, and transparency. 220–229.

Digital Library

[18]

Jose G Moreno-Torres, Troy Raeder, RocíO Alaiz-RodríGuez, Nitesh V Chawla, and Francisco Herrera. 2012. A unifying view on dataset shift in classification. Pattern recognition 45, 1 (2012), 521–530.

[19]

Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. 2019. Can you trust your model’s uncertainty? Evaluating predictive uncertainty under dataset shift. In Advances in Neural Information Processing Systems. 13991–14002.

[20]

Forough Poursabzi-Sangdeh, Daniel G Goldstein, Jake M Hofman, Jennifer Wortman Vaughan, and Hanna Wallach. 2018. Manipulating and measuring model interpretability. arXiv preprint arXiv:1802.07810(2018).

[21]

Joaquin Quionero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D Lawrence. 2009. Dataset shift in machine learning. The MIT Press.

[22]

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. ” Why should I trust you?” Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. 1135–1144.

Digital Library

[23]

Susumu Saito, Chun-Wei Chiang, Saiph Savage, Teppei Nakano, Tetsunori Kobayashi, and Jeffrey Bigham. 2019. Predicting the Working Time of Microtasks Based on Workers’ Perception of Prediction Errors. Human Computation 6, 1 (2019), 192–219.

[24]

Hidetoshi Shimodaira. 2000. Improving predictive inference under covariate shift by weighting the log-likelihood function. Journal of statistical planning and inference 90, 2 (2000), 227–244.

[25]

Masashi Sugiyama and Motoaki Kawanabe. 2012. Machine learning in non-stationary environments: Introduction to covariate shift adaptation. MIT press.

[26]

Suzanne Tolmeijer, Ujwal Gadiraju, Ramya Ghantasala, Akshit Gupta, and Abraham Bernstein. 2021. Second Chance for a First Impression? Trust Development in Intelligent System Interaction. In Proceedings of the 29th ACM Conference on User Modeling, Adaptation and Personalization (UMAP 2021).

Digital Library

[27]

Anne Marthe van der Bles, Sander van der Linden, Alexandra LJ Freeman, James Mitchell, Ana B Galvao, Lisa Zaval, and David J Spiegelhalter. 2019. Communicating uncertainty about facts, numbers and science. Royal Society open science 6, 5 (2019), 181870.

[28]

Mei Wang, Weihong Deng, Jiani Hu, Xunqiang Tao, and Yaohai Huang. 2019. Racial faces in the wild: Reducing racial bias by information maximization adaptation network. In Proceedings of the IEEE International Conference on Computer Vision. 692–702.

[29]

Xinru Wang and Ming Yin. 2021. Are Explanations Helpful? A Comparative Study of the Effects of Explanations in AI-Assisted Decision-Making. In 26th International Conference on Intelligent User Interfaces. 318–328.

Digital Library

[30]

Michael Yeomans, Anuj Shah, Sendhil Mullainathan, and Jon Kleinberg. 2019. Making sense of recommendations. Journal of Behavioral Decision Making 32, 4 (2019), 403–414.

[31]

Ming Yin, Jennifer Wortman Vaughan, and Hanna Wallach. 2019. Understanding the effect of accuracy on trust in machine learning models. In Proceedings of the 2019 chi conference on human factors in computing systems. 1–12.

Digital Library

[32]

Yunfeng Zhang, Q Vera Liao, and Rachel KE Bellamy. 2020. Effect of confidence and explanation on accuracy and trust calibration in AI-assisted decision making. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency. 295–305.

Digital Library

[33]

Zijian Zhang, Jaspreet Singh, Ujwal Gadiraju, and Avishek Anand. 2019. Dissonance between human and machine understanding. Proceedings of the ACM on Human-Computer Interaction 3, CSCW(2019), 1–23.

Digital Library

Cited By

Zhang ZBuchner FLiu YButz A(2024)You Can Only Verify When You Know the Answer: Feature-Based Explanations Reduce Overreliance on AI for Easy Decisions, but Not for Hard OnesProceedings of Mensch und Computer 202410.1145/3670653.3670660(156-170)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1145/3670653.3670660
Lu ZWang DYin M(2024)Does More Advice Help? The Effects of Second Opinions in AI-Assisted Decision MakingProceedings of the ACM on Human-Computer Interaction10.1145/36537088:CSCW1(1-31)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653708
He GBharos AGadiraju U(2024)To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI SystemsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675130(98-105)Online publication date: 10-Sep-2024
https://dl.acm.org/doi/10.1145/3648188.3675130
Show More Cited By

Recommendations

Human Reliance on Machine Learning Models When Performance Feedback is Limited: Heuristics and Risks
CHI '21: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems

This paper addresses an under-explored problem of AI-assisted decision-making: when objective performance information of the machine learning model underlying a decision aid is absent or scarce, how do people decide their reliance on the model? Through ...
Covariate shift and incremental learning
ICONIP'08: Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I

Learning strategies under covariate shift have recently been widely discussed. Under covariate shift, the density of learning inputs is different from that of test inputs. In such environments, learning machines need to employ special learning ...
Exploring the Effects of Machine Learning Literacy Interventions on Laypeople’s Reliance on Machine Learning Models
IUI '22: Proceedings of the 27th International Conference on Intelligent User Interfaces

Today, machine learning (ML) technologies have penetrated almost every aspect of people’s lives, yet public understandings of these technologies are often limited. This highlights the urgent need of designing effective methods to increase people’s ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

WebSci '21: Proceedings of the 13th ACM Web Science Conference 2021

June 2021

328 pages

ISBN:9781450383301

DOI:10.1145/3447535

Copyright © 2021 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 June 2021

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Science Foundation

Conference

WebSci '21

Sponsor:

SIGWEB

WebSci '21: WebSci '21 13th ACM Web Science Conference 2021

June 21 - 25, 2021

Virtual Event, United Kingdom

Acceptance Rates

Overall Acceptance Rate 245 of 933 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

27
Total Citations
View Citations
1,353
Total Downloads

Downloads (Last 12 months)406
Downloads (Last 6 weeks)69

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang ZBuchner FLiu YButz A(2024)You Can Only Verify When You Know the Answer: Feature-Based Explanations Reduce Overreliance on AI for Easy Decisions, but Not for Hard OnesProceedings of Mensch und Computer 202410.1145/3670653.3670660(156-170)Online publication date: 1-Sep-2024
https://dl.acm.org/doi/10.1145/3670653.3670660
Lu ZWang DYin M(2024)Does More Advice Help? The Effects of Second Opinions in AI-Assisted Decision MakingProceedings of the ACM on Human-Computer Interaction10.1145/36537088:CSCW1(1-31)Online publication date: 26-Apr-2024
https://dl.acm.org/doi/10.1145/3653708
He GBharos AGadiraju U(2024)To Err Is AI! Debugging as an Intervention to Facilitate Appropriate Reliance on AI SystemsProceedings of the 35th ACM Conference on Hypertext and Social Media10.1145/3648188.3675130(98-105)Online publication date: 10-Sep-2024
https://dl.acm.org/doi/10.1145/3648188.3675130
Chiang CLu ZLi ZYin M(2024)Enhancing AI-Assisted Group Decision Making through LLM-Powered Devil's AdvocateProceedings of the 29th International Conference on Intelligent User Interfaces10.1145/3640543.3645199(103-119)Online publication date: 18-Mar-2024
https://dl.acm.org/doi/10.1145/3640543.3645199
McConvey KGuha S(2024)"This is not a data problem": Algorithms and Power in Public Higher Education in CanadaProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642451(1-14)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642451
Gao SLu ZLuan HYin MWang L(2024)AI Pilot in the Cockpit: An Investigation of Public AcceptanceInternational Journal of Human–Computer Interaction10.1080/10447318.2024.2301856(1-14)Online publication date: 12-Jan-2024
https://doi.org/10.1080/10447318.2024.2301856
Cao SGomez CHuang C(2023)How Time Pressure in Different Phases of Decision-Making Influences Human-AI CollaborationProceedings of the ACM on Human-Computer Interaction10.1145/36100687:CSCW2(1-26)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3610068
He GBuijsman SGadiraju U(2023)How Stated Accuracy of an AI System and Analogies to Explain Accuracy Affect Human Reliance on the SystemProceedings of the ACM on Human-Computer Interaction10.1145/36100677:CSCW2(1-29)Online publication date: 4-Oct-2023
https://dl.acm.org/doi/10.1145/3610067
Schemmer MKuehl NBenz CBartos ASatzger G(2023)Appropriate Reliance on AI Advice: Conceptualization and the Effect of ExplanationsProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584066(410-422)Online publication date: 27-Mar-2023
https://dl.acm.org/doi/10.1145/3581641.3584066
Holstein KDe-Arteaga MTumati LCheng Y(2023)Toward Supporting Perceptual Complementarity in Human-AI Collaboration via Reflection on UnobservablesProceedings of the ACM on Human-Computer Interaction10.1145/35796287:CSCW1(1-20)Online publication date: 16-Apr-2023
https://dl.acm.org/doi/10.1145/3579628
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents