You'd better stop! Understanding human reliance on machine learning models under covariate shift
Decision-making aids powered by machine learning models become increasingly prevalent
on the web today. However, when applied to a new distribution of data that is different from
the training data (ie, when covariate shift occurs), machine learning models often suffer from
performance degradation and may provide misleading recommendations to human decision-
makers. In this paper, we conduct a randomized experiment to investigate how people rely
on machine learning models to make decisions under covariate shift. Surprisingly, we find …
on the web today. However, when applied to a new distribution of data that is different from
the training data (ie, when covariate shift occurs), machine learning models often suffer from
performance degradation and may provide misleading recommendations to human decision-
makers. In this paper, we conduct a randomized experiment to investigate how people rely
on machine learning models to make decisions under covariate shift. Surprisingly, we find …
Decision-making aids powered by machine learning models become increasingly prevalent on the web today. However, when applied to a new distribution of data that is different from the training data (i.e., when covariate shift occurs), machine learning models often suffer from performance degradation and may provide misleading recommendations to human decision-makers. In this paper, we conduct a randomized experiment to investigate how people rely on machine learning models to make decisions under covariate shift. Surprisingly, we find that people rely on machine learning models more when making decisions on out-of-distribution data than in-distribution data. Moreover, while increasing people’s awareness of the machine learning model’s possible performance disparity on different data helps decrease people’s over-reliance on the model under covariate shift, enabling people to visualize the data distributions and the model’s performance does not seem to help. We conclude by discussing the implication of our results.