Learning to Complement and to Defer to Multiple Users

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15114))

Included in the following conference series:

European Conference on Computer Vision

238 Accesses

Abstract

With the development of Human-AI Collaboration in Classification (HAI-CC), integrating users and AI predictions becomes challenging due to the complex decision-making process. This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users. Despite their interconnected nature, these options have been studied in isolation rather than as components of a unified system. In this paper, we address this weakness with the novel HAI-CC methodology, called Learning to Complement and to Defer to Multiple Users (LECODU). LECODU not only combines learning to complement and learning to defer strategies, but it also incorporates an estimation of the optimal number of users to engage in the decision process. The training of LECODU maximises classification accuracy and minimises collaboration costs associated with user involvement. Comprehensive evaluations across real-world and synthesized datasets demonstrate LECODU’s superior performance compared to state-of-the-art HAI-CC methods. Remarkably, even when relying on unreliable users with high rates of label noise, LECODU exhibits significant improvement over both human decision-makers alone and AI alone (Supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/Y018036/1). Code is available at https://github.com/zhengzhang37/LECODU.git.

Supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/Y018036/1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Agreeing to disagree: active learning with noisy labels without crowdsourcing

Article 27 February 2017

Hierarchical Active Learning with Proportion Feedback on Regions

A review and experimental analysis of active learning over crowdsourced data

Article Open access 30 May 2021

References

Agarwal, N., Moehring, A., Rajpurkar, P., Salz, T.: Combining human expertise with artificial intelligence: experimental evidence from radiology. Technical report, National Bureau of Economic Research (2023)
Google Scholar
Arazo, E., Ortego, D., Albert, P., O’Connor, N., Mcguinness, K.: Unsupervised label noise modeling and loss correction. In: ICML, pp. 312–321 (2019)
Google Scholar
Babbar, V., Bhatt, U., Weller, A.: On the utility of prediction sets in human-AI teams. arXiv preprint arXiv:2205.01411 (2022)
Bansal, G., Nushi, B., Kamar, E., Horvitz, E., Weld, D.S.: Is the most accurate AI the best teammate? Optimizing AI for teamwork. In: AAAI, vol. 35, pp. 11405–11414 (2021)
Google Scholar
Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: MixMatch: a holistic approach to semi-supervised learning. In: NeurIPS, vol. 32 (2019)
Google Scholar
Bubeck, S., et al.: Sparks of artificial general intelligence: early experiments with GPT-4. arXiv preprint arXiv:2303.12712 (2023)
Cao, Y., Mozannar, H., Feng, L., Wei, H., An, B.: In defense of softmax parametrization for calibrated and consistent learning to defer. In: NeurIPS, vol. 36 (2024)
Google Scholar
Charusaie, M.A., Mozannar, H., Sontag, D., Samadi, S.: Sample efficient learning of predictors that complement humans. In: ICML, pp. 2972–3005. PMLR (2022)
Google Scholar
Chen, P., Ye, J., Chen, G., Zhao, J., Heng, P.A.: Beyond class-conditional assumption: a primary attempt to combat instance-dependent label noise. In: AAAI (2021)
Google Scholar
Chen, Z., et al.: Structured probabilistic end-to-end learning from crowds. In: IJCAI, pp. 1512–1518 (2021)
Google Scholar
Chiou, E.K., Lee, J.D.: Trusting automation: designing for responsivity and resilience. Hum. Factors 65(1), 137–165 (2023)
Article Google Scholar
Cordeiro, F.R., Sachdeva, R., Belagiannis, V., Reid, I., Carneiro, G.: LongReMix: robust learning with high confidence samples in a noisy label environment. Pattern Recogn. 133, 109013 (2023)
Article Google Scholar
Cortes, C., DeSalvo, G., Mohri, M.: Learning with rejection. In: Ortner, R., Simon, H.U., Zilles, S. (eds.) ALT 2016. LNCS (LNAI), vol. 9925, pp. 67–82. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46379-7_5
Chapter Google Scholar
Dafoe, A., Bachrach, Y., Hadfield, G., Horvitz, E., Larson, K., Graepel, T.: Cooperative AI: machines must learn to find common ground. Nature 593(7857), 33–36 (2021)
Article Google Scholar
Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 20–28 (1979)
Google Scholar
Garg, A., Nguyen, C., Felix, R., Do, T.T., Carneiro, G.: Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2288–2298 (2023)
Google Scholar
Ghosh, A., Kumar, H., Sastry, P.S.: Robust loss functions under label noise for deep neural networks. In: AAAI, vol. 31 (2017)
Google Scholar
Goh, H.W., Tkachenko, U., Mueller, J.: CROWDLAB: supervised learning to infer consensus labels and quality scores for data with multiple annotators (2023)
Google Scholar
Green, B., Chen, Y.: Disparate interactions: an algorithm-in-the-loop analysis of fairness in risk assessments. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 90–99 (2019)
Google Scholar
Guan, M., Gulshan, V., Dai, A., Hinton, G.: Who said what: modeling individual labelers improves classification. In: AAAI, vol. 32 (2018)
Google Scholar
Halling-Brown, M.D., et al.: Optimam mammography image database: a large-scale resource of mammography images and clinical data. Radiology: Artif. Intell. 3(1), e200103 (2020)
Google Scholar
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: NeurIPS, vol. 31 (2018)
Google Scholar
Hemmer, P., Schellhammer, S., Vössing, M., Jakubik, J., Satzger, G.: Forming effective human-AI teams: building machine learning models that complement the capabilities of multiple experts. In: Raedt, L.D. (ed.) IJCAI, pp. 2478–2484. International Joint Conferences on Artificial Intelligence Organization (2022). https://doi.org/10.24963/ijcai.2022/344
Jaehwan, L., Donggeun, Y., Hyo-Eun, K.: Photometric transformer networks and label adjustment for breast density prediction. In: International Conference on Computer Vision Workshops (2019)
Google Scholar
Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)
Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L.: MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: ICML (2018)
Google Scholar
Kamar, E., Hacker, S., Horvitz, E.: Combining human and machine intelligence in large-scale crowdsourcing. In: AAMAS, vol. 12, pp. 467–474 (2012)
Google Scholar
Kerrigan, G., Smyth, P., Steyvers, M.: Combining human predictions with model probabilities via confusion matrices and calibration. In: NeurIPS, vol. 34, pp. 4421–4434 (2021)
Google Scholar
Keswani, V., Lease, M., Kenthapadi, K.: Towards unbiased and accurate deferral to multiple experts. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp. 154–165 (2021)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Li, J., Socher, R., Hoi, S.C.: DivideMix: learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020)
Liu, M., Wei, J., Liu, Y., Davis, J.: Do humans and machines have the same eyes? Human-machine perceptual differences on image classification. arXiv preprint arXiv:2304.08733 (2023)
Liu, Y., Cheng, H., Zhang, K.: Identifiability of label noise transition matrix. In: ICML, pp. 21475–21496. PMLR (2023)
Google Scholar
Lu, Z., Yin, M.: Human reliance on machine learning models when performance feedback is limited: Heuristics and risks. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–16 (2021)
Google Scholar
Madras, D., Pitassi, T., Zemel, R.: Predict responsibly: improving fairness and accuracy by learning to defer. In: NeurIPS, vol. 31 (2018)
Google Scholar
Mao, A., Mohri, C., Mohri, M., Zhong, Y.: Two-stage learning to defer with multiple experts. In: NeurIPS (2023)
Google Scholar
Mozannar, H., Lang, H., Wei, D., Sattigeri, P., Das, S., Sontag, D.: Who should predict? Exact algorithms for learning to defer to humans. In: Ruiz, F., Dy, J., van de Meent, J.W. (eds.) Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 206, pp. 10520–10545. PMLR (2023). https://proceedings.mlr.press/v206/mozannar23a.html
Mozannar, H., Sontag, D.: Consistent estimators for learning to defer to an expert. In: ICML, pp. 7076–7087. PMLR (2020)
Google Scholar
Narasimhan, H., Jitkrittum, W., Menon, A.K., Rawat, A., Kumar, S.: Post-hoc estimators for learning to defer to an expert. In: NeurIPS, vol. 35, pp. 29292–29304 (2022)
Google Scholar
Okati, N., De, A., Rodriguez, M.: Differentiable learning under triage. In: NeurIPS, vol. 34, pp. 9140–9151 (2021)
Google Scholar
Ortego, D., Arazo, E., Albert, P., O’Connor, N.E., McGuinness, K.: Multi-objective interpolation training for robustness to label noise. In: CVPR, pp. 6606–6615 (2021)
Google Scholar
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS, vol. 32 (2019)
Google Scholar
Peterson, J.C., Battleday, R.M., Griffiths, T.L., Russakovsky, O.: Human uncertainty makes classification more robust. In: ICCV, pp. 9617–9626 (2019)
Google Scholar
Pradier, M.F., Zazo, J., Parbhoo, S., Perlis, R.H., Zazzi, M., Doshi-Velez, F.: Preferential mixture-of-experts: interpretable models that rely on human expertise as much as possible. In: AMIA Summits on Translational Science Proceedings 2021, p. 525 (2021)
Google Scholar
Raghu, M., Blumer, K., Corrado, G., Kleinberg, J., Obermeyer, Z., Mullainathan, S.: The algorithmic automation problem: prediction, triage, and human effort. arXiv preprint arXiv:1903.12220 (2019)
Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11(4) (2010)
Google Scholar
Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. In: ICML (2018)
Google Scholar
Rodrigues, F., Lourenco, M., Ribeiro, B., Pereira, F.C.: Learning supervised topic models for classification and regression from crowds. IEEE TPAMI 39(12), 2409–2422 (2017)
Article Google Scholar
Rodrigues, F., Pereira, F.: Deep learning from crowds. In: AAAI, vol. 32 (2018)
Google Scholar
Rodrigues, F., Pereira, F., Ribeiro, B.: Gaussian process classification and active learning with multiple annotators. In: ICML, pp. 433–441. PMLR (2014)
Google Scholar
Rosenfeld, A., Solbach, M.D., Tsotsos, J.K.: Totally looks like-how humans compare, compared to machines. In: CVPRW, pp. 1961–1964 (2018)
Google Scholar
Sachdeva, R., Cordeiro, F.R., Belagiannis, V., Reid, I., Carneiro, G.: ScanMix: learning from severe label noise via semantic clustering and semi-supervised learning. Pattern Recogn. 134, 109121 (2023)
Article Google Scholar
Serre, T.: Deep learning: the good, the bad, and the ugly. Annu. Rev. Vision Sci. 5, 399–426 (2019)
Article Google Scholar
Shin, D.: The effects of explainability and causability on perception, trust, and acceptance: implications for explainable AI. Int. J. Hum Comput Stud. 146, 102551 (2021)
Article Google Scholar
Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Google Scholar
Steyvers, M., Tejeda, H., Kerrigan, G., Smyth, P.: Bayesian modeling of human-AI complementarity. Proc. Nat. Acad. Sci. 119(11), e2111547119 (2022)
Article MathSciNet Google Scholar
Verma, R., Barrejón, D., Nalisnick, E.: On the calibration of learning to defer to multiple experts. In: Workshop on Human-Machine Collaboration and Teaming in International Conference of Machine Learning (2022)
Google Scholar
Verma, R., Barrejon, D., Nalisnick, E.: Learning to defer to multiple experts: consistent surrogate losses, confidence calibration, and conformal ensembles. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 11415–11434. Proceedings of Machine Learning Research, PMLR (2023). https://proceedings.mlr.press/v206/verma23a.html
Verma, R., Nalisnick, E.: Calibrated learning to defer with one-vs-all classifiers. In: ICML, pp. 22184–22202. PMLR (2022)
Google Scholar
Vodrahalli, K., Daneshjou, R., Gerstenberg, T., Zou, J.: Do humans trust advice more if it comes from AI? an analysis of human-AI interactions. In: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pp. 763–777 (2022)
Google Scholar
Wang, H., Xiao, R., Dong, Y., Feng, L., Zhao, J.: ProMix: combating label noise via maximizing clean sample utility. arXiv preprint arXiv:2207.10276 (2022)
Wei, H., Xie, R., Feng, L., Han, B., An, B.: Deep learning from multiple noisy annotators as a union. IEEE Trans. Neural Netw. Learn. Syst. (2022)
Google Scholar
Wei, J., et al.: Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022)
Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., Liu, Y.: Learning with noisy labels revisited: a study using real-world human annotations. arXiv preprint arXiv:2110.12088 (2021)
Weitz, K., Schiller, D., Schlagowski, R., Huber, T., André, E.: “Do you trust me?” increasing user-trust by integrating virtual agents in explainable AI interaction design. In: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, pp. 7–9 (2019)
Google Scholar
Whitehill, J., Wu, T.f., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: NeurIPS, vol. 22 (2009)
Google Scholar
Wilder, B., Horvitz, E., Kamar, E.: Learning to complement humans. In: IJCAI (2021)
Google Scholar
Wu, X., Xiao, L., Sun, Y., Zhang, J., Ma, T., He, L.: A survey of human-in-the-loop for machine learning. Future Gener. Comput. Syst. 135(C), 364–381 (2022). https://doi.org/10.1016/j.future.2022.05.014
Article Google Scholar
Xia, X., et al.: Sample selection with uncertainty of losses for learning with noisy labels. arXiv preprint arXiv:2106.00445 (2021)
Xu, Y., Zhu, L., Jiang, L., Yang, Y.: Faster meta update strategy for noise-robust deep learning. In: CVPR (2021)
Google Scholar
Yin, M., Wortman Vaughan, J., Wallach, H.: Understanding the effect of accuracy on trust in machine learning models. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2019)
Google Scholar
Yuan, B., Chen, J., Zhang, W., Tai, H.S., McMains, S.: Iterative cross learning on noisy labels. In: Winter Conference on Applications of Computer Vision, pp. 757–765 (2018)
Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zhang, Z., Wells, K., Carneiro, G.: Learning to complement with multiple humans (LECOMH): integrating multi-rater and noisy-label learning into human-AI collaboration. arXiv preprint arXiv:2311.13172 (2023)
Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: NeurIPS, pp. 8778–8788 (2018)
Google Scholar
Zhang, Z., Pfister, T.: Learning fast sample re-weighting without reward data. In: ICCV (2021)
Google Scholar
Zhang, Z., Zhang, H., Arik, S.Ö., Lee, H., Pfister, T.: Distilling effective supervision from severe label noise. In: CVPR, pp. 9291–9300 (2020)
Google Scholar
Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. CRC Press (2012)
Google Scholar
Zhu, C., Chen, W., Peng, T., Wang, Y., Jin, M.: Hard sample aware noise robust learning for histopathology image classification. IEEE Trans. Med. Imaging 41(4), 881–894 (2021)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Vision, Speech and Signal Processing, University of Surrey, Guildford, UK
Zheng Zhang, Wenjie Ai, Kevin Wells, David Rosewarne & Gustavo Carneiro
Royal Wolverhampton Hospitals NHS Trust, Wolverhampton, UK
David Rosewarne
Department of Data Science and AI, Monash University, Melbourne, Australia
Thanh-Toan Do

Authors

Zheng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Wenjie Ai
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Wells
View author publications
You can also search for this author in PubMed Google Scholar
David Rosewarne
View author publications
You can also search for this author in PubMed Google Scholar
Thanh-Toan Do
View author publications
You can also search for this author in PubMed Google Scholar
Gustavo Carneiro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gustavo Carneiro .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Hessen, Germany
Stefan Roth
Princeton University, Palo Alto, CA, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 165 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Ai, W., Wells, K., Rosewarne, D., Do, TT., Carneiro, G. (2025). Learning to Complement and to Defer to Multiple Users. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15114. Springer, Cham. https://doi.org/10.1007/978-3-031-72992-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-72992-8_9
Published: 30 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72991-1
Online ISBN: 978-3-031-72992-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics