Nothing Special   »   [go: up one dir, main page]

Skip to main content

Learning to Complement and to Defer to Multiple Users

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

With the development of Human-AI Collaboration in Classification (HAI-CC), integrating users and AI predictions becomes challenging due to the complex decision-making process. This process has three options: 1) AI autonomously classifies, 2) learning to complement, where AI collaborates with users, and 3) learning to defer, where AI defers to users. Despite their interconnected nature, these options have been studied in isolation rather than as components of a unified system. In this paper, we address this weakness with the novel HAI-CC methodology, called Learning to Complement and to Defer to Multiple Users (LECODU). LECODU not only combines learning to complement and learning to defer strategies, but it also incorporates an estimation of the optimal number of users to engage in the decision process. The training of LECODU maximises classification accuracy and minimises collaboration costs associated with user involvement. Comprehensive evaluations across real-world and synthesized datasets demonstrate LECODU’s superior performance compared to state-of-the-art HAI-CC methods. Remarkably, even when relying on unreliable users with high rates of label noise, LECODU exhibits significant improvement over both human decision-makers alone and AI alone (Supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/Y018036/1). Code is available at https://github.com/zhengzhang37/LECODU.git.

Supported by the Engineering and Physical Sciences Research Council (EPSRC) through grant EP/Y018036/1.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Agarwal, N., Moehring, A., Rajpurkar, P., Salz, T.: Combining human expertise with artificial intelligence: experimental evidence from radiology. Technical report, National Bureau of Economic Research (2023)

    Google Scholar 

  2. Arazo, E., Ortego, D., Albert, P., O’Connor, N., Mcguinness, K.: Unsupervised label noise modeling and loss correction. In: ICML, pp. 312–321 (2019)

    Google Scholar 

  3. Babbar, V., Bhatt, U., Weller, A.: On the utility of prediction sets in human-AI teams. arXiv preprint arXiv:2205.01411 (2022)

  4. Bansal, G., Nushi, B., Kamar, E., Horvitz, E., Weld, D.S.: Is the most accurate AI the best teammate? Optimizing AI for teamwork. In: AAAI, vol. 35, pp. 11405–11414 (2021)

    Google Scholar 

  5. Berthelot, D., Carlini, N., Goodfellow, I., Papernot, N., Oliver, A., Raffel, C.A.: MixMatch: a holistic approach to semi-supervised learning. In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  6. Bubeck, S., et al.: Sparks of artificial general intelligence: early experiments with GPT-4. arXiv preprint arXiv:2303.12712 (2023)

  7. Cao, Y., Mozannar, H., Feng, L., Wei, H., An, B.: In defense of softmax parametrization for calibrated and consistent learning to defer. In: NeurIPS, vol. 36 (2024)

    Google Scholar 

  8. Charusaie, M.A., Mozannar, H., Sontag, D., Samadi, S.: Sample efficient learning of predictors that complement humans. In: ICML, pp. 2972–3005. PMLR (2022)

    Google Scholar 

  9. Chen, P., Ye, J., Chen, G., Zhao, J., Heng, P.A.: Beyond class-conditional assumption: a primary attempt to combat instance-dependent label noise. In: AAAI (2021)

    Google Scholar 

  10. Chen, Z., et al.: Structured probabilistic end-to-end learning from crowds. In: IJCAI, pp. 1512–1518 (2021)

    Google Scholar 

  11. Chiou, E.K., Lee, J.D.: Trusting automation: designing for responsivity and resilience. Hum. Factors 65(1), 137–165 (2023)

    Article  Google Scholar 

  12. Cordeiro, F.R., Sachdeva, R., Belagiannis, V., Reid, I., Carneiro, G.: LongReMix: robust learning with high confidence samples in a noisy label environment. Pattern Recogn. 133, 109013 (2023)

    Article  Google Scholar 

  13. Cortes, C., DeSalvo, G., Mohri, M.: Learning with rejection. In: Ortner, R., Simon, H.U., Zilles, S. (eds.) ALT 2016. LNCS (LNAI), vol. 9925, pp. 67–82. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46379-7_5

    Chapter  Google Scholar 

  14. Dafoe, A., Bachrach, Y., Hadfield, G., Horvitz, E., Larson, K., Graepel, T.: Cooperative AI: machines must learn to find common ground. Nature 593(7857), 33–36 (2021)

    Article  Google Scholar 

  15. Dawid, A.P., Skene, A.M.: Maximum likelihood estimation of observer error-rates using the EM algorithm. J. Roy. Stat. Soc.: Ser. C (Appl. Stat.) 28(1), 20–28 (1979)

    Google Scholar 

  16. Garg, A., Nguyen, C., Felix, R., Do, T.T., Carneiro, G.: Instance-dependent noisy label learning via graphical modelling. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2288–2298 (2023)

    Google Scholar 

  17. Ghosh, A., Kumar, H., Sastry, P.S.: Robust loss functions under label noise for deep neural networks. In: AAAI, vol. 31 (2017)

    Google Scholar 

  18. Goh, H.W., Tkachenko, U., Mueller, J.: CROWDLAB: supervised learning to infer consensus labels and quality scores for data with multiple annotators (2023)

    Google Scholar 

  19. Green, B., Chen, Y.: Disparate interactions: an algorithm-in-the-loop analysis of fairness in risk assessments. In: Proceedings of the Conference on Fairness, Accountability, and Transparency, pp. 90–99 (2019)

    Google Scholar 

  20. Guan, M., Gulshan, V., Dai, A., Hinton, G.: Who said what: modeling individual labelers improves classification. In: AAAI, vol. 32 (2018)

    Google Scholar 

  21. Halling-Brown, M.D., et al.: Optimam mammography image database: a large-scale resource of mammography images and clinical data. Radiology: Artif. Intell. 3(1), e200103 (2020)

    Google Scholar 

  22. Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: NeurIPS, vol. 31 (2018)

    Google Scholar 

  23. Hemmer, P., Schellhammer, S., Vössing, M., Jakubik, J., Satzger, G.: Forming effective human-AI teams: building machine learning models that complement the capabilities of multiple experts. In: Raedt, L.D. (ed.) IJCAI, pp. 2478–2484. International Joint Conferences on Artificial Intelligence Organization (2022). https://doi.org/10.24963/ijcai.2022/344

  24. Jaehwan, L., Donggeun, Y., Hyo-Eun, K.: Photometric transformer networks and label adjustment for breast density prediction. In: International Conference on Computer Vision Workshops (2019)

    Google Scholar 

  25. Jang, E., Gu, S., Poole, B.: Categorical reparameterization with Gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016)

  26. Jiang, L., Zhou, Z., Leung, T., Li, L.J., Fei-Fei, L.: MentorNet: learning data-driven curriculum for very deep neural networks on corrupted labels. In: ICML (2018)

    Google Scholar 

  27. Kamar, E., Hacker, S., Horvitz, E.: Combining human and machine intelligence in large-scale crowdsourcing. In: AAMAS, vol. 12, pp. 467–474 (2012)

    Google Scholar 

  28. Kerrigan, G., Smyth, P., Steyvers, M.: Combining human predictions with model probabilities via confusion matrices and calibration. In: NeurIPS, vol. 34, pp. 4421–4434 (2021)

    Google Scholar 

  29. Keswani, V., Lease, M., Kenthapadi, K.: Towards unbiased and accurate deferral to multiple experts. In: Proceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, pp. 154–165 (2021)

    Google Scholar 

  30. Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)

    Google Scholar 

  31. Li, J., Socher, R., Hoi, S.C.: DivideMix: learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020)

  32. Liu, M., Wei, J., Liu, Y., Davis, J.: Do humans and machines have the same eyes? Human-machine perceptual differences on image classification. arXiv preprint arXiv:2304.08733 (2023)

  33. Liu, Y., Cheng, H., Zhang, K.: Identifiability of label noise transition matrix. In: ICML, pp. 21475–21496. PMLR (2023)

    Google Scholar 

  34. Lu, Z., Yin, M.: Human reliance on machine learning models when performance feedback is limited: Heuristics and risks. In: Proceedings of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–16 (2021)

    Google Scholar 

  35. Madras, D., Pitassi, T., Zemel, R.: Predict responsibly: improving fairness and accuracy by learning to defer. In: NeurIPS, vol. 31 (2018)

    Google Scholar 

  36. Mao, A., Mohri, C., Mohri, M., Zhong, Y.: Two-stage learning to defer with multiple experts. In: NeurIPS (2023)

    Google Scholar 

  37. Mozannar, H., Lang, H., Wei, D., Sattigeri, P., Das, S., Sontag, D.: Who should predict? Exact algorithms for learning to defer to humans. In: Ruiz, F., Dy, J., van de Meent, J.W. (eds.) Proceedings of The 26th International Conference on Artificial Intelligence and Statistics. Proceedings of Machine Learning Research, vol. 206, pp. 10520–10545. PMLR (2023). https://proceedings.mlr.press/v206/mozannar23a.html

  38. Mozannar, H., Sontag, D.: Consistent estimators for learning to defer to an expert. In: ICML, pp. 7076–7087. PMLR (2020)

    Google Scholar 

  39. Narasimhan, H., Jitkrittum, W., Menon, A.K., Rawat, A., Kumar, S.: Post-hoc estimators for learning to defer to an expert. In: NeurIPS, vol. 35, pp. 29292–29304 (2022)

    Google Scholar 

  40. Okati, N., De, A., Rodriguez, M.: Differentiable learning under triage. In: NeurIPS, vol. 34, pp. 9140–9151 (2021)

    Google Scholar 

  41. Ortego, D., Arazo, E., Albert, P., O’Connor, N.E., McGuinness, K.: Multi-objective interpolation training for robustness to label noise. In: CVPR, pp. 6606–6615 (2021)

    Google Scholar 

  42. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. In: NeurIPS, vol. 32 (2019)

    Google Scholar 

  43. Peterson, J.C., Battleday, R.M., Griffiths, T.L., Russakovsky, O.: Human uncertainty makes classification more robust. In: ICCV, pp. 9617–9626 (2019)

    Google Scholar 

  44. Pradier, M.F., Zazo, J., Parbhoo, S., Perlis, R.H., Zazzi, M., Doshi-Velez, F.: Preferential mixture-of-experts: interpretable models that rely on human expertise as much as possible. In: AMIA Summits on Translational Science Proceedings 2021, p. 525 (2021)

    Google Scholar 

  45. Raghu, M., Blumer, K., Corrado, G., Kleinberg, J., Obermeyer, Z., Mullainathan, S.: The algorithmic automation problem: prediction, triage, and human effort. arXiv preprint arXiv:1903.12220 (2019)

  46. Raykar, V.C., et al.: Learning from crowds. J. Mach. Learn. Res. 11(4) (2010)

    Google Scholar 

  47. Ren, M., Zeng, W., Yang, B., Urtasun, R.: Learning to reweight examples for robust deep learning. In: ICML (2018)

    Google Scholar 

  48. Rodrigues, F., Lourenco, M., Ribeiro, B., Pereira, F.C.: Learning supervised topic models for classification and regression from crowds. IEEE TPAMI 39(12), 2409–2422 (2017)

    Article  Google Scholar 

  49. Rodrigues, F., Pereira, F.: Deep learning from crowds. In: AAAI, vol. 32 (2018)

    Google Scholar 

  50. Rodrigues, F., Pereira, F., Ribeiro, B.: Gaussian process classification and active learning with multiple annotators. In: ICML, pp. 433–441. PMLR (2014)

    Google Scholar 

  51. Rosenfeld, A., Solbach, M.D., Tsotsos, J.K.: Totally looks like-how humans compare, compared to machines. In: CVPRW, pp. 1961–1964 (2018)

    Google Scholar 

  52. Sachdeva, R., Cordeiro, F.R., Belagiannis, V., Reid, I., Carneiro, G.: ScanMix: learning from severe label noise via semantic clustering and semi-supervised learning. Pattern Recogn. 134, 109121 (2023)

    Article  Google Scholar 

  53. Serre, T.: Deep learning: the good, the bad, and the ugly. Annu. Rev. Vision Sci. 5, 399–426 (2019)

    Article  Google Scholar 

  54. Shin, D.: The effects of explainability and causability on perception, trust, and acceptance: implications for explainable AI. Int. J. Hum Comput Stud. 146, 102551 (2021)

    Article  Google Scholar 

  55. Song, H., Kim, M., Park, D., Shin, Y., Lee, J.G.: Learning from noisy labels with deep neural networks: a survey. IEEE Trans. Neural Netw. Learn. Syst. (2022)

    Google Scholar 

  56. Steyvers, M., Tejeda, H., Kerrigan, G., Smyth, P.: Bayesian modeling of human-AI complementarity. Proc. Nat. Acad. Sci. 119(11), e2111547119 (2022)

    Article  MathSciNet  Google Scholar 

  57. Verma, R., Barrejón, D., Nalisnick, E.: On the calibration of learning to defer to multiple experts. In: Workshop on Human-Machine Collaboration and Teaming in International Conference of Machine Learning (2022)

    Google Scholar 

  58. Verma, R., Barrejon, D., Nalisnick, E.: Learning to defer to multiple experts: consistent surrogate losses, confidence calibration, and conformal ensembles. In: International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 11415–11434. Proceedings of Machine Learning Research, PMLR (2023). https://proceedings.mlr.press/v206/verma23a.html

  59. Verma, R., Nalisnick, E.: Calibrated learning to defer with one-vs-all classifiers. In: ICML, pp. 22184–22202. PMLR (2022)

    Google Scholar 

  60. Vodrahalli, K., Daneshjou, R., Gerstenberg, T., Zou, J.: Do humans trust advice more if it comes from AI? an analysis of human-AI interactions. In: Proceedings of the 2022 AAAI/ACM Conference on AI, Ethics, and Society, pp. 763–777 (2022)

    Google Scholar 

  61. Wang, H., Xiao, R., Dong, Y., Feng, L., Zhao, J.: ProMix: combating label noise via maximizing clean sample utility. arXiv preprint arXiv:2207.10276 (2022)

  62. Wei, H., Xie, R., Feng, L., Han, B., An, B.: Deep learning from multiple noisy annotators as a union. IEEE Trans. Neural Netw. Learn. Syst. (2022)

    Google Scholar 

  63. Wei, J., et al.: Emergent abilities of large language models. arXiv preprint arXiv:2206.07682 (2022)

  64. Wei, J., Zhu, Z., Cheng, H., Liu, T., Niu, G., Liu, Y.: Learning with noisy labels revisited: a study using real-world human annotations. arXiv preprint arXiv:2110.12088 (2021)

  65. Weitz, K., Schiller, D., Schlagowski, R., Huber, T., André, E.: “Do you trust me?” increasing user-trust by integrating virtual agents in explainable AI interaction design. In: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, pp. 7–9 (2019)

    Google Scholar 

  66. Whitehill, J., Wu, T.f., Bergsma, J., Movellan, J., Ruvolo, P.: Whose vote should count more: optimal integration of labels from labelers of unknown expertise. In: NeurIPS, vol. 22 (2009)

    Google Scholar 

  67. Wilder, B., Horvitz, E., Kamar, E.: Learning to complement humans. In: IJCAI (2021)

    Google Scholar 

  68. Wu, X., Xiao, L., Sun, Y., Zhang, J., Ma, T., He, L.: A survey of human-in-the-loop for machine learning. Future Gener. Comput. Syst. 135(C), 364–381 (2022). https://doi.org/10.1016/j.future.2022.05.014

    Article  Google Scholar 

  69. Xia, X., et al.: Sample selection with uncertainty of losses for learning with noisy labels. arXiv preprint arXiv:2106.00445 (2021)

  70. Xu, Y., Zhu, L., Jiang, L., Yang, Y.: Faster meta update strategy for noise-robust deep learning. In: CVPR (2021)

    Google Scholar 

  71. Yin, M., Wortman Vaughan, J., Wallach, H.: Understanding the effect of accuracy on trust in machine learning models. In: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, pp. 1–12 (2019)

    Google Scholar 

  72. Yuan, B., Chen, J., Zhang, W., Tai, H.S., McMains, S.: Iterative cross learning on noisy labels. In: Winter Conference on Applications of Computer Vision, pp. 757–765 (2018)

    Google Scholar 

  73. Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: Mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)

  74. Zhang, Z., Wells, K., Carneiro, G.: Learning to complement with multiple humans (LECOMH): integrating multi-rater and noisy-label learning into human-AI collaboration. arXiv preprint arXiv:2311.13172 (2023)

  75. Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. In: NeurIPS, pp. 8778–8788 (2018)

    Google Scholar 

  76. Zhang, Z., Pfister, T.: Learning fast sample re-weighting without reward data. In: ICCV (2021)

    Google Scholar 

  77. Zhang, Z., Zhang, H., Arik, S.Ö., Lee, H., Pfister, T.: Distilling effective supervision from severe label noise. In: CVPR, pp. 9291–9300 (2020)

    Google Scholar 

  78. Zhou, Z.H.: Ensemble Methods: Foundations and Algorithms. CRC Press (2012)

    Google Scholar 

  79. Zhu, C., Chen, W., Peng, T., Wang, Y., Jin, M.: Hard sample aware noise robust learning for histopathology image classification. IEEE Trans. Med. Imaging 41(4), 881–894 (2021)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gustavo Carneiro .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 165 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhang, Z., Ai, W., Wells, K., Rosewarne, D., Do, TT., Carneiro, G. (2025). Learning to Complement and to Defer to Multiple Users. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15114. Springer, Cham. https://doi.org/10.1007/978-3-031-72992-8_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72992-8_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72991-1

  • Online ISBN: 978-3-031-72992-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics