Abstract
Effective confidence estimation is desired for image classification tasks like clinical diagnosis based on medical imaging. However, it is well known that modern neural networks often show over-confidence in their predictions. Deep Ensemble (DE) is one of the state-of-the-art methods to estimate reliable confidence. In this work, we observed that DE sometimes harms the confidence estimation due to relatively lower confidence output for correctly classified samples. Motivated by the observation that a doctor often refers to other doctors’ opinions to adjust the confidence for his or her own decision, we propose a simple but effective post-hoc confidence estimation method called Deep Model Reference (DMR). Specifically, DMR employs one individual model to make decision while a group of individual models to help estimate the confidence for its decision. Rigorous proof and extensive empirical evaluations show that DMR achieves superior performance in confidence estimation compared to DE and other state-of-the-art methods, making trustworthy image classification more practical. Source code is available at https://openi.pcl.ac.cn/OpenMedIA/MICCAI2024_DMR.
Y. Qiu—Work not related to position at Amazon.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Al-Dhabyani, W., Gomaa, M., Khaled, H., Fahmy, A.: Dataset of breast ultrasound images. Data Brief 28, 104863 (2020)
Allen-Zhu, Z., Li, Y.: Towards understanding ensemble, knowledge distillation and self-distillation in deep learning. In: ICLR (2023)
Corbière, C., Thome, N., Bar-Hen, A., Cord, M., Pérez, P.: Addressing failure prediction by learning model confidence. In: NeurIPS (2019)
Dietterich, T.G.: Ensemble methods in machine learning. In: Multiple Classifier Systems, pp. 1–15. Springer, Heidelberg (2000). https://doi.org/10.1007/3-540-45014-9_1
Ding, Q., Cao, Y., Luo, P.: Top-ambiguity samples matter: understanding why deep ensemble works in selective classification. In: NeurIPS (2023)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. In: ICLR (2021)
Galdran, A., Verjans, J.W., Carneiro, G., González Ballester, M.A.: Multi-head Multi-loss model calibration. In: Greenspan, H., et al. (eds.) MICCAI 2023, Part III, pp. 108–117. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-43898-1_11
Geifman, Y., El-Yaniv, R.: Selective classification for deep neural networks. In: NeurIPS (2017)
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: ICML (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)
Hendrycks, D., Dietterich, T.G.: Benchmarking neural network robustness to common corruptions and perturbations. In: ICLR (2019)
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: ICLR (2017)
Hendrycks, D., Mazeika, M., Dietterich, T.G.: Deep anomaly detection with outlier exposure. In: ICLR (2019)
Huang, G., Liu, Z., Van Der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: CVPR (2017)
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images. Tech. rep. (2009)
Lakshminarayanan, B., Pritzel, A., Blundell, C.: Simple and scalable predictive uncertainty estimation using deep ensembles. In: NeurIPS (2017)
Laurent, O., et al.: Packed ensembles for efficient uncertainty estimation. In: ICLR (2023)
Loh, C., et al.: Multi-symmetry ensembles: improving diversity and generalization via opposing symmetries. In: ICML (2023)
Moon, J., Kim, J., Shin, Y., Hwang, S.: Confidence-aware learning for deep neural networks. In: ICML (2020)
Rahaman, R., Thiéry, A.H.: Uncertainty quantification and deep ensembles. In: NeurIPS (2021)
Ramé, A., Cord, M.: DICE: diversity in deep ensembles via conditional redundancy adversarial estimation. In: ICLR (2021)
Vinyals, O., Blundell, C., Lillicrap, T., Kavukcuoglu, K., Wierstra, D.: Matching networks for one shot learning. In: NeurIPS (2016)
Wen, Y., Tran, D., Ba, J.: Batchensemble: an alternative approach to efficient ensemble and lifelong learning. In: ICLR (2020)
Xia, G., Bouganis, C.: Window-based early-exit cascades for uncertainty estimation: When deep ensembles are more efficient than single models. In: ICCV (2023)
Yang, X., He, X., Zhao, J., Zhang, Y., Zhang, S., Xie, P.: Covid-CT-dataset: a CT scan dataset about covid-19 (2020)
Yang, Y., Cui, Z., Xu, J., Zhong, C., Zheng, W., Wang, R.: Continual learning with bayesian model based on a fixed pre-trained feature extractor. Visual Intelligence 1(1) (2023)
Zagoruyko, S., Komodakis, N.: Wide residual networks. In: BMVC (2016)
Zhang, X.Y., Xie, G.S., Li, X., Mei, T., Liu, C.L.: A survey on learning to reject. Proc. IEEE 111(2), 185–215 (2023)
Zheng, X., et al.: A deep learning model and human-machine fusion for prediction of EBV-associated gastric cancer from histopathology. Nat. Commun. 13(1), 2790 (2022)
Zhou, B., Lapedriza, A., Khosla, A., Oliva, A., Torralba, A.: Places: a 10 million image database for scene recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1452–1464 (2017)
Zhou, W., et al.: Interpretable artificial intelligence-based app assists inexperienced radiologists in diagnosing biliary atresia from sonographic gallbladder images. BMC Med. 22(1), 29 (2024)
Zhu, F., Cheng, Z., Zhang, X.-Y., Liu, C.-L.: Rethinking confidence calibration for failure prediction. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022, Part XXV, pp. 518–536. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19806-9_30
Zhu, F., Cheng, Z., Zhang, X.Y., Liu, C.L.: Openmix: exploring outlier samples for misclassification detection. In: CVPR (2023)
Zhu, F., Zhang, X.Y., Wang, R.Q., Liu, C.L.: Learning by seeing more classes. IEEE Trans. Pattern Anal. Mach. Intell. 45, 7477–7493 (2022)
Acknowledgments
This work is supported in part by the National Natural Science Foundation of China (grant No. 62071502), the Major Key Project of PCL (grant No. PCL2023A09), and Guangdong Excellent Youth Team Program (grant No. 2023B1515040025).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zheng, Y., Qiu, Y., Che, H., Chen, H., Zheng, WS., Wang, R. (2024). Deep Model Reference: Simple Yet Effective Confidence Estimation for Image Classification. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15010. Springer, Cham. https://doi.org/10.1007/978-3-031-72117-5_17
Download citation
DOI: https://doi.org/10.1007/978-3-031-72117-5_17
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72116-8
Online ISBN: 978-3-031-72117-5
eBook Packages: Computer ScienceComputer Science (R0)