Abstract
The segmentation foundation model, e.g., Segment Anything Model (SAM), has attracted increasing interest in the medical image community. Early pioneering studies primarily concentrated on assessing and improving SAM’s performance from the perspectives of overall accuracy and efficiency, yet little attention was given to the fairness considerations. This oversight raises questions about the potential for performance biases that could mirror those found in task-specific deep learning models like nnU-Net. In this paper, we explored the fairness dilemma concerning large segmentation foundation models. We prospectively curate a benchmark dataset of 3D MRI and CT scans of the organs including liver, kidney, spleen, lung and aorta from a total of 1056 healthy subjects with expert segmentations. Crucially, we document demographic details such as gender, age, and body mass index (BMI) for each subject to facilitate a nuanced fairness analysis. We test state-of-the-art foundation models for medical image segmentation, including the original SAM, medical SAM and Segment Anything in medical scenarios, driven by Text prompts(SAT-Nano) models, to evaluate segmentation efficacy across different demographic groups and identify disparities. Our comprehensive analysis, which accounts for various confounding factors, reveals significant fairness concerns within these foundational models. Moreover, our findings highlight not only disparities in overall segmentation metrics, such as the Dice Similarity Coefficient but also significant variations in the spatial distribution of segmentation errors, offering empirical evidence of the nuanced challenges in ensuring fairness in medical image segmentation.
Q. Li and Y. Zhang—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Afzal, M.M., Khan, M.O., Mirza, S.: Towards equitable kidney tumor segmentation: bias evaluation and mitigation. In: Machine Learning for Health (ML4H), pp. 13–26. PMLR (2023)
Chen, R.J.: Algorithmic fairness in artificial intelligence for medicine and healthcare. Nat. Biomed. Eng. 7(6), 719–742 (2023)
Cheng, J., et al.: Sam-med2d (2023)
Dixon, W.T.: Simple proton spectroscopic imaging. Radiology 153(1), 189–194 (1984)
Gaggion, N., Echeveste, R., Mansilla, L., Milone, D.H., Ferrante, E.: Unsupervised bias discovery in medical image segmentation. In: Workshop on Clinical Image-Based Procedures, pp. 266–275. Springer (2023)
He, S., Bao, R., Li, J., Grant, P.E., Ou, Y.: Accuracy of segment-anything model (sam) in medical image segmentation tasks. arXiv preprint arXiv:2304.09324 (2023)
Huang, Y., et al.: Segment anything model for medical images? Med. Image Anal. 92, 103061 (2024)
Kirillov, A., et al.: Segment anything. arXiv preprintarXiv:2304.02643 (2023)
Ma, J., He, Y., Li, F., Han, L., You, C., Wang, B.: Segment anything in medical images. Nat. Commun. 15(1), 654 (2024)
Seyyed-Kalantari, L., Zhang, H., McDermott, M.B.A., Chen, I.Y., Ghassemi, M.: Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27(12), 2176–2182 (2021)
Wang, H., et al.: Sam-med3d. arXiv preprintarXiv:2310.15161 (2023)
Wang, M., Deng, W.: Mitigating bias in face recognition using skewness-aware reinforcement learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9322–9331 (2020)
Wu, J.: Medical sam adapter: adapting segment anything model for medical image segmentation. arXiv preprintarXiv:2304.12620 (2023)
Xu, Z., Li, J., Yao, Q., Li, H., Zhou, S.K.: Fairness in medical image analysis and healthcare: A literature survey. Authorea Prepr. (2023)
Yoon, J.S., Park, Y.M., Zhang, C., Hong, C.S.: Privacy-preserving continuous learning for mobileSAM via federated learning. In: 2023 International Conference on Advanced Technologies for Communications (ATC), pp. 388–392. IEEE (2023)
Zhang, Y., Zhou, T., Wang, S., Liang, P., Zhang, Y., Chen, D.Z.: Input augmentation with sam: boosting medical image segmentation with segmentation foundation model. In: International Conference on Medical Image Computing and Computer-Assisted Intervention Workshops, pp. 129–139. Springer (2023)
Zhao, Z., et al.: One model to rule them all: Towards universal segmentation for medical images with text prompt. arXiv preprintarXiv:2312.17183 (2023)
We thank Liguo Jia and Xiaoqing Qiao for their annotation work on the in-house dataset in this study. This study was supported in part by the National Natural Science Foundation of China (No. 62331021, No. 62201263), the Shanghai Sailing Program under Grant 22YF1409300, the China Computer Federation (CCF)-Baidu Open Fund under Grant CCF-BAIDU 202316, and the International Science and Technology Cooperation Program under the 2023 Shanghai Action Plan for Science under Grant 23410710400.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Ethics declarations
Disclosure of Interests
The authors have no competing interests to declare that are relevant to the content of this article.
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Q. et al. (2024). An Empirical Study on the Fairness of Foundation Models for Multi-Organ Image Segmentation. In: Linguraru, M.G., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2024. MICCAI 2024. Lecture Notes in Computer Science, vol 15012. Springer, Cham. https://doi.org/10.1007/978-3-031-72390-2_41
Download citation
DOI: https://doi.org/10.1007/978-3-031-72390-2_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72389-6
Online ISBN: 978-3-031-72390-2
eBook Packages: Computer ScienceComputer Science (R0)