Abstract
The U-Net model, introduced in 2015, is established as the state-of-the-art architecture for medical image segmentation, along with its variants UNet++, nnU-Net, V-Net, etc. Vision transformers made a breakthrough in the computer vision world in 2021. Since then, many transformer based architectures or hybrid architectures (combining convolutional blocks and transformer blocks) have been proposed for image segmentation, that are challenging the predominance of U-Net. In this paper, we ask the question whether transformers could overtake U-Net for medical image segmentation. We compare SegFormer, one of the most popular transformer architectures for segmentation, to U-Net using three publicly available medical image datasets that include various modalities and organs: segmentation of cardiac structures in ultrasound images from the CAMUS challenge, segmentation of polyp in endoscopy images and segmentation of instrument in colonoscopy images from the MedAI challenge. We compare them in the light of various metrics (segmentation performance, training time) and show that SegFormer can be a true competitor to U-Net and should be carefully considered for future tasks in medical image segmentation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Azad, R., et al.: Advances in medical image analysis with vision transformers: a comprehensive review (2023). https://doi.org/10.48550/arXiv.2301.03505, arXiv:2301.03505
Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Dosovitskiy, A., et al.: An image is worth 16x16 words: transformers for image recognition at scale. CoRR abs/2010.11929 (2020). https://arxiv.org/abs/2010.11929
Galdran, A., Anjos, A., Dolz, J., et al.: State-of-the-art retinal vessel segmentation with minimalistic models. Sci. Rep. 12, 6174 (2022). https://doi.org/10.1038/s41598-022-09675-y
Hatamizadeh, A., et al.: UNETR: transformers for 3d medical image segmentation. In: WACV, pp. 1748–1758 (2022)
He, K., et al.: Transformers in medical image analysis. Intell. Med. 3(1), 59–78 (2023). https://doi.org/10.1016/j.imed.2022.07.002, https://www.sciencedirect.com/science/article/pii/S2667102622000717
Isensee, F., et al.: nnU-Net: Self-adapting framework for u-net-based medical image segmentation. CoRR abs/1809.10486 (2018). http://arxiv.org/abs/1809.10486
Kirillov, A., et al.: Segment anything (2023)
Leclerc, S., et al.: Deep learning for segmentation using an open large-scale dataset in 2d echocardiography. IEEE Trans. Med. Imaging 38(9), 2198–2210 (2019). https://doi.org/10.1109/tmi.2019.2900516
Li, H., Hu, D., Liu, H., Wang, J., Oguz, I.: Cats: complementary CNN and transformer encoders for segmentation. In: 2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pp. 1–5 (2022). https://doi.org/10.1109/ISBI52829.2022.9761596
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Wang, W., et al.: Pyramid vision transformer: a versatile backbone for dense prediction without convolutions. CoRR abs/2102.12122 (2021). https://arxiv.org/abs/2102.12122
Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J.M., Luo, P.: SegFormer: simple and efficient design for semantic segmentation with transformers. CoRR abs/2105.15203 (2021). https://arxiv.org/abs/2105.15203
Zheng, S., et al.: Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. CoRR abs/2012.15840 (2020). https://arxiv.org/abs/2012.15840
Acknowledgments
The authors acknowledge the support of the French Agence Nationale de la Recherche (ANR), under grant Project-ANR-21-CE23-0013 (project MediSEG).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Sourget, T., Hasany, S.N., Mériaudeau, F., Petitjean, C. (2024). Can SegFormer be a True Competitor to U-Net for Medical Image Segmentation?. In: Waiter, G., Lambrou, T., Leontidis, G., Oren, N., Morris, T., Gordon, S. (eds) Medical Image Understanding and Analysis. MIUA 2023. Lecture Notes in Computer Science, vol 14122. Springer, Cham. https://doi.org/10.1007/978-3-031-48593-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-48593-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-48592-3
Online ISBN: 978-3-031-48593-0
eBook Packages: Computer ScienceComputer Science (R0)