Abstract
Accurate fetal brain MRI image segmentation is essential for fetal disease diagnosis and treatment. While manual segmentation is laborious, time-consuming, and error-prone, automated segmentation is a challenging task owing to (1) the variations in shape and size of brain structures among patients, (2) the subtle changes caused by congenital diseases, and (3) the complicated anatomy of brain. It is critical to effectively capture the long-range dependencies and correlations among training samples to yield satisfactory results. Recently, some transformer-based models have been proposed and achieved good performance in segmentation tasks. However, the self-attention blocks embedded in transformers often neglect the latent relationships among different samples. Model may have biased results due to the unbalanced data distribution in the training dataset. We propose a novel unbalanced weighted Unet equipped with a new ExSwin transformer block to comprehensively address the above concerns by effectively capturing long-range dependencies and correlations among different samples. We design a deeper encoder to facilitate features extracting and preserving more semantic details. In addition, an adaptive weight adjusting method is implemented to dynamically adjust the loss weight of different classes to optimize learning direction and extract more features from under-learning classes. Extensive experiments on a FeTA dataset demonstrate the effectiveness of our model, achieving better results than state-of-the-art approaches.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cao, H., et al.: Swin-Unet: Unet-like pure transformer for medical image segmentation. arXiv preprint arXiv:2105.05537 (2021)
Chen, J., et al.: TransUNet: transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2017)
Clouchoux, C., et al.: Delayed cortical development in fetuses with complex congenital heart disease. Cereb. Cortex 23(12), 2932–2943 (2013)
Dosovitskiy, A., et al.: An image is worth \(16 \times 16\) words: transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Egaña-Ugrinovic, G., Sanz-Cortes, M., Figueras, F., Bargalló, N., Gratacós, E.: Differences in cortical development assessed by fetal MRI in late-onset intrauterine growth restriction. Am. J. Obstet. Gynecol. 209(2), 126-e1 (2013)
Gao, Y., Zhou, M., Metaxas, D.N.: UTNet: a hybrid transformer architecture for medical image segmentation. In: de Bruijne, M., et al. (eds.) MICCAI 2021. LNCS, vol. 12903, pp. 61–71. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87199-4_6
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Guo, M.H., Cai, J.X., Liu, Z.N., Mu, T.J., Martin, R.R., Hu, S.M.: PCT: point cloud transformer. Comput. Vis. Media 7(2), 187–199 (2021)
Guo, M.H., Liu, Z.N., Mu, T.J., Hu, S.M.: Beyond self-attention: external attention using two linear layers for visual tasks. arXiv preprint arXiv:2105.02358 (2021)
Guo, R., Niu, D., Qu, L., Li, Z.: SOTR: segmenting objects with transformers. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7157–7166 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Li, Z., Pan, J., Wu, H., Wen, Z., Qin, J.: Memory-efficient automatic kidney and tumor segmentation based on non-local context guided 3D U-Net. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12264, pp. 197–206. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59719-1_20
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Liu, L., et al.: Global, regional, and national causes of child mortality: an updated systematic analysis for 2010 with time trends since 2000. Lancet 379(9832), 2151–2161 (2012)
Liu, Z., et al.: Swin transformer v2: scaling up capacity and resolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12009–12019 (2022)
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Oktay, O., et al.: Attention U-Net: learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Payette, K., et al.: An automatic multi-tissue human fetal brain segmentation benchmark using the fetal tissue annotation dataset. Sci. Data 8(1), 1–14 (2021)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Strudel, R., Garcia, R., Laptev, I., Schmid, C.: Transformer for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7262–7272 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Wu, H., Lu, X., Lei, B., Wen, Z.: Automated left ventricular segmentation from cardiac magnetic resonance images via adversarial learning with multi-stage pose estimation network and co-discriminator. Med. Image Anal. 68, 101891 (2021)
Wu, H., Pan, J., Li, Z., Wen, Z., Qin, J.: Automated skin lesion segmentation via an adaptive dual attention module. IEEE Trans. Med. Imaging 40(1), 357–370 (2020)
Zhang, M., Lucas, J., Ba, J., Hinton, G.E.: Lookahead optimizer: k steps forward, 1 step back. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Zugazaga Cortazar, A., Martín Martinez, C., Duran Feliubadalo, C., Bella Cueto, M.R., Serra, L.: Magnetic resonance imaging in the prenatal diagnosis of neural tube defects. Insights Imaging 4(2), 225–237 (2013). https://doi.org/10.1007/s13244-013-0223-2
Acknowledgments
This work was supported partly by National Natural Science Foundation of China (No. 61973221), Natural Science Foundation of Guangdong Province, China (No. 2019A1515011165), the Innovation and Technology Fund-Mainland-Hong Kong Joint Funding Scheme (ITF-MHKJFS) (No. MHP/014/20) and the Project of Strategic Importance grant of The Hong Kong Polytechnic University (No. 1-ZE2Q).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wen, Y., Liang, C., Lin, J., Wu, H., Qin, J. (2023). ExSwin-Unet: An Unbalanced Weighted Unet with Shifted Window and External Attentions for Fetal Brain MRI Image Segmentation. In: Karlinsky, L., Michaeli, T., Nishino, K. (eds) Computer Vision – ECCV 2022 Workshops. ECCV 2022. Lecture Notes in Computer Science, vol 13803. Springer, Cham. https://doi.org/10.1007/978-3-031-25066-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-25066-8_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25065-1
Online ISBN: 978-3-031-25066-8
eBook Packages: Computer ScienceComputer Science (R0)