Abstract
Medical images have low contrast and blurred boundaries between different tissues or between tissues and lesions. Because labeling medical images is laborious and requires expert knowledge, the labeled data are expensive or simply unavailable. UNet has achieved great success in the field of medical image segmentation. However, the pooling layer in downsampling tends to discard important information such as location information. It is difficult to learn global and long-range semantic interactive information well due to the locality of convolution operation. The usual solution is increasing the number of datasets or enhancing the training data though augmentation methods. However, to obtain a large number of medical datasets is tough, and the augmentation methods may increase the training burden. In this work, we propose a 2D medical image segmentation network with a convolutional capsule encoder and a multiscale local co-occurrence module. To extract more local detail and contextual information, the capsule encoder is introduced to learn the information about the target location and the relationship between the part and the whole. Multi-scale features can be fused by a new attention mechanism, which can then selectively emphasize salient features useful for a specific task by capturing global information and suppress background noise. The proposed attention mechanism is used to preserve the information that is discarded by pooling layers of the network. In addition, a multi-scale local co-occurrence algorithm is proposed, where the context and dependencies between different regions in an image can be better learned. Experimental results on the dataset of Liver, ISIC and BraTS2019 show that our network is superior to the UNet and other previous medical image segmentation networks under the same experimental conditions.
Similar content being viewed by others
Data availability
The data that support the findings of this study are available from LiTS (Liver Tumor Segmentation Challenge), ISIC 2018 Skin Lesion Analysis Towards Melanoma Detection and Multimodal Brain Tumor Segmentation Challenge (BraTS 2019) but restrictions apply to the availability of these data, which were used under licence for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of LiTS, ISIC and BraTS.
Code availability
The code can be made available on request.
References
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5–9, 2015, Proceedings, Part III 18 (pp. 234–241). Springer International Publishing (2015)
Maji, D., Sigedar, P., Singh, M.: Attention Res-UNet with guided decoder for semantic segmentation of brain tumors. Biomed. Signal Process. Control. 71, 103077 (2022)
Tulsani, A., Kumar, P., Pathan, S.: Automated segmentation of optic disc and optic cup for glaucoma assessment using improved unet + + architecture. Biocybernetics Biomedical Eng. 41(18) (2021)
Aslam, M.S., Younas, M., Sarwar, M.U., Shah, M.A., Zaindin, M.: Liver-tumor detection using cnn resunet. Computers Mater. Continua. 67(2), 1899–1914 (2021)
Alom, M.Z., Hasan, M., Yakopcic, C., Taha, T.M., Asari, V.K.: Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv Preprint arXiv:180206955. (2018)
Gu, Z., Cheng, J., Fu, H., Zhou, K., Hao, H., Zhao, Y., Liu, J.: Ce-net: Context encoder network for 2d medical image segmentation. IEEE Trans. Med. Imaging. 38(10), 2281–2292 (2019)
LaLonde, R., Bagci, U.: Capsules for object segmentation. arXiv Preprint arXiv:180404241. (2018)
Wang, D., Liu, Q.: An optimization view on dynamic routing between capsules (2018)
Survarachakan, S., Johansen, J.S., Pedersen, M.A., Amani, M., Lindseth, F.: Capsule nets for complex medical image segmentation tasks. In CVCS (2020)
Hinton, G.E., Sabour, S., Frosst, N.: Matrix capsules with EM routing. In: International Conference on Learning Representations, May 2018
Hu, J., Shen, L., Albanie, S., Sun, G., Wu, E.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020)
Ribeiro, A.H., Tiels, K., Aguirre, L.A., Schn, T.B.: Beyond exploding and vanishing gradients: Analysing rnn training using attractors and smoothness (2019)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2015)
Kosgiker, G.M., Deshpande, A., Anjum, K.: Significant of multi-level pre-processing steps and its proper sequence in segcaps skin lesion segmentation of dermoscopic images. Mater. Today Proc. (2) (2021)
Guo, M.H., Xu, T.X., Liu, J.J., Liu, Z.N., Jiang, P.T., Mu, T.J., et al.: Attention mechanisms in computer vision:a survey. 8(3), 38 (2022)
Zhang, Z., Sabuncu, M.: Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural. Inf. Process. Syst. 31 (2018)
Hardie, R.C., Ali, R., Silva, M.D., Kebede, T.M.: Skin lesion segmentation and classification for ISIC 2018 using traditional classifiers with Hand-crafted features. arXiv e-prints (2018). https://doi.org/10.48550/arXiv.1807.07001
Aljanabi, M., Abdullah, A.S., Mohammed, J.K., Alan, N.: Assessment of skin lesions segmentation on database isic 2018 by bee colony link. IOP Conf. Series Mater. Sci. Eng. 1076(1) (2021)
Heimann, T., Ginneken, B.V., Styner, M.A., Arzhaeva, Y., Wolf, I.: Comparison and evaluation of methods for liver segmentation from ct datasets. IEEE Trans. Med. Imaging. 28(8), 1251–1265 (2009)
Fan, L., Zhao, B., Kijewski, P.K., Liang, W., Schwartz, L.H.: Liver segmentation for ct images using gvf snake. Med. Phys. 32(12) (2005)
Beichel12, R., Bauer, C., Bornik, A., Sorantin, E., Bischof, H.: Liver segmentation in CT data: A segmentation refinement approach. In: Proceedings of 3D Segmentation in The Clinic: A Grand Challenge, pp. 235–245 (2007)
Bock, S., Goppold, J., Wei, M.: An improvement of the convergence proof of the ADAM-Optimizer (2018). https://doi.org/10.48550/arXiv.1804.10587
Kumar, E.: An efficient image classification of malaria parasite using convolutional neural network and adam optimizer. Turkish J. Comput. Math. Educ. (TURCOMAT). 12(2), 3376–3384 (2021)
Li, X., Sun, X., Meng, Y., Liang, J., Wu, F., Li, J.: Dice loss for data-imbalanced NLP tasks. arXiv preprint arXiv:1911.02855 (2019)
Chen, Y., Wang, K., Liao, X., Qian, Y., Heng, P.A.: Channel-unet: A spatial channel-wise convolutional neural network for liver and tumors segmentation. Front. Genet. 10, 1110 (2019)
Krizhevsky, A., Sutskever, I., Hinton, G.: Imagenet classification with deep convolutional neural networks. Adv. Neural. Inf. Process. Syst. 25(2) (2012)
Survarachakan, S., Johansen, J.S., Aarseth, M., Pedersen, M.A., Lindseth, F.: Capsule nets for complex medical image segmentation tasks. CVCS (2020)
Tran, M., Ly, L., Hua, B.-S., Le, N.: Ss-3dcapsnet: Self-supervised 3d capsule networks for medical segmentation on less labeled data. arXiv Preprint arXiv:220105905 (2022)
Jim´enez-S´anchez, S., Albarqouni, Mateus, D.: Capsule networks against medical imaging data challenges. In: Intravascular Imaging and Computer Assisted Stenting and Large-Scale Annotation of Biomedical Data and Expert Label Synthesis, pp. 150–160. Springer (2018)
Niu, Z., Zhong, G., Yu, H.: A review on the attention mechanism of deep learning. Neurocomputing. 452, 48–62 (2021)
Guo, M.H., Xu, T.X., Liu, J.J., Liu, Z.N., Jiang, P.T., Mu, T.J., Hu, S.M.: Attention mechanisms in computer vision: A survey. Comput. Visual Media. 8(3), 331–368 (2022)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778 (2016)
Survarachakan, S., Johansen, J.S., Pedersen, M.A., Amani, M., Lindseth, F.: CVCS. Capsule nets for complex medical image segmentation tasks (2020)
Nguyen, T., Hua, B.S., Le, N.: 3d-ucaps: 3d capsules unet for volumetric image segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2021: 24th International Conference, Strasbourg, France, September 27–October 1, 2021, Proceedings, Part I 24, pp. 548–558. Springer International Publishing (2021)
Hu, C., Xia, T., Cui, Y., Zou, Q., Wang, Y., Xiao, W., Li, X.: Trustworthy multi-phase liver tumor segmentation via evidence-based uncertainty. Eng. Appl. Artif. Intell. 133, 108289 (2024)
Isensee, F., Jäger, P.F., Full, P.M., Vollmuth, P., Maier-Hein, K.H.: nnU-Net for brain tumor segmentation. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries: 6th International Workshop, BrainLes 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Revised Selected Papers, Part II 6, pp. 118–132 (2021)
Hu, C., Wang, Y.: An efficient convolutional neural network model based on object-level attention mechanism for casting defect detection on radiography images. IEEE Trans. Industr. Electron. 67(12), 10922–10930 (2020)
Zhang, J., Wang, Y., Chen, L., Liu, J., Zhang, S., Pan, Z., … Guo, Y. (2023). Dual-branch TransV-Net for 3D echocardiography segmentation. IEEE Trans. Ind. Inf.
Magadza, T., Viriri, S.: Deep learning for brain tumor segmentation: a survey of state-of-the-art. J. Imaging, 19 (2021)
Isensee, F., Jaeger, P.F., Kohl, S.A.A., et al.: nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation[J]. Nat. Methods. 18(2), 203–211 (2021)
Ma, J., Li, F., Wang, B.: U-mamba: Enhancing long-range dependency for biomedical image segmentation[J]. (2024). arXiv preprint arXiv:2401.04722.
Ruan, J., Xiang, S.: Vm-unet: Vision mamba unet for medical image segmentation[J]. (2024). arXiv preprint arXiv:2402.02491.
Zhu, L., Liao, B., Zhang, Q., Wang, X., Liu, W., Wang, X.: Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv preprint arXiv:2401.09417 (2024)
Funding
This work was sponsored by Natural Science Foundation of Shanghai under Grant No. 22ZR1443700.
Author information
Authors and Affiliations
Contributions
Wang Yongxiong provided the overall idea for the paper, Qin Chendong conducted the theoretical derivation and all related experiments, Qin Chendong and Zhang Jiapeng wrote the main text of the paper, and all authors reviewed the manuscript.
Corresponding author
Ethics declarations
Ethics approval
Not applicable.
Competing interests
The authors declare no competing interests.
Conflicts of interest
We declare that we have no conflict of interest.
Additional information
Communicated by Bin Xiao.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Qin, C., Wang, Y. & Zhang, J. CMLCNet: medical image segmentation network based on convolution capsule encoder and multi-scale local co-occurrence. Multimedia Systems 30, 220 (2024). https://doi.org/10.1007/s00530-024-01430-9
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00530-024-01430-9