UNet-eVAE: Iterative Refinement Using VAE Embodied Learning for Endoscopic Image Segmentation

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13583))

Included in the following conference series:

International Workshop on Machine Learning in Medical Imaging

1832 Accesses

Abstract

While endoscopy is routinely used for surveillance, high operator dependence demands robust automated image analysis methods. Automated segmentation of region-of-interest (ROI) that includes lesions, inflammations, and instruments can serve to cope with the operator dependence problem in this field. Most supervised methods are developed by fitting models on the available ground truth mask samples only. This work proposes a joint training approach using the UNet coupled with a variational auto-encoder (VAE) to improve endoscopic image segmentation by exploiting original samples, predicted masks and ground truth masks. In the proposed UNet-eVAE, VAE utilises the masks to constrain ROI-specific feature representations for reconstruction as an auxiliary task. The fine-grained spatial information from VAE is fused with the UNet decoder to enrich the feature representations and improve segmentation performance. Our experimental results on both colonoscopy and ureteroscopy datasets demonstrate that the proposed architecture can learn robust representations and generalise segmentation performance on unseen samples while improving the baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Can SegFormer be a True Competitor to U-Net for Medical Image Segmentation?

EndoUDA: A Modality Independent Segmentation Approach for Endoscopy Imaging

Self Pre-training with Single-Scale Adapter for Left Atrial Segmentation

References

Aldoukhi, A.H., Roberts, W.W., Hall, T.L., Ghani, K.R.: Holmium laser lithotripsy in the new stone age: dust or bust? Front. Surg. 4, 57 (2017)
Article Google Scholar
Alelign, T., Petros, B.: Kidney stone disease: an update on current concepts. Adv. Urol. 2018 (2018)
Google Scholar
Ali, S., et al.: Deep learning for detection and segmentation of artefact and disease instances in gastrointestinal endoscopy. Med. Image Anal. 70, 102002 (2021)
Article Google Scholar
Ali, S., et al.: PolypGen: a multi-center polyp detection and segmentation dataset for generalisability assessment. arXiv preprint arXiv:2106.04463 (2021)
Ali, S., et al.: An objective comparison of detection and segmentation algorithms for artefacts in clinical endoscopy. Sci. Rep. 10(1), 1–15 (2020)
Google Scholar
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput. Med. Imaging Graph. 43, 99–111 (2015)
Article Google Scholar
Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A., Jemal, A.: Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 68(6), 394–424 (2018)
Article Google Scholar
Fan, D.-P., et al.: PraNet: parallel reverse attention network for polyp segmentation. In: Martel, A.L., et al. (eds.) MICCAI 2020. LNCS, vol. 12266, pp. 263–273. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59725-2_26
Chapter Google Scholar
Galdran, A., Carneiro, G., Ballester, M.A.G.: Double encoder-decoder networks for gastrointestinal polyp segmentation. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12661, pp. 293–307. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68763-2_22
Chapter Google Scholar
Gupta, S., Ali, S., Goldsmith, L., Turney, B., Rittscher, J.: MI-UNet: improved segmentation in ureteroscopy. In: 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), pp. 212–216 (2020). https://doi.org/10.1109/ISBI45749.2020.9098608
Jha, D., et al.: Kvasir-SEG: a segmented polyp dataset. In: Ro, Y.M., et al. (eds.) MMM 2020. LNCS, vol. 11962, pp. 451–462. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-37734-2_37
Chapter Google Scholar
Kohl, S., et al.: A probabilistic U-Net for segmentation of ambiguous images. In: Advances in Neural Information Processing Systems, vol. 31 (2018)
Google Scholar
Li, K., Kong, L., Zhang, Y.: 3D U-Net brain tumor segmentation using VAE skip connection. In: 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC), pp. 97–101. IEEE (2020)
Google Scholar
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Conference on 3D Vision (3DV), pp. 565–571. IEEE (2016)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Tomar, N.K., et al.: DDANet: dual decoder attention network for automatic polyp segmentation. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12668, pp. 307–314. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68793-9_23
Chapter Google Scholar
Yeung, M., Sala, E., Schönlieb, C.B., Rundo, L.: Focus U-Net: a novel dual attention-gated CNN for polyp segmentation during colonoscopy. Comput. Biol. Med. 137, 104815 (2021)
Article Google Scholar
Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual u-net. IEEE Geosci. Remote Sens. Lett. 15(5), 749–753 (2018)
Article Google Scholar
Zhu, Y., Min, M.R., Kadav, A., Graf, H.P.: S3VAE: self-supervised sequential VAE for representation disentanglement and data generation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6538–6547 (2020)
Google Scholar

Download references

Acknowledgement

We would like to thank Boston Scientific for funding this project (Grant No: DFR04690). SG and BT are funded by BSC, BB is funded by EndoMapper Horizon 2020 FET (GA 863146), SA and JR were supported by the NIHR Oxford Biomedical Research Centre.

Author information

Authors and Affiliations

Institute of Biomedical Engineering, Department of Engineering Science, University of Oxford, Oxford, UK
Soumya Gupta, Ziang Xu & Jens Rittscher
School of Computing, University of Leeds, Leeds, UK
Sharib Ali
Big Data Institute, University of Oxford, Li Ka Shing Centre for Health Information and Discovery, Oxford, UK
Soumya Gupta, Ziang Xu & Jens Rittscher
Oxford NIHR Biomedical Research Centre, Oxford, UK
Jens Rittscher
University College London, London, UK
Binod Bhattarai
Department of Urology, The Churchill, OUH NHS Trust, Oxford, UK
Ben Turney

Authors

Soumya Gupta
View author publications
You can also search for this author in PubMed Google Scholar
Sharib Ali
View author publications
You can also search for this author in PubMed Google Scholar
Ziang Xu
View author publications
You can also search for this author in PubMed Google Scholar
Binod Bhattarai
View author publications
You can also search for this author in PubMed Google Scholar
Ben Turney
View author publications
You can also search for this author in PubMed Google Scholar
Jens Rittscher
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Soumya Gupta .

Editor information

Editors and Affiliations

Xi'an Jiaotong University, Xi'an, China
Chunfeng Lian
Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
Xiaohuan Cao
Istanbul Technical University, Istanbul, Turkey
Islem Rekik
Rensselaer Polytechnic Institute, Troy, NY, USA
Xuanang Xu
ShanghaiTech University, Pudong, China
Zhiming Cui

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 184 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gupta, S., Ali, S., Xu, Z., Bhattarai, B., Turney, B., Rittscher, J. (2022). UNet-eVAE: Iterative Refinement Using VAE Embodied Learning for Endoscopic Image Segmentation. In: Lian, C., Cao, X., Rekik, I., Xu, X., Cui, Z. (eds) Machine Learning in Medical Imaging. MLMI 2022. Lecture Notes in Computer Science, vol 13583. Springer, Cham. https://doi.org/10.1007/978-3-031-21014-3_17

Download citation

DOI: https://doi.org/10.1007/978-3-031-21014-3_17
Published: 16 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21013-6
Online ISBN: 978-3-031-21014-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics