UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images

Yassine Oukdach¹,
Anass Garbaz¹,
Zakaria Kerkaou¹,
Mohamed El Ansari²,
Lahcen Koutti¹,
Ahmed Fouad El Ouafdi¹ &
…
Mouna Salihoun³

578 Accesses
Explore all metrics

Abstract

Colorectal cancer (CRC) stands out as one of the most prevalent global cancers. The accurate localization of colorectal polyps in endoscopy images is pivotal for timely detection and removal, contributing significantly to CRC prevention. The manual analysis of images generated by gastrointestinal screening technologies poses a tedious task for doctors. Therefore, computer vision-assisted cancer detection could serve as an efficient tool for polyp segmentation. Numerous efforts have been dedicated to automating polyp localization, with the majority of studies relying on convolutional neural networks (CNNs) to learn features from polyp images. Despite their success in polyp segmentation tasks, CNNs exhibit significant limitations in precisely determining polyp location and shape due to their sole reliance on learning local features from images. While gastrointestinal images manifest significant variation in their features, encompassing both high- and low-level ones, a framework that combines the ability to learn both features of polyps is desired. This paper introduces UViT-Seg, a framework designed for polyp segmentation in gastrointestinal images. Operating on an encoder-decoder architecture, UViT-Seg employs two distinct feature extraction methods. A vision transformer in the encoder section captures long-range semantic information, while a CNN module, integrating squeeze-excitation and dual attention mechanisms, captures low-level features, focusing on critical image regions. Experimental evaluations conducted on five public datasets, including CVC clinic, ColonDB, Kvasir-SEG, ETIS LaribDB, and Kvasir Capsule-SEG, demonstrate UViT-Seg’s effectiveness in polyp localization. To confirm its generalization performance, the model is tested on datasets not used in training. Benchmarking against common segmentation methods and state-of-the-art polyp segmentation approaches, the proposed model yields promising results. For instance, it achieves a mean Dice coefficient of 0.915 and a mean intersection over union of 0.902 on the CVC Colon dataset. Furthermore, UViT-Seg has the advantage of being efficient, requiring fewer computational resources for both training and testing. This feature positions it as an optimal choice for real-world deployment scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images

Article 17 January 2025

NA-segformer: A multi-level transformer model based on neighborhood attention for colonoscopic polyp segmentation

Article Open access 28 September 2024

Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Article Open access 27 March 2024

Data Availability

All the datasets used in this study are indicated and cited correctly in the paper.

References

Siegel, R.L., Miller, K.D., Wagle, N.S., Jemal, A.: Cancer statistics, 2023. CA: a cancer journal for clinicians 73(1), 17–48 (2023)
Siegel, R.L., Wagle, N.S., Cercek, A., Smith, R.A., Jemal, A.: Colorectal cancer statistics, 2023. CA: a cancer journal for clinicians 73(3), 233–254 (2023)
Hazewinkel, Y., Dekker, E.: Colonoscopy: basic principles and novel techniques. Nature reviews Gastroenterology & hepatology 8(10), 554–564 (2011)
Article Google Scholar
Holzheimer, R.G., Mannick, J.A.: Surgical treatment: evidence-based and problem-oriented (2001)
Tranquillini, C.V., Bernardo, W.M., Brunaldi, V.O., MOURA, E.T.d., Marques, S.B., MOURA, E.G.H.d.: Best polypectomy technique for small and diminutive colorectal polyps: A systematic review and meta-analysis. Arquivos de gastroenterologia 55, 358–368 (2018)
Costamagna, G., Shah, S.K., Riccioni, M.E., Foschia, F., Mutignani, M., Perri, V., Vecchioli, A., Brizi, M.G., Picciocchi, A., Marano, P.: A prospective trial comparing small bowel radiographs and video capsule endoscopy for suspected small bowel disease. Gastroenterology 123(4), 999–1005 (2002)
Article PubMed Google Scholar
Iddan, G., Meron, G., Glukhovsky, A., Swain, P.: Wireless capsule endoscopy. Nature 405(6785), 417–417 (2000)
Article CAS PubMed Google Scholar
Omori, T., Hara, T., Sakasai, S., Kambayashi, H., Murasugi, S., Ito, A., Nakamura, S., Tokushige, K.: Does the pillcam sb3 capsule endoscopy system improve image reading efficiency irrespective of experience? a pilot study. Endoscopy international open 6(06), 669–675 (2018)
Article Google Scholar
Jha, D., Ali, S., Tomar, N.K., Johansen, H.D., Johansen, D., Rittscher, J., Riegler, M.A., Halvorsen, P.: Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. Ieee Access 9, 40496–40510 (2021)
Article PubMed Google Scholar
Urban, G., Tripathi, P., Alkayali, T., Mittal, M., Jalali, F., Karnes, W., Baldi, P.: Deep learning localizes and identifies polyps in real time with 96% accuracy in screening colonoscopy. Gastroenterology 155(4), 1069–1078 (2018)
Article PubMed Google Scholar
Gross, S., Stehle, T., Behrens, A., Auer, R., Aach, T., Winograd, R., Trautwein, C., Tischendorf, J.: A comparison of blood vessel features and local binary patterns for colorectal polyp classification. In: Medical Imaging 2009: Computer-Aided Diagnosis, vol. 7260, pp. 758–765 (2009). SPIE
Iwahori, Y., Hattori, A., Adachi, Y., Bhuyan, M.K., Woodham, R.J., Kasugai, K.: Automatic detection of polyp using hessian filter and hog features. Procedia computer science 60, 730–739 (2015)
Article Google Scholar
Amber, A., Iwahori, Y., Bhuyan, M.K., Woodham, R.J., Kasugai, K.: Feature point based polyp tracking in endoscopic videos. In: 2015 3rd International Conference on Applied Computing and Information Technology/2nd International Conference on Computational Science and Intelligence, pp. 299–304 (2015). IEEE
Sasmal, P., Bhuyan, M.K., Iwahori, Y., Kasugai, K.: Colonoscopic polyp classification using local shape and texture features. IEEE Access 9, 92629–92639 (2021)
Article Google Scholar
Pogorelov, K., Ostroukhova, O., Jeppsson, M., Espeland, H., Griwodz, C., de Lange, T., Johansen, D., Riegler, M., Halvorsen, P.: Deep learning and hand-crafted feature based approaches for polyp detection in medical videos. In: 2018 IEEE 31st International Symposium on Computer-Based Medical Systems (CBMS), pp. 381–386 (2018). IEEE
Hmoud Al-Adhaileh, M., Mohammed Senan, E., Alsaade, W., Aldhyani, T.H.H., Alsharif, N., Abdullah Alqarni, A., Uddin, M.I., Alzahrani, M.Y., Alzain, E.D., Jadhav, M.E.: Deep learning algorithms for detection and classification of gastrointestinal diseases. Complexity 2021, 1–12 (2021)
Goel, N., Kaur, S., Gunjan, D., Mahapatra, S.: Dilated cnn for abnormality detection in wireless capsule endoscopy images. Soft Computing, 1–17 (2022)
Jain, S., Seal, A., Ojha, A.: A convolutional neural network with meta-feature learning for wireless capsule endoscopy image classification. Journal of Medical and Biological Engineering 43(4), 475–494 (2023)
Article Google Scholar
Jia, X., Meng, M.Q.-H.: A deep convolutional neural network for bleeding detection in wireless capsule endoscopy images. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 639–642 (2016). IEEE
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, pp. 234–241 (2015). Springer
Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine intelligence 39(12), 2481–2495 (2017)
Article PubMed Google Scholar
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., et al.: An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020)
Jia, X., Meng, M.Q.-H.: Gastrointestinal bleeding detection in wireless capsule endoscopy images using handcrafted and cnn features. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3154–3157 (2017). IEEE
Yuan, Y., Li, B., Meng, M.Q.-H.: Bleeding frame and region detection in the wireless capsule endoscopy video. IEEE journal of biomedical and health informatics 20(2), 624–630 (2015)
Article PubMed Google Scholar
Yuan, Y., Wang, J., Li, B., Meng, M.Q.-H.: Saliency based ulcer detection for wireless capsule endoscopy diagnosis. IEEE transactions on medical imaging 34(10), 2046–2057 (2015)
Article PubMed Google Scholar
Jain, S., Seal, A., Ojha, A., Krejcar, O., Bureš, J., Tachecí, I., Yazidi, A.: Detection of abnormality in wireless capsule endoscopy images using fractal features. Computers in biology and medicine 127, 104094 (2020)
Article PubMed Google Scholar
Sánchez-González, A., García-Zapirain, B., Sierra-Sosa, D., Elmaghraby, A.: Automatized colon polyp segmentation via contour region analysis. Computers in biology and medicine 100, 152–164 (2018)
Article PubMed Google Scholar
Jia, X., Xing, X., Yuan, Y., Xing, L., Meng, M.Q.-H.: Wireless capsule endoscopy: A new tool for cancer screening in the colon with deep-learning-based polyp recognition. Proceedings of the IEEE 108(1), 178–197 (2019)
Article Google Scholar
Shin, Y., Balasingham, I.: Comparison of hand-craft feature based svm and cnn based deep learning framework for automatic polyp classification. In: 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 3277–3280 (2017). IEEE
Guo, Y., Bernal, J., J. Matuszewski, B.: Polyp segmentation with fully convolutional deep neural networks–extended evaluation study. Journal of Imaging 6(7), 69 (2020)
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE transactions on medical imaging 35(2), 630–644 (2015)
Article PubMed Google Scholar
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilariño, F.: Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized medical imaging and graphics 43, 99–111 (2015)
Mahmud, T., Paul, B., Fattah, S.A.: Polypsegnet: A modified encoder-decoder architecture for automated polyp segmentation from colonoscopy images. Computers in Biology and Medicine 128, 104119 (2021)
Article PubMed Google Scholar
Jha, D., Smedsrud, P.H., Riegler, M.A., Johansen, D., De Lange, T., Halvorsen, P., Johansen, H.D.: Resunet++: An advanced architecture for medical image segmentation. In: 2019 IEEE International Symposium on Multimedia (ISM), pp. 225–2255 (2019). IEEE
Ta, N., Chen, H., Lyu, Y., Wu, T.: Ble-net: boundary learning and enhancement network for polyp segmentation. Multimedia Systems 29(5), 3041–3054 (2023)
Article Google Scholar
Qadir, H.A., Shin, Y., Solhusvik, J., Bergsland, J., Aabakken, L., Balasingham, I.: Polyp detection and segmentation using mask r-cnn: Does a deeper feature extractor cnn always perform better? In: 2019 13th International Symposium on Medical Information and Communication Technology (ISMICT), pp. 1–6 (2019). IEEE
Jain, S., Seal, A., Ojha, A., Yazidi, A., Bures, J., Tacheci, I., Krejcar, O.: A deep cnn model for anomaly detection and localization in wireless capsule endoscopy images. Computers in Biology and Medicine 137, 104789 (2021)
Article PubMed Google Scholar
Lafraxo, S., Souaidi, M., El Ansari, M., Koutti, L.: Semantic segmentation of digestive abnormalities from wce images by using attresu-net architecture. Life 13(3), 719 (2023)
Article PubMed PubMed Central Google Scholar
Oukdach, Y., Kerkaou, Z., El Ansari, M., Koutti, L., Fouad El Ouafdi, A., De Lange, T.: Vitca-net: a framework for disease detection in video capsule endoscopy images using a vision transformer and convolutional neural network with a specific attention mechanism. Multimedia Tools and Applications, 1–20 (2024)
Bai, L., Wang, L., Chen, T., Zhao, Y., Ren, H.: Transformer-based disease identification for small-scale imbalanced capsule endoscopy dataset. Electronics 11(17), 2747 (2022)
Article Google Scholar
Oukdach, Y., Kerkaou, Z., Ansari, M.E., Koutti, L., Ouafdi, A.F.E.: Conv-vit: Feature fusion-based detection of gastrointestinal abnormalities using cnn and vit in wce images. In: 2023 10th International Conference on Wireless Networks and Mobile Communications (WINCOM), pp. 1–6 (2023). 10.1109/WINCOM59760.2023.10322944
Hosain, A.S., Islam, M., Mehedi, M.H.K., Kabir, I.E., Khan, Z.T.: Gastrointestinal disorder detection with a transformer based approach. In: 2022 IEEE 13th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), pp. 0280–0285 (2022). IEEE
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation. arXiv preprint arXiv:2102.04306 (2021)
Alam, M.J., Fattah, S.A.: Sr-attnet: An interpretable stretch–relax attention based deep neural network for polyp segmentation in colonoscopy images. Computers in Biology and Medicine 160, 106945 (2023)
Article PubMed Google Scholar
Jha, D., Smedsrud, P.H., Riegler, M.A., Halvorsen, P., de Lange, T., Johansen, D., Johansen, H.D.: Kvasir-seg: A segmented polyp dataset. In: MultiMedia Modeling: 26th International Conference, MMM 2020, Daejeon, South Korea, January 5–8, 2020, Proceedings, Part II 26, pp. 451–462 (2020). Springer
Silva, J., Histace, A., Romain, O., Dray, X., Granado, B.: Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. International journal of computer assisted radiology and surgery 9, 283–293 (2014)
Article PubMed Google Scholar
Jha, D., Tomar, N.K., Ali, S., Riegler, M.A., Johansen, H.D., Johansen, D., de Lange, T., Halvorsen, P.: Nanonet: Real-time polyp segmentation in video capsule endoscopy and colonoscopy. In: Proceedings of the 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS), pp. 37–43 (2021)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N., Liang, J.: Unet++: A nested u-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, pp. 3–11 (2018). Springer
Xiao, X., Lian, S., Luo, Z., Li, S.: Weighted res-unet for high-quality retina vessel segmentation. In: 2018 9th International Conference on Information Technology in Medicine and Education (ITME), pp. 327–331 (2018). IEEE
Ibtehaz, N., Rahman, M.S.: Multiresunet: Rethinking the u-net architecture for multimodal biomedical image segmentation. Neural networks 121, 74–87 (2020)
Article PubMed Google Scholar
Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., et al.: Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999 (2018)
Fang, Y., Chen, C., Yuan, Y., Tong, K.-y.: Selective feature aggregation network with area-boundary constraints for polyp segmentation. In: Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part I 22, pp. 302–310 (2019). Springer
Fan, D.-P., Ji, G.-P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-assisted Intervention, pp. 263–273 (2020). Springer
Huang, C.-H., Wu, H.-Y., Lin, Y.-L.: Hardnet-mseg: A simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172 (2021)
Yeung, M., Sala, E., Schönlieb, C.-B., Rundo, L.: Focus u-net: A novel dual attention-gated cnn for polyp segmentation during colonoscopy. Computers in biology and medicine 137, 104815 (2021)
Article CAS PubMed PubMed Central Google Scholar

Download references

Funding

This work was supported by the Ministry of National Education by Vocational Training; in part by the Higher Education and Scientific Research through the Ministry of Industry, Trade, and Green and Digital Economy; in part by the Digital Development Agency (ADD); and in part by the National Center for Scientific and Technical Research (CNRST) under Project ALKHAWARIZMI/2020/20.

Author information

Authors and Affiliations

LabSIV, Department of Computer Science, Faculty of Sciences, Ibnou Zohr University, Agadir, 80000, Morocco
Yassine Oukdach, Anass Garbaz, Zakaria Kerkaou, Lahcen Koutti & Ahmed Fouad El Ouafdi
Informatics and Applications Laboratory, Department of Computer Sciences, Faculty of Science, Moulay Ismail University, B.P 11201, Meknès, 52000, Morocco
Mohamed El Ansari
Faculty of Medicine and Pharmacy of Rabat, Mohammed V University of Rabat, Rabat, 10000, Morocco
Mouna Salihoun

Authors

Yassine Oukdach
View author publications
You can also search for this author in PubMed Google Scholar
Anass Garbaz
View author publications
You can also search for this author in PubMed Google Scholar
Zakaria Kerkaou
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed El Ansari
View author publications
You can also search for this author in PubMed Google Scholar
Lahcen Koutti
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Fouad El Ouafdi
View author publications
You can also search for this author in PubMed Google Scholar
Mouna Salihoun
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Y.O., A.G., Z.K., M.E., L.K., and A.F.E. wrote the main manuscript text. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yassine Oukdach.

Ethics declarations

Ethics Approval

Institutional Review Board approval was obtained.

Consent to Participate

All study participants provided written informed consent.

Consent for Publication

The manuscript contains no identifiable individual data or images that would require consent to publish from any participant.

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Oukdach, Y., Garbaz, A., Kerkaou, Z. et al. UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images. J Digit Imaging. Inform. med. 37, 2354–2374 (2024). https://doi.org/10.1007/s10278-024-01124-8

Download citation

Received: 29 January 2024
Revised: 01 April 2024
Accepted: 13 April 2024
Published: 26 April 2024
Issue Date: October 2024
DOI: https://doi.org/10.1007/s10278-024-01124-8

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images

NA-segformer: A multi-level transformer model based on neighborhood attention for colonoscopic polyp segmentation

Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

UViT-Seg: An Efficient ViT and U-Net-Based Framework for Accurate Colorectal Polyp Segmentation in Colonoscopy and WCE Images

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

InCoLoTransNet: An Involution-Convolution and Locality Attention-Aware Transformer for Precise Colorectal Polyp Segmentation in GI Images

NA-segformer: A multi-level transformer model based on neighborhood attention for colonoscopic polyp segmentation

Utilizing adaptive deformable convolution and position embedding for colon polyp segmentation with a visual transformer

Data Availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics Approval

Consent to Participate

Consent for Publication

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation