Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images
<p>Framework of the traditional SSD.</p> "> Figure 2
<p>Flowchart of the proposed Hyb-SSDNet method for small polyp detection in WCE images.</p> "> Figure 3
<p>Modified inception structure.</p> "> Figure 4
<p>The mid-fusion framework. Input feature maps were created at two successive inception modules of the inception v4 network to encode contextual information. All scale sizes are 35 × 35 × 384. Extracted features were passed to the mSE-Network to generate the score maps, reflecting the importance of features at different positions and scales. Weighted features were then concatenated and normalized to complete the network process.</p> "> Figure 5
<p>Overview of the Hyb-SSDNet architecture with a 299 × 299 × 3 input size and inception v4 as the backbone. Features from two successive modified inception-A layers (<math display="inline"><semantics> <msub> <mi>S</mi> <mrow> <mi>L</mi> <mn>1</mn> </mrow> </msub> </semantics></math>, <math display="inline"><semantics> <msub> <mi>S</mi> <mrow> <mi>L</mi> <mn>2</mn> </mrow> </msub> </semantics></math>) were fused by a mid-fusion block producing an intermediate feature representation (<math display="inline"><semantics> <msubsup> <mrow> <mi>S</mi> </mrow> <mrow> <mi>H</mi> <mi>L</mi> </mrow> <msup> <mrow/> <mo>′</mo> </msup> </msubsup> </semantics></math>).</p> "> Figure 6
<p>Detailed structure of the Hyb-SSDNet network.</p> "> Figure 7
<p>Example of WCE polyp images (<b>a</b>–<b>c</b>).</p> "> Figure 8
<p>Example of CVC-ClinicDB polyp images (<b>a</b>–<b>c</b>).</p> "> Figure 9
<p>Example of ETIS-Larib polyp images (<b>a</b>–<b>c</b>).</p> "> Figure 10
<p>Precision vs. recall for (<b>a</b>) WCE test set, (<b>b</b>) CVC-ClinicDB test set, and (<b>c</b>) ETIS-Larib test set using the Hyb-SSDNet framework.</p> "> Figure 11
<p>Qualitative results comparison between FSSD300 (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) and the proposed Hyb-SSDNet (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) on the WCE polyp test set. True bounding boxes with IoU of 0.5 or higher with the bounding predicted boxes are drawn in green and red colors, respectively.</p> "> Figure 12
<p>Qualitative results comparison between FSSD300 (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) and the proposed Hyb-SSDNet (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) on the CVC-ClinicDB polyp test set. True bounding boxes with IoU of 0.5 or higher with the bounding predicted boxes are drawn in green and red colors, respectively.</p> "> Figure 13
<p>Qualitative results comparison between FSSD300 (<b>a</b>,<b>c</b>,<b>e</b>,<b>g</b>) and the proposed Hyb-SSDNet (<b>b</b>,<b>d</b>,<b>f</b>,<b>h</b>) on the ETIS-Larib polyp test set. True bounding boxes with IoU of 0.5 or higher with the bounding predicted boxes are drawn in green and red colors, respectively.</p> ">
Abstract
:1. Introduction
- The application of a lightweight version of inception v4 architecture as the backbone to improve the SSD detector ability in small polyp detection.
- A weighted mid-fusion block of two adjacent modules of the modified version of inception-A is used to replace the Conv4_3 layer of the VGG16 network; thus, tackling the problem of missing target details.
- The filter numbers of the first convolution layers in the stem part of the inception v4 backbone were modified from (32 to 64) to capture more patterns and relevant information similar to the original SSD.
- The modification of the inception v4 model through the reduction of layers is used to achieve faster speed while maintaining the computational cost.
- The Hyb-SSDNet uses a weighted mid-fusion block to add new convolution layers to construct the multi-scale feature pyramid differently from the conventional SSD.
- The Hyb-SSDNet model robustness is verified through repeated experimentation on three well-known datasets in the field (WCE, CVC-ClinicDB, and ETIS-Larib) for a fair comparison with the competitor’s state-of-the-art methods.
- The discussion of the advantages and limitations of the proposed framework.
2. Literature Review
2.1. Hand-Crafted and CNN Methods
2.2. SSD-Based and Other Object Detectors
2.3. Single-Shot Multibox Detector (SSD)
2.4. Feature Pyramids Hierarchy
3. Materials and Methods
3.1. Mid-Fusion Block
mSE-Network
3.2. Multi-Scale Feature Pyramid
4. Experiments
4.1. Datasets
4.2. Evaluation Indexes
4.3. Experimental Setup
4.3.1. Experimental Environment Configuration
4.3.2. Model Training
4.4. Results and Discussion
4.4.1. Ablation Studies
4.4.2. Comparison with the State-of-the-Art Method
4.4.3. Visualization of Detection Results
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Garrido, A.; Sont, R.; Dghoughi, W.; Marcoval, S.; Romeu, J.; Fernández-Esparrach, G.; Belda, I.; Guardiola, M. Automatic Polyp Detection Using Microwave Endoscopy for Colorectal Cancer Prevention and Early Detection: Phantom Validation. IEEE Access 2021, 9, 148048–148059. [Google Scholar] [CrossRef]
- Dulf, E.H.; Bledea, M.; Mocan, T.; Mocan, L. Automatic Detection of Colorectal Polyps Using Transfer Learning. Sensors 2021, 21, 5704. [Google Scholar] [CrossRef] [PubMed]
- Charfi, S.; El Ansari, M.; Balasingham, I. Computer-aided diagnosis system for ulcer detection in wireless capsule endoscopy images. IET Image Process. 2019, 13, 1023–1030. [Google Scholar] [CrossRef] [Green Version]
- Lafraxo, S.; El Ansari, M. GastroNet: Abnormalities Recognition in Gastrointestinal Tract through Endoscopic Imagery using Deep Learning Techniques. In Proceedings of the 2020 8th International Conference on Wireless Networks and Mobile Communications (WINCOM), Reims, France, 27–29 October 2020; pp. 1–5. [Google Scholar]
- Souaidi, M.; Abdelouahad, A.A.; El Ansari, M. A fully automated ulcer detection system for wireless capsule endoscopy images. In Proceedings of the 2017 International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Fez, Morocco, 22–24 May 2017; pp. 1–6. [Google Scholar]
- Souaidi, M.; Charfi, S.; Abdelouahad, A.A.; El Ansari, M. New features for wireless capsule endoscopy polyp detection. In Proceedings of the 2018 International Conference on Intelligent Systems and Computer Vision (ISCV), Fez, Morocco, 2–4 April 2018; pp. 1–6. [Google Scholar]
- Charfi, S.; El Ansari, M. Computer-aided diagnosis system for colon abnormalities detection in wireless capsule endoscopy images. Multimed. Tools Appl. 2018, 77, 4047–4064. [Google Scholar] [CrossRef]
- Souaidi, M.; Abdelouahed, A.A.; El Ansari, M. Multi-scale completed local binary patterns for ulcer detection in wireless capsule endoscopy images. Multimed. Tools Appl. 2019, 78, 13091–13108. [Google Scholar] [CrossRef]
- Charfi, S.; El Ansari, M. A locally based feature descriptor for abnormalities detection. Soft Comput. 2020, 24, 4469–4481. [Google Scholar] [CrossRef]
- Souaidi, M.; El Ansari, M. Automated Detection of Wireless Capsule Endoscopy Polyp Abnormalities with Deep Transfer Learning and Support Vector Machines. In Proceedings of the International Conference on Advanced Intelligent Systems for Sustainable Development; Springer: Berlin/Heidelberg, Germany, 2020; pp. 870–880. [Google Scholar]
- Lafraxo, S.; Ansari, M.E.; Charfi, S. MelaNet: An effective deep learning framework for melanoma detection using dermoscopic images. Multimed. Tools Appl. 2022, 81, 16021–16045. [Google Scholar] [CrossRef]
- Lafraxo, S.; El Ansari, M. CoviNet: Automated COVID-19 Detection from X-rays using Deep Learning Techniques. In Proceedings of the 2020 6th IEEE Congress on Information Science and Technology (CiSt), Agadir, Morocco, 5–12 June 2021; pp. 489–494. [Google Scholar]
- Lafraxo, S.; Ansari, M.E. Regularized Convolutional Neural Network for Pneumonia Detection Trough Chest X-Rays. In Proceedings of the International Conference on Advanced Intelligent Systems for Sustainable Development; Springer: Berlin/Heidelberg, Germany, 2020; pp. 887–896. [Google Scholar]
- Xu, L.; Xie, J.; Cai, F.; Wu, J. Spectral Classification Based on Deep Learning Algorithms. Electronics 2021, 10, 1892. [Google Scholar] [CrossRef]
- Nogueira-Rodríguez, A.; Domínguez-Carbajales, R.; Campos-Tato, F.; Herrero, J.; Puga, M.; Remedios, D.; Rivas, L.; Sánchez, E.; Iglesias, Á.; Cubiella, J.; et al. Real-time polyp detection model using convolutional neural networks. Neural Comput. Appl. 2021, 34, 10375–10396. [Google Scholar] [CrossRef]
- Mohammed, A.; Yildirim, S.; Farup, I.; Pedersen, M.; Hovde, Ø. Y-net: A deep convolutional neural network for polyp detection. arXiv 2018, arXiv:1806.01907. [Google Scholar]
- Chen, X.; Zhang, K.; Lin, S.; Dai, K.F.; Yun, Y. Single Shot Multibox Detector Automatic Polyp Detection Network Based on Gastrointestinal Endoscopic Images. Comput. Math. Methods Med. 2021, 2021, 2144472. [Google Scholar] [CrossRef]
- Souaidi, M.; El Ansari, M. A New Automated Polyp Detection Network MP-FSSD in WCE and Colonoscopy Images based Fusion Single Shot Multibox Detector and Transfer Learning. IEEE Access 2022, 10, 47124–47140. [Google Scholar] [CrossRef]
- Girshick, R.; Donahue, J.; Darrell, T.; Malik, J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 580–587. [Google Scholar]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 2016, 39, 1137–1149. [Google Scholar] [CrossRef] [Green Version]
- Jifeng, D.; Yi, L.; Kaiming, H.; Jian, S. Object Detection via Region-Based Fully Convolutional Networks. arXiv 2016, arXiv:1605.06409. [Google Scholar]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar]
- Liu, W.; Anguelov, D.; Erhan, D.; Szegedy, C.; Reed, S.; Fu, C.Y.; Berg, A.C. Ssd: Single shot multibox detector. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 21–37. [Google Scholar]
- Hasanpour, S.H.; Rouhani, M.; Fayyaz, M.; Sabokrou, M.; Adeli, E. Towards principled design of deep convolutional networks: Introducing simpnet. arXiv 2018, arXiv:1802.06205. [Google Scholar]
- Li, Z.; Zhou, F. FSSD: Feature fusion single shot multibox detector. arXiv 2017, arXiv:1712.00960. [Google Scholar]
- Szegedy, C.; Ioffe, S.; Vanhoucke, V.; Alemi, A.A. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, San Francisco, CA, USA, 4–9 February 2017. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Shin, H.C.; Lu, L.; Kim, L.; Seff, A.; Yao, J.; Summers, R.M. Interleaved text/image deep mining on a very large-scale radiology database. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1090–1099. [Google Scholar]
- Chen, B.L.; Wan, J.J.; Chen, T.Y.; Yu, Y.T.; Ji, M. A self-attention based faster R-CNN for polyp detection from colonoscopy images. Biomed. Signal Process. Control. 2021, 70, 103019. [Google Scholar] [CrossRef]
- Pacal, I.; Karaman, A.; Karaboga, D.; Akay, B.; Basturk, A.; Nalbantoglu, U.; Coskun, S. An efficient real-time colonic polyp detection with YOLO algorithms trained by using negative samples and large datasets. Comput. Biol. Med. 2021, 141, 105031. [Google Scholar] [CrossRef]
- Hong-Tae, C.; Ho-Jun, L.; Kang, H.; Yu, S. SSD-EMB: An Improved SSD Using Enhanced Feature Map Block for Object Detection. Sensors 2021, 21, 2842. [Google Scholar]
- Qadir, H.A.; Shin, Y.; Solhusvik, J.; Bergsland, J.; Aabakken, L.; Balasingham, I. Polyp detection and segmentation using mask R-CNN: Does a deeper feature extractor CNN always perform better? In Proceedings of the 2019 13th International Symposium on Medical Information and Communication Technology (ISMICT), Oslo, Norway, 8–10 May 2019; pp. 1–6. [Google Scholar]
- Jia, X.; Mai, X.; Cui, Y.; Yuan, Y.; Xing, X.; Seo, H.; Xing, L.; Meng, M.Q.H. Automatic polyp recognition in colonoscopy images using deep learning and two-stage pyramidal feature prediction. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1570–1584. [Google Scholar] [CrossRef]
- Tashk, A.; Nadimi, E. An innovative polyp detection method from colon capsule endoscopy images based on a novel combination of RCNN and DRLSE. In Proceedings of the 2020 IEEE Congress on Evolutionary Computation (CEC), Glasgow, UK, 19–24 July 2020; pp. 1–6. [Google Scholar]
- Misawa, M.; Kudo, S.e.; Mori, Y.; Hotta, K.; Ohtsuka, K.; Matsuda, T.; Saito, S.; Kudo, T.; Baba, T.; Ishida, F.; et al. Development of a computer-aided detection system for colonoscopy and a publicly accessible large colonoscopy video database (with video). Gastrointest. Endosc. 2021, 93, 960–967. [Google Scholar] [CrossRef]
- Liu, M.; Jiang, J.; Wang, Z. Colonic polyp detection in endoscopic videos with single shot detection based deep convolutional neural network. IEEE Access 2019, 7, 75058–75066. [Google Scholar] [CrossRef]
- Zhang, X.; Chen, F.; Yu, T.; An, J.; Huang, Z.; Liu, J.; Hu, W.; Wang, L.; Duan, H.; Si, J. Real-time gastric polyp detection using convolutional neural networks. PLoS ONE 2019, 14, e0214133. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Jeong, J.; Park, H.; Kwak, N. Enhancement of SSD by concatenating feature maps for object detection. arXiv 2017, arXiv:1705.09587. [Google Scholar]
- Zhai, S.; Shang, D.; Wang, S.; Dong, S. DF-SSD: An improved SSD object detection algorithm based on DenseNet and feature fusion. IEEE Access 2020, 8, 24344–24357. [Google Scholar] [CrossRef]
- Wang, T.; Shen, F.; Deng, H.; Cai, F.; Chen, S. Smartphone imaging spectrometer for egg/meat freshness monitoring. Anal. Methods 2022, 14, 508–517. [Google Scholar] [CrossRef] [PubMed]
- Lin, T.Y.; Dollár, P.; Girshick, R.; He, K.; Hariharan, B.; Belongie, S. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2117–2125. [Google Scholar]
- Fu, C.Y.; Liu, W.; Ranga, A.; Tyagi, A.; Berg, A.C. Dssd: Deconvolutional single shot detector. arXiv 2017, arXiv:1701.06659. [Google Scholar]
- Russakovsky, O.; Deng, J.; Su, H.; Krause, J.; Satheesh, S.; Ma, S.; Huang, Z.; Karpathy, A.; Khosla, A.; Bernstein, M.; et al. Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 2015, 115, 211–252. [Google Scholar] [CrossRef] [Green Version]
- Pan, H.; Jiang, J.; Chen, G. TDFSSD: Top-down feature fusion single shot MultiBox detector. Signal Process. Image Commun. 2020, 89, 115987. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
- Roy, A.G.; Navab, N.; Wachinger, C. Recalibrating fully convolutional networks with spatial and channel “squeeze and excitation” blocks. IEEE Trans. Med Imaging 2018, 38, 540–549. [Google Scholar] [CrossRef]
- Prasath, V.S. Polyp detection and segmentation from video capsule endoscopy: A review. J. Imaging 2016, 3, 1. [Google Scholar] [CrossRef]
- Bernal, J.; Sánchez, F.J.; Fernández-Esparrach, G.; Gil, D.; Rodríguez, C.; Vilariño, F. WM-DOVA maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Comput. Med Imaging Graph. 2015, 43, 99–111. [Google Scholar] [CrossRef]
- WE, O. ETIS-Larib Polyp DB. Available online: https://polyp.grand-challenge.org/EtisLarib/ (accessed on 27 March 2022).
- Angermann, Q.; Bernal, J.; Sánchez-Montes, C.; Hammami, M.; Fernández-Esparrach, G.; Dray, X.; Romain, O.; Sánchez, F.J.; Histace, A. Towards real-time polyp detection in colonoscopy videos: Adapting still frame-based methodologies for video sequences analysis. In Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures; Springer: Berlin/Heidelberg, Germany, 2017; pp. 29–41. [Google Scholar]
- Silva, J.; Histace, A.; Romain, O.; Dray, X.; Granado, B. Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer. Int. J. Comput. Assist. Radiol. Surg. 2014, 9, 283–293. [Google Scholar] [CrossRef]
- Picard, R.R.; Cook, R.D. Cross-validation of regression models. J. Am. Stat. Assoc. 1984, 79, 575–583. [Google Scholar] [CrossRef]
- Jha, D.; Ali, S.; Tomar, N.K.; Johansen, H.D.; Johansen, D.; Rittscher, J.; Riegler, M.A.; Halvorsen, P. Real-time polyp detection, localization and segmentation in colonoscopy using deep learning. IEEE Access 2021, 9, 40496–40510. [Google Scholar] [CrossRef]
- Chatfield, K.; Simonyan, K.; Vedaldi, A.; Zisserman, A. Return of the devil in the details: Delving deep into convolutional nets. arXiv 2014, arXiv:1405.3531. [Google Scholar]
- Mash, R.; Borghetti, B.; Pecarina, J. Improved aircraft recognition for aerial refueling through data augmentation in convolutional neural networks. In Proceedings of the International Symposium on Visual Computing; Springer: Berlin/Heidelberg, Germany, 2016; pp. 113–122. [Google Scholar]
- Ma, W.; Wang, X.; Yu, J. A Lightweight Feature Fusion Single Shot Multibox Detector for Garbage Detection. IEEE Access 2020, 8, 188577–188586. [Google Scholar] [CrossRef]
- Shin, Y.; Qadir, H.A.; Aabakken, L.; Bergsland, J.; Balasingham, I. Automatic colon polyp detection using region based deep cnn and post learning approaches. IEEE Access 2018, 6, 40950–40962. [Google Scholar] [CrossRef]
- Liu, X.; Guo, X.; Liu, Y.; Yuan, Y. Consolidated domain adaptive detection and localization framework for cross-device colonoscopic images. Med. Image Anal. 2021, 71, 102052. [Google Scholar] [CrossRef]
- Wang, D.; Zhang, N.; Sun, X.; Zhang, P.; Zhang, C.; Cao, Y.; Liu, B. Afp-net: Realtime anchor-free polyp detection in colonoscopy. In Proceedings of the 2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI), Portland, OR, USA, 4–6 November 2019; pp. 636–643. [Google Scholar]
- Qadir, H.A.; Shin, Y.; Solhusvik, J.; Bergsland, J.; Aabakken, L.; Balasingham, I. Toward real-time polyp detection using fully CNNs for 2D Gaussian shapes prediction. Med. Image Anal. 2021, 68, 101897. [Google Scholar] [CrossRef]
- Pacal, I.; Karaboga, D. A robust real-time deep learning based automatic polyp detection system. Comput. Biol. Med. 2021, 134, 104519. [Google Scholar] [CrossRef]
- Krenzer, A.; Banck, M.; Makowski, K.; Hekalo, A.; Fitting, D.; Troya, J.; Sudarevic, B.; Zoller, W.G.; Hann, A.; Puppe, F. A Real-Time Polyp Detection System with Clinical Application in Colonoscopy Using Deep Convolutional Neural Networks. 2022. Available online: https://www.researchsquare.com/article/rs-1310139/latest.pdf (accessed on 13 August 2022).
Training Data | Testing Data | Mid-Fusion Block | Weighted Feature Fusion | Normalization | Feature Fusion | mAP (%) |
---|---|---|---|---|---|---|
Inception-A (×2) | × | × | Concatenation | 89.85 | ||
Inception-A(×2) | ✓ | × | Summation | 88.69 | ||
Stem & Inception-A(×1) | × | B.Norm | Concatenation | 88.42 | ||
Stem & Inception-A(×1) | ✓ | × | Summation | 89.28 | ||
WCE | WCE | Inception-A(×2) | ✓ | L2.Norm | Average | 90.11 |
Inception-A(×2) | ✓ | B.Norm | Average | 89.96 | ||
Inception-A(×2) | ✓ | L2.Norm | Concatenation | 93.29 | ||
Stem & Inception-A(×2) | × | B.Norm | Concatenation | 89.75 | ||
Stem & Inception-A(×2) | ✓ | × | Summation | 91.42 | ||
Stem & Inception-A(×2) | ✓ | L2.Norm | Average | 92.53 | ||
Inception-A (×2) | × | × | Concatenation | 88.25 | ||
Inception-A(×2) | ✓ | × | Summation | 87.37 | ||
Stem & Inception-A(×1) | × | B.Norm | Concatenation | 88.21 | ||
Stem & Inception-A(×1) | ✓ | × | Summation | 88.98 | ||
CVC-ClinicDB | Inception-A(×2) | ✓ | L2.Norm | Average | 89.74 | |
& | CVC-ClinicDB | Inception-A(×2) | ✓ | B.Norm | Average | 88.65 |
ETIS-Larib | Inception-A(×2) | ✓ | L2.Norm | Concatenation | 91.93 | |
Stem & Inception-A(×2) | × | B.Norm | Concatenation | 88.87 | ||
Stem & Inception-A(×2) | ✓ | × | Summation | 89.68 | ||
Stem & Inception-A(×2) | ✓ | L2.Norm | Average | 90.14 | ||
Inception-A (×2) | × | × | Concatenation | 88.49 | ||
Inception-A(x2) | ✓ | × | Summation | 88.03 | ||
Stem & Inception-A(×1) | × | B.Norm | Concatenation | 87.82 | ||
Stem & Inception-A(x1) | ✓ | × | Summation | 89.46 | ||
CVC-ClinicDB | Inception-A(×2) | ✓ | L2.Norm | Average | 89 | |
& | ETIS-Larib | Inception-A(×2) | ✓ | B.Norm | Average | 88.94 |
ETIS-Larib | Inception-A(×2) | ✓ | L2.Norm | Concatenation | 91.10 | |
Stem & Inception-A(×2) | × | B.Norm | Concatenation | 87.73 | ||
Stem & Inception-A(×2) | ✓ | × | Summation | 89.41 | ||
Stem & Inception-A(×2) | ✓ | L2.Norm | Average | 90.05 |
Training Data | Methods | Backbone | Input Size | Pre-Train | FPS | [email protected](%) | |
---|---|---|---|---|---|---|---|
WCE | |||||||
SSD300 | VGG16 | 300 × 300 × 3 | ✓ | 46 | 77.2 | ||
SSD300 | ResNet-101 | 300 × 300 × 3 | ✓ | 47.3 | 81.65 | ||
SSD500 | VGG16 | 300 × 300 × 3 | ✓ | 19 | 79.45 | ||
SSD500 | ResNet-101 | 300 × 300 × 3 | ✓ | 20 | 84.95 | ||
WCE | FSSD300 | VGG16 | 300 × 300 × 3 | ✓ | 65.9 | 89.78 | |
FSSD500 | VGG16 | 500 × 500 × 3 | ✓ | 69.6 | 88.71 | ||
DF-SSD300 [41] | DenseNet-S-32-1 | 300 × 300 × 3 | ✓ | 11.6 | 91.24 | ||
L_SSD [58] | ResNet-101 | 224 × 224 × 3 | ✓ | 40 | 89.98 | ||
MP-FSSD [18] | VGG16 | 300 × 300 × 3 | ✓ | 62.57 | 93.4 | ||
Hyb-SSDNet (ours) | Inception v4 | 299 × 299 × 3 | ✓ | 44.5 | 93.29 | ||
CVC-ClinicDB | ETIS-Larib | ||||||
SSD300 | VGG16 | 300 × 300 × 3 | ✓ | 46 | 74.5 | 74.12 | |
SSD300 | ResNet-101 | 300 × 300 × 3 | ✓ | 47.3 | 78.85 | 75.73 | |
CVC-ClinicDB | SSD500 | VGG16 | 500 × 500 × 3 | ✓ | 19 | 78.38 | 75.45 |
& | SSD500 | ResNet-101 | 500 × 500 × 3 | ✓ | 20 | 82.74 | 80.14 |
ETIS-Larib | FSSD300 | VGG16 | 300 × 300 × 3 | ✓ | 65.9 | 87.26 | 86.3 |
FSSD500 | VGG16 | 500 × 500 × 3 | ✓ | 69.6 | 87.54 | 86.92 | |
DF-SSD300 [41] | DenseNet-S-32-1 | 300 × 300 × 3 | ✓ | 11.6 | 89.92 | 86.84 | |
L_SSD [58] | ResNet-101 | 224 × 224 × 3 | ✓ | 40 | 88.18 | 87.23 | |
MP-FSSD [18] | VGG16 | 300 × 300 × 3 | ✓ | 62.57 | 89.82 | 90 | |
Hyb-SSDNet (ours) | Inception v4 | 299 × 299 × 3 | ✓ | 44.5 | 91.93 | 91.10 |
Training Dataset | Methods | Testing Dataset | Backbone Network | Pre-Train | Input Size | Prec | Recall | F1 Score |
---|---|---|---|---|---|---|---|---|
WCE images | Hyb-SSDNet (ours) | WCE images | Inception v4 | ✓ | 93.29%(mAP) | 89.4%(mAR) | 91.5%(mAF) | |
ETIS-Larib+CVC-ClinicDB | Hyb-SSDNet (ours) | CVC-ClinicDB | Inception v4 | ✓ | 91.93%(mAP) | 89.5%(mAR) | 90.8%(mAF) | |
ETIS-Larib+CVC-ClinicDB | Hyb-SSDNet (ours) | ETIS-Larib | Inception v4 | ✓ | 91.10%(mAP) | 87%(mAR) | 89%(mAF) | |
WCE +CVC-ClinicDB | Souaidi et al., 2022 [18] | ETIS-Larib | VGG16 | ✓ | 90.02%(mAP) | × | × | |
CVC-ClinicDB + ETIS-Larib | Shin et al., 2018 [59] | ETIS-Larib | Inception ResNet | ✓ | 92.2% | 69.7% | 79.4% | |
SUN+ PICCOLO+ CVC-ClinicDB | Ishak et al., 2021 [32] | ETIS-Larib | YOLOv3 | ✓ | 90.61% | 91.04% | 90.82% | |
CVC-ClinicDB | Liu et al., 2021 [60] | ETIS-Larib | ResNet-101 | ✓ | 77.80% | 87.50% | 82.40% | |
GIANA 2017 | Wang et al., 2019 [61] | ETIS-Larib | AFP-Net(VGG16) | ✓ | 88.89% | 80.7% | 84.63% | |
CVC-ClinicDB | Qadir et al., 2021 [62] | ETIS-Larib | ResNet34 | ✓ | 86.54% | 86.12% | 86.33% | |
CVC-ClinicDB | Pacal and Karaboga, 2021 [63] | ETIS-Larib | CSPDarkNet53 | ✓ | 91.62% | 82.55% | 86.85% | |
CVC-ClinicDB | Wang et al., 2019 [61] | ETIS-Larib | Faster R-CNN (VGG16) | × | 88.89% | 80.77% | 84.63% | |
CVC-VideoClinicDB | Krenzer et al., 2019 [64] | CVC-VideoClinicDB | YOLOv5 | × | 73.21%(mAP) | × | 79.55% |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Souaidi, M.; El Ansari, M. Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images. Diagnostics 2022, 12, 2030. https://doi.org/10.3390/diagnostics12082030
Souaidi M, El Ansari M. Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images. Diagnostics. 2022; 12(8):2030. https://doi.org/10.3390/diagnostics12082030
Chicago/Turabian StyleSouaidi, Meryem, and Mohamed El Ansari. 2022. "Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images" Diagnostics 12, no. 8: 2030. https://doi.org/10.3390/diagnostics12082030
APA StyleSouaidi, M., & El Ansari, M. (2022). Multi-Scale Hybrid Network for Polyp Detection in Wireless Capsule Endoscopy and Colonoscopy Images. Diagnostics, 12(8), 2030. https://doi.org/10.3390/diagnostics12082030