Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames

Michael D. Vasilakakis¹,
Dimitris Diamantis¹,
Evaggelos Spyrou¹,
Anastasios Koulaouzidis² &
…
Dimtris K. Iakovidis ORCID: orcid.org/0000-0002-5027-5323¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Several studies have addressed the problem of abnormality detection in medical images using computer-based systems. The impact of such systems in clinical practice and in the society can be high, considering that they can contribute to the reduction of medical errors and the associated adverse events. Today, most of these systems are based on binary classification algorithms that are “strongly” supervised, in the sense that the abnormal training images need to be annotated in detail, i.e., with pixel-level annotations indicating the location of the abnormalities. However, this approach usually does not take into account the diversity of the image content, which may include a variety of structures and artifacts. In the context of gastrointestinal video-endoscopy, addressed in this study, the semantics of the normal contents of the endoscopic video frames include normal mucosal tissues, bubbles, debris and the hole of the lumen, whereas the abnormal video frames may include additional semantics corresponding to lesions or blood. This observation motivated us to investigate various multi-label classification methods, aiming to a richer semantic interpretation of the endoscopic images. Among them, an image-saliency enabled bag-of-words approach and a convolutional neural network architecture enabling multi-scale feature extraction (MM-CNN) are presented. Weakly-supervised learning is implemented using only semantic-level annotations, i.e., meaningful keywords, thus, avoiding the need for the resource demanding pixelwise annotation of the training images. Experiments were performed on a diverse set of wireless capsule endoscopy images. The results of the experiments validate that the weakly-supervised multi-label classification can provide enhanced discrimination of the gastrointestinal abnormalities, with MM-CNN method to provide the best performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Beyond Lesion Detection: Towards Semantic Interpretation of Endoscopy Videos

Combining deep features and hand-crafted features for abnormality detection in WCE images

Article 25 May 2023

Weakly-Supervised Lesion Detection in Video Capsule Endoscopy Based on a Bag-of-Colour Features Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

KID dataset: https://is-innovation.eu/kid/.

References

Abadi M, Agarwal et al (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems. arXiv preprint. arXiv:1603.04467
Achanta R, Shaji A, Smith K, Lucchi A, Fua P, Süsstrunk S (2012) SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans Pattern Anal Mach Intell 34:2274–2282. https://doi.org/10.1109/tpami.2012.120
Article Google Scholar
Bay H, Ess A, Tuytelaars T, Van Gool L (2008) Speeded-up robust features (SURF). Comput Vis Image Video Frame Underst 110:346–359. https://doi.org/10.1016/j.cviu.2007.09.014
Article Google Scholar
Bernal J, Sánchez F, Fernández-Esparrach G, Gil D, Rodríguez C, Vilariño F (2015) WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians. Comput Med Imaging Graph 43:99–111
Article Google Scholar
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol (TIST) 2(3):27
Google Scholar
Chen H, Wu X, Tao G, Peng Q (2017) Automatic content understanding with cascaded spatial–temporal deep framework for capsule endoscopy videos. Neurocomputing 229:77–87
Article Google Scholar
Chollet F (2015) Keras. GitHub. https://github.com/fchollet/keras
Deng J, Dong W, Socher R et al (2009) Imagenet: a large-scale hierarchical image database. In: Computer vision and pattern recognition. CVPR. IEEE Conference, pp 248–255
Drake J, Hamerly G (2012) Accelerated k-means with adaptive distance bounds. In: 5th NIPS workshop on optimization for machine learning
Elisseeff A, Weston J (2001) A kernel method for multi-labeled classification. NIPS 681–687
Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874. https://doi.org/10.1016/j.patrec.2005.10.010
Article Google Scholar
Fu Y, Zhang W, Mandal M, Meng M (2014) Computer-aided bleeding detection in WCE video. IEEE J Biomed Health Inform 18(2):636–642
Article Google Scholar
Fürnkranz J, Hüllermeier E, Loza Mencía E, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73:133–153. https://doi.org/10.1007/s10994-008-5064-8
Article Google Scholar
Georgakopoulos S, Iakovidis D, Vasilakakis M et al (2016) Weakly-supervised convolutional learning for detection of inflammatory gastrointestinal lesions. In: Imaging systems and techniques (IST), IEEE international conference. IEEE, pp 510–514
Gong Y, Jia Y, Leung T et al (2013) Deep convolutional ranking for multilabel image annotation. arXiv preprint. arXiv:1312.4894
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) Tagprop: discriminative metric learning in nearest neighbor models for image auto-annotation. In: Computer vision, 2009 IEEE 12th international conference, pp 309–316
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Hinton GE, Srivastava N, Swersky K (2012) Lecture 6a—overview of mini-batch gradient descent. In: Neural networks for machine learning, pp 31
Hoai M, Torresani L, De la Torre F, Rother C (2014) Learning discriminative localization from weakly labeled data. Pattern Recogn 47:1523–1534. https://doi.org/10.1016/j.patcog.2013.09.028
Article MATH Google Scholar
Iakovidis D, Koulaouzidis A (2014) Automatic lesion detection in capsule endoscopy based on color saliency: closer to an essential adjunct for reviewing software. Gastrointest Endosc 80:877–883. https://doi.org/10.1016/j.gie.2014.06.026
Article Google Scholar
Iakovidis D, Koulaouzidis A (2015) Software for enhanced video capsule endoscopy: challenges for essential progress. Nat Rev Gastroenterol Hepatol 12:172–186. https://doi.org/10.1038/nrgastro.2015.13
Article Google Scholar
Iakovidis D, Goudas T, Smailis C, Maglogiannis I (2014a) Ratsnake: a versatile image video frame annotation tool with application to computer-aided diagnosis. Sci World J 2014:1–12. https://doi.org/10.1155/2014/286856
Article Google Scholar
Iakovidis D, Sarmiento R, Silva J, Histace A, Romain O, Koulaouzidis A, Dehollain C, Pinna A, Granado B, Dray X (2014b) Towards intelligent capsules for robust wireless endoscopic imaging of the gut. In: Imaging systems and techniques, IEEE international conference. IEEE, pp 95–100
Iakovidis D, Chatzis D, Chrysanthopoulos P, Koulaouzidis A (2015) Blood detection in wireless capsule endoscope images based on salient superpixels. In: Annual international conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp 731–734
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456
Jia X, Meng M (2018) A deep convolutional neural network for bleeding detection in wireless capsule endoscopy images. In: 38th annual international conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp 639–642
Jia Y, Shelhamer E, Donahue J (2014) Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM international conference on multimedia, pp 675–678
Koulaouzidis A, Rondonotti E, Karargyris A (2013) Small-bowel capsule endoscopy: a ten-point contemporary review. World J Gastroenterol 19(24):3726–3746. 6
Article Google Scholar
Koulaouzidis A, Iakovidis DK, Karargyris A, Rondonotti E (2015) Wireless endoscopy in 2020: will it still be a capsule? World J Gastroenterol 21(17):5119–5130
Article Google Scholar
Koulaouzidis A, Iakovidis DK, Yung DE, Rondonotti E, Kopylov U, Plevris JN, Toth E, Eliakim A, Johansson GW, Marlicz W et al (2017) KID project: an internet-based digital video atlas of capsule endoscopy for research purposes. Endosc Int Open 5(06):E477–E483
Article Google Scholar
Krizhevsky A (2009) Learning multiple layers of features from tiny images. Technical Report, Computer Science Department, University of Toronto. https://www.cs.toronto.edu/~kriz/learning-features-2009-TR.pdf
Krizhevsky A, Sutskever I, Hinton G (2012) Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems (NIPS), Lake Tahoe, Nevada, vol 1, pp 1097–1105
Le Cun Y, Boser B, Denker J et al (1990) Handwritten digit recognition with a back-propagation network. In: Advances in neural information processing systems, pp 396–404
Li H, Liu L, Sun F et al (2016) Multi-level feature representations for video semantic concept detection. Neurocomputing 172:64–70. https://doi.org/10.1016/j.neucom.2014.09.096
Article Google Scholar
Lowe D (2004) Distinctive image video frame features from scale-invariant keypoints. Int J Comput Vision 60:91–110. https://doi.org/10.1023/b:visi.0000029664.99615.94
Article Google Scholar
Mencia E, Furnkranz J (2008) Pairwise learning of multilabel classifications with perceptrons. In: Neural networks, 2008. IJCNN 2008 (IEEE world congress on computational intelligence). IEEE international joint conference. IEEE, pp 2899–2906
Nickolls J, Buck I, Garland M, Skadron K (2008) Scalable parallel programming with CUDA. Queue 6:40. https://doi.org/10.1145/1365490.1365500
Article Google Scholar
Provost F, Fawcett T (1997) Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. In: Proceedings of the third international conference on knowledge discovery and data mining (KDD'97), pp 43–48
Read J, Pfahringer B, Holmes G (2008) Multi-label classification using ensembles of pruned sets. Paper presented at the proceedings—IEEE international conference on data mining, ICDM, pp 995–1000
Read J, Reutemann P, Pfahringer B, Holmes G (2017) MEKA: a multi-label/multi-target extension to WEKA. J Mach Learn Res 17:1–5
MathSciNet MATH Google Scholar
Riphaus A, Richter S, Vonderach M, Wehrmann T (2009) Capsule endoscopy interpretation by an endoscopy nurse—a comparative trial. Zeitschrift für Gastroenterologie 47:273–276. https://doi.org/10.1055/s-2008-1027822
Article Google Scholar
Seguí S, Drozdzal M, Pascual G, Radeva P, Malagelada C, Azpiroz F, Vitrià J (2016) Generic feature learning for wireless capsule endoscopy analysis. Comput Biol Med 79:163–172
Article Google Scholar
Sekuboyina A, Devarakonda S, Seelamantula C (2017) A convolutional neural network approach for abnormality detection in wireless capsule endoscopy. In: Biomedical imaging (ISBI 2017). IEEE 14th international symposium, pp 1057–1060
Shi W, Chen J, Chen H, Peng Q, Gan T (2015) Bleeding fragment localization using time domain information for WCE videos. In: 2015 8th international conference on biomedical engineering and informatics, BMEI, pp 73–78
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint. arXiv:1409.1556
Smeulders A, Worring M, Santini S et al (2000) Content-based imagevideo frame retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22:1349–1380. https://doi.org/10.1109/34.895972
Article Google Scholar
Springenberg J, Dosovitskiy A, Brox T, Riedmiller M (2014) Striving for simplicity: the all convolutional net. arXiv preprint. arXiv:1412.6806
Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Theodoridis S, Koutroumbas K (2008) Pattern recognition. Elsevier/Academic Press, Amsterdam
MATH Google Scholar
Tsoumakas G, Katakis I (2007) Multi-label classification. Int J Data Wareh Min 3:1–13. https://doi.org/10.4018/jdwm.2007070101
Article Google Scholar
Tuytelaars T (2010) Dense interest points. In: Computer vision and pattern recognition (CVPR). IEEE conference, pp 2281–2288
Vasilakakis M, Iakovidis DK, Spyrou E, Koulaouzidis A (2016) Weakly-supervised lesion detection in video capsule endoscopy based on a bag-of-colour features model. In: International workshop on computer-assisted and robotic endoscopy, pp 96–103
Vasilakakis M, Iakovidis D, Spyrou E et al (2017) Beyond lesion detection: towards semantic interpretation of endoscopy videos. In: International conference on engineering applications of neural networks. Springer, Cham, pp 379–390
Wang S, Cong Y, Fan H, Yang Y, Tang Y, Zhao H (2015) Computer aided endoscope diagnosis via weakly labeled data mining. In: Image processing (ICIP). IEEE international conference, pp 3072–3076
Wang J, Yang Y, Mao J et al (2016a) Cnn-rnn: a unified framework for multi-label image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2285–2294
Wang S, Cong Y, Fan H, Liu L, Li X, Yang Y, Tang Y, Zhao H, Yu H (2016b) Computer-aided endoscopic diagnosis without human-specific labeling. IEEE Trans Biomed Eng 63(11):2347–2358
Article Google Scholar
Witten I, Frank E, Hall M, Pal C (2017) Data mining, 1st edn. Morgan Kaufmann, Amsterdam
Google Scholar
Yu L, Yuen P, Lai J (2012) Ulcer detection in wireless capsule endoscopy images. In: 21st international conference on pattern recognition (ICPR). IEEE, pp 45–48
Yuan Y, Wang J, Li B, Meng M (2015) Saliency based ulcer detection for wireless capsule endoscopy diagnosis. IEEE Trans Med Imaging 34(10):2046–2057
Article Google Scholar
Yuan Y, Li B, Meng M (2016a) Improved bag of feature for automatic polyp detection in wireless capsule endoscopy images video frames. IEEE Trans Autom Sci Eng 13:529–535. https://doi.org/10.1109/tase.2015.2395429
Article Google Scholar
Yuan Y, Li B, Meng M (2016b) Bleeding frame and region detection in the wireless capsule endoscopy video. IEEE J Biomed Health Inform 20(2):624–630
Article Google Scholar
Yuan Y, Li B, Meng M (2017a) WCE abnormality detection based on saliency and adaptive locality-constrained linear coding. IEEE Trans Autom Sci Eng 14(1):149–159
Article Google Scholar
Yuan Y, Li D, Meng MQH (2017b) Automatic polyp detection via a novel unified bottom-up and top-down saliency approach. IEEE J Biomed Health Inform. https://doi.org/10.1109/JBHI.2017.2734329
Article Google Scholar
Yung D, Fernandez-Urien I, Douglas S, Plevris J, Sidhu R, McAlindon M, Panter S, Koulaouzidis A (2017) Systematic review and meta-analysis of the performance of nurses in small bowel capsule endoscopy reading. United Eur Gastroenterol J. https://doi.org/10.1177/2050640616687232
Article Google Scholar
Zhang M, Zhou Z (2006) Multilabel neural networks with applications to functional genomics and text categorization. IEEE Trans Knowl Data Eng 18:1338–1351. https://doi.org/10.1109/tkde.2006.162
Article Google Scholar
Zhang M, Zhou Z (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recogn 40:2038–2048. https://doi.org/10.1016/j.patcog.2006.12.019
Article MATH Google Scholar
Zhang M, Zhou Z (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26:1819–1837. https://doi.org/10.1109/tkde.2013.39
Article Google Scholar
Zhang R, Zheng Y, Mak T, Yu R, Wong S, Lau J, Poon C (2017) Automatic detection and classification of colorectal polyps by transferring low-level CNN features from nonmedical domain. IEEE J Biomed Health Inform 21(1):41–47
Article Google Scholar
Zheng Y, Hawkins L, Wolff J, Goloubeva O, Goldberg E (2012) Detection of lesions during capsule endoscopy: physician performance is disappointing. Am J Gastroenterol 107:554–560. https://doi.org/10.1038/ajg.2011.46
Article Google Scholar

Download references

Acknowledgements

The research presented in this paper was financially supported by the project “Klearchos Koulaouzidis” Grant no. 5151 and the Special Account of Research Grants of the University of Thessaly, Greece.

Author information

Authors and Affiliations

Department of Computer Science and Biomedical Informatics, University of Thessaly, Papasiopoulou 2-4, 35131, Lamia, Greece
Michael D. Vasilakakis, Dimitris Diamantis, Evaggelos Spyrou & Dimtris K. Iakovidis
Endoscopy Unit, The Royal Infirmary of Edinburgh, Edinburgh, UK
Anastasios Koulaouzidis

Authors

Michael D. Vasilakakis
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Diamantis
View author publications
You can also search for this author in PubMed Google Scholar
Evaggelos Spyrou
View author publications
You can also search for this author in PubMed Google Scholar
Anastasios Koulaouzidis
View author publications
You can also search for this author in PubMed Google Scholar
Dimtris K. Iakovidis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dimtris K. Iakovidis.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Vasilakakis, M.D., Diamantis, D., Spyrou, E. et al. Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames. Evolving Systems 11, 409–421 (2020). https://doi.org/10.1007/s12530-018-9236-x

Download citation

Received: 05 January 2018
Accepted: 17 May 2018
Published: 25 May 2018
Issue Date: September 2020
DOI: https://doi.org/10.1007/s12530-018-9236-x

Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Beyond Lesion Detection: Towards Semantic Interpretation of Endoscopy Videos

Combining deep features and hand-crafted features for abnormality detection in WCE images

Weakly-Supervised Lesion Detection in Video Capsule Endoscopy Based on a Bag-of-Colour Features Model

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Weakly supervised multilabel classification for semantic interpretation of endoscopy video frames

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Beyond Lesion Detection: Towards Semantic Interpretation of Endoscopy Videos

Combining deep features and hand-crafted features for abnormality detection in WCE images

Weakly-Supervised Lesion Detection in Video Capsule Endoscopy Based on a Bag-of-Colour Features Model

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation