Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

De-redundancy in wireless capsule endoscopy video sequences using correspondence matching and motion analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Handling wireless capsule endoscopy (WCE) de-redundancy is a challenging task. This paper proposes a scheme, called SS-VCF-Der, to consider applying a flow field estimation between two successive WCE frames to WCE imaging motion analysis and then address the WCE de-redundancy problem based on the results of the motion analysis. To this end, we intend to exploit a self-supervised technique to learn interframe visual correspondence representations from large amounts of raw WCE videos without manual human supervision, and predict the flow field. Our key idea is to use the natural spatial-temporal coherence in color and cycle consistency in time in WCE videos as free supervisory signal to learn WCE visual correspondence relations from scratch. We call this procedure self-supervised visual correspondence flow learning (SS-VCF). At training time, we use three losses: forward-backward cycle-consistency loss, visual similarity loss, and color loss, to train and optimize model. At test time, we use the acquired representation to generate a flow field for analyzing pixel movement between two successive WCE frames. Furthermore, according to the resulting flow field estimation, we compute the motion intensity of motion fields between two successive frames, and use our proposed de-redundancy method, namely SS-VCF-MI, to select some frames as key ones with distinct scene changes in local neighborhood so as to achieve the purpose of de-redundancy. Extensive experiments on our collected WCE-2019-Video dataset show that our scheme can achieve a promising result, verifying its effectiveness on the visual correspondence representation and redundancy removal for WCE videos.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. We do not distinguish between the term unsupervised and self-supervised, as both refer to learning without human supervision. But in this paper, we use the term of self-supervised learning for WCE video representation.

References

  1. Al-shebani Q, Premaratne P, McAndrew DJ, Vial PJ, Abey S (2019) A frame reduction system based on a color structural similarity (css) method and bayer images analysis for capsule endoscopy. Artif Intell Med 94:18–27. https://doi.org/10.1016/j.artmed.2018.12.008

    Article  PubMed  Google Scholar 

  2. Baker S, Roth S, Scharstein D, Black MJ, Lewis JP, Szeliski R: A database and evaluation methodology for optical ow. In: 2007 IEEE 11th International Conference on Computer Vision, pp 1–8 (2007). https://doi.org/10.1109/ICCV.2007.4408903

  3. Beg S, Card T, Sidhu R, Wronska E, Ragunath K, Ching H-L, Koulaouzidis A, Yung D, Panter S, Mcalindon M, Johnson M, Kurup A, Shonde A, San-Juan Acosta M, Sansone S, Simmon E, Thurston V, Healy A, Chetcuti Zammit S, Schembri J, Lau MS, Lam C, Nizamuddin M, Baxter A, Patel J, Archer T, Oppong P, Phillips F, Dorn T, Fateen W, White J, Budihal S, Tan H, Tiwari R (2021) The impact of reader fatigue on the accuracy of capsule endoscopy interpretation. Digestive and Liver Disease 53(8):1028–1033. https://doi.org/10.1016/j.dld.2021.04.024

    Article  PubMed  Google Scholar 

  4. Biniaz A, Zoroo RA, Sohrabi MR (2020) Automatic reduction of wireless capsule endoscopy reviewing time based on factorization analysis. Biomed Signal Process Control 59:101897. https://doi.org/10.1016/j.bspc.2020.101897

    Article  Google Scholar 

  5. Butler DJ, Wul J, Stanley GB, Black MJ: A naturalistic open source movie for optical ow evaluation. In: Proceedings of the 12th European Conference on Computer Vision - Volume Part VI. ECCV’12, pp 611–625. Springer, Berlin, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33783-3_44

  6. Chen J, Zou Y, Wang Y: Wireless capsule endoscopy video summarization: A learning approach based on siamese neural network and support vector machine. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp 1303–1308 (2016). https://doi.org/10.1109/ICPR.2016.7899817

  7. Dalal N, Triggs B: Histograms of oriented gradients for human detection. In: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Volume 1 - Volume 01. CVPR ’05, pp 886–893. IEEE Computer Society, USA (2005). https://doi.org/10.1109/CVPR.2005.177

  8. Divakaran A, Peker K, Huifang S: A region based descriptor for spatial distribution of motion activity for compressed video. In: Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101), vol 2, pp 287–2902 (2000). https://doi.org/10.1109/ICIP.2000.899359

  9. Divakaran A, Sun H: Descriptor for spatial distribution of motion activity for compressed video. In: Storage and Retrieval for Media Databases 2000, vol 3972, pp 392–398. https://doi.org/10.1117/12.373571

  10. Dosovitskiy A, Fischer P, Ilg E, Häusser P, Hazirbas C, Golkov V, v. d. Smagt P, Cremers D, Brox T: Flownet: Learning optical ow with convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766 (2015). https://doi.org/10.1109/ICCV.2015.316

  11. Dray X, Iakovidis D, Houdeville C, Jover R, Diamantis D, Histace A, Koulaouzidis A (2021) Artificial intelligence in small bowel capsule endoscopy - current status, challenges and future promise. J Gastroenterology Hepatology 36(1):12–19. https://doi.org/10.1111/jgh.15341

    Article  Google Scholar 

  12. Drozdzal M, Igual L, Vitrià J, Malagelada C, Azpiroz F, Radeva P: Aligning endoluminal scene sequences in wireless capsule endoscopy. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops, pp 117–124 (2010). https://doi.org/10.1109/CVPRW.2010.5543456

  13. Dwibedi D, Aytar Y, Tompson J, Sermanet P, Zisserman A: Temporal cycle-consistency learning. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1801–1810 (2019). https://doi.org/10.1109/CVPR.2019.00190

  14. Figueiredo IN, Leal C, Pinto L, Figueiredo PN, Tsai R: Dissimilarity measure of consecutive frames in wireless capsule endoscopy videos: A way of searching for abnormalities. In: 2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS), pp 702–707 (2017). https://doi.org/10.1109/CBMS.2017.18

  15. Figueiredo IN, Leal C, Pinto L, Figueiredo PN, Tsai R (2018) Hybrid multiscale affine and elastic image registration approach towards wireless capsule endoscope localization. Biomed Signal Process Control 39:486–502. https://doi.org/10.1016/j.bspc.2017.08.019

    Article  Google Scholar 

  16. Fu Y, Liu H, Cheng Y, Yan T, Li T, Meng MQ-: Key-frame selection in wce video based on shot detection. In: Proceedings of the 10th World Congress on Intelligent Control and Automation, pp 5030–5034 (2012). https://doi.org/10.1109/WCICA.2012.6359431

  17. Han K, Rezende RS, Ham B, Wong KK, Cho M, Schmid C, Ponce J: Scnet: Learning semantic correspondence. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp 1849–1858 (2017). https://doi.org/10.1109/ICCV.2017.203

  18. He K, Zhang X, Ren S, Sun J: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 770–778 (2016). https://doi.org/10.1109/CVPR.2016.90

  19. Horn BKP, Schunck BG (1981) Determining optical ow. Artif Intell 17(1):185–203. https://doi.org/10.1016/0004-3702(81)90024-2

    Article  Google Scholar 

  20. Iakovidis DK, Koulaouzidis A (2015) Software for enhanced video capsule endoscopy: challenges for essential progress. Nature Rev Gastroenterology Hepatology 12:172–186. https://doi.org/10.1038/nrgastro.2015.13

    Article  Google Scholar 

  21. Iakovidis DK, Tsevas S, Polydorou A (2010) Reduction of capsule endoscopy reading times by unsupervised image mining. Comput Med Imag Graph 34(6):471–478. https://doi.org/10.1016/j.compmedimag.2009.11.005

    Article  CAS  Google Scholar 

  22. Iddan G, Meron G, Glukhovsky A, Swain P (2000) Wireless capsule endoscopy. Nature 405(6785):417–417. https://doi.org/10.1038/35013140

    Article  ADS  CAS  PubMed  Google Scholar 

  23. Ilg E, Mayer N, Saikia T, Keuper M, Dosovitskiy A, Brox T: Flownet 2.0: Evolution of optical ow estimation with deep networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1647–1655 (2017). https://doi.org/10.1109/CVPR.2017.179

  24. Jaderberg M, Simonyan K, Zisserman A, Kavukcuoglu K: Spatial transformer networks (2015) arXiv:1506.02025

  25. Jani KK, Srivastava R (2019) A survey on medical image analysis in capsule endoscopy. Current Med Imag Rev 15(7):622–636. https://doi.org/10.2174/1573405614666181102152434

    Article  Google Scholar 

  26. Karargyris A, Bourbakis N: A video-frame based registration using segmentation and graph connectivity for wireless capsule endoscopy. In: 2009 IEEE/NIH Life Science Systems and Applications Workshop, pp 74–79 (2009). https://doi.org/10.1109/LISSA.2009.4906713

  27. Kim S, Min D, Ham B, Lin S, Sohn K (2019) Fcss: Fully convolutional self-similarity for dense semantic correspondence. IEEE Trans Pattern Anal Mach Intell 41(3):581–595. https://doi.org/10.1109/TPAMI.2018.2803169

    Article  PubMed  Google Scholar 

  28. Kingma DP, Ba J: Adam: A method for stochastic optimization (2014) arXiv:1412.6980

  29. Koulaouzidis A, Dabos K, Philipper M, Toth E, Keuchel M (2021) How should we do colon capsule endoscopy reading: a practical guide. Therapeutic Advances in Gastrointestinal Endoscopy 14:26317745211001984. https://doi.org/10.1177/26317745211001983. (PMID: 33817637)

    Article  PubMed  PubMed Central  Google Scholar 

  30. Lai Z, Xie W: Self-supervised learning for video correspondence ow (2019) arXiv:1905.00875

  31. Lan L, Ye C: Recurrent generative adversarial networks for unsupervised wce video summarization. Knowledge-Based Systems, 106971 (2021). https://doi.org/10.1016/j.knosys.2021.106971

  32. Larsson G, Maire M, Shakhnarovich G: Colorization as a proxy task for visual understanding. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 840–849. IEEE Computer Society, Los Alamitos, CA, USA (2017). https://doi.org/10.1109/CVPR.2017.96

  33. Lee H-G, Choi M-K, Shin B-S, Lee S-C (2013) Reducing redundancy in wireless capsule endoscopy videos. Comput Biology Med 43(6):670–682. https://doi.org/10.1016/j.compbiomed.2013.02.009

    Article  Google Scholar 

  34. Lee J, Kim D, Ponce J, Ham B: Sfnet: Learning object-aware semantic correspondence. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 2273–2282 (2019). https://doi.org/10.1109/CVPR.2019.00238

  35. Li C, Hamza AB, Bouguila N, Wang X, Ming F, Xiao G (2014) Online redundant image elimination and its application to wireless capsule endoscopy. Signal Imag Video Process 8(8):1497–1506. https://doi.org/10.1007/s11760-012-0384-3

    Article  Google Scholar 

  36. Liao C, Wang C, Bai J, Lan L, Wu X (2021) Deep learning for registration of region of interest in consecutive wireless capsule endoscopy frames. Comput Method Prog Biomed 208:106189. https://doi.org/10.1016/j.cmpb.2021.106189

    Article  Google Scholar 

  37. Lien G, Liu C, Jiang J, Chuang C, Teng M (2012) Magnetic control system targeted for capsule endoscopic operations in the stomach|design, fabrication, and in vitro and ex vivo evaluations. IEEE Trans Biomed Eng 59(7):2068–2079. https://doi.org/10.1109/TBME.2012.2198061

    Article  PubMed  Google Scholar 

  38. Li S, Han K, Costain TW, Howard-Jenkins H, Prisacariu V: Correspondence networks with adaptive neighbourhood consensus. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 10193–10202 (2020). https://doi.org/10.1109/CVPR42600.2020.01021

  39. Li B, Meng MQ-, Hu C: Motion analysis for capsule endoscopy video segmentation. In: 2011 IEEE International Conference on Automation and Logistics (ICAL), pp 46–51 (2011). https://doi.org/10.1109/ICAL.2011.6024682

  40. Li B, Meng MQ-, Zhao Q: Wireless capsule endoscopy video summary. In: 2010 IEEE International Conference on Robotics and Biomimetics, pp 454–459 (2010). https://doi.org/10.1109/ROBIO.2010.5723369

  41. Li B, Meng MQ-: Capsule endoscopy video boundary detection. In: 2011 IEEE International Conference on Information and Automation, pp 373–378 (2011). https://doi.org/10.1109/ICINFA.2011.5949020

  42. Liu C, Yuen J, Torralba A (2011) Sift ow: Dense correspondence across scenes and its applications. IEEE Trans Pattern Anal Mach Intell 33(5):978–994. https://doi.org/10.1109/TPAMI.2010.147

    Article  PubMed  Google Scholar 

  43. Liu H, Pan N, Lu H, Song E, Wang Q, Hung CC (2013) Wireless capsule endoscopy video reduction based on camera motion estimation. J Digital Imag. https://doi.org/10.1007/s10278-012-9519-x

    Article  Google Scholar 

  44. Liu X, Lee J, Jin H: Learning video representations from correspondence proposals. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 4268–4276 (2019). https://doi.org/10.1109/CVPR.2019.00440

  45. Li H, Zhang Y, Yang M, Men Y, Chao H: A rapid abnormal event detection method for surveillance video based on a novel feature in compressed domain of hevc. In: 2014 IEEE International Conference on Multimedia and Expo (ICME), pp 1–6 (2014). https://doi.org/10.1109/ICME.2014.6890212

  46. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110. https://doi.org/10.1023/B:VISI.0000029664.99615.94

    Article  Google Scholar 

  47. Lucas BD, Kanade T: An iterative image registration technique with an application to stereo vision. In: Proceedings of the 7th International Joint Conference on Artificial Intelligence - Volume 2. IJCAI’81, pp 674–679. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1981). https://doi.org/10.5555/1623264.1623280

  48. Mahasseni B, Lam M, Todorovic S: Unsupervised video summarization with adversarial lstm networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 2982–2991 (2017). https://doi.org/10.1109/CVPR.2017.318

  49. Meister S, Hur J, Roth S: Unflow: Unsupervised learning of optical flow with a bidirectional census loss (2017) arXiv:1711.07837

  50. Muhammad K, Khan S, Kumar N, Del Ser J, Mirjalili S (2020) Visionbased personalized wireless capsule endoscopy for smart healthcare: Taxonomy, literature review, opportunities and challenges. Future Generation Comput Syst 113:266–280. https://doi.org/10.1016/j.future.2020.06.048

    Article  Google Scholar 

  51. Nie R, Yang H, Peng H, Luo W, Fan W, Zhang J, Liao J, Huang F, Xiao Y: Application of Structural Similarity Analysis of Visually Salient Areas and Hierarchical Clustering in the Screening of Similar Wireless Capsule Endoscopic Images. arXiv e-prints, 2004–02805 (2020) arXiv:2004.02805 [eess.IV]

  52. Paszke A, am Gross, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A: Automatic di erentiation in pytorch. In: Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, pp 1–4 (2017)

  53. Rocco I, Arandjelović R, Sivic J (2019) Convolutional neural network architecture for geometric matching. IEEE Trans Pattern Anal Mach Intell 41(11):2553–2567. https://doi.org/10.1109/TPAMI.2018.2865351

    Article  PubMed  Google Scholar 

  54. Rondonotti E, Pennazio M, Toth E, Koulaouzidis A (2020) How to read small bowel capsule endoscopy: a practical guide for everyday use. Endoscopy Int Open 8(10):1220–1224. https://doi.org/10.1055/a-1210-4830

    Article  Google Scholar 

  55. Schoeffmann K, Fabro MD, Szkaliczki T, aszlo Böszörmenyi, Keckstein J (2015) Keyframe extraction in endoscopic video. J Multimed Tools Appl 74:11187–11206. https://doi.org/10.1007/s11042-014-2224-7

    Article  Google Scholar 

  56. Spyrou E, Iakovidis DK (2013) Video-based measurements for wireless capsule endoscope tracking. Measure Sci Technol 25(1):015002. https://doi.org/10.1088/0957-0233/25/1/015002

    Article  ADS  CAS  Google Scholar 

  57. Spyrou E, Diamantis D, Iakovidis DK: Panoramic visual summaries for efficient reading of capsule endoscopy videos. In: 2013 8th International Workshop on Semantic and Social Media Adaptation and Personalization, pp 41–46 (2013). https://doi.org/10.1109/SMAP.2013.21

  58. Sushma B, Aparna P (2021) Summarization of wireless capsule endoscopy video using deep feature matching and motion analysis. IEEE Access 9:13691–13703. https://doi.org/10.1109/ACCESS.2020.3044759

    Article  Google Scholar 

  59. Vondrick C, Shrivastava A, Fathi A, Guadarrama S, Murphy K: Tracking emerges by colorizing videos. In: Computer Vision - ECCV 2018, pp 402–419. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_24

  60. Wang X, Jabri A, Efros AA: Learning correspondence from the cycleconsistency of time. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 2561–2571 (2019). https://doi.org/10.1109/CVPR.2019.00267

  61. Wang N, Song Y, Ma C, Zhou W, Liu W, Li H: Unsupervised deep tracking. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp 1308–1317 (2019). https://doi.org/10.1109/CVPR.2019.00140

  62. Xu Y, Li K, Zhao Z, Meng MQ-: A novel system for closed-loop simultaneous magnetic actuation and localization of wce based on external sensors and rotating actuation. IEEE Trans Autom Sci Eng, 1–13 (2020). https://doi.org/10.1109/TASE.2020.3013954

  63. Yuan Y, Meng MQ-: Hierarchical key frames extraction for wce video. In: 2013 IEEE International Conference on Mechatronics and Automation, pp 225–229 (2013). https://doi.org/10.1109/ICMA.2013.6617922

  64. Zhang K, Chao W, Sha F, Grauman K: Video summarization with long short-term memory (2016) arXiv:1605.08110

  65. Zhang R, Isola P, Efros AA: Colorful image colorization. In: Computer Vision - ECCV 2016, pp 649–666. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_40

Download references

Acknowledgements

This work is supported in part by the Scientific Research Foundation of Chongqing University of Technology (0103210650), in part by the National Key Research and Development Program of China (Grant No. 2017YFB0802400), in part by the National Natural Science Foundation of China research fund (61672115), in part by the Chongqing Social Undertakings and Livelihood Security Science and Technology Innovation Project Special Program (cstc2017shmsA30003), and in part by the Humanity and Social Science Youth Foundation, Ministry of Education (Grant No. 17YJCZH043). In addition, we thank Juan Zhou and her colleagues from the Second Affiliated Hospital, Third Military Medical University, for the helpful discussions and suggestions. We also thank the Chongqing Jinshan Science & Technology (Group) Co., Ltd., for providing vital support with raw WCE videos. We would also like to thank the anonymous reviewers for their helpful comments which have led to many improvements in this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chunxiao Ye.

Ethics declarations

Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lan, L., Ye, C., Liao, C. et al. De-redundancy in wireless capsule endoscopy video sequences using correspondence matching and motion analysis. Multimed Tools Appl 83, 21171–21195 (2024). https://doi.org/10.1007/s11042-023-15530-7

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15530-7

Keywords

Navigation