Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.17915 (cs)

[Submitted on 26 Mar 2024 (v1), last revised 20 Aug 2024 (this version, v4)]

Title:Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos

Authors:Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen M. Pizer, Marc Niethammer, Roni Sengupta

View PDF HTML (experimental)

Abstract:Monocular depth estimation in endoscopy videos can enable assistive and robotic surgery to obtain better coverage of the organ and detection of various health issues. Despite promising progress on mainstream, natural image depth estimation, techniques perform poorly on endoscopy images due to a lack of strong geometric features and challenging illumination effects. In this paper, we utilize the photometric cues, i.e., the light emitted from an endoscope and reflected by the surface, to improve monocular depth estimation. We first create two novel loss functions with supervised and self-supervised variants that utilize a per-pixel shading representation. We then propose a novel depth refinement network (PPSNet) that leverages the same per-pixel shading representation. Finally, we introduce teacher-student transfer learning to produce better depth maps from both synthetic data with supervision and clinical data with self-supervision. We achieve state-of-the-art results on the C3VD dataset while estimating high-quality depth maps from clinical data. Our code, pre-trained models, and supplementary materials can be found on our project page: this https URL

Comments:	Accepted to ECCV 2024. 27 pages, 8 tables, 8 figures. Updated to include reference to clinical dataset
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.17915 [cs.CV]
	(or arXiv:2403.17915v4 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.17915

Submission history

From: Akshay Paruchuri [view email]
[v1] Tue, 26 Mar 2024 17:52:23 UTC (3,223 KB)
[v2] Tue, 16 Jul 2024 06:44:04 UTC (2,810 KB)
[v3] Thu, 18 Jul 2024 04:27:38 UTC (3,053 KB)
[v4] Tue, 20 Aug 2024 18:17:30 UTC (3,053 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy Videos

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators