research-article

Open access

Fast depth densification for occlusion-aware augmented reality

Authors:

Aleksander Holynski,

Johannes KopfAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 37, Issue 6

Article No.: 194, Pages 1 - 11

https://doi.org/10.1145/3272127.3275083

Published: 04 December 2018 Publication History

Abstract

Current AR systems only track sparse geometric features but do not compute depth for all pixels. For this reason, most AR effects are pure overlays that can never be occluded by real objects. We present a novel algorithm that propagates sparse depth to every pixel in near realtime. The produced depth maps are spatio-temporally smooth but exhibit sharp discontinuities at depth edges. This enables AR effects that can fully interact with and be occluded by the real scene. Our algorithm uses a video and a sparse SLAM reconstruction as input. It starts by estimating soft depth edges from the gradient of optical flow fields. Because optical flow is unreliable near occlusions we compute forward and backward flow fields and fuse the resulting depth edges using a novel reliability measure. We then localize the depth edges by thinning and aligning them with image edges. Finally, we optimize the propagated depth smoothly but encourage discontinuities at the recovered depth edges. We present results for numerous real-world examples and demonstrate the effectiveness for several occlusion-aware AR video effects. To quantitatively evaluate our algorithm we characterize the properties that make depth maps desirable for AR applications, and present novel evaluation metrics that capture how well these are satisfied. Our results compare favorably to a set of competitive baseline algorithms in this context.

References

[1]

Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 35, 6 (2016), article no. 198.

Digital Library

[2]

Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. European Conference on Computer Vision (ECCV) (2016), 617--632.

[3]

Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2015. Blind Video Temporal Consistency. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2015) 34, 6 (2015).

Digital Library

[4]

John Canny. 1986. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6 (1986), 679--698.

Digital Library

[5]

P. Dollár, Z. Tu, and S. Belongie. 2006. Supervised Learning of Edges and Object Boundaries. In CVPR.

Digital Library

[6]

Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2018. Direct Sparse Odometry. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2018).

[7]

Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-Scale Direct Monocular SLAM. European Conference on Computer Vision (ECCV) (2014), 834--849.

[8]

Yasutaka Furukawa and Carlos Hernández. 2015. Multi-View Stereo: A Tutorial. Foundations and Trends. in Computer Graphics and Vision 9, 1--2 (2015), 1--148.

Digital Library

[9]

Asmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz. 2011. Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In Pacific-Rim Symposium on Image and Video Technology. Springer, 165--177.

Digital Library

[10]

Till Kroeger, Radu Timofte, Dengxin Dai, and Luc Van Gool. 2016. Fast Optical Flow using Dense Inverse Search. Proceedings of the European Conference on Computer Vision (ECCV) (2016).

[11]

Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization Using Optimization. ACM Trans. Graph. 23, 3 (2004), 689--694.

Digital Library

[12]

D. Scaramuzza M. Pizzoli, C. Forster. 2014. REMODE: Probabilistic, monocular dense reconstruction in real time. International Conference on Robotics and Automation (ICRA) (2014), 2609--2616.

[13]

James McCann and Nancy S Pollard. 2008. Real-time gradient-domain painting. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 93.

Digital Library

[14]

Richard A. Newcombe, Steven J. Lovegrove, and Andrew J. Davison. 2011. DTAM: Dense Tracking and Mapping in Real-time. International Conference on Computer Vision (ICCV) (2011), 2320--2327.

Digital Library

[15]

Liyuan Pan, Yuchao Dai, Miaomiao Liu, and Fatih Porikli. 2018. Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps. In Applications of Computer Vision (WACV), 2018 IEEE Winter Conference on. IEEE, 1377--1386.

[16]

Jaesik Park, Hyeongwoo Kim, Yu-Wing Tai, Michael S Brown, and In So Kweon. 2014. High-quality depth map upsampling and completion for RGB-D cameras. IEEE Transactions on Image Processing 23, 12 (2014), 5559--5572.

[17]

Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital Photography with Flash and No-flash Image Pairs. ACM Trans. Graph. 23, 3 (2004), 664--672.

Digital Library

[18]

J.M.M. Montiel R. Mur-Artal and Juan D. Tardos. 2015. ORB-SLAM: a Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics 31, 5 (2015), 1147--1163.

Digital Library

[19]

Christian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A Dodgson. 2010. Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In European conference on Computer vision. Springer, 510--523.

Digital Library

[20]

Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. (2016).

[21]

Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In null. IEEE, 519--528.

Digital Library

[22]

Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Hernández, and Steven M. Seitz. 2014. Occluding Contours for Multi-view Stereo. Conference on Computer Vision and Pattern Recognition (2014), 4002--4009.

Digital Library

[23]

Jan Stühmer, Stefan Gumhold, and Daniel Cremers. 2010. Real-time Dense Geometry from a Handheld Camera. Proceedings of the 32Nd DAGM Conference on Pattern Recognition (2010), 11--20.

Digital Library

[24]

Richard Szeliski. 2006. Locally Adapted Hierarchical Basis Preconditioning. ACM Trans. Graph. 25, 3 (2006), 1135--1143.

Digital Library

[25]

Chamara Saroj Weerasekera, Thanuja Dharmasiri, Ravi Garg, Tom Drummond, and Ian Reid. 2018. Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors. arXiv preprint arXiv:1805.04239 (2018).

[26]

Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong, and Hujun Bao. 2009. Consistent Depth Maps Recovery from a Video Sequence. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 31, 6 (2009), 974--988.

Digital Library

[27]

Yinda Zhang and Thomas Funkhouser. 2018. Deep Depth Completion of a Single RGB-D Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 175--185.

Cited By

Xu HChen CYin QMa CGuo F(2025)Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3Machines10.3390/machines1303021213:3(212)Online publication date: 6-Mar-2025
https://doi.org/10.3390/machines13030212
Chen ZZhao YHe JLu YCui ZLi WZhang Y(2025)Feature distribution normalization network for multi-view stereoThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03334-141:1(409-421)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00371-024-03334-1
Pointecker FFriedl-Knirsch JJetter HAnthes C(2024)From Real to Virtual: Exploring Replica-Enhanced Environment Transitions along the Reality-Virtuality ContinuumProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642844(1-13)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642844
Show More Cited By

Index Terms

Fast depth densification for occlusion-aware augmented reality
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Reconstruction
        Video segmentation
      2. Image and video acquisition
        Computational photography
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Mixed / augmented reality

Recommendations

Digging into the multi-scale structure for a more refined depth map and 3D reconstruction
Abstract
Extracting dense depth from a single image is an important yet challenging computer vision task. Compared with stereo depth estimation, sensing the depth of a scene from monocular images is much more difficult and ambiguous because the epipolar ...
3-D Depth Reconstruction from a Single Still Image

We consider the task of 3-d depth estimation from a single still image. We take a supervised learning approach to this problem, in which we begin by collecting a training set of monocular images (of unstructured indoor and outdoor environments which ...
Fast Omnidirectional Depth Densification
Advances in Visual Computing
Abstract
Omnidirectional cameras are commonly equipped with fisheye lenses to capture 360-degree visual information, and severe spherical projective distortion occurs when a 360-degree image is stored as a two-dimensional image array. As a consequence, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 37, Issue 6

December 2018

1401 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3272127

Editor:
Takeo Igarashi
The University of Tokyo, Japan

Issue’s Table of Contents

Copyright © 2018 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2018

Published in TOG Volume 37, Issue 6

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

63
Total Citations
View Citations
3,055
Total Downloads

Downloads (Last 12 months)437
Downloads (Last 6 weeks)65

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Xu HChen CYin QMa CGuo F(2025)Virtual and Real Occlusion Processing Method of Monocular Visual Assembly Scene Based on ORB-SLAM3Machines10.3390/machines1303021213:3(212)Online publication date: 6-Mar-2025
https://doi.org/10.3390/machines13030212
Chen ZZhao YHe JLu YCui ZLi WZhang Y(2025)Feature distribution normalization network for multi-view stereoThe Visual Computer: International Journal of Computer Graphics10.1007/s00371-024-03334-141:1(409-421)Online publication date: 1-Jan-2025
https://dl.acm.org/doi/10.1007/s00371-024-03334-1
Pointecker FFriedl-Knirsch JJetter HAnthes C(2024)From Real to Virtual: Exploring Replica-Enhanced Environment Transitions along the Reality-Virtuality ContinuumProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642844(1-13)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642844
Wu GQian JCastelo Quispe SChen SRulff JSilva C(2024)ARTiST: Automated Text Simplification for Task Guidance in Augmented RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642772(1-24)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613904.3642772
Bhattacharjee PHans SKamboj S(2024)Innovative Approaches to Real-Time Occlusion Handling for Enhanced Augmented Reality Experiences2024 First International Conference on Software, Systems and Information Technology (SSITCON)10.1109/SSITCON62437.2024.10797111(1-6)Online publication date: 18-Oct-2024
https://doi.org/10.1109/SSITCON62437.2024.10797111
Ahmad MAmiruddin MIsmail AFadzli FHalim NSuaib N(2024)Implementation of Real-Time Spatial Mapping for Occlusion-awareness in Augmented Reality2024 5th International Conference on Smart Electronics and Communication (ICOSEC)10.1109/ICOSEC61587.2024.10722459(1651-1656)Online publication date: 18-Sep-2024
https://doi.org/10.1109/ICOSEC61587.2024.10722459
Qiao XPoggi MDeng PWei HGe CMattoccia S(2024)RGB Guided ToF Imaging System: A Survey of Deep Learning-Based MethodsInternational Journal of Computer Vision10.1007/s11263-024-02089-5132:11(4954-4991)Online publication date: 1-Nov-2024
https://dl.acm.org/doi/10.1007/s11263-024-02089-5
Khan NKim MTompkin J(2024)Are Multi-view Edges Incomplete for Depth Estimation?International Journal of Computer Vision10.1007/s11263-023-01890-y132:7(2639-2673)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1007/s11263-023-01890-y
Zuo YDeng J(2024)OGNI-DC: Robust Depth Completion with Optimization-Guided Neural IterationsComputer Vision – ECCV 202410.1007/978-3-031-72646-0_5(78-95)Online publication date: 29-Sep-2024
https://dl.acm.org/doi/10.1007/978-3-031-72646-0_5
Ferrão JDias PSantos BOliveira M(2023)Environment-Aware Rendering and Interaction in Web-Based Augmented RealityJournal of Imaging10.3390/jimaging90300639:3(63)Online publication date: 8-Mar-2023
https://doi.org/10.3390/jimaging9030063
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Figures

Tables

Media

View Issue’s Table of Contents