Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Fast depth densification for occlusion-aware augmented reality

Published: 04 December 2018 Publication History

Abstract

Current AR systems only track sparse geometric features but do not compute depth for all pixels. For this reason, most AR effects are pure overlays that can never be occluded by real objects. We present a novel algorithm that propagates sparse depth to every pixel in near realtime. The produced depth maps are spatio-temporally smooth but exhibit sharp discontinuities at depth edges. This enables AR effects that can fully interact with and be occluded by the real scene. Our algorithm uses a video and a sparse SLAM reconstruction as input. It starts by estimating soft depth edges from the gradient of optical flow fields. Because optical flow is unreliable near occlusions we compute forward and backward flow fields and fuse the resulting depth edges using a novel reliability measure. We then localize the depth edges by thinning and aligning them with image edges. Finally, we optimize the propagated depth smoothly but encourage discontinuities at the recovered depth edges. We present results for numerous real-world examples and demonstrate the effectiveness for several occlusion-aware AR video effects. To quantitatively evaluate our algorithm we characterize the properties that make depth maps desirable for AR applications, and present novel evaluation metrics that capture how well these are satisfied. Our results compare favorably to a set of competitive baseline algorithms in this context.

References

[1]
Robert Anderson, David Gallup, Jonathan T. Barron, Janne Kontkanen, Noah Snavely, Carlos Hernandez Esteban, Sameer Agarwal, and Steven M. Seitz. 2016. Jump: Virtual Reality Video. ACM Transactions on Graphics (Proc. SIGGRAPH Asia) 35, 6 (2016), article no. 198.
[2]
Jonathan T Barron and Ben Poole. 2016. The Fast Bilateral Solver. European Conference on Computer Vision (ECCV) (2016), 617--632.
[3]
Nicolas Bonneel, James Tompkin, Kalyan Sunkavalli, Deqing Sun, Sylvain Paris, and Hanspeter Pfister. 2015. Blind Video Temporal Consistency. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2015) 34, 6 (2015).
[4]
John Canny. 1986. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 8, 6 (1986), 679--698.
[5]
P. Dollár, Z. Tu, and S. Belongie. 2006. Supervised Learning of Edges and Object Boundaries. In CVPR.
[6]
Jakob Engel, Vladlen Koltun, and Daniel Cremers. 2018. Direct Sparse Odometry. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) (2018).
[7]
Jakob Engel, Thomas Schöps, and Daniel Cremers. 2014. LSD-SLAM: Large-Scale Direct Monocular SLAM. European Conference on Computer Vision (ECCV) (2014), 834--849.
[8]
Yasutaka Furukawa and Carlos Hernández. 2015. Multi-View Stereo: A Tutorial. Foundations and Trends. in Computer Graphics and Vision 9, 1--2 (2015), 1--148.
[9]
Asmaa Hosni, Christoph Rhemann, Michael Bleyer, and Margrit Gelautz. 2011. Temporally consistent disparity and optical flow via efficient spatio-temporal filtering. In Pacific-Rim Symposium on Image and Video Technology. Springer, 165--177.
[10]
Till Kroeger, Radu Timofte, Dengxin Dai, and Luc Van Gool. 2016. Fast Optical Flow using Dense Inverse Search. Proceedings of the European Conference on Computer Vision (ECCV) (2016).
[11]
Anat Levin, Dani Lischinski, and Yair Weiss. 2004. Colorization Using Optimization. ACM Trans. Graph. 23, 3 (2004), 689--694.
[12]
D. Scaramuzza M. Pizzoli, C. Forster. 2014. REMODE: Probabilistic, monocular dense reconstruction in real time. International Conference on Robotics and Automation (ICRA) (2014), 2609--2616.
[13]
James McCann and Nancy S Pollard. 2008. Real-time gradient-domain painting. In ACM Transactions on Graphics (TOG), Vol. 27. ACM, 93.
[14]
Richard A. Newcombe, Steven J. Lovegrove, and Andrew J. Davison. 2011. DTAM: Dense Tracking and Mapping in Real-time. International Conference on Computer Vision (ICCV) (2011), 2320--2327.
[15]
Liyuan Pan, Yuchao Dai, Miaomiao Liu, and Fatih Porikli. 2018. Depth Map Completion by Jointly Exploiting Blurry Color Images and Sparse Depth Maps. In Applications of Computer Vision (WACV), 2018 IEEE Winter Conference on. IEEE, 1377--1386.
[16]
Jaesik Park, Hyeongwoo Kim, Yu-Wing Tai, Michael S Brown, and In So Kweon. 2014. High-quality depth map upsampling and completion for RGB-D cameras. IEEE Transactions on Image Processing 23, 12 (2014), 5559--5572.
[17]
Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital Photography with Flash and No-flash Image Pairs. ACM Trans. Graph. 23, 3 (2004), 664--672.
[18]
J.M.M. Montiel R. Mur-Artal and Juan D. Tardos. 2015. ORB-SLAM: a Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics 31, 5 (2015), 1147--1163.
[19]
Christian Richardt, Douglas Orr, Ian Davies, Antonio Criminisi, and Neil A Dodgson. 2010. Real-time spatiotemporal stereo matching using the dual-cross-bilateral grid. In European conference on Computer vision. Springer, 510--523.
[20]
Johannes Lutz Schönberger, Enliang Zheng, Marc Pollefeys, and Jan-Michael Frahm. 2016. Pixelwise View Selection for Unstructured Multi-View Stereo. (2016).
[21]
Steven M Seitz, Brian Curless, James Diebel, Daniel Scharstein, and Richard Szeliski. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In null. IEEE, 519--528.
[22]
Qi Shan, Brian Curless, Yasutaka Furukawa, Carlos Hernández, and Steven M. Seitz. 2014. Occluding Contours for Multi-view Stereo. Conference on Computer Vision and Pattern Recognition (2014), 4002--4009.
[23]
Jan Stühmer, Stefan Gumhold, and Daniel Cremers. 2010. Real-time Dense Geometry from a Handheld Camera. Proceedings of the 32Nd DAGM Conference on Pattern Recognition (2010), 11--20.
[24]
Richard Szeliski. 2006. Locally Adapted Hierarchical Basis Preconditioning. ACM Trans. Graph. 25, 3 (2006), 1135--1143.
[25]
Chamara Saroj Weerasekera, Thanuja Dharmasiri, Ravi Garg, Tom Drummond, and Ian Reid. 2018. Just-in-Time Reconstruction: Inpainting Sparse Maps using Single View Depth Predictors as Priors. arXiv preprint arXiv:1805.04239 (2018).
[26]
Guofeng Zhang, Jiaya Jia, Tien-Tsin Wong, and Hujun Bao. 2009. Consistent Depth Maps Recovery from a Video Sequence. Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 31, 6 (2009), 974--988.
[27]
Yinda Zhang and Thomas Funkhouser. 2018. Deep Depth Completion of a Single RGB-D Image. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 175--185.

Cited By

View all
  • (2024)From Real to Virtual: Exploring Replica-Enhanced Environment Transitions along the Reality-Virtuality ContinuumProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642844(1-13)Online publication date: 11-May-2024
  • (2024)ARTiST: Automated Text Simplification for Task Guidance in Augmented RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642772(1-24)Online publication date: 11-May-2024
  • (2024)RGB Guided ToF Imaging System: A Survey of Deep Learning-Based MethodsInternational Journal of Computer Vision10.1007/s11263-024-02089-5132:11(4954-4991)Online publication date: 1-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 37, Issue 6
December 2018
1401 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3272127
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 December 2018
Published in TOG Volume 37, Issue 6

Check for updates

Author Tags

  1. 3D reconstruction
  2. augmented reality
  3. depth estimation
  4. simultaneous localization and mapping
  5. video analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)401
  • Downloads (Last 6 weeks)76
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)From Real to Virtual: Exploring Replica-Enhanced Environment Transitions along the Reality-Virtuality ContinuumProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642844(1-13)Online publication date: 11-May-2024
  • (2024)ARTiST: Automated Text Simplification for Task Guidance in Augmented RealityProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642772(1-24)Online publication date: 11-May-2024
  • (2024)RGB Guided ToF Imaging System: A Survey of Deep Learning-Based MethodsInternational Journal of Computer Vision10.1007/s11263-024-02089-5132:11(4954-4991)Online publication date: 1-Nov-2024
  • (2024)Are Multi-view Edges Incomplete for Depth Estimation?International Journal of Computer Vision10.1007/s11263-023-01890-y132:7(2639-2673)Online publication date: 1-Jul-2024
  • (2024)Feature distribution normalization network for multi-view stereoThe Visual Computer10.1007/s00371-024-03334-1Online publication date: 17-Apr-2024
  • (2024)OGNI-DC: Robust Depth Completion with Optimization-Guided Neural IterationsComputer Vision – ECCV 202410.1007/978-3-031-72646-0_5(78-95)Online publication date: 29-Sep-2024
  • (2023)Environment-Aware Rendering and Interaction in Web-Based Augmented RealityJournal of Imaging10.3390/jimaging90300639:3(63)Online publication date: 8-Mar-2023
  • (2023)Integrating Both Parallax and Latency Compensation into Video See-through Head-mounted DisplayIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.324746029:5(2826-2836)Online publication date: 27-Feb-2023
  • (2023)Occlusion Handling in Augmented Reality: Past, Present and FutureIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.311786629:2(1590-1609)Online publication date: 1-Feb-2023
  • (2023)Integrated Registration and Occlusion Handling Based on Deep Learning for Augmented-Reality-Assisted Assembly InstructionIEEE Transactions on Industrial Informatics10.1109/TII.2022.318942819:5(6825-6835)Online publication date: May-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media