research-article

Fast, High-Quality Hierarchical Depth-Map Super-Resolution

Authors:

Christian Richardt,

Darren CoskerAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 4444 - 4453

https://doi.org/10.1145/3474085.3475595

Published: 17 October 2021 Publication History

Abstract

The low spatial resolution of acquired depth maps is a major drawback of most RGBD sensors. However, there are many scenarios in which fast acquisition of high-resolution and high-quality depth maps would be desirable. One approach to achieve higher quality depth maps is through super-resolution. However, edge preservation is challenging, and artifacts such as depth confusion and blurring are easily introduced near boundaries. In view of this, we propose a method for fast, high-quality hierarchical depth-map super-resolution (HDS). In our method, a high-resolution RGB image is degraded layer by layer to guide the bilateral filtering of the depth map. To improve the upsampled depth map quality, we construct a feature-based bilateral filter (FBF) for the interpolation, by using the extracted RGB shallow and multi-layer features. To accelerate the process, we perform filtering only near depth boundaries and through matrix operations. We also propose an extension of our HDS model to a Classification-based Hierarchical Depth-map Super-resolution (C-HDS) model, where a context-aware trilateral filter reduces the contributions of unreliable neighbors to the current missing depth location. Experimental results show that the proposed method is significantly faster than existing methods for generating high-resolution depth maps, while also significantly improving depth quality compared to the current state-of-the-art approaches, especially for large-scale 16x super-resolution.

References

[1]

Sari Awwad, Fairouz Hussein, and Massimo Piccardi. 2015. Local Depth Patterns for Tracking in Depth Videos. In International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1115--1118. https://doi.org/10.1145/2733373.2806295

Digital Library

[2]

A. M. Bensaid, L. O. Hall, J. C. Bezdek, L. P. Clarke, M. L. Silbiger, J. A. Arrington, and R. F. Murtagh. 1996. Validity-guided (re)clustering with applications to image segmentation. IEEE Transactions on Fuzzy Systems, Vol. 4, 2 (May 1996), 112--123. https://doi.org/10.1109/91.493905

Digital Library

[3]

Derek Chan, Hylke Buisman, Christian Theobalt, and Sebastian Thrun. 2008. A Noise-aware Filter for Real-time Depth Upsampling. In ECCV Workshops. IEEE, Marseille, France.

[4]

K. N. Chaudhury. 2013. Acceleration of the Shiftable O(1) Algorithm for Bilateral Filtering and Nonlocal Means. IEEE Transactions on Image Processing, Vol. 22, 4 (April 2013), 1291--1300. https://doi.org/10.1109/TIP.2012.2222903

Digital Library

[5]

Ruijin Chen and Wei Gao. 2020. Color-Guided Depth Map Super-Resolution Using a Dual-Branch Multi-Scale Residual Network with Channel Interaction. Sensors, Vol. 20 (2020), 6. https://doi.org/10.3390/s20061560

[6]

Longquan Dai, Mengke Yuan, and Xiaopeng Zhang. 2016. Speeding up the bilateral filter: A joint acceleration way. IEEE Transactions on Image Processing, Vol. 25, 6 (2016), 2657--2672.

Digital Library

[7]

James Diebel and Sebastian Thrun. 2006. An Application of Markov Random Fields to Range Sensing. In NIPS. MIT Press, Vancouver, B.C., Canada, 291--298. http://papers.nips.cc/paper/2837-an-application-of-markov-random-fields-to-range-sensing.pdf

Digital Library

[8]

Elmar Eisemann and Frédo Durand. 2004. Flash photography enhancement via intrinsic relighting. ACM Transactions on Graphics (Proceedings of SIGGRAPH), Vol. 23, 3 (August 2004), 673--678. https://doi.org/10.1145/1186562.1015778

Digital Library

[9]

David Ferstl, Christian Reinbacher, Rene Ranftl, Matthias Rüther, and Horst Bischof. 2013. Image Guided Depth Upsampling using Anisotropic Total Generalized Variation. In ICCV. IEEE Computer Society, USA, 993--1000.

Digital Library

[10]

Sergi Foix, Guillem Alenyà, and Carme Torras. 2011. Lock-in Time-of-Flight (ToF) Cameras: A Survey. IEEE Sensors Journal, Vol. 11 (2011), 1917--1926.

[11]

Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. 2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. In CVPR. IEEE Computer Society, USA, 580--587. https://doi.org/10.1109/CVPR.2014.81

Digital Library

[12]

Jing Gu, Licheng Jiao, Shuyuan Yang, and Fang Liu. 2018. Fuzzy Double C-Means Clustering Based on Sparse Self-Representation. IEEE Transactions on Fuzzy Systems, Vol. 26, 2 (April 2018), 612--626. https://doi.org/10.1109/TFUZZ.2017.2686804

[13]

Bumsub Ham, Minsu Cho, and Jean Ponce. 2018. Robust Guided Image Filtering Using Nonconvex Potentials. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, 1 (Jan 2018), 192--207. https://doi.org/10.1109/TPAMI.2017.2669034

[14]

Jungong Han, Ling Shao, Dong Xu, and Jamie Shotton. 2013. Enhanced Computer Vision With Microsoft Kinect Sensor: A Review. IEEE Transactions on Cybernetics, Vol. 43, 5 (oct 2013), 1318--1334. https://doi.org/10.1109/tcyb.2013.2265378

[15]

Kaiming He, Jian Sun, and Xiaoou Tang. 2013. Guided Image Filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 6 (June 2013), 1397--1409. https://doi.org/10.1109/TPAMI.2012.213

Digital Library

[16]

Qibin Hou, Ming-Ming Cheng, Xiaowei Hu, Ali Borji, Zhuowen Tu, and Philip H. S. Torr. 2019. Deeply Supervised Salient Object Detection with Short Connections. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 41, 4 (April 2019), 815--828. https://doi.org/10.1109/TPAMI.2018.2815688

Digital Library

[17]

Tak-Wai Hui, Chen Change Loy, and Xiaoou Tang. 2016. Depth map super-resolution by deep multi-scale guidance. In European conference on computer vision. Springer, Springer International Publishing, Cham, 353--369.

[18]

J. M. Keller, M. R. Gray, and J. A. Givens. 1985. A fuzzy K-nearest neighbor algorithm. IEEE Transactions on Systems, Man, and Cybernetics, Vol. SMC-15, 4 (July 1985), 580--585. https://doi.org/10.1109/TSMC.1985.6313426

[19]

Joohyeok Kim, Gwanggil Jeon, and Jechang Jeong. 2014. Joint-adaptive bilateral depth map upsampling. Signal Processing: Image Communication, Vol. 29, 4 (2014), 506--513. https://doi.org/10.1016/j.image.2014.01.011

[20]

Andreas Kolb, Erhardt Barth, Reinhard Koch, and Rasmus Larsen. 2010. Time-of-Flight Cameras in Computer Graphics. Computer Graphics Forum, Vol. 29, 1 (March 2010), 141--159. https://doi.org/10.1111/j.1467--8659.2009.01583.x

[21]

Johannes Kopf, Michael F. Cohen, Dani Lischinski, and Matt Uyttendaele. 2007. Joint bilateral upsampling. ACM Transaction on Graphics, Vol. 26, 3 (July 2007), 96. https://doi.org/10.1145/1276377.1276497

Digital Library

[22]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS. Association for Computing Machinery, New York, NY, USA, 84--90.

Digital Library

[23]

Yijun Li, Jia-Bin Huang, Ahuja Narendra, and Ming-Hsuan Yang. 2016. Deep Joint Image Filtering. In ECCV. Springer International Publishing, Cham, 154--169.

[24]

Ming-Yu Liu, Oncel Tuzel, and Yuichi Taguchi. 2013. Joint Geodesic Upsampling of Depth Images. In CVPR. IEEE Computer Society, Los Alamitos, CA, USA, 169--176.

Digital Library

[25]

Kai-Han Lo, Yu-Chiang Frank Wang, and Kai-Lung Hua. 2018. Edge-Preserving Depth Map Upsampling by Joint Trilateral Filter. IEEE Transactions on Cybernetics, Vol. 48 (2018), 371--384.

[26]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In CVPR. IEEE Press, Boston, MA, USA, 3431--3440.

[27]

Bruno Macchiavello, Camilo Dorea, Edson M. Hung, Gene Cheung, and Wai-Tian Tan. 2014. Loss-Resilient Coding of Texture and Depth for Free-Viewpoint Video Conferencing. Trans. Multi., Vol. 16, 3 (April 2014), 711--725. https://doi.org/10.1109/TMM.2014.2299768

Digital Library

[28]

Ilya Makarov, Vladimir Aliev, and Olga Gerasimova. 2017. Semi-Dense Depth Interpolation Using Deep Convolutional Neural Networks. In International Conference on Multimedia. Association for Computing Machinery, New York, NY, USA, 1407--1415. https://doi.org/10.1145/3123266.3123360

Digital Library

[29]

Patrick Ndjiki-Nya, Martin Köppel, Dimitar Doshkov, Haricharan Lakshman, Philipp Merkle, Karsten Müller, and Thomas Wiegand. 2010. Depth Image-Based Rendering With Advanced Texture Synthesis for 3-D Video. IEEE Transactions on Multimedia, Vol. 13 (2010), 453--465.

Digital Library

[30]

Simon Niklaus and Feng Liu. 2018. Context-Aware Synthesis for Video Frame Interpolation. In CVPR. IEEE Computer Society, Los Alamitos, CA, USA, 1701--1710.

[31]

Georg Petschnigg, Richard Szeliski, Maneesh Agrawala, Michael Cohen, Hugues Hoppe, and Kentaro Toyama. 2004. Digital photography with flash and no-flash image pairs. ACM Transactions on Graphics (Proceedings of SIGGRAPH), Vol. 23, 3 (August 2004), 664--672. https://doi.org/10.1145/1186562.1015777

Digital Library

[32]

Yiguo Qiao, Licheng Jiao, Shuyuan Yang, and Biao Hou. 2019. A Novel Segmentation Based Depth Map Up-Sampling. IEEE Transactions on Multimedia, Vol. 21, 1 (Jan 2019), 1--14. https://doi.org/10.1109/TMM.2018.2845699

Digital Library

[33]

Christian Richardt, Carsten Stoll, Neil A. Dodgson, Hans-Peter Seidel, and Christian Theobalt. 2012. Coherent Spatiotemporal Filtering, Upsampling and Rendering of RGBZ Videos. Computer Graphics Forum (Proceedings of Eurographics), Vol. 31, 2 (May 2012), 247--256. https://doi.org/10.1111/j.1467--8659.2012.03003.x

Digital Library

[34]

Daniel Scharstein, Heiko Hirschmüller, York Kitajima, Greg Krathwohl, Xi Wang, and Porter Westling. 2014. High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth. In German Conference on Pattern Recognition. Springer International Publishing, Cham, 31--42.

[35]

Daniel Scharstein and Chris Pal. 2007. Learning Conditional Random Fields for Stereo. In CVPR. IEEE Computer Society, Los Alamitos, CA, USA, 1--8. https://doi.org/10.1109/CVPR.2007.383191

[36]

Daniel Scharstein and Richard Szeliski. 2003. High-accuracy stereo depth maps using structured light. In CVPR. IEEE Computer Society, USA, 195--202. https://doi.org/10.1109/CVPR.2003.1211354

Digital Library

[37]

Jamie Shotton, Ross Girshick, Andrew Fitzgibbon, Toby Sharp, Mat Cook, Mark Finocchio, Richard Moore, Pushmeet Kohli, Antonio Criminisi, Alex Kipman, and Andrew Blake. 2013. Efficient Human Pose Estimation from Single Depth Images. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 35, 12 (December 2013), 2821--2840. https://doi.org/10.1109/TPAMI.2012.241

Digital Library

[38]

Karen Simonyan and Andrew Zisserman. 2015. Very Deep Convolutional Networks for Large-Scale Image Recognition. In ICLR. IEEE, Kuala Lumpur, Malaysia, 730--734.

[39]

Mashhour Solh and Ghassan Al-Regib. 2012. Hierarchical Hole-Filling For Depth-Based View Synthesis in FTV and 3D Video. IEEE Journal of Selected Topics in Signal Processing, Vol. 6 (2012), 495--504.

[40]

Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper With Convolutions. In CVPR. IEEE, Los Alamitos, CA, USA, 1--9.

[41]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, Salt Lake City, UT, USA, 9446--9454.

[42]

Oleg Voynov, Alexey Artemov, Vage Egiazarian, Alexander Notchenko, Gleb Bobrovskikh, Evgeny Burnaev, and Denis Zorin. 2019. Perceptual Deep Depth Super-Resolution. In ICCV. IEEE, Seoul, Korea (South), 5653--5663.

[43]

Anran Wang, Jianfei Cai, Jiwen Lu, and Tat-Jen Cham. 2016a. Modality and Component Aware Feature Fusion for RGB-D Scene Classification. In CVPR. IEEE, Las Vegas, NV, USA, 5995--6004. https://doi.org/10.1109/CVPR.2016.645

[44]

Jiang Wang, Zicheng Liu, Ying Wu, and Junsong Yuan. 2012. Mining Actionlet Ensemble for Action Recognition with Depth Cameras. In CVPR. IEEE, Providence, RI, USA, 1290--1297.

Digital Library

[45]

Yucheng Wang, Jian Zhang, Zicheng Liu, Qiang Wu, Philip A. Chou, Zhengyou Zhang, and Yunde Jia. 2016b. Handling Occlusion and Large Displacement Through Improved RGB-D Scene Flow Estimation. IEEE Transactions on Circuits and Systems for Video Technology, Vol. 26, 7 (July 2016), 1265--1278. https://doi.org/10.1109/TCSVT.2015.2462011

Digital Library

[46]

Lu Xia and J.K. Aggarwal. 2013. Spatio-temporal Depth Cuboid Similarity Feature for Activity Recognition Using Depth Camera. In CVPR. IEEE, Portland, OR, USA, 2834--2841.

Digital Library

[47]

Jingyu Yang, Xinchen Ye, Kun Li, Chunping Hou, and Yao Wang. 2014. Color-Guided Depth Recovery From RGB-D Data Using an Adaptive Autoregressive Model. IEEE Transactions on Image Processing, Vol. 23, 8 (Aug 2014), 3443--3458. https://doi.org/10.1109/TIP.2014.2329776

[48]

Zhengyou Zhang. 2012. Microsoft Kinect Sensor and Its Effect. IEEE MultiMedia, Vol. 19, 2 (February 2012), 4--10. https://doi.org/10.1109/MMUL.2012.24

Digital Library

[49]

Lijun Zhao, Huihui Bai, Jie Liang, Anhong Wang, Bing Zeng, and Yao Zhao. 2019 a. Local activity-driven structural-preserving filtering for noise removal and image smoothing. Signal Processing, Vol. 157 (2019), 62--72. https://doi.org/10.1016/j.sigpro.2018.11.012

[50]

Lijun Zhao, Huihui Bai, Jie Liang, Bing Zeng, Anhong Wang, and Yao Zhao. 2019 b. Simultaneous color-depth super-resolution with conditional generative adversarial networks. Pattern Recognition, Vol. 88 (2019), 356--369. https://doi.org/10.1016/j.patcog.2018.11.028

Cited By

Safin AKan MDrobyshev NVoynov OArtemov AFilippov AZorin DBurnaev E(2024)Unpaired Depth Super-Resolution in the WildIEEE Access10.1109/ACCESS.2024.344445212(123322-123338)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3444452
Suárez PCarpio DSappa A(2024)Enhancement of guided thermal image super-resolution approachesNeurocomputing10.1016/j.neucom.2023.127197573:COnline publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.neucom.2023.127197
Zhong ZLiu XJiang JZhao DJi X(2023)Guided Depth Map Super-Resolution: A SurveyACM Computing Surveys10.1145/358486055:14s(1-36)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3584860

Index Terms

Fast, High-Quality Hierarchical Depth-Map Super-Resolution
1. Computing methodologies
  1. Computer graphics
    1. Image manipulation
      1. Computational photography
      2. Image processing
2. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Interpolation

Recommendations

Depth Map Super-Resolution Considering View Synthesis Quality

Accurate and high-quality depth maps are required in lots of 3D applications, such as multi-view rendering, 3D reconstruction and 3DTV. However, the resolution of captured depth image is much lower than that of its corresponding color image, which ...
Single Depth Map Super-resolution with Local Self-similarity
ICVIP '18: Proceedings of the 2018 2nd International Conference on Video and Image Processing

Consumer depth sensors such as time-of-flight camera or Kinect have gained significant popularity in recently. However, the captured depth maps suffer from limited spatial resolution and a variety of noise, making such depth maps difficult to be ...
Joint Example-Based Depth Map Super-Resolution
ICME '12: Proceedings of the 2012 IEEE International Conference on Multimedia and Expo

The fast development of time-of-flight (ToF) cameras in recent years enables capture of high frame-rate 3D depth maps of moving objects. However, the resolution of depth map captured by ToF is rather limited, and thus it cannot be directly used to build ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

EPSRC CAMERA

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
221
Total Downloads

Downloads (Last 12 months)37
Downloads (Last 6 weeks)11

Reflects downloads up to 23 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Safin AKan MDrobyshev NVoynov OArtemov AFilippov AZorin DBurnaev E(2024)Unpaired Depth Super-Resolution in the WildIEEE Access10.1109/ACCESS.2024.344445212(123322-123338)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3444452
Suárez PCarpio DSappa A(2024)Enhancement of guided thermal image super-resolution approachesNeurocomputing10.1016/j.neucom.2023.127197573:COnline publication date: 16-May-2024
https://dl.acm.org/doi/10.1016/j.neucom.2023.127197
Zhong ZLiu XJiang JZhao DJi X(2023)Guided Depth Map Super-Resolution: A SurveyACM Computing Surveys10.1145/358486055:14s(1-36)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3584860
Ye YZhou MWang ZShen X(2023)Improved Upsampling Based Depth Image Super-Resolution ReconstructionIEEE Access10.1109/ACCESS.2023.327496611(46782-46792)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3274966

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents