Abstract
Augmented Reality applications are set to revolutionize the smartphone industry due to the integration of RGB-D sensors into mobile devices. Given the large number of smartphone users, efficient storage and transmission of RGB-D data is of paramount interest to the research community. While there exist Video Coding Standards such as HEVC and H.264/AVC for compression of RGB/texture component, the coding of depth data is still an area of active research. This paper presents a method for coding depth videos, captured from mobile RGB-D sensors, by planar segmentation. The segmentation algorithm is based on Markov Random Field assumptions on depth data and solved using Graph Cuts. While all prior works based on this approach remain restricted to images only and under noise-free conditions, this paper presents an efficient solution to planar segmentation in noisy depth videos. Also presented is a unique method to encode depth based on its segmented planar representation. Experiments on depth captured from a noisy sensor (Microsoft Kinect) shows superior Rate-Distortion performance over the 3D extension of HEVC codec.
Similar content being viewed by others
Notes
where T denotes matrix transpose.
References
3D High Efficiency Video Coding (3D-HTM), https://hevc.hhi.fraunhofer.de/3dhevc
Bay H, Tuytelaars T, Van Gool L (2006) Surf: Speeded up robust features. In: ECCV. IEEE, pp 404–417
Bhattacharya U, Veerawal S, Govindu VM (2017) Uttaran and Veerawal, Sumit and Govindu, Venu Madhav, Fast Multiview 3D Scan Registration using Planar Structures, International Conference on 3D Vision
Bjøntegaard G (2001) Calculation of average PSNR differences between RD-curves, Technical Report VCEG-M33, ITU-T SG16/Q6, Austin
Blake A, Kohli P, Markov CR (2011) Random Fields for Vision and Image Processing. MIT Press, Stanford
Boykov Y, Veksler O, Zabih R (2001) Fast approximate energy minimization via graph cuts. In: IEEE Transactions on Pattern Analysis and Machine Intelligence
Chatterjee A (2015) Geometric calibration and Shape Refinement for 3D Reconstruction. PhD Thesis Report
Cheung G, Kim WS, Ortega A, Ishida J, Kubota A (2011) Depth map coding using graph based transform and transform domain sparsification. In: International workshop on multimedia signal processing, pp 1–6. https://doi.org/10.1109/MMSP.2011.6093810
Delong A, Osokin A, Isack H, Boykov Y (2012) Fast approximate energy minimization with label costs. Int J Comput Vis 96(1):1–27
Duch MM, Morros JR, Ruiz-Hidalgo J (2016) Depth map compression via 3D region-based representation, J Multimed Tools Appl. https://doi.org/10.1007/s11042-016-3727-1
Farid M, Lucenteforte M, Grangetto M (2015) Panorama view with spatiotemporal occlusion compensation for 3D video coding. IEEE Trans Image Process 24(1):205–219. https://doi.org/10.1109/TIP.2014.2374533
Fehn C, Schuur K, Kauff P, Smolic A (2003) Coding results for EE4 in MPEG 3DAV, ISO/IEC JTC1/SC29/WG11 M, vol 9561
Feng C, Taguchi Y, Kamat VR (2014) Fast plane extraction in organized point clouds using agglomerative hierarchical clustering. Fast plane extraction in organized point clouds using agglomerative hierarchical clustering, 2014 IEEE International Conference on Robotics and Automation (ICRA), Hong Kong, pp 6218–6225. https://doi.org/10.1109/ICRA.2014.6907776
Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography (PDF). Comm ACM 24(6):381–395. https://doi.org/10.1145/358669.358692
Gallup D, Frahm JM, Mordohai P, Pollefeys M (2008) Variable baseline/resolution stereo. 2008 IEEE Conference on Computer Vision and Pattern Recognition Variable baseline/resolution stereo, Anchorage, pp 1–8. https://doi.org/10.1109/CVPR.2008.4587671
Jäger F (2012) Simplified depth map intra coding with an optional depth lookup table, 2012 International Conference on 3D Imaging (IC3D), Liege, pp 1–4. https://doi.org/10.1109/IC3D.2012.6615142
Jager F (2011) Contour-based segmentation and coding for depth map compression. In: Visual communications and image processing, pp 1–4. https://doi.org/10.1109/VCIP.2011.6115989
Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T (2011) A category-level 3-D object dataset: Putting the Kinect to work, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp 1168–1174
Hemanth Kumar S, Ramakrishnan KR (2014) Improved motion vector compression using 3d-warping. In: Data Compression Conference (DCC). IEEE, pp 424–424
Hemanth Kumar S, Suraj K, Ramakrishnan KR (2014) An efficient depth estimation using temporal 3D-Warping. 2014 International Conference on 3D Imaging (IC3D), Liege, pp 1–8. https://doi.org/10.1109/IC3D.2014.7032586
Hemmat H, Bondarev Y, With P (2015) Real-time planar segmentation of depth images : from three-dimensional edges to segmented planes. J Electron Imaging 24(5):1–11
Howard P, Kossentini F, Martins B, Forchhammer S, Rucklidge W (2002) The emerging JBIG2 standard. IEEE Trans Circ Syst Video Technolo 8(7):838–848
ITU-T and ISO/IEC Advanced video coding for generic audiovisual services ITU-T rec h.264 and ISO/IEC 14496-10 (AVC) (2010)
Isack H, Boykov Y (2012) Energy-based Geometric Multi-Model Fitting. Int J Comput Vis 97(2):123–147
Kim WS, Ortega A, Lai P, Tian D (2015) Depth map coding optimization using rendered view distortion for 3D video coding. IEEE Trans Image Process 24 (11):3534–3545. https://doi.org/10.1109/TIP.2015.2447737
Lei J, Li S, Zhu C, Sun M, Hou C (2015) Depth coding based on depth-texture motion and structure similarities. IEEE Trans Circ Syst Video Technol 25(2):275–286. https://doi.org/10.1109/TCSVT.2014.2335471
Lossless photo compression benchmark (2013) http://www.imagecompression.info/gralic/ 2013
Lossless image compression (2014) http://www.squeezechart.com/bitmap.html
Mahoney M (2005) Adaptive weighing of context models for lossless data compression. Florida Technical report, Melbourne,
Merkle P, Morvan Y, Smolic A, Farin D, Muller K, de With P, Wiegand T (2008) The effect of depth compression on multiview rendering quality. In: 3DTV-conference: the true vision - capture, transmission and display of 3D video
Merkle P, Muller K, Marpe D, Wiegand T (2015) Depth intra coding for 3D video based on geometric primitives. IEEE Trans Circuits Syst Video Technol
Milani S, Zanuttigh P, Zamarin M, Forchhammer S (2011) Efficient depth map compression exploiting segmented color data. In: IEEE international conference on multimedia and expo, pp 1–6. https://doi.org/10.1109/ICME.2011.6011969
Ozaktas HM, Onural L (2008) Three-Dimensional Television, Signals and Communication Technology. Springer, Berlin
Ozkalayci B, Alatan A (2014) 3D planar representation of stereo depth images for 3DTV applications. IEEE Trans Image Process 23(12):5222–5232. https://doi.org/10.1109/TIP.2014.2360452
Ozkalayci B (2014) Planar 3D Scene Representations for Depth Compression. Middle East Technical University (thesis report), Çankaya/Ankara
Shahriyar S, Murshed M, Ali M, Paul M (2014) Efficient coding of depth map by exploiting temporal correlation. In: International conference on digital image computing: techniques and applications, pp 1–8. https://doi.org/10.1109/DICTA.2014.7008105
Shen G, Kim WS, Narang SK, Ortega A, Lee J, Wey H Edge-adaptive transforms for efficient depth map coding, 28th Picture Coding Symposium, Nagoya, 2010, pp 566–569. https://doi.org/10.1109/PCS.2010.5702565
Sullivan GJ, Ohm JR, Han WJ, Wiegand T (2012) Overview of the high effciency video coding (hevc) standard. IEEE Transactions on Circuits and Systems for Video Technology
Smisek J, Jancosek M, Pajdla T (2011) 3D With kinect. 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, pp 1154–1160. https://doi.org/10.1109/ICCVW.2011.6130380
Skodras A, Christopoulos C, Ebrahimi T (2001) The JPEG 2000 still image compression standard. IEEE Signal Proc Mag 18:36–58
Sturm J, Engelhard N, Endres F, Burgard W, Dremers D (2012) A Benchmark for the Evaluation of RGB-D SLAM Systems. Proceedings of the International Conference on Intelligent Robot Systems (IROS)
Tech G, Schwarz H, Muller K, Wiegand T (2012) 3D video coding using the synthesized view distortion change. In: Picture coding symposium, pp 25–28. https://doi.org/10.1109/PCS.2012.6213277
Tech G, Chen Y, Müller K, Ohm JR, Vetro A, Wang YK (2016) Overview of the Multiview and 3D Extensions of High Efficiency Video Coding. IEEE Trans Circ Syst Video Technol 26(1):35–49. https://doi.org/10.1109/TCSVT.2015.2477935
The PAQ data compression programs (2013) http://cs.fit.edu/mmahoney/compression/paq.html
Umeyama S (1991) Least-squares estimation of transformation parameters between two point patterns. IEEE Transactions on pattern analysis and machine intelligence, pp 13
Yan C et al (2014) A Highly Parallel Framework for HEVC Coding Unit Partitioning Tree Decision on Many-core Processors. IEEE Signal Process Lett 21(5):573–576
Yan C et al (2014) Efficient Parallel Framework for HEVC Motion Estimation on Many-Core Processors. IEEE Trans Circ Syst Video Technol 24(12):2077–2089
Zou F, Tian D, Vetro A, Sun H, Au OC, Shimizu S (2014) View synthesis prediction in the 3D video coding extensions of AVC and HEVC. IEEE Trans Circ Syst Video Technol 24(10):1696–1708
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kumar, S.H., K. R. Ramakrishnan Depth compression via planar segmentation. Multimed Tools Appl 78, 6529–6558 (2019). https://doi.org/10.1007/s11042-018-6327-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-018-6327-4