Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3581783.3613847acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

YOGA: Yet Another Geometry-based Point Cloud Compressor

Published: 27 October 2023 Publication History

Abstract

A learning-based YOGA (Yet Another Geometry-based Point Cloud Compressor) is proposed. It is flexible, allowing for the separable lossy compression of geometry and color attributes, and variable-rate coding using a single neural model; it is high-efficiency, significantly outperforming the latest G-PCC standard quantitatively and qualitatively, e.g., 25% BD-BR gains using PCQM (Point Cloud Quality Metric) as the distortion assessment, and it is lightweight, e.g., similar runtime as the G-PCC codec, owing to the use of sparse convolution and parallel entropy coding. To this end, YOGA adopts a unified end-to-end learning-based backbone for separate geometry and attribute compression. The backbone uses a two-layer structure, where the downscaled thumbnail point cloud is encoded using G-PCC at the base layer, and upon G-PCC compressed priors, multiscale sparse convolutions are stacked at the enhancement layer to effectively characterize spatial correlations to compactly represent the full-resolution sample. In addition, YOGA integrates the adaptive quantization and entropy model group to enable variable-rate control, as well as adaptive filters for better quality restoration.

Supplemental Material

MP4 File
Recently, the growth of new applications has raised an urgent need for an efficient and flexible point cloud compression method. We propose a learning-based YOGA (Yet Another Geometry-based Point Cloud Compressor). YOGA adopts a unified end-to-end learning-based backbone for separate geometry and attribute compression. The backbone uses a two-layer structure, where the downscaled thumbnail point cloud is encoded using G-PCC at the base layer, and upon G-PCC compressed priors, multiscale sparse convolutions are stacked at the enhancement layer to effectively exploit spatial correlations to compactly represent the full-resolution sample. In addition, YOGA integrates the adaptive quantization and entropy model group to enable variable-rate control, as well as adaptive filters for better quality restoration. YOGA significantly outperforms the latest G-PCC standard quantitatively and qualitatively, e.g., 25% BD-BR gains using PCQM (Point Cloud Quality Metric) as the distortion assessment.

References

[1]
Evangelos Alexiou, Kuan Tung, and Touradj Ebrahimi. 2020. Towards neural network approaches for point cloud compression. In Applications of digital image processing XLIII, Vol. 11510. SPIE, 18--37.
[2]
J. Ascenso, E. Alshina, A. Karabutov, Y. Wu, and Semih Esenlik. 2023. Proposed draft of description for JPEG AI Verification Model under Consideration (JPEG AI VMuC). In ISO/IEC JTC 1/SC29/WG1 M97047.
[3]
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).
[4]
G. Bjøntegaard. 2001. Calculation of average PSNR differences between RDcurves. In ITU-T SG 16/Q6, 13th VCEG Meeting. document VCEG-M33.
[5]
Shilv Cai, Zhijun Zhang, Liqun Chen, Luxin Yan, Sheng Zhong, and Xu Zou. 2022. High-Fidelity Variable-Rate Image Compression via Invertible Activation Transformation. arXiv preprint arXiv:2209.05054 (2022).
[6]
Chao Cao, Marius Preda, Vladyslav Zakharchenko, Euee S Jang, and Titus Zaharia. 2021. Compression of sparse and dense dynamic point clouds-methods and standards. Proc. IEEE 109, 9 (2021), 1537--1558.
[7]
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015).
[8]
Siheng Chen, Baoan Liu, Chen Feng, Carlos Vallespi-Gonzalez, and Carl Wellington. 2020. 3d point cloud processing and learning for autonomous driving: Impacting map creation, localization, and perception. IEEE Signal Processing Magazine 38, 1 (2020), 68--86.
[9]
Tong Chen and Zhan Ma. 2020. Variable bitrate image compression with quality scaling factors. In ICASSP 2020--2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2163--2167.
[10]
Yoojin Choi, Mostafa El-Khamy, and Jungwon Lee. 2019. Variable rate deep image compression with a conditional autoencoder. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3146--3154.
[11]
Ze Cui, Jing Wang, Bo Bai, Tiansheng Guo, and Yihui Feng. 2020. G-VAE: A continuously variable rate deep image compression framework. arXiv preprint arXiv:2003.02012 2, 3 (2020).
[12]
Ricardo L De Queiroz and Philip A Chou. 2016. Compression of 3D point clouds using a region-adaptive hierarchical transform. IEEE Transactions on Image Processing 25, 8 (2016), 3947--3956.
[13]
Eugene d'Eon, Bob Harrison, Taos Myers, and Philip A. Chou. 2017. 8i Voxelized Full Bodies - A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006 (2017).
[14]
Dandan Ding, JunjieWang, Guangkun Zhen, Debargha Mukherjee, Urvang Joshi, and Zhan Ma. 2023. Neural Adaptive Loop Filtering For Video Coding: Exploring Multi-hypothesis Sample Refinement. IEEE Transactions on Circuits and Systems for Video Technology (2023).
[15]
Zhihao Duan, Ming Lu, Jack Ma, Zhan Ma, and Fengqing Zhu. 2023. QARV: Quantization-Aware ResNet VAE for Lossy Image Compression. arXiv preprint arXiv:2302.08899 (2023).
[16]
Chunyang Fu, Ge Li, Rui Song,Wei Gao, and Shan Liu. 2022. Octattention: Octreebased large-scale contexts model for point cloud compression. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 625--633.
[17]
Linyao Gao, Tingyu Fan, Jianqiang Wan, Yiling Xu, Jun Sun, and Zhan Ma. 2021. Point cloud geometry compression via neural graph sampling. In 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 3373--3377.
[18]
D Graziosi, O Nakagami, S Kuma, A Zaghetto, T Suzuki, and A Tabatabai. 2020. An overview of ongoing point cloud compression standardization activities: Video-based (V-PCC) and geometry-based (G-PCC). APSIPA Transactions on Signal and Information Processing 9 (2020).
[19]
André FR Guarda, Nuno MM Rodrigues, and Fernando Pereira. 2020. Adaptive deep learning-based point cloud geometry coding. IEEE Journal of Selected Topics in Signal Processing 15, 2 (2020), 415--430.
[20]
Lila Huang, Shenlong Wang, Kelvin Wong, Jerry Liu, and Raquel Urtasun. 2020. Octsqueeze: Octree-structured entropy model for lidar compression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1313-- 1323.
[21]
Tianxin Huang and Yong Liu. 2019. 3d point cloud geometry compression on deep learning. In Proceedings of the 27th ACM international conference on multimedia. 890--898.
[22]
Tianxin Huang, Jiangning Zhang, Jun Chen, Zhonggan Ding, Ying Tai, Zhenyu Zhang, Chengjie Wang, and Yong Liu. 2022. 3QNet: 3D Point Cloud Geometry Quantization Compression Network. ACM Transactions on Graphics (TOG) 41, 6 (2022), 1--13.
[23]
Berivan Isik, Philip A Chou, Sung Jin Hwang, Nick Johnston, and George Toderici. 2021. LVAC: Learned Volumetric Attribute Compression for Point Clouds using Coordinate Based Networks. arXiv preprint arXiv:2111.08988 (2021).
[24]
Dat Thanh Nguyen Andre Kaup. 2023. Lossless Point Cloud Geometry and Attribute Compression Using a Learned Conditional Probability Model. arXiv preprint arXiv:2303.06519 (2023).
[25]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[26]
Jianping Lin, Mohammad Akbari, Haisheng Fu, Qian Zhang, Shang Wang, Jie Liang, Dong Liu, Feng Liang, Guohe Zhang, and Chengjie Tu. 2020. Variablerate multi-frequency image compression using modulated generalized octave convolution. In 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--6.
[27]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In European conference on computer vision. Springer, 740--755.
[28]
Gexin Liu, Jianqiang Wang, Dandan Ding, and Zhan Ma. 2022. PCGFormer: Lossy Point Cloud Geometry Compression via Local Self-Attention. In 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP). IEEE, 1--5.
[29]
Hao Liu, Hui Yuan, Junhui Hou, Raouf Hamzaoui, andWei Gao. 2022. Pufa-gan: A frequency-aware generative adversarial network for 3d point cloud upsampling. IEEE Transactions on Image Processing 31 (2022), 7389--7402.
[30]
Hao Liu, Hui Yuan, Qi Liu, Junhui Hou, and Ju Liu. 2019. A comprehensive study and comparison of core technologies for MPEG 3-D point cloud compression. IEEE Transactions on Broadcasting 66, 3 (2019), 701--717.
[31]
Qi Liu, Hui Yuan, Raouf Hamzaoui, Honglei Su, Junhui Hou, and Huan Yang. 2021. Reduced reference perceptual quality model with application to rate control for video-based point cloud compression. IEEE Transactions on Image Processing 30 (2021), 6623--6636.
[32]
Gabriel Meynet, Yana Nehmé, Julie Digne, and Guillaume Lavoué. 2020. PCQM: A full-reference quality metric for colored 3D point clouds. In 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX). IEEE, 1--6.
[33]
Marc-Antoine Moinnereau, Alcyr Alves de Oliveira Jr, and Tiago H Falk. 2022. Immersive media experience: a survey of existing methods and tools for human influential factors assessment. Quality and User Experience 7, 1 (2022), 5.
[34]
Dat Thanh Nguyen, Maurice Quach, Giuseppe Valenzise, and Pierre Duhamel. 2021. Lossless coding of point cloud geometry using a deep generative model. IEEE Transactions on Circuits and Systems for Video Technology 31, 12 (2021), 4617--4629.
[35]
Jiahao Pang, Muhammad Asad Lodhi, and Dong Tian. 2022. GRASP-Net: Geometric Residual Analysis and Synthesis for Point Cloud Compression. In Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 11--19.
[36]
Maurice Quach, Jiahao Pang, Dong Tian, Giuseppe Valenzise, and Frédéric Dufaux. 2022. Survey on Deep Learning-based Point Cloud Compression. Frontiers in Signal Processing (2022).
[37]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2019. Learning convolutional transforms for lossy point cloud geometry compression. In 2019 IEEE international conference on image processing (ICIP). IEEE, 4320--4324.
[38]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Folding-based compression of point cloud attributes. In 2020 IEEE International Conference on Image Processing (ICIP). IEEE, 3309--3313.
[39]
Zizheng Que, Guo Lu, and Dong Xu. 2021. Voxelcontext-net: An octree based framework for point cloud compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6042--6051.
[40]
Sebastian Schwarz, Philip A. Chou, and Indranil Sinharoy. 2018. Common test conditions for point cloud compression. ISO/IEC MPEG N18474 (2018).
[41]
S. Schwarz, M. Preda, V. Baroncini, et al. 2019. Emerging MPEG Standards for Point Cloud Compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9 (2019), 133--148.
[42]
Xihua Sheng, Li Li, Dong Liu, Zhiwei Xiong, Zhu Li, and Feng Wu. 2021. Deep- PCAC: An end-to-end deep lossy compression framework for point cloud attributes. IEEE Transactions on Multimedia 24 (2021), 2617--2632.
[43]
Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2380--2389.
[44]
Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander Alemi. 2017. Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 31.
[45]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[46]
Guo-Hua Wang, Jiahao Li, Bin Li, and Yan Lu. 2023. Evc: Towards real-time neural image compression with mask decay. arXiv preprint arXiv:2302.05071 (2023).
[47]
Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, and Zhan Ma. 2022. Sparse tensor-based multiscale representation for point cloud geometry compression. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).
[48]
Jianqiang Wang, Dandan Ding, Zhu Li, and Zhan Ma. 2021. Multiscale point cloud geometry compression. In 2021 Data Compression Conference (DCC). IEEE, 73--82.
[49]
JianqiangWang, Dandan Ding, and Zhan Ma. 2023. Lossless Point Cloud Attribute Compression Using Cross-scale, Cross-group, and Cross-color Prediction. In IEEE DCC.
[50]
Jianqiang Wang and Zhan Ma. 2022. Sparse Tensor-based Point Cloud Attribute Compression. In 2022 IEEE 5th International Conference on Multimedia Information Processing and Retrieval (MIPR). 59--64. https://doi.org/10.1109/MIPR54900.2022. 00018
[51]
Jianqiang Wang, Hao Zhu, Haojie Liu, and Zhan Ma. 2021. Lossy point cloud geometry compression via end-to-end learning. IEEE Transactions on Circuits and Systems for Video Technology 31, 12 (2021), 4909--4923.
[52]
Xining Wang, Ming Lu, and Zhan Ma. 2022. Block-Level Rate Control for Learnt Image Coding. In 2022 Picture Coding Symposium (PCS). IEEE, 157--161.
[53]
Yujie Wang, Chenggang Yan, Yutong Feng, Shaoyi Du, Qionghai Dai, and Yue Gao. 2022. Storm: Structure-based overlap matching for partial point cloud registration. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 1 (2022), 1135--1149.
[54]
Xuanzheng Wen, Xu Wang, Junhui Hou, Lin Ma, Yu Zhou, and Jianmin Jiang. 2020. Lossy geometry compression of 3d point cloud data via an adaptive octreeguided network. In 2020 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1--6.
[55]
WG7, MPEG 3D Graphics Coding. July 2022. Description of Exploration Experiment 5.3 on AI-based Dynamic PC Coding. ISO/IEC JTC 1/SC 29/WG 7 N00386 (July 2022).
[56]
Yi Xu, Yao Lu, and Ziyu Wen. 2017. Owlii Dynamic human mesh sequence dataset. ISO/IEC JTC1/SC29/WG11 m41658 (2017).
[57]
Wei Yan, Shan Liu, Thomas H Li, Zhu Li, Ge Li, et al. 2019. Deep autoencoder-based lossy geometry compression for point clouds. arXiv preprint arXiv:1905.03691 (2019).
[58]
Fei Yang, Luis Herranz, Yongmei Cheng, and Mikhail G Mozerov. 2021. Slimmable compressive autoencoders for practical neural image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4998--5007.
[59]
Kang You and Pan Gao. 2021. Patch-based deep autoencoder for point cloud geometry compression. In ACM Multimedia Asia. 1--7.
[60]
Kang You, Pan Gao, and Qing Li. 2022. IPDAE: Improved Patch-Based Deep Autoencoder for Lossy Point Cloud Geometry Compression. In Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 1--10.
[61]
Junteng Zhang, Gexin Liu, Dandan Ding, and Zhan Ma. 2022. Transformer and Upsampling-Based Point Cloud Compression. In Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 33--39.

Cited By

View all
  • (2024)Laplacian Matrix Learning for Point Cloud Attribute Compression with Ternary Search-Based Adaptive Block PartitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681615(10412-10420)Online publication date: 28-Oct-2024
  • (2024)An End-to-End ConvLSTM-based Method for Point Cloud Streaming Compression2024 International Conference on Advanced Robotics and Mechatronics (ICARM)10.1109/ICARM62033.2024.10715920(382-387)Online publication date: 8-Jul-2024
  • (2024)Variable-Rate Point Cloud Geometry Compression Based on Feature Adjustment and Interpolation2024 Data Compression Conference (DCC)10.1109/DCC58796.2024.00014(63-72)Online publication date: 19-Mar-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '23: Proceedings of the 31st ACM International Conference on Multimedia
October 2023
9913 pages
ISBN:9798400701085
DOI:10.1145/3581783
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. attribute
  2. geometry
  3. layered coding
  4. point cloud compression

Qualifiers

  • Research-article

Funding Sources

Conference

MM '23
Sponsor:
MM '23: The 31st ACM International Conference on Multimedia
October 29 - November 3, 2023
Ottawa ON, Canada

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)568
  • Downloads (Last 6 weeks)42
Reflects downloads up to 19 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Laplacian Matrix Learning for Point Cloud Attribute Compression with Ternary Search-Based Adaptive Block PartitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681615(10412-10420)Online publication date: 28-Oct-2024
  • (2024)An End-to-End ConvLSTM-based Method for Point Cloud Streaming Compression2024 International Conference on Advanced Robotics and Mechatronics (ICARM)10.1109/ICARM62033.2024.10715920(382-387)Online publication date: 8-Jul-2024
  • (2024)Variable-Rate Point Cloud Geometry Compression Based on Feature Adjustment and Interpolation2024 Data Compression Conference (DCC)10.1109/DCC58796.2024.00014(63-72)Online publication date: 19-Mar-2024
  • (2024)Fast Point Cloud Geometry Compression with Context-Based Residual Coding and INR-Based RefinementComputer Vision – ECCV 202410.1007/978-3-031-73113-6_16(270-288)Online publication date: 21-Nov-2024
  • (2023)GRNet:Geometry Restoration for G-PCC Compressed Point Clouds Using Auxiliary Density SignalingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.333693630:10(6740-6753)Online publication date: 27-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media