research-article

End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

Authors:

Huashan SunAuthors Info & Claims

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

Article No.: 70, Pages 1 - 7

https://doi.org/10.1145/3595916.3626444

Published: 01 January 2024 Publication History

Abstract

Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial redundancy in latent representations. However, latent representations still contain some spatial correlations(e.g. same spatial structure), it needs to be eliminated by further processing. And many compression models are single-rate model, which is difficult to cover a big range of bitrate. In order to address this issue, we propose a novel variable-rate image compression algorithm that efficiently leverages bi-resolution spatial-channel information through learned mechanisms. In this paper, we first proposed a BRP network to divide our latent representations and side information into HR and LR components, eliminating the spatial redundancy in same location. Combining the spatial-channel context, we proposed a BSC context model, including a decreasing-granularity checkerboard pattern and channel grouping based on cosine slicing strategy. To cover a wide range of bitrate, we take a weight map as input to control bit allocation, achieving multiple compression rates. Our experimental results show that our method provides a better rate-distortion trade-off than BPG, JPEG and other recent image compression methods based on deep learning.

References

[1]

2021. Versatile Video Coding Reference Software Version 12.1 (VTM-12.1). https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-12.1.

[2]

Mohammad Akbari, Jie Liang, Jingning Han, and Chengjie Tu. 2021. Learned bi-resolution image coding using generalized octave convolutions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6592–6599.

[3]

Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).

[4]

Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).

[5]

Fabrice Bellard. 2015. BPG image format. URL https://bellard. org/bpg 1, 2 (2015), 1.

[6]

Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).

[7]

Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF international conference on computer vision. 3435–3444.

[8]

Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7939–7948.

[9]

Rich Franzen. 1999. Kodak lossless true color image suite. source: http://r0k. us/graphics/kodak 4, 2 (1999), 9.

[10]

Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Causal contextual prediction for learned image compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2329–2341.

[11]

Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Soft then hard: Rethinking the quantization in neural image compression. In International Conference on Machine Learning. PMLR, 3920–3929.

[12]

Dailan He, Ziming Yang, Weikun Peng, Rui Ma, Hongwei Qin, and Yan Wang. 2022. Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5718–5727.

[13]

Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14771–14780.

[14]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[15]

Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.

[16]

Ming Lu, Fangdong Chen, Shiliang Pu, and Zhan Ma. 2022. High-efficiency lossy image coding through adaptive neighborhood information aggregation. arXiv preprint arXiv:2204.11448 (2022).

[17]

David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems 31 (2018).

[18]

Athanassios Skodras, Charilaos Christopoulos, and Touradj Ebrahimi. 2001. The JPEG 2000 still image compression standard. IEEE Signal processing magazine 18, 5 (2001), 36–58.

[19]

Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2380–2389.

[20]

Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.

Digital Library

[21]

Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, and Xiaotong Guo. 2023. FGC-VC: Flow-Guided Context Video Compression. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 3175–3179.

[22]

Yibo Yang, Robert Bamler, and Stephan Mandt. 2020. Improving inference for neural image compression. Advances in Neural Information Processing Systems 33 (2020), 573–584.

[23]

Jing Zhao, Bin Li, Jiahao Li, Ruiqin Xiong, and Yan Lu. 2021. A universal encoder rate distortion optimization framework for learned compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1880–1884.

[24]

Yinhao Zhu, Yang Yang, and Taco Cohen. 2021. Transformer-based transform coding. In International Conference on Learning Representations.

Cited By

Huang QLiu WLu HWang Y(2024)Spatial-Temporal Motion Compensation for Learned Video Compression2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831688(3019-3024)Online publication date: 6-Oct-2024
https://doi.org/10.1109/SMC54092.2024.10831688
Ji K(2024)Learned Image Compression with Transformer-CNN Mixed Structures and Spatial Checkerboard Context2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831319(1391-1397)Online publication date: 6-Oct-2024
https://doi.org/10.1109/SMC54092.2024.10831319

Index Terms

End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems

Recommendations

Simple bit-plane coding for lossless image compression and extended functionalities
PCS'09: Proceedings of the 27th conference on Picture Coding Symposium

A simple lossy-to-lossless bit-plane coding of still images is presented to integrate several functionality extensions including selective tile partitioning, progressive transmission, ROI transmission, accuracy scalability, and others. The mean squared ...
Conditional Entropy Coding of VQ Indexes for Image Compression
DCC '97: Proceedings of the Conference on Data Compression

Vector quantization (VQ) is a source coding methodology with provable rate-distortion optimality. However, despite more than two decades of intensive research, VQ theoretical promise is yet to be fully realized in image compression practice. Restricted ...
Progressive scalable interactive region-of-interest image coding using vector quantization

We have developed novel progressive scalable region-of-interest (ROI) image compression schemes with rate-distortion-complexity tradeoff based on vector quantization. Residual vector quantization (RVQ) equips the encoder with a multi-resolution ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia

December 2023

745 pages

ISBN:9798400702051

DOI:10.1145/3595916

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

the Fundamental Research Funds of China for the Central Universities
the Jiangsu Higher Education Reform Research Project
the 2022 Undergraduate Practice Teaching Reform Research Project of Hohai University
the 14th Five-Year Plan for Educational Science of Jiangsu Province
the Key Research and Development Program of Yunnan Province
the Postgraduate Research & Practice Innovation Program of Jiangsu Province

Conference

MMAsia '23

Sponsor:

SIGMM

MMAsia '23: ACM Multimedia Asia

December 6 - 8, 2023

Tainan, Taiwan

Acceptance Rates

Overall Acceptance Rate 59 of 204 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
109
Total Downloads

Downloads (Last 12 months)61
Downloads (Last 6 weeks)2

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Huang QLiu WLu HWang Y(2024)Spatial-Temporal Motion Compensation for Learned Video Compression2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831688(3019-3024)Online publication date: 6-Oct-2024
https://doi.org/10.1109/SMC54092.2024.10831688
Ji K(2024)Learned Image Compression with Transformer-CNN Mixed Structures and Spatial Checkerboard Context2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831319(1391-1397)Online publication date: 6-Oct-2024
https://doi.org/10.1109/SMC54092.2024.10831319

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten