Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3595916.3626444acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

Published: 01 January 2024 Publication History

Abstract

Recently, neural network-based image compression techniques have demonstrated remarkable compression performance. The use of context-adaptive entropy models greatly enhances the rate-distortion (R-D) performance by effectively capturing spatial redundancy in latent representations. However, latent representations still contain some spatial correlations(e.g. same spatial structure), it needs to be eliminated by further processing. And many compression models are single-rate model, which is difficult to cover a big range of bitrate. In order to address this issue, we propose a novel variable-rate image compression algorithm that efficiently leverages bi-resolution spatial-channel information through learned mechanisms. In this paper, we first proposed a BRP network to divide our latent representations and side information into HR and LR components, eliminating the spatial redundancy in same location. Combining the spatial-channel context, we proposed a BSC context model, including a decreasing-granularity checkerboard pattern and channel grouping based on cosine slicing strategy. To cover a wide range of bitrate, we take a weight map as input to control bit allocation, achieving multiple compression rates. Our experimental results show that our method provides a better rate-distortion trade-off than BPG, JPEG and other recent image compression methods based on deep learning.

References

[1]
2021. Versatile Video Coding Reference Software Version 12.1 (VTM-12.1). https://vcgit.hhi.fraunhofer.de/jvet/VVCSoftware_VTM/-/tags/VTM-12.1.
[2]
Mohammad Akbari, Jie Liang, Jingning Han, and Chengjie Tu. 2021. Learned bi-resolution image coding using generalized octave convolutions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 6592–6599.
[3]
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).
[4]
Jean Bégaint, Fabien Racapé, Simon Feltman, and Akshay Pushparaja. 2020. Compressai: a pytorch library and evaluation platform for end-to-end compression research. arXiv preprint arXiv:2011.03029 (2020).
[5]
Fabrice Bellard. 2015. BPG image format. URL https://bellard. org/bpg 1, 2 (2015), 1.
[6]
Gisle Bjontegaard. 2001. Calculation of average PSNR differences between RD-curves. ITU SG16 Doc. VCEG-M33 (2001).
[7]
Yunpeng Chen, Haoqi Fan, Bing Xu, Zhicheng Yan, Yannis Kalantidis, Marcus Rohrbach, Shuicheng Yan, and Jiashi Feng. 2019. Drop an octave: Reducing spatial redundancy in convolutional neural networks with octave convolution. In Proceedings of the IEEE/CVF international conference on computer vision. 3435–3444.
[8]
Zhengxue Cheng, Heming Sun, Masaru Takeuchi, and Jiro Katto. 2020. Learned image compression with discretized gaussian mixture likelihoods and attention modules. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 7939–7948.
[9]
Rich Franzen. 1999. Kodak lossless true color image suite. source: http://r0k. us/graphics/kodak 4, 2 (1999), 9.
[10]
Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Causal contextual prediction for learned image compression. IEEE Transactions on Circuits and Systems for Video Technology 32, 4 (2021), 2329–2341.
[11]
Zongyu Guo, Zhizheng Zhang, Runsen Feng, and Zhibo Chen. 2021. Soft then hard: Rethinking the quantization in neural image compression. In International Conference on Machine Learning. PMLR, 3920–3929.
[12]
Dailan He, Ziming Yang, Weikun Peng, Rui Ma, Hongwei Qin, and Yan Wang. 2022. Elic: Efficient learned image compression with unevenly grouped space-channel contextual adaptive coding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5718–5727.
[13]
Dailan He, Yaoyan Zheng, Baocheng Sun, Yan Wang, and Hongwei Qin. 2021. Checkerboard context model for efficient learned image compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14771–14780.
[14]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[15]
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. 2014. Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13. Springer, 740–755.
[16]
Ming Lu, Fangdong Chen, Shiliang Pu, and Zhan Ma. 2022. High-efficiency lossy image coding through adaptive neighborhood information aggregation. arXiv preprint arXiv:2204.11448 (2022).
[17]
David Minnen, Johannes Ballé, and George D Toderici. 2018. Joint autoregressive and hierarchical priors for learned image compression. Advances in neural information processing systems 31 (2018).
[18]
Athanassios Skodras, Charilaos Christopoulos, and Touradj Ebrahimi. 2001. The JPEG 2000 still image compression standard. IEEE Signal processing magazine 18, 5 (2001), 36–58.
[19]
Myungseo Song, Jinyoung Choi, and Bohyung Han. 2021. Variable-rate deep image compression through spatially-adaptive feature transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2380–2389.
[20]
Gregory K Wallace. 1992. The JPEG still picture compression standard. IEEE transactions on consumer electronics 38, 1 (1992), xviii–xxxiv.
[21]
Yiming Wang, Qian Huang, Bin Tang, Huashan Sun, and Xiaotong Guo. 2023. FGC-VC: Flow-Guided Context Video Compression. In 2023 IEEE International Conference on Image Processing (ICIP). IEEE, 3175–3179.
[22]
Yibo Yang, Robert Bamler, and Stephan Mandt. 2020. Improving inference for neural image compression. Advances in Neural Information Processing Systems 33 (2020), 573–584.
[23]
Jing Zhao, Bin Li, Jiahao Li, Ruiqin Xiong, and Yan Lu. 2021. A universal encoder rate distortion optimization framework for learned compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1880–1884.
[24]
Yinhao Zhu, Yang Yang, and Taco Cohen. 2021. Transformer-based transform coding. In International Conference on Learning Representations.

Cited By

View all
  • (2024)Spatial-Temporal Motion Compensation for Learned Video Compression2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831688(3019-3024)Online publication date: 6-Oct-2024
  • (2024)Learned Image Compression with Transformer-CNN Mixed Structures and Spatial Checkerboard Context2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831319(1391-1397)Online publication date: 6-Oct-2024

Index Terms

  1. End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MMAsia '23: Proceedings of the 5th ACM International Conference on Multimedia in Asia
    December 2023
    745 pages
    ISBN:9798400702051
    DOI:10.1145/3595916
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 January 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep network
    2. image compression
    3. lossy compression
    4. multi-resolution

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • the Fundamental Research Funds of China for the Central Universities
    • the Jiangsu Higher Education Reform Research Project
    • the 2022 Undergraduate Practice Teaching Reform Research Project of Hohai University
    • the 14th Five-Year Plan for Educational Science of Jiangsu Province
    • the Key Research and Development Program of Yunnan Province
    • the Postgraduate Research & Practice Innovation Program of Jiangsu Province

    Conference

    MMAsia '23
    Sponsor:
    MMAsia '23: ACM Multimedia Asia
    December 6 - 8, 2023
    Tainan, Taiwan

    Acceptance Rates

    Overall Acceptance Rate 59 of 204 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)61
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 15 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Spatial-Temporal Motion Compensation for Learned Video Compression2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831688(3019-3024)Online publication date: 6-Oct-2024
    • (2024)Learned Image Compression with Transformer-CNN Mixed Structures and Spatial Checkerboard Context2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC54092.2024.10831319(1391-1397)Online publication date: 6-Oct-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media