CNN-Based DCT-Like Transform for Image Compression

Dong Liu²¹,
Haichuan Ma²¹,
Zhiwei Xiong²¹ &
…
Feng Wu²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10705))

Included in the following conference series:

International Conference on Multimedia Modeling

4308 Accesses
28 Citations
3 Altmetric

Abstract

This paper presents a block transform for image compression, where the transform is inspired by discrete cosine transform (DCT) but achieved by training convolutional neural network (CNN) models. Specifically, we adopt the combination of convolution, nonlinear mapping, and linear transform to form a non-linear transform as well as a non-linear inverse transform. The transform, quantization, and inverse transform are jointly trained to achieve the overall rate-distortion optimization. For the training purpose, we propose to estimate the rate by the $l_1$-norm of the quantized coefficients. We also explore different combinations of linear/non-linear transform and inverse transform. Experimental results show that our proposed CNN-based transform achieves higher compression efficiency than fixed DCT, and also outperforms JPEG significantly at low bit rates.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Residual CNN Image Compression

On Enhancing Low Bit-Rate Performance of an Image Codec Using Deep Learning-Based Nonlinear Processing

Convolutional Neural Network (CNN) to Reduce Construction Loss in JPEG Compression Caused by Discrete Fourier Transform (DFT)

Notes

1.
http://r0k.us/graphics/kodak/.
2.
https://hevc.hhi.fraunhofer.de/svn/svn_HEVCSoftware/tags/HM-16.15/.
3.
https://github.com/tensorflow/models/tree/master/compression. This network has no entropy coding since the authors do not provide.

References

Wallace, G.K.: The JPEG still picture compression standard. IEEE Trans. Consum. Electron. 38(1), xviii–xxxiv (1992)
Google Scholar
Christopoulos, C., Skodras, A., Ebrahimi, T.: The JPEG2000 still image coding system: an overview. IEEE Trans. Consum. Electron. 46(4), 1103–1127 (2000)
Article Google Scholar
Wiegand, T., Sullivan, G.J., Bjontegaard, G., Luthra, A.: Overview of the H.264/AVC video coding standard. IEEE Trans. Circ. Syst. Video Technol. 13(7), 560–576 (2003)
Article Google Scholar
Sullivan, G.J., Ohm, J., Han, W.J., Wiegand, T.: Overview of the high efficiency video coding (HEVC) standard. IEEE Trans. Circ. Syst. Video Technol. 22(12), 1649–1668 (2012)
Article Google Scholar
Hu, W., Cheung, G., Ortega, A., Au, O.C.: Multiresolution graph fourier transform for compression of piecewise smooth images. IEEE Trans. Image Process. 24(1), 419–433 (2015)
Article MathSciNet Google Scholar
Toderici, G., O’Malley, S.M., Hwang, S.J., Vincent, D., Minnen, D., Baluja, S., Covell, M., Sukthankar, R.: Variable rate image compression with recurrent neural networks. In: ICLR (2016)
Google Scholar
Toderici, G., Vincent, D., Johnston, N., Hwang, S.J., Minnen, D., Shor, J., Covell, M.: Full resolution image compression with recurrent neural networks. In: CVPR, pp. 5306–5314 (2017)
Google Scholar
Johnston, N., Vincent, D., Minnen, D., Covell, M., Singh, S., Chinen, T., Hwang, S.J., Shor, J., Toderici, G.: Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks. arXiv preprint arXiv:1703.10114 (2017)
Ballé, J., Laparra, V., Simoncelli, E.P.: End-to-end optimized image compression. In: ICLR (2017)
Google Scholar
Theis, L., Shi, W., Cunningham, A., Huszár, F.: Lossy image compression with compressive autoencoders. In: ICLR (2017)
Google Scholar
Rippel, O., Bourdev, L.: Real-time adaptive image compression. In: ICML, pp. 2922–2930 (2017)
Google Scholar
Jiang, F., Tao, W., Liu, S., Ren, J., Guo, X., Zhao, D.: An end-to-end compression framework based on convolutional neural networks. IEEE Trans. Circ. Syst. Video Technol. (2017). https://doi.org/10.1109/TCSVT.2017.2734838
Baig, M.H., Torresani, L.: Multiple hypothesis colorization and its application to image compression. Comput. Vis. Image Underst. (2017)
Google Scholar
Prakash, A., Moran, N., Garber, S., DiLillo, A., Storer, J.: Semantic perceptual image compression using deep convolution networks. In: DCC, pp. 250–259 (2017)
Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MathSciNet MATH Google Scholar
Wong, C.W., Au, O.C., Lam, H.K.: Rate control using probability of non-zero quantized coefficients. In: ICME (2004)
Google Scholar
Candes, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51(12), 4203–4215 (2005)
Article MathSciNet MATH Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML, pp. 807–814 (2010)
Google Scholar
Schaefer, G., Stich, M.: UCID: an uncompressed color image database. In: Electronic Imaging 2004, International Society for Optics and Photonics, pp. 472–480 (2004)
Google Scholar
Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., Guadarrama, S., Darrell, T.: Caffe: convolutional architecture for fast feature embedding. In: ACM Multimedia, pp. 675–678. ACM (2014)
Google Scholar
Said, A.: Introduction to arithmetic coding - theory and practice. Technical report HPL-2004-76, Hewlett Packard Laboratories Palo Alto (2004)
Google Scholar

Download references

Acknowledgment

This work was supported by the Natural Science Foundation of China (NSFC) under Grant 61772483, Grant 61390512, and Grant 61425026, and by the Fundamental Research Funds for the Central Universities under Grant WK3490000001.

Author information

Authors and Affiliations

CAS Key Laboratory of Technology in Geo-Spatial Information Processing and Application System, University of Science and Technology of China, Hefei, China
Dong Liu, Haichuan Ma, Zhiwei Xiong & Feng Wu

Authors

Dong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Haichuan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwei Xiong
View author publications
You can also search for this author in PubMed Google Scholar
Feng Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong Liu .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liu, D., Ma, H., Xiong, Z., Wu, F. (2018). CNN-Based DCT-Like Transform for Image Compression. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10705. Springer, Cham. https://doi.org/10.1007/978-3-319-73600-6_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-73600-6_6
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73599-3
Online ISBN: 978-3-319-73600-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

CNN-Based DCT-Like Transform for Image Compression

Abstract

Access this chapter

Subscribe and save

Buy Now