research-article

Free access

Adaptive weight compression for memory-efficient neural networks

Authors:

Saibal MukhopadhyayAuthors Info & Claims

DATE '17: Proceedings of the Conference on Design, Automation & Test in Europe

Pages 199 - 204

Published: 27 March 2017 Publication History

Abstract

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents an application of JPEG image encoding to compress the weights by exploiting the spatial locality and smoothness of the weight matrix. To minimize the loss of accuracy due to JPEG encoding, we propose to adaptively control the quantization factor of the JPEG algorithm depending on the error-sensitivity (gradient) of each weight. With the adaptive compression technique, the weight blocks with higher sensitivity are compressed less for higher accuracy. The adaptive compression reduces memory requirement, which in turn results in higher performance and lower energy of neural network hardware. The simulation for inference hardware for multilayer perceptron with the MNIST dataset shows up to 42X compression with less than 1% loss of recognition accuracy, resulting in 3X higher effective memory bandwidth and ∼19X lower system energy.

References

[1]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., pp. 1--9, 2012.

Digital Library

[2]

P. Panda, A. Sengupta, and K. Roy, "Conditional Deep Learning for Energy-Efficient and Enhanced Pattern Recognition Priyadarshini," DATE, 2016.

Digital Library

[3]

Pavlo Molchanov et al., "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks," CVPR, 2016.

[4]

S. Han, H. Mao, and W. J. Dally, "Deep Compression - Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," ICLR, 2016.

[5]

"MNIST database." {Online}. http://yann.lecun.com/exdb/mnist/.

[6]

"DDR3 SDRAM, JESD79-3F." {Online}. Available: http://www.jedec.org/standards-documents/docs/jesd-79-3d.

[7]

D. Kim, J. Kung, S. Chai, S. Yalamanchili, and S. Mukhopadhyay, "Neurocube : A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory," in ISCA, 2016.

Digital Library

[8]

J. Kung, D. Kim, and S. Mukhopadhyay, "Dynamic Approximation with Feedback Control for Energy-Efficient Recurrent Neural Network Hardware," ISLPED, 2016.

Digital Library

[9]

M. Courbariaux and Y. Bengio, "BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1," arXiv, p. 9, 2016.

[10]

M. Courbariaux, Y. Bengio, and J.-P. David, "Low Precision Storage for Deep Learning," ICLR, 2015.

[11]

J. A. Hertz, A. Krogh, and R. G. Palmer, Introduction to the Theory of Neural Computation. 1991.

Digital Library

[12]

B. R. Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, "Low-Rank Matrix Factorization for Deep Neural Network Training With High-Dimensional Output Targets," ICASSP, 2013.

[13]

Y. H. Chen, T. Krishna, J. Emer, and V. Sze, "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," ISSCC, pp. 262--263, 2016.

[14]

J. Koutník, F. Gomez, and J. Schmidhuber, "Evolving Neural Networks in Compressed Weight Space," Proc. 12th Annu. Conf. Genet. Evol. Comput. - GECCO '10, p. 619, 2010.

Digital Library

[15]

W. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, and Y. Chen, "Compressing Convolutional Neural Networks in the Frequency Domain," KDD, 2016.

Digital Library

[16]

J. Chung and T. Shin, "Simplifying Deep Neural Networks for Neuromorphic Architectures," in DAC, 2016.

Digital Library

[17]

J. Kung, D. Kim, and S. Mukhopadhyay, "A power-aware digital feedforward neural network platform with backpropagation driven approximate synapses," ISLPED, 2015.

[18]

"Digital compression and coding of continuous-tone still images: JPEG File Interchange Format (JFIF)."

[19]

N. Tavakoli, "Entropy and Image Compression," J. Vis. Commun. Image Represent., 1993.

[20]

Y. Hu, F. Meng, and Y. Wang, "Improved JPEG Compression Algorithm Based on Saliency Maps," CISP, 2012.

[21]

L. Sun, Z. Lin, F. Xiao, and J. Guo, "A New Scheme Based on Pixel-Intense Motion Block Algorithm for Residual Distributed Video Coding," Int. J. Distrib. Sens. Networks, 2013.

[22]

"CNAE-9 Dataset." https://archive.ics.uci.edu/ml/datasets/CNAE-9.

[23]

Y. Netzer and T. Wang, "Reading digits in natural images with unsupervised feature learning," NIPS, 2011.

[24]

"SD Specifications Version 4.10." SD Association.

[25]

K. T. Malladi, "Towards energy proportional datacenter memory with mobile dram," ACM SIGARCH Comput. Archit. News, 2012.

Digital Library

Cited By

Maleki MNabipour-Meybodi AKamal MAfzali-Kusha APedram M(2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1145/3460972
Guo CZhang LZhou XZhang GLi BQian WYin XZhuo C(2021)A Reconfigurable Multiplier for Signed Multiplications with Asymmetric Bit-WidthsACM Journal on Emerging Technologies in Computing Systems10.1145/344621317:4(1-16)Online publication date: 30-Jun-2021
https://dl.acm.org/doi/10.1145/3446213
Ko YChadwick ABates DMullins R(2021)Lane CompressionACM Transactions on Embedded Computing Systems10.1145/343181520:2(1-26)Online publication date: 18-Mar-2021
https://dl.acm.org/doi/10.1145/3431815
Show More Cited By

Adaptive weight compression for memory-efficient neural networks
1. Computing methodologies
  1. Computer graphics
  2. Machine learning
    1. Machine learning approaches

Recommendations

Neural Network Weight Compression with NNW-BDI
MEMSYS '20: Proceedings of the International Symposium on Memory Systems

Memory is a scarce resource and increasingly so in the age of deep neural networks. Memory compression is a solution to the memory scarcity problem. This work proposes NNW-BDI, a scheme for compressing pretrained neural network weights. NNW-BDI is a ...
Neural Video Compression with Re-Parametrisation Scene Content-Adaptive Network
EMCLR'24: Proceedings of the 1st International Workshop on Efficient Multimedia Computing under Limited

To further improve the video compression performance, a neural network based content-adaptive video compression method is proposed. In current neural video compression methods, the compression quality of I-Frames has a significant impact on the final ...
Adaptive image compression using local pattern information

This paper describes a new adaptive coding technique to the coding of transform coefficients used in block based image compression schemes. The presence and orientation of the edge information in a sub-block are used to select different quantization ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

DATE '17: Proceedings of the Conference on Design, Automation & Test in Europe

March 2017

1814 pages

Publisher

European Design and Automation Association

Leuven, Belgium

Publication History

Published: 27 March 2017

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
742
Total Downloads

Downloads (Last 12 months)133
Downloads (Last 6 weeks)12

Reflects downloads up to 22 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Maleki MNabipour-Meybodi AKamal MAfzali-Kusha APedram M(2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
https://dl.acm.org/doi/10.1145/3460972
Guo CZhang LZhou XZhang GLi BQian WYin XZhuo C(2021)A Reconfigurable Multiplier for Signed Multiplications with Asymmetric Bit-WidthsACM Journal on Emerging Technologies in Computing Systems10.1145/344621317:4(1-16)Online publication date: 30-Jun-2021
https://dl.acm.org/doi/10.1145/3446213
Ko YChadwick ABates DMullins R(2021)Lane CompressionACM Transactions on Embedded Computing Systems10.1145/343181520:2(1-26)Online publication date: 18-Mar-2021
https://dl.acm.org/doi/10.1145/3431815
Imani MSokolova AGarcia RHuang AWu FAksanli BRosing T(2019)ApproxLPProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317774(1-6)Online publication date: 2-Jun-2019
https://dl.acm.org/doi/10.1145/3316781.3317774

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents