Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3130379.3130424guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
research-article
Free access

Adaptive weight compression for memory-efficient neural networks

Published: 27 March 2017 Publication History

Abstract

Neural networks generally require significant memory capacity/bandwidth to store/access a large number of synaptic weights. This paper presents an application of JPEG image encoding to compress the weights by exploiting the spatial locality and smoothness of the weight matrix. To minimize the loss of accuracy due to JPEG encoding, we propose to adaptively control the quantization factor of the JPEG algorithm depending on the error-sensitivity (gradient) of each weight. With the adaptive compression technique, the weight blocks with higher sensitivity are compressed less for higher accuracy. The adaptive compression reduces memory requirement, which in turn results in higher performance and lower energy of neural network hardware. The simulation for inference hardware for multilayer perceptron with the MNIST dataset shows up to 42X compression with less than 1% loss of recognition accuracy, resulting in 3X higher effective memory bandwidth and ∼19X lower system energy.

References

[1]
A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Adv. Neural Inf. Process. Syst., pp. 1--9, 2012.
[2]
P. Panda, A. Sengupta, and K. Roy, "Conditional Deep Learning for Energy-Efficient and Enhanced Pattern Recognition Priyadarshini," DATE, 2016.
[3]
Pavlo Molchanov et al., "Online Detection and Classification of Dynamic Hand Gestures with Recurrent 3D Convolutional Neural Networks," CVPR, 2016.
[4]
S. Han, H. Mao, and W. J. Dally, "Deep Compression - Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding," ICLR, 2016.
[5]
"MNIST database." {Online}. http://yann.lecun.com/exdb/mnist/.
[6]
"DDR3 SDRAM, JESD79-3F." {Online}. Available: http://www.jedec.org/standards-documents/docs/jesd-79-3d.
[7]
D. Kim, J. Kung, S. Chai, S. Yalamanchili, and S. Mukhopadhyay, "Neurocube : A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory," in ISCA, 2016.
[8]
J. Kung, D. Kim, and S. Mukhopadhyay, "Dynamic Approximation with Feedback Control for Energy-Efficient Recurrent Neural Network Hardware," ISLPED, 2016.
[9]
M. Courbariaux and Y. Bengio, "BinaryNet: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1," arXiv, p. 9, 2016.
[10]
M. Courbariaux, Y. Bengio, and J.-P. David, "Low Precision Storage for Deep Learning," ICLR, 2015.
[11]
J. A. Hertz, A. Krogh, and R. G. Palmer, Introduction to the Theory of Neural Computation. 1991.
[12]
B. R. Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, "Low-Rank Matrix Factorization for Deep Neural Network Training With High-Dimensional Output Targets," ICASSP, 2013.
[13]
Y. H. Chen, T. Krishna, J. Emer, and V. Sze, "Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks," ISSCC, pp. 262--263, 2016.
[14]
J. Koutník, F. Gomez, and J. Schmidhuber, "Evolving Neural Networks in Compressed Weight Space," Proc. 12th Annu. Conf. Genet. Evol. Comput. - GECCO '10, p. 619, 2010.
[15]
W. Chen, J. T. Wilson, S. Tyree, K. Q. Weinberger, and Y. Chen, "Compressing Convolutional Neural Networks in the Frequency Domain," KDD, 2016.
[16]
J. Chung and T. Shin, "Simplifying Deep Neural Networks for Neuromorphic Architectures," in DAC, 2016.
[17]
J. Kung, D. Kim, and S. Mukhopadhyay, "A power-aware digital feedforward neural network platform with backpropagation driven approximate synapses," ISLPED, 2015.
[18]
"Digital compression and coding of continuous-tone still images: JPEG File Interchange Format (JFIF)."
[19]
N. Tavakoli, "Entropy and Image Compression," J. Vis. Commun. Image Represent., 1993.
[20]
Y. Hu, F. Meng, and Y. Wang, "Improved JPEG Compression Algorithm Based on Saliency Maps," CISP, 2012.
[21]
L. Sun, Z. Lin, F. Xiao, and J. Guo, "A New Scheme Based on Pixel-Intense Motion Block Algorithm for Residual Distributed Video Coding," Int. J. Distrib. Sens. Networks, 2013.
[22]
"CNAE-9 Dataset." https://archive.ics.uci.edu/ml/datasets/CNAE-9.
[23]
Y. Netzer and T. Wang, "Reading digits in natural images with unsupervised feature learning," NIPS, 2011.
[24]
"SD Specifications Version 4.10." SD Association.
[25]
K. T. Malladi, "Towards energy proportional datacenter memory with mobile dram," ACM SIGARCH Comput. Archit. News, 2012.

Cited By

View all
  • (2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
  • (2021)A Reconfigurable Multiplier for Signed Multiplications with Asymmetric Bit-WidthsACM Journal on Emerging Technologies in Computing Systems10.1145/344621317:4(1-16)Online publication date: 30-Jun-2021
  • (2021)Lane CompressionACM Transactions on Embedded Computing Systems10.1145/343181520:2(1-26)Online publication date: 18-Mar-2021
  • Show More Cited By
  1. Adaptive weight compression for memory-efficient neural networks

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      DATE '17: Proceedings of the Conference on Design, Automation & Test in Europe
      March 2017
      1814 pages

      Publisher

      European Design and Automation Association

      Leuven, Belgium

      Publication History

      Published: 27 March 2017

      Author Tags

      1. JPEG
      2. MLP
      3. compression
      4. memory-efficient
      5. neural network
      6. weight

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)133
      • Downloads (Last 6 weeks)12
      Reflects downloads up to 22 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)An Energy-Efficient Inference Method in Convolutional Neural Networks Based on Dynamic Adjustment of the Pruning LevelACM Transactions on Design Automation of Electronic Systems10.1145/346097226:6(1-20)Online publication date: 1-Aug-2021
      • (2021)A Reconfigurable Multiplier for Signed Multiplications with Asymmetric Bit-WidthsACM Journal on Emerging Technologies in Computing Systems10.1145/344621317:4(1-16)Online publication date: 30-Jun-2021
      • (2021)Lane CompressionACM Transactions on Embedded Computing Systems10.1145/343181520:2(1-26)Online publication date: 18-Mar-2021
      • (2019)ApproxLPProceedings of the 56th Annual Design Automation Conference 201910.1145/3316781.3317774(1-6)Online publication date: 2-Jun-2019

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media