Approximate Computing for Efficient Neural Network Computation: A Survey

Hao Zhang³,
Mohammadreza Asadikouhanjani⁴,
Jie Han⁵,
Deivalakshmi Subbian⁶ &
…
Seok-Bum Ko⁴

1350 Accesses
2 Citations

Abstract

The implementation of neural network models is hardware expensive. To reduce the hardware cost, many optimization techniques have been explored in the literature and approximate computing is one of these techniques. With careful design, neural networks with approximate arithmetic units can have similar or even better accuracy than those with conventional exact arithmetic units while achieving significant improvement in energy efficiency. In this chapter, a comprehensive survey of approximate arithmetic units applied for efficient neural network computation is presented. As multiplier is more expensive than adder, our focus is put on approximate multipliers designed for neural network computation. The design methodologies of approximate multipliers and their performance in neural network computation are going to be discussed. Then, a general discussion will be made to summarize the current findings, present the design challenges, and propose potential future research directions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

eBook: USD 12.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Approximate Computing Techniques for Deep Neural Networks

Energy-Efficient Hardware Implementation of Fully Connected Artificial Neural Networks Using Approximate Arithmetic Blocks

Article 24 April 2023

Hardware–Software Approximations for Deep Neural Networks

References

Jain A, Mao J, Mohiuddin K. Artificial neural networks: a tutorial. Computer. 1996;29(3):31–44.
Article Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
Article Google Scholar
Bengio Y, Lecun Y, Hinton G. Deep learning for AI. Commun. ACM. 2021;64(7):58–65.
Article Google Scholar
Maass W. Networks of spiking neurons: the third generation of neural network models. Neural Netw. 1997;10(9):1659–71.
Article Google Scholar
Zhou Z, Chen X, Li E, Zeng L, Luo K, and Zhang J. Edge intelligence: paving the last mile of artificial intelligence with edge computing. Proc IEEE. 2019;107(8):1738–62.
Article Google Scholar
Moore GE. Cramming more components onto integrated circuits, Reprinted from Electronics, volume 38, number 8, April 19, 1965, pp.114 ff. IEEE Solid-State Circ Soc Newsl. 2006;11(3):33–5.
Google Scholar
Dennard R, Gaensslen F, Yu HN, Rideout V, Bassous E, LeBlanc A. Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circ. 1974;9(5):256–68.
Article Google Scholar
Hennessy JL, Patterson DA. A new golden age for computer architecture. Commun ACM. 2019;62(2):48–60.
Article Google Scholar
Sze V, Chen YH, Yang TJ, Emer JS. Efficient processing of deep neural networks: a tutorial and survey. Proc IEEE. 2017;105(12):2295–329.
Article Google Scholar
Jiang H, Santiago FJH, Mo H, Liu L, Han J. Approximate arithmetic circuits: a survey, characterization, and recent applications. Proc IEEE. 2020;108(12):2108–35.
Article Google Scholar
Liu W, Lombardi F, Shulte M. A retrospective and prospective view of approximate computing [Point of View]. Proc IEEE. 2020;108(3):394–9.
Article Google Scholar
Venkataramani S, Sun X, Wang N, Chen CY, Choi J, Kang M, Agarwal A, Oh J, Jain S, Babinsky T, Cao N, Fox T, Fleischer B, Gristede G, Guillorn M, Haynie H, Inoue H, Ishizaki K, Klaiber M, Lo SH, Maier G, Mueller S, Scheuermann M, Ogawa E, Schaal M, Serrano M, Silberman J, Vezyrtzis C, Wang W, Yee F, Zhang J, Ziegler M, Zhou C, Ohara M, Lu PF, Curran B, Shukla S, Srinivasan V, Chang L, Gopalakrishnan K. Efficient AI system design with cross-layer approximate computing. Proc IEEE. 2020;108(12):2232–50.
Google Scholar
Gupta S, Agrawal A, Gopalakrishnan K, Narayanan P. Deep learning with limited numerical precision. In: Proceedings of the 32nd international conference on international conference on machine learning - volume 37. 2015. pp. 1737–46.
Google Scholar
Han S, Mao H, Dally WJ. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. CoRR, vol. abs/1510.00149. 2016. pp. 1–14.
Google Scholar
Fleischer B, Shukla S, Ziegler M, Silberman J, Oh J, Srinivasan V, Choi J, Mueller S, Agrawal A, Babinsky T, Cao N, Chen CY, Chuang P, Fox T, Gristede G, Guillorn M, Haynie H, Klaiber M, Lee D, Lo SH, Maier G, Scheuermann M, Venkataramani S, Vezyrtzis C, Wang N, Yee F, Zhou C, Lu PF, Curran B, Chang L, Gopalakrishnan K. A scalable multi-TeraOPS deep learning processor core for AI training and inference. In: 2018 IEEE symposium on VLSI circuits. 2018. pp. 35–6.
Google Scholar
Han S, Liu X, Mao H, Pu J, Pedram A, Horowitz MA, Dally WJ. EIE: efficient inference engine on compressed deep neural network. SIGARCH Comput Archit News. 2016;44(3):243–54.
Article Google Scholar
Ansari MS, Cockburn BF, Han J. An improved logarithmic multiplier for energy-efficient neural computing. IEEE Trans Comput. 2021;70(4):614–25.
Article MathSciNet Google Scholar
Chakraborty I, Ali M, Ankit A, Jain S, Roy S, Sridharan S, Agrawal A, Raghunathan A, Roy K. Resistive crossbars as approximate hardware building blocks for machine learning: opportunities and challenges. Proc IEEE. 2020;108(12):2276–310.
Article Google Scholar
Venkataramani S, Choi J, Srinivasan V, Wang W, Zhang J, Schaal M, Serrano MJ, Ishizaki K, Inoue H, Ogawa E, Ohara M, Chang L, Gopalakrishnan K. DeepTools: compiler and execution runtime extensions for RaPiD AI accelerator. IEEE Micro. 2019;39(5):102–11.
Article Google Scholar
NVIDIA. NVIDIA TensorRT Developer Guide. NVIDIA Docs. 2021.
Google Scholar
Chen J, Hu J. Energy-efficient digital signal processing via voltage-overscaling-based residue number system. IEEE Trans Very Large Scale Integr (VLSI) Systems. 2013;21(7):1322–32.
Article Google Scholar
Venkataramani S, Kozhikkottu VJ, Sabne A, Roy K, Raghunathan A. Logic synthesis of approximate circuits. IEEE Trans Comput Aided Des Integr Circuits Syst. 2020;39(10):2503–15.
Article Google Scholar
Chen CY, Choi J, Gopalakrishnan K, Srinivasan V, Venkataramani S. Exploiting approximate computing for deep learning acceleration. In: 2018 Design, automation test in Europe Conference Exhibition (DATE). 2018. pp. 821–6.
Google Scholar
Wang E, Davis , Zhao R, Ng HC, Niu X, Luk W, Cheung PYK, Constantinides GA. Deep neural network approximation for custom hardware: where we’ve been, where we’re going. ACM Comput Surv. 2019;52(2):1–39.
Article Google Scholar
Panda P, Sengupta A, SS Sarwar, Srinivasan G, Venkataramani S, Raghunathan A, Roy K. Invited — cross-layer approximations for neuromorphic computing: from devices to circuits and systems. In: 2016 53nd ACM/EDAC/IEEE design automation conference (DAC). 2016. pp. 1–6.
Google Scholar
Zhang H, Chen D, Ko SB. New flexible multiple-precision multiply-accumulate unit for deep neural network training and inference. IEEE Trans Comput. 2020;69(1):26–38.
Article MathSciNet Google Scholar
Zhang H, He J, Ko SB. Efficient posit multiply-accumulate unit generator for deep learning applications. In: 2019 IEEE international symposium on circuits and systems (ISCAS). 2019. pp. 1–5.
Google Scholar
Zhang H, Lee HJ, Ko SB. Efficient fixed/floating-point merged mixed-precision multiply-accumulate unit for deep learning processors. In: 2018 IEEE international symposium on circuits and systems (ISCAS). 2018. pp. 1–5.
Google Scholar
Venkatachalam S, Adams E, Lee HJ, Ko SB. Design and analysis of area and power efficient approximate booth multipliers. IEEE Trans Comput. 2019;68(11):1697–703.
Article MathSciNet Google Scholar
Du Z, Lingamneni A, Chen Y, Palem KV, Temam O, Wu C. Leveraging the error resilience of neural networks for designing highly energy efficient accelerators. IEEE Trans Comput Aided Des Integr Circ Syst. 2015;34(8):1223–35.
Article Google Scholar
Mahdiani HR, Haji Seyed Javadi M, Fakhraie SM. Efficient utilization of imprecise computational blocks for hardware implementation of imprecision tolerant applications. Microelectron J. 2017;61(C):57–66.
Article Google Scholar
Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.
Article Google Scholar
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing systems - volume 1. 2012, pp. 1097–105.
Google Scholar
Kim MS, Del Barrio Garcia AA, Kim H, Bagherzadeh N. The effects of approximate multiplication on convolutional neural networks. IEEE Trans Emerg Top Comput. 2021. p. 1.
Google Scholar
Liang J, Han J, Lombardi F. New metrics for the reliability of approximate and probabilistic adders. IEEE Trans Comput. 2013;62(9):1760–71.
Article MathSciNet Google Scholar
MS Ansari, Mrazek V, Cockburn BF, Sekanina L, Vasicek Z, Han J. Improving the accuracy and hardware efficiency of neural networks using approximate multipliers. IEEE Trans Very Large Scale Integr (VLSI) Syst. 2020;28(2):317–28.
Article Google Scholar
Schulte M, Swartzlander E. Truncated multiplication with correction constant. In: Proceedings of IEEE workshop on VLSI signal processing. 1993. pp. 388–96.
Google Scholar
King E, Swartzlander E. Data-dependent truncation scheme for parallel multipliers. In: Conference record of the thirty-first Asilomar conference on signals, systems and computers (Cat. No.97CB36136), vol. 2. 1997. pp. 1178–82.
Google Scholar
Chen T, Du Z, Sun N, Wang J, Wu C, Chen Y, Temam O. DianNao: A small-footprint high-throughput accelerator for ubiquitous machine-learning. SIGPLAN Not. 2014;49(4):269–84.
Article Google Scholar
Zhang Q, Wang T, Tian Y, Yuan F, Xu Q. ApproxANN: an approximate computing framework for artificial neural network. In: 2015 Design, automation test in Europe conference exhibition (DATE). 2015. pp. 701–6.
Google Scholar
Lingamneni A, Enz C, Palem K, Piguet C. Synthesizing parsimonious inexact circuits through probabilistic design techniques. ACM Trans Embed Comput Syst. 2013;12(2s):1–26.
Article Google Scholar
Du Z, Palem K, Lingamneni A, Temam O, Chen Y, Wu C. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. In: 2014 19th Asia and South Pacific design automation conference (ASP-DAC). 2014. pp. 201–6.
Google Scholar
Ahmadinejad M, Moaiyeri MH. Energy- and quality-efficient approximate multipliers for neural network and image processing applications. IEEE Trans Emerg Top Comput2021:1. https://ieeexplore.ieee.org/document/9403977
Netzer Y, Wang T, Coates A, Bissacco A, Wu B, AY Ng. Reading digits in natural images with unsupervised feature learning. In: NIPS workshop on deep learning and unsupervised feature learning 2011. 2011. pp. 1–9.
Google Scholar
Hashemi S, Bahar RI, Reda S. DRUM: a dynamic range unbiased multiplier for approximate applications. In: 2015 IEEE/ACM international conference on computer-aided design (ICCAD). 2015. pp. 418–25.
Google Scholar
He X, Ke L, Lu W, Yan G, and Zhang X. AxTrain: hardware-oriented neural network training for approximate inference. In: Proceedings of the international symposium on low power electronics and design, ser. ISLPED ’18. New York: Association for Computing Machinery; 2018.
Google Scholar
He X, Lu W, Yan G, Zhang X. Joint design of training and hardware towards efficient and accuracy-scalable neural network inference. IEEE J Emerg Sel Top Circuits Syst. 2018;8(4):810–21.
Article Google Scholar
Narayanamoorthy S, Moghaddam HA, Liu Z, Park T, Kim NS. Energy-efficient approximate multiplication for digital signal processing and classification applications. IEEE Trans Very Large Scale Integr (VLSI) Syst. 2015;23(6):1180–4.
Article Google Scholar
Zervakis G, Amrouch H, Henkel J. Design automation of approximate circuits with runtime reconfigurable accuracy. IEEE Access. 2020;8:53522–38.
Article Google Scholar
Tasoulas ZG, Zervakis G, Anagnostopoulos I, Amrouch H, Henkel J. Weight-oriented approximation for energy-efficient neural network inference accelerators. IEEE Trans Circuits Syst I Reg Pap. 2020;67(12):4670–83.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). 2016. pp. 770–8.
Google Scholar
Sandler M, Howard A, Zhu M, Zhmoginov A, Chen LC. MobileNetV2: inverted residuals and linear bottlenecks. In: 2018 IEEE/CVF conference on computer vision and pattern recognition. 2018. pp. 4510–20.
Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In: 3rd International conference on learning representations (ICLR 2015). 2015. pp. 1–14.
Google Scholar
Sarwar SS, Venkataramani S, Raghunathan A, Roy K. Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing. In: 2016 Design, automation test in Europe conference exhibition (DATE). 2016. pp. 145–50.
Google Scholar
Sarwar SS, Venkataramani S, Ankit A, Raghunathan A, Roy K. Energy-efficient neural computing with approximate multipliers. J Emerg Technol Comput Syst. 2018;14(2):16:1–16:23.
Google Scholar
Sarwar SS, Srinivasan G, Han B, Wijesinghe P, Jaiswal A, Panda P, Raghunathan A, Roy K. Energy efficient neural computing: a study of cross-layer approximations. IEEE J Emerg Sel Top Circuits Syst. 2018;8(4):796–809.
Article Google Scholar
Mrazek V, Sarwar SS, Sekanina L, Vasicek Z, Roy K. Design of power-efficient approximate multipliers for approximate artificial neural networks. In: 2016 IEEE/ACM international conference on computer-aided design (ICCAD). 2016. pp. 1–7.
Google Scholar
Mrazek V, Hrbacek R, Vasicek Z, Sekanina L. EvoApprox8b: library of approximate adders and multipliers for circuit design and benchmarking of approximation methods. In: Design, automation test in Europe conference exhibition (DATE), 2017. 2017. pp. 258–61.
Google Scholar
Mrazek V, Sekanina L, Vasicek Z. Libraries of approximate circuits: automated design and application in CNN accelerators. IEEE J Emerg Sel Top Circuits Syst. 2020;10(4):406–18.
Article Google Scholar
De la Parra C, Guntoro A, Kumar A. Full approximation of deep neural networks through efficient optimization. In: 2020 IEEE international symposium on circuits and systems (ISCAS). 2020. pp. 1–5.
Google Scholar
Mitchell JN. Computer multiplication and division using binary logarithms. IRE Trans Electron Comput. 1962;EC-11(4):512–7.
Article MathSciNet Google Scholar
Kim MS, Del Barrio AA, Hermida R, Bagherzadeh N. Low-power implementation of Mitchell’s approximate logarithmic multiplication for convolutional neural networks. In: 2018 23rd Asia and South Pacific design automation conference (ASP-DAC). 2018. pp. 617–22.
Google Scholar
Kim MS, Barrio AAD, Oliveira LT, Hermida R, Bagherzadeh N. Efficient Mitchell’s approximate log multipliers for convolutional neural networks. IEEE Trans Comput. 2019;68(5):660–75.
Article MathSciNet Google Scholar
Yin P, Wang C, Waris H, Liu W, Han Y, Lombardi F. Design and analysis of energy-efficient dynamic range approximate logarithmic multipliers for machine learning. IEEE Trans Sustain Comput. Oct.-Dec. 2021;6(4):612–25.
Article Google Scholar
Krizhevsky A. Learning multiple layers of features from tiny images. Tech. Rep. 2009.
Google Scholar
Cheng T, Yu J, Hashimoto M. Minimizing power for neural network training with logarithm-approximate floating-point multiplier. In: 2019 29th international symposium on power and timing modeling, optimization and simulation (PATMOS). 2019. pp. 91–6.
Google Scholar
Society IC. IEEE standard for floating-point arithmetic. IEEE Std 754-2019 (Revision of IEEE 754-2008). 2019. pp. 1–84.
Google Scholar
Liu W, Xu J, Wang D, Wang C, Montuschi P, Lombardi F. Design and evaluation of approximate logarithmic multipliers for low power error-tolerant applications. IEEE Trans Circuits Syst I Reg Pap. 2018;65(9):2856–68.
Article Google Scholar
Gustafsson O, Hellman N. Approximate floating-point operations with integer units by processing in the logarithmic domain. In: 2021 28th IEEE symposium on computer arithmetic (ARITH 2021). 2021. pp. 45–52.
Google Scholar
Kim H, Kim MS, Del Barrio AA, Bagherzadeh N. A cost-efficient iterative truncated logarithmic multiplication for convolutional neural networks. In: 2019 IEEE 26th symposium on computer arithmetic (ARITH). 2019. pp. 108–11.
Google Scholar

Download references

Acknowledgements

The authors would like to thank Ocean University of China, University of Saskatchewan, and the Natural Sciences and Engineering Research Council of Canada (NSERC) for their financial support for the related projects and the writing of this chapter.

Author information

Authors and Affiliations

Faculty of Information Science and Engineering, Ocean University of China, Qingdao, China
Hao Zhang
Department of Electrical and Computer Engineering, University of Saskatchewan, Saskatoon, SK, Canada
Mohammadreza Asadikouhanjani & Seok-Bum Ko
Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada
Jie Han
Department of Electronics and Communication Engineering, National Institute of Technology, Tiruchirappalli, TN, India
Deivalakshmi Subbian

Authors

Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mohammadreza Asadikouhanjani
View author publications
You can also search for this author in PubMed Google Scholar
Jie Han
View author publications
You can also search for this author in PubMed Google Scholar
Deivalakshmi Subbian
View author publications
You can also search for this author in PubMed Google Scholar
Seok-Bum Ko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Seok-Bum Ko .

Editor information

Editors and Affiliations

Nanjing University of Aeronautics and Astronautics, Nanjing, China
Weiqiang Liu
Northeastern University, Boston, MA, USA
Fabrizio Lombardi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zhang, H., Asadikouhanjani, M., Han, J., Subbian, D., Ko, SB. (2022). Approximate Computing for Efficient Neural Network Computation: A Survey. In: Liu, W., Lombardi, F. (eds) Approximate Computing. Springer, Cham. https://doi.org/10.1007/978-3-030-98347-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-030-98347-5_16
Published: 24 February 2012
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98346-8
Online ISBN: 978-3-030-98347-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics