Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

qLUT: Input-Aware Quantized Table Lookup for Energy-Efficient Approximate Accelerators

Published: 27 September 2017 Publication History

Abstract

Approximate computing has emerged as a popular design paradigm for optimizing the performance and energy consumption of error-resilient applications in domains such as machine learning, graphics, data analytics, etc. Numerous techniques for approximate computing have been proposed at different layers of the system stack, from circuits to architecture to software. In this work, we propose a new technique, called quantized table lookup, for approximating the meta-functions used in the core computational kernels of error-resilient applications. In contrast to prior work that directly approximates the functionality of the meta-functions, the proposed technique instead approximates the input data to the meta-functions by reducing/quantizing them to a much smaller set of values that we call quantized inputs. The small number of quantized inputs enables us to completely replace the energy-intensive arithmetic units in the meta-function with small and energy-efficient lookup tables (called quantized lookup tables or qLUT) that contain precomputed output values corresponding to the quantized inputs. The proposed approximation technique is not only highly generic, but also inherently quality-configurable and input-aware. Quality-configurability and input-awareness are achieved by modulating the size of the qLUT as well as selecting the values of the quantized inputs judiciously based on the statistics of the original input data. To evaluate the proposed technique, we have implemented the dominant meta-functions of nine error-resilient application benchmarks as quantized table lookup based hardware accelerators using 45nm technology. Experimental results demonstrate average energy savings of 46% at the application-level for minimal (<1%) loss in output quality.

References

[1]
V. Chippa, S. Chakradhar, K. Roy, and A. Raghunathan. 2013. Analysis and characterization of inherent application resilience for approximate computing. In Proceedings of the 50th Annual Design Automation Conference (DAC’13). ACM, 113:1--113:9. ISBN 978-1-4503-2071-9.
[2]
M. Shafique, R. Hafiz, S. Rehman, W. El-Harouni, and J. Henkel. 2016. Invited: Cross-layer approximate computing: From logic to architectures. In 2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC). 1--6.
[3]
D. Mohapatra, V. K. Chippa, A. Raghunathan, and K. Roy. 2011. Design of voltage-scalable meta-functions for approximate computing. In 2011 Design, Automation Test in Europe. 1--6.
[4]
R. Amirtharajah and A. P. Chandrakasan. 2004. A micropower programmable DSP using approximate signal processing based on distributed arithmetic. IEEE Journal of Solid-State Circuits 39, 2 (Feb 2004), 337--347. ISSN 0018-9200.
[5]
F. de Dinechin and A. Tisserand. 2005. Multipartite table methods. IEEE Trans. Comput. Transactions on Computers 54, 3 (March 2005), 319--330. ISSN 0018-9340.
[6]
A. Raha, S. Venkataramani, V. Raghunathan, and A. Raghunathan. 2015. Quality configurable reduce-and-rank for energy efficient approximate computing. In 2015 Design, Automation Test in Europe Conference Exhibition (DATE). 665--670.
[7]
C. Alvarez, J. Corbal, and M. Valero. 2005. Fuzzy memoization for floating-point multimedia applications. IEEE Transactions on Computers 54, 7 (July 2005), 922--927. ISSN 0018-9340.
[8]
M. Samadi, J. Lee, D. Jamshidi, A. Hormati, and S. Mahlke. 2013. SAGE: Self-tuning approximation for graphics engines. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, 13--24. ISBN 978-1-4503-2638-4.
[9]
V. Chippa, A. Raghunathan, K. Roy, and S. Chakradhar. 2011. Dynamic effort scaling: Managing the quality-efficiency tradeoff. In Proceedings of the 48th Design Automation Conference (DAC’11). ACM, 603--608. ISBN 978-1-4503-0636-2.
[10]
W. Baek and T. Chilimbi. 2010. Green: A framework for supporting energy-conscious programming using controlled approximation. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’10). ACM, 198--209. ISBN 978-1-4503-0019-3.
[11]
D. Martin, C. Fowlkes, D. Tal, and J. Malik. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001, volume 2, pages 416--423.
[12]
S. Venkataramani, V. Chippa, S. Chakradhar, K. Roy, and A. Raghunathan. 2013. Quality programmable vector processors for approximate computing. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46). ACM, 1--12. ISBN 978-1-4503-2638-4.
[13]
Y. Voronenko and M. Püschel. 2007. Multiplierless multiple constant multiplication. ACM Trans. Algorithms 3, 2 (May 2007). ISSN 1549-6325.
[14]
H. Nguyen and A. Chatterjee. 2000. Number-splitting with shift-and-add decomposition for power and hardware optimization in linear DSP synthesis. IEEE Trans. Very Large Scale Integr. Syst. 8, 4 (August 2000), 419--424. ISSN 1063-8210.
[15]
M. Potkonjak, M. B. Srivastava, and A. P. Chandrakasan. 1996. Multiple constant multiplications: efficient and versatile framework and algorithms for exploring common subexpression elimination. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 15, 2 (Feb 1996), 151--165. ISSN 0278-0070.
[16]
M. Ayinala and K. K. Parhi. 2013. Low-energy architectures for support vector machine computation. In 2013 Asilomar Conference on Signals, Systems and Computers. 2167--2171.
[17]
V. Gupta, D. Mohapatra, S. P. Park, A. Raghunathan, and K. Roy. 2011. IMPACT: IMPrecise adders for low-power approximate computing. In IEEE/ACM International Symposium on Low Power Electronics and Design. 409--414.
[18]
A. Raha, H. Jayakumar, and V. Raghunathan. 2014. A power efficient video encoder using reconfigurable approximate arithmetic units. In 2014 27th International Conference on VLSI Design and 2014 13th International Conference on Embedded Systems. 324--329.
[19]
A. Raha, H. Jayakumar, and V. Raghunathan. 2016. Input-based dynamic reconfiguration of approximate arithmetic units for video encoding. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 24, 3 (March 2016), 846--857. ISSN 1063-8210.
[20]
P. K. Krause and I. Polian. 2011. Adaptive voltage over-scaling for resilient applications. In 2011 Design, Automation Test in Europe. 1--6.
[21]
A. Lingamneni, A. Basu, C. Enz, K. Palem, and C. Piguet. 2013. Improving energy gains of inexact DSP hardware through reciprocative error compensation. In Proceedings of the 50th Annual Design Automation Conference (DAC’13). ACM, 20:1--20:8. ISBN 978-1-4503-2071-9.
[22]
D. Shin and S. K. Gupta. 2010. Approximate logic synthesis for error tolerant applications. In 2010 Design, Automation Test in Europe Conference Exhibition (DATE 2010). 957--960.
[23]
S. Rehman, W. El-Harouni, M. Shafique, A. Kumar, and J. Henkel. 2016. Architectural-space exploration of approximate multipliers. In Proceedings of the 35th International Conference on Computer-Aided Design (ICCAD’16). ACM, 80:1--80:8. ISBN 978-1-4503-4466-1.
[24]
A. Ranjan, A. Raha, S. Venkataramani, K. Roy, and A. Raghunathan. 2014. ASLAN: Synthesis of approximate sequential circuits. In 2014 Design, Automation Test in Europe Conference Exhibition (DATE). 1--6.
[25]
A. Raha, S. Venkataramani, V. Raghunathan, and A. Raghunathan. 2017. Energy-efficient reduce-and-rank using input-adaptive approximations. IEEE Transactions on Very Large Scale Integration (VLSI) Systems 25, 2 (Feb 2017), 462--475. ISSN 1063-8210.
[26]
H. Esmaeilzadeh, A. Sampson, L. Ceze, and D. Burger. 2012. Neural acceleration for general-purpose approximate programs. In Proceedings of the 2012 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-45). IEEE Computer Society, 449--460. ISBN 978-0-7695-4924-8.
[27]
S. Sidiroglou-Douskos, S. Misailovic, H. Hoffmann, and M. Rinard. 2011. Managing performance vs. accuracy trade-offs with loop perforation. In Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering (ESEC/FSE’11). ACM, 124--134. ISBN 978-1-4503-0443-6.
[28]
H. Hoffmann, S. Sidiroglou, M. Carbin, S. Misailovic, A. Agarwal, and M. Rinard. 2011. Dynamic knobs for responsive power-aware computing. In Proceedings of the Sixteenth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XVI. ACM, 199--212. ISBN 978-1-4503-0266-1.
[29]
Aurangzeb and R. Eigenmann. 2017. HiPA: History-based piecewise approximation for functions. In Proceedings of the International Conference on Supercomputing (ICS’17). ACM, 23:1--23:10. ISBN 978-1-4503-5020-4.
[30]
M. Laurenzano, P. Hill, M. Samadi, S. Mahlke, J. Mars, and L. Tang. 2016. Input responsiveness: Using canary inputs to dynamically steer approximation. In Proceedings of the 37th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’16). ACM, 161--176. ISBN 978-1-4503-4261-2.
[31]
A. Raha and V. Raghunathan. 2017. Synergistic approximation of computation and memory subsystems for error-resilient applications. IEEE Embedded Systems Letters 9, 1 (March 2017), 21--24. ISSN 1943-0663.
[32]
A. Raha and V. Raghunathan. 2017. Towards full-system energy-accuracy tradeoffs: A case study of an approximate smart camera system. In Proceedings of the 54th Annual Design Automation Conference 2017 (DAC’17). ACM, 74:1--74:6. ISBN 978-1-4503-4927-7.

Cited By

View all
  • (2024)Privacy Preserving Function Evaluation Using Lookup Tables with Word-Wise FHEIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2023EAP1114E107.A:8(1163-1177)Online publication date: 1-Aug-2024
  • (2024)ReApprox-PIM: Reconfigurable Approximate Lookup-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336782243:8(2288-2300)Online publication date: 1-Aug-2024
  • (2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Embedded Computing Systems
ACM Transactions on Embedded Computing Systems  Volume 16, Issue 5s
Special Issue ESWEEK 2017, CASES 2017, CODES + ISSS 2017 and EMSOFT 2017
October 2017
1448 pages
ISSN:1539-9087
EISSN:1558-3465
DOI:10.1145/3145508
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 27 September 2017
Accepted: 01 June 2017
Revised: 01 June 2017
Received: 01 April 2017
Published in TECS Volume 16, Issue 5s

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accelerators
  2. Approximate computing
  3. Lookup table
  4. Low-power design

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)40
  • Downloads (Last 6 weeks)10
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Privacy Preserving Function Evaluation Using Lookup Tables with Word-Wise FHEIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2023EAP1114E107.A:8(1163-1177)Online publication date: 1-Aug-2024
  • (2024)ReApprox-PIM: Reconfigurable Approximate Lookup-Table (LUT)-Based Processing-in-Memory (PIM) Machine Learning AcceleratorIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336782243:8(2288-2300)Online publication date: 1-Aug-2024
  • (2024)Approximate Computing: Concepts, Architectures, Challenges, Applications, and Future DirectionsIEEE Access10.1109/ACCESS.2024.346737512(146022-146088)Online publication date: 2024
  • (2023)Energy-Efficient Approximate Edge Inference SystemsACM Transactions on Embedded Computing Systems10.1145/358976622:4(1-50)Online publication date: 31-Mar-2023
  • (2023)Efficient Table-based Function Approximation on FPGAs Using Interval Splitting and BRAM InstantiationACM Transactions on Embedded Computing Systems10.1145/358073722:4(1-24)Online publication date: 25-Jan-2023
  • (2023)Adaptive Approximate Accelerators with Controlled Quality Using Machine LearningDesign and Applications of Emerging Computer Systems10.1007/978-3-031-42478-6_19(501-529)Online publication date: 17-Aug-2023
  • (2023)Efficient Hardware Acceleration of Emerging Neural Networks for Embedded Machine Learning: An Industry PerspectiveEmbedded Machine Learning for Cyber-Physical, IoT, and Edge Computing10.1007/978-3-031-19568-6_5(121-172)Online publication date: 1-Oct-2023
  • (2022)Approximate Down-Sampling Strategy for Power-Constrained Intelligent SystemsIEEE Access10.1109/ACCESS.2022.314229210(7073-7081)Online publication date: 2022
  • (2021)Value Similarity Extensions for Approximate Computing in General-Purpose Processors2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE51398.2021.9474264(481-486)Online publication date: 1-Feb-2021
  • (2021)Region of Interest-Based Parameter Optimization for Approximate Image Processing on FPGAsInternational Journal of Networking and Computing10.15803/ijnc.11.2_43811:2(438-462)Online publication date: 2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media