research-article

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit

Authors:

Bindu G. Gowda,

Madhav RaoAuthors Info & Claims

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

Pages 503 - 508

https://doi.org/10.1145/3583781.3590265

Published: 05 June 2023 Publication History

Abstract

Multiply-and-accumulate (MAC) units are primarily utilized for convolution operations targeted towards signal and image processing workload. The compressors are applied at the partial product reduction stages to extract the multiplier output bits, which are later accumulated with an extra adder unit. The paper proposes an integrated approach where the other operand of the MAC unit is directly fed to the partial-product-matrix (PPM) before the product bits are evaluated. This integrated Multiplier-and-Accumulate (IMAC) approach saves an additional adder unit and instead extends the compressor, which is already used to reduce partial-product bits of the multiplier design. Compressors employed exact and approximate IMAC architectures were designed and evaluated through ASIC and FPGA flow. Five versions of inexact IMAC design were independently compared with traditional one-level approximation and two-level approximation in MAC designs. The proposed work is found to be hardware efficient when compared with state-of-art MAC units. The error metrics were either comparable or better for IMAC design when compared with separately designed approximate multipliers followed by exact or approximate adder units. The image blending application was considered to measure the quality metrics. The proposed IMAC design files are made freely available for further usage by the research and development community.

References

[1]

Yashaswi Mannepalli, Viraj Bharadwaj Korede, and Madhav Rao. Novel approximate multiplier designs for edge detection application. In Proceedings of the 2021 on Great Lakes Symposium on VLSI, GLSVLSI '21, page 371--377, New York, NY, USA, 2021. Association for Computing Machinery.

Digital Library

[2]

Swagath Venkataramani, Vivek J. Kozhikkottu, Amit Sabne, Kaushik Roy, and Anand Raghunathan. Logic synthesis of approximate circuits. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10):2503--2515, 2020.

[3]

Shalini Singh, Pavan Kumar Pothula, and Madhav Rao. Design and evaluation of on-chip dct accelerators based on novel approximate reverse carry propagate adders. In 2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pages 8--13, 2022.

[4]

Vishesh Mishra, Divy Pandey, Saurabh Singh, Sagar Satapathy, Kaustav Goswami, Babita Jajodia, and Dip Sankar Banerjee. Art-mac: Approximate rounding and truncation based mac unit for fault-tolerant applications. In 2022 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1640--1644, 2022.

[5]

Soujanya S R and Madhav Rao. Hardware characterization of integer-net based seizure detection models on fpga. In 2022 IEEE 15th International Symposium on Embedded Multicore/Many-core Systems-on-Chip (MCSoC), pages 224--231, 2022.

[6]

Prashanth H C, Soujanya S R, Bindu G Gowda, and Madhav Rao. Design and evaluation of in-exact compressor based approximate multipliers. In Proceedings of the Great Lakes Symposium on VLSI 2022, GLSVLSI '22, page 431--436, New York, NY, USA, 2022. Association for Computing Machinery.

[7]

Omkar G. Ratnaparkhi and Madhav Rao. Lead: Logarithmic exponent approximate divider for image quantization application. In Proceedings of the Great Lakes Symposium on VLSI 2022, GLSVLSI '22, page 437--442, New York, NY, USA, 2022. Association for Computing Machinery.

[8]

Omkar G Ratnaparkhi and Madhav Rao. Esas: Exponent series based approximate square root design. In 2022 25th Euromicro Conference on Digital System Design (DSD), pages 39--45, 2022.

[9]

K J N S Bhargav, Sairam Palisetti, and Madhav Rao. A newton raphson method based approximate divider design for color quantization application. In 2021 18th International SoC Design Conference (ISOCC), pages 115--116, 2021.

[10]

Kunal Bharathi, Jiang Hu, and Sunil P. Khatri. Scaled population subtraction for approximate computing. In 2020 IEEE 38th International Conference on Computer Design (ICCD), pages 348--355, 2020.

[11]

H C Prashanth and Madhav Rao. Somalib: Library of exact and approximate activation functions for hardware-efficient neural network accelerators. In 2022 IEEE 40th International Conference on Computer Design (ICCD), pages 746--753, 2022.

[12]

Prashanth H. C. and Madhav Rao. Improving digital circuit synthesis of complex functions using binary weighted fitness and variable mutation rate in cartesian genetic programming. In Proceedings of the 14th International Joint Conference on Computational Intelligence - ECTA, pages 112--120. INSTICC, SciTePress, 2022.

[13]

Nandagopal R, Rajashree V, and Madhav Rao. Accelerated piece-wise-linear implementation of floating-point power function. In 2022 29th IEEE International Conference on Electronics, Circuits and Systems (ICECS), pages 1--4, 2022.

[14]

Alice Sokolova, Mohsen Imani, Andrew Huang, Ricardo Garcia, Justin Morris, Tajana Rosing, and Baris Aksanli. Maccelerator: Approximate arithmetic unit for computational acceleration. In 2021 22nd International Symposium on Quality Electronic Design (ISQED), pages 444--449, 2021.

[15]

Hang Xiao, Haobo Xu, Xiaoming Chen, Yujie Wang, and Yinhe Han. Fast and high-accuracy approximate mac unit design for cnn computing. IEEE Embedded Systems Letters, 14(3):155--158, 2022.

[16]

Gunho Park, Jaeha Kung, and Youngjoo Lee. Design and analysis of approximate compressors for balanced error accumulation in mac operator. IEEE Transactions on Circuits and Systems I: Regular Papers, 68(7):2950--2961, 2021.

[17]

Yicheng Lu, Weiwei Shan, and Jiaming Xu. A depthwise separable convolution neural network for small-footprint keyword spotting using approximate mac unit and streaming convolution reuse. In 2019 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS), pages 309--312, 2019.

[18]

Vojtech Mrazek, Zdenek Vasicek, Lukas Sekanina, Muhammad Abdullah Hanif, and Muhammad Shafique. Alwann: Automatic layer-wise approximation of deep neural network accelerators without retraining. In 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pages 1--8, 2019.

[19]

Bahar Asgari, Ramyad Hadidi, and Hyesoon Kim. Meissa: Multiplying matrices efficiently in a scalable systolic architecture. In 2020 IEEE 38th International Conference on Computer Design (ICCD), pages 130--137, 2020.

[20]

Mingqiang Huang, Yucen Liu, Changhai Man, Kai Li, Quan Cheng, Wei Mao, and Hao Yu. A high performance multi-bit-width booth vector systolic accelerator for nas optimized deep learning neural networks. IEEE Transactions on Circuits and Systems I: Regular Papers, 69(9):3619--3631, 2022.

[21]

Wei Mao, Liuyao Dai, Kai Li, Quan Cheng, Yuhang Wang, Laimin Du, Shaobo Luo, Mingqiang Huang, and Hao Yu. An energy-efficient mixed-bitwidth systolic accelerator for nas-optimized deep neural networks. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 30(12):1878--1890, 2022.

[22]

G. A. Gillani, M. A. Hanif, B. Verstoep, S. H. Gerez, M. Shafique, and A. B. J. Kokkeler. Macish: Designing approximate mac accelerators with internal-self-healing. IEEE Access, 7:77142--77160, 2019.

[23]

Mahmoud Masadeh, Osman Hasan, and Sofiène Tahar. Input-conscious approximate multiply-accumulate (mac) unit for energy-efficiency. IEEE Access, 7:147129--147142, 2019.

[24]

Elizabeth Adams, Suganthi Venkatachalam, and Seok-Bum Ko. Energy-efficient approximate mac unit. In 2019 IEEE International Symposium on Circuits and Systems (ISCAS), pages 1--4, 2019.

[25]

https://sites.google.com/view/integratedmac/home.

[26]

Yen-Jen Chang, Yu-Cheng Cheng, Shao-Chi Liao, and Chun-Huo Hsiao. A low power radix-4 booth multiplier with pre-encoded mechanism. IEEE Access, 8: 114842--114853, 2020.

[27]

Darjn Esposito, Antonio Giuseppe Maria Strollo, Ettore Napoli, Davide De Caro, and Nicola Petra. Approximate multipliers based on new approximate compressors. IEEE Transactions on Circuits and Systems I: Regular Papers, 65(12): 4169--4182, 2018.

Cited By

Devi DAjay Kumar GG Gowda BRao M(2024)Integrated MAC-based Systolic Arrays: Design and Performance EvaluationProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658797(292-295)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3649476.3658797
Devi DKumar GGowda BRao M(2024)Performance-Aware Design of Approximate Integrated MAC Factored Systolic Array Accelerators2024 25th International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED60706.2024.10528695(1-8)Online publication date: 3-Apr-2024
https://doi.org/10.1109/ISQED60706.2024.10528695
Fakir KMande S(2024)Design and Implementation of High Performance Energy Efficient MAC Unit for Emerging Applications2024 International Conference on Innovation and Novelty in Engineering and Technology (INNOVA)10.1109/INNOVA63080.2024.10846950(1-4)Online publication date: 20-Dec-2024
https://doi.org/10.1109/INNOVA63080.2024.10846950
Show More Cited By

Index Terms

IMAC:: A Pre-Multiplier And Integrated Reduction Based Multiply-And-Accumulate Unit
1. Hardware
  1. Integrated circuits
    1. Logic circuits
      1. Arithmetic and datapath circuits
      2. Combinational circuits
    2. Reconfigurable logic and FPGAs
  2. Very large scale integration design
    1. Application-specific VLSI designs
      1. Application specific integrated circuits

Recommendations

Design and Evaluation of Adiabatic Arithmetic Units
Special issue: analog design issues in digital VSLI circuits and systems

Adiabatic design is an attractive approach to reducing energy consumption in VLSI circuits after exhausting the potential of conventional energy-saving techniques. Despite the plethora of adiabatic logic architectures that have been proposed in recent years,...
Integration workshop: Expandable arithmetic block macrocell

Parameterized macrocells are a natural extension of libraries of less complex standard cells. An expandable arithmetic block macrocell was designed and implemented. The arithmetic block performs multiplication (using a sequential algorithm), ...
Hardware implementation of approximate multipliers for signal processing applications

Multiplication is a complex and substantial arithmetic task involved in signal processing applications. The hardware complexity of the multiplier is always high when compared with any other arithmetic operation. Approximate multiplication is a common ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

GLSVLSI '23: Proceedings of the Great Lakes Symposium on VLSI 2023

June 2023

731 pages

ISBN:9798400701252

DOI:10.1145/3583781

General Chairs:
Himanshu Thapliyal
University of Tennessee, Knoxville, USA
,
Ronald DeMara
University of Central Florida, USA
,
Program Chairs:
Inna Partin-Vaisband
University of Illinois Chicago, USA
,
Srinivas Katkoori
University of South Florida, USA

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGDA: ACM Special Interest Group on Design Automation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

GLSVLSI '23

Sponsor:

SIGDA

GLSVLSI '23: Great Lakes Symposium on VLSI 2023

June 5 - 7, 2023

TN, Knoxville, USA

Acceptance Rates

Overall Acceptance Rate 312 of 1,156 submissions, 27%

Upcoming Conference

GLSVLSI '25

Sponsor:
sigda

Great Lakes Symposium on VLSI 2025

June 30 - July 2, 2025

New Orleans , LA , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
113
Total Downloads

Downloads (Last 12 months)29
Downloads (Last 6 weeks)1

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Devi DAjay Kumar GG Gowda BRao M(2024)Integrated MAC-based Systolic Arrays: Design and Performance EvaluationProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3658797(292-295)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3649476.3658797
Devi DKumar GGowda BRao M(2024)Performance-Aware Design of Approximate Integrated MAC Factored Systolic Array Accelerators2024 25th International Symposium on Quality Electronic Design (ISQED)10.1109/ISQED60706.2024.10528695(1-8)Online publication date: 3-Apr-2024
https://doi.org/10.1109/ISQED60706.2024.10528695
Fakir KMande S(2024)Design and Implementation of High Performance Energy Efficient MAC Unit for Emerging Applications2024 International Conference on Innovation and Novelty in Engineering and Technology (INNOVA)10.1109/INNOVA63080.2024.10846950(1-4)Online publication date: 20-Dec-2024
https://doi.org/10.1109/INNOVA63080.2024.10846950
Gowda BN RC PNandi PRao M(2023)ApproxCNN: Evaluation Of CNN With Approximated Layers Using In-Exact Multipliers2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00017(46-53)Online publication date: 6-Nov-2023
https://doi.org/10.1109/ICCD58817.2023.00017

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten