research-article

Low-Power Multiple-Precision Iterative Floating-Point Multiplier with SIMD Support

Authors:

Dimitri Tan,

Carl E. Lemonds,

Michael J. SchulteAuthors Info & Claims

IEEE Transactions on Computers, Volume 58, Issue 2

Pages 175 - 187

https://doi.org/10.1109/TC.2008.203

Published: 01 February 2009 Publication History

Publisher Site

Abstract

The demand for improved SIMD floating-point performance on general-purpose x86-compatible microprocessors is rising. At the same time, there is a conflicting demand in the low-power computing market for a reduction in power consumption. Along with this, there is the absolute necessity of backward compatibility for x86-compatible microprocessors, which includes the support of x87 scientific floating-point instructions. The combined effect is that there is a need for low-power, low-cost floating-point units that are still capable of delivering good SIMD performance while maintaining full x86 functionality. This paper presents the design of an x86-compatible floating-point multiplier (FPM) that is compliant with the IEEE-754 Standard for Binary Floating-Point Arithmetic [12] and is specifically tailored to provide good SIMD performance in a low-cost, low-power solution while maintaining full x87 backward compatibility. The FPM efficiently supports multiple precisions using an iterative rectangular multiplier. The FPM can perform two parallel single-precision multiplies every cycle with a latency of two cycles, one double-precision multiply every two cycles with a latency of four cycles, or one extended-double-precision multiply every three cycles with a latency of five cycles. The iterative FPM also supports division, square-root, and transcendental functions. Compared to a previous design with similar functionality, the proposed iterative FPM has 60 percent less area and 59 percent less dynamic power dissipation.

Cited By

View all

Ha DZhang YKao CHughes CRo WTseng H(2024)M3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00016(1-16)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00016
Zhang HChen DKo S(2020)New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and InferenceIEEE Transactions on Computers10.1109/TC.2019.293619269:1(26-38)Online publication date: 3-Jan-2020
https://dl.acm.org/doi/10.1109/TC.2019.2936192
Hanuman CKamala JAruna A(2020)Implementation of high precision/low latency FP divider using Urdhva–Tiryakbhyam multiplier for SoC applicationsDesign Automation for Embedded Systems10.1007/s10617-019-09225-224:2(111-125)Online publication date: 1-Jun-2020
https://dl.acm.org/doi/10.1007/s10617-019-09225-2
Show More Cited By

Index Terms

Low-Power Multiple-Precision Iterative Floating-Point Multiplier with SIMD Support
1. Hardware
2. Mathematics of computing
  1. Mathematical analysis
    1. Numerical analysis
      1. Arbitrary-precision arithmetic
      2. Interval arithmetic

Recommendations

Hardware Designs for Decimal Floating-Point Addition and Related Operations

Decimal arithmetic is often used in commercial, financial, and Internet-based applications. Due to the growing importance of decimal floating-point (DFP) arithmetic, the IEEE 754 Draft Standard for Floating-Point Arithmetic (IEEE P754) includes ...
Decimal Floating-Point Multiplication

Decimal multiplication is important in many commercial applications including financial analysis, banking, tax calculation, currency conversion, insurance, and accounting. This paper presents the design of two decimal floating-point multipliers: one ...
A 13.3ns double-precision floating-point ALU and multiplier
ICCD '95: Proceedings of the 1995 International Conference on Computer Design: VLSI in Computers and Processors

One-bit pre-shifting before alignment shift, normalization with anticipated leading '1' bit and pre-rounding techniques have been developed for a floating-point arithmetic logic unit (ALU). In addition, carry select addition and pre-rounding techniques ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers

IEEE Transactions on Computers Volume 58, Issue 2

February 2009

144 pages

ISSN:0018-9340

Issue’s Table of Contents

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 February 2009

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Ha DZhang YKao CHughes CRo WTseng H(2024)M3XU: Achieving High-Precision and Complex Matrix Multiplication with Low-Precision MXUsProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00016(1-16)Online publication date: 17-Nov-2024
https://dl.acm.org/doi/10.1109/SC41406.2024.00016
Zhang HChen DKo S(2020)New Flexible Multiple-Precision Multiply-Accumulate Unit for Deep Neural Network Training and InferenceIEEE Transactions on Computers10.1109/TC.2019.293619269:1(26-38)Online publication date: 3-Jan-2020
https://dl.acm.org/doi/10.1109/TC.2019.2936192
Hanuman CKamala JAruna A(2020)Implementation of high precision/low latency FP divider using Urdhva–Tiryakbhyam multiplier for SoC applicationsDesign Automation for Embedded Systems10.1007/s10617-019-09225-224:2(111-125)Online publication date: 1-Jun-2020
https://dl.acm.org/doi/10.1007/s10617-019-09225-2
(2018)Throughput enhancement of SISO parallel LTE turbo decoders using floating point turbo decoding algorithmInternational Journal of Wireless and Mobile Computing10.5555/3282783.328279115:1(58-66)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.5555/3282783.3282791
Del Barrio ABagherzadeh NHermida R(2014)Ultra-low-power adder stage design for exascale floating point unitsACM Transactions on Embedded Computing Systems10.1145/256793213:3s(1-24)Online publication date: 28-Mar-2014
https://dl.acm.org/doi/10.1145/2567932
Wu KKuang SYu K(2013)An exact method for estimating maximum errors of multi-mode floating-point iterative booth multiplierInternational Journal of Computational Science and Engineering10.1504/IJCSE.2013.0572958:4(306-315)Online publication date: 1-Oct-2013
https://dl.acm.org/doi/10.1504/IJCSE.2013.057295
Kuang SWu KYu K(2013)Energy-Efficient Multiple-Precision Floating-Point Multiplier for Embedded ApplicationsJournal of Signal Processing Systems10.1007/s11265-012-0695-172:1(43-55)Online publication date: 1-Jul-2013
https://dl.acm.org/doi/10.1007/s11265-012-0695-1

Abstract

Cited By

Index Terms

Recommendations

Hardware Designs for Decimal Floating-Point Addition and Related Operations

Decimal Floating-Point Multiplication

A 13.3ns double-precision floating-point ALU and multiplier

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations