Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

A Floating-Point Unit for 4D Vector Inner Product with Reduced Latency

Published: 01 July 2009 Publication History

Abstract

This paper presents the algorithm and implementation of a new high-performance functional unit for floating-point four-dimensional vector inner product (4D dot product; DP4), which is most frequently performed in 3D graphics application. The proposed IEEE-compliant DP4 unit computes {\rm Z} = {\rm AB} + {\rm CD} + {\rm EF} + {\rm GH} in one path and keeps the intermediate rounding by IEEE-754 rounding to nearest even. The intermediate rounding is merged with shift alignment, and intermediate carry-propagated addition and normalization are omitted to reduce latency in the proposed architecture. The proposed DP4 unit is implemented with 0.18-\mu{\rm m} CMOS technology and has 12.8-ns critical path delay, which is reduced by 45.5 percent compared to a previous DP4 implementation using discrete multipliers and adders. The proposed DP4 unit also reduces the cycle time of 3D graphics applications by 12.4 percent on the average compared to the usual 3D graphics FPU based on four-way multiply-add-fused units.

Cited By

View all
  • (2018)Homogeneous stream processors with embedded special function units for high-utilization programmable shadersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.216149920:9(1691-1704)Online publication date: 29-Dec-2018
  • (2018)A mobile 3-D display processor with a bandwidth-saving subdividerIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.215025320:6(1082-1093)Online publication date: 29-Dec-2018
  • (2018)A dual-shader 3-D graphics processor with fast 4-D vector inner product units and power-aware texture cacheIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2009.203892219:4(525-537)Online publication date: 29-Dec-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Computers
IEEE Transactions on Computers  Volume 58, Issue 7
July 2009
144 pages

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 July 2009

Author Tags

  1. 3D graphics.
  2. DP4
  3. Floating point arithmetic
  4. Floating-point arithmetic
  5. Graphics processors
  6. Vector inner product
  7. vector inner product

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Homogeneous stream processors with embedded special function units for high-utilization programmable shadersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.216149920:9(1691-1704)Online publication date: 29-Dec-2018
  • (2018)A mobile 3-D display processor with a bandwidth-saving subdividerIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2011.215025320:6(1082-1093)Online publication date: 29-Dec-2018
  • (2018)A dual-shader 3-D graphics processor with fast 4-D vector inner product units and power-aware texture cacheIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2009.203892219:4(525-537)Online publication date: 29-Dec-2018
  • (2013)Self-Alignment Schemes for the Implementation of Addition-Related Floating-Point OperatorsACM Transactions on Reconfigurable Technology and Systems10.1145/2457443.24574446:1(1-21)Online publication date: 1-May-2013

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media