Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2648668.2648716acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

REEL: reducing effective execution latency of floating point operations

Published: 04 September 2013 Publication History

Abstract

The height of the dynamic dependence graph of a program, as executed by a processor, determines the minimum bound on the execution time. This height can be decreased by reducing the effective execution latency of operations that form dependence chains in the graph. In this paper, we propose a technique called REEL to reduce overall latency of chains of dependent floating point (FP) operations by increasing the throughput of computation. REEL comprises of a high-throughput floating point unit (HFP) that allows early issue of an FP Add that is dependent on another FP Add or FP Multiply. This is complemented by instruction scheduler modifications that allow early issue of dependent FP Adds, and a novel checker logic that corrects any precision errors. Unlike conventional static operation fusion, like fused Multiply-Add (FMA), there are no changes to the instruction set to enable utilization of the new hardware, and no recompilation is necessary. Furthermore, unlike ISA-level FMA, our technique produces results that are bit compatible while boosting performance of Add-Add dependence pairs in addition to Multiply-Add pairs. Our evaluation of REEL using CFP2006 benchmarks shows an average performance gain of 7.6% and maximum performance gain of 17% while consuming 1.2% lower energy.

References

[1]
S. Gochman, R. Ronen, I. Anati, A. Berkovits, T. Kurts, A. Naveh, A. Saeed, Z. Sperber, and R. C. Valentine. The Intel Pentium M processor: Microarchitecture and performance. Intel Technology Journal, 07(2): 21--36, february 2003.
[2]
S. Z. Gilani, N. S. Kim, and M. Schulte. Energy-efficient floating-point arithmetic for software-defined radio architectures. In ASAP- 2011.
[3]
E. Quinnell, E. E. Swartzlander, and C. Lemonds. Bridge floating-point fused multiply-add design. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, Dec. 2008.
[4]
Standard Performance Evaluation Corporation. Spec cpu 2006. www.spec.org/cpu2006, 2011.
[5]
H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, and A. Karunanidhi. Pinpointing representative portions of large Intel Itanium programs with dynamic instrumentation. In MICRO-37, 2004.
[6]
T. Sherwood, E. Perelman, and B. Calder. Basic block distribution analysis to find periodic behavior and simulation points in applications. In PACT, 2001.
[7]
C. K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood. Pin: building customized program analysis tools with dynamic instrumentation. In PLDI, 2005.
[8]
H. Sun and M. Gao. A novel architecture for floating-point multiply-add-fused operation. In Proceedings of the 2003 Joint Conference of the Fourth International Conference on Information, Communications and Signal Processing and Fourth Pacific Rim Conference on Multimedia.
[9]
J. D. Bruguera and T. Lang. Floating-point fused multiply-add: reduced latency for floating-point addition. In ARITH-17, 2005.
[10]
E. Hokenek, R. K. Montoye, and P. W. Cook. Second-generation RISC floating point with multiply-add fused. Solid-State Circuits, IEEE Journal of, oct 1990.
[11]
Intel Corporation. Intel(R) advanced vector extensions programming reference. www.intel.com, 2011.
[12]
M. Butler, L. Barnes, D. D. Sarma, and B. Gelinas. Bulldozer: An approach to multithreaded compute performance. Micro, IEEE, 31(2): 6--15, march-april 2011.
[13]
W. J. Dally. Micro-optimization of floating-point operations. In ASPLOS-III, 1989.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and Design
September 2013
440 pages
ISBN:9781479912353

Sponsors

Publisher

IEEE Press

Publication History

Published: 04 September 2013

Check for updates

Qualifiers

  • Research-article

Conference

ISLPED'13
Sponsor:

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 53
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media