research-article

Quantifying acceleration: power/performance trade-offs of application kernels in hardware

Authors:

Brandon Reagen,

Yakun Sophia Shao,

Gu-Yeon Wei,

David BrooksAuthors Info & Claims

ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and Design

Pages 395 - 400

Published: 04 September 2013 Publication History

Get Access

Abstract

As the traditional performance gains of technology scaling diminish, one of the most promising directions is building special purpose fixed function hardware blocks, commonly referred to as accelerators. Accelerators have become prevalent in industrial SoC designs for their low power, high performance potential. In this work we explore thousands of implementations of classical software workloads in hardware. This thorough, detailed design space search of hardware accelerators gives architects a quantitative way to reason about the differences in implementations. The exploration presented in this work shows that the space is full of poor design choices. By thoroughly analyzing each benchmark, we show which provide the most performance when implemented in hardware given a fixed power budget and explain which design techniques work best for each workload.

References

[1]

E. S. Chung, P. A. Milder, J. C. Hoe, and K. Mai. Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs. In MICRO, 2010.

Digital Library

Google Scholar

[2]

J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. Charm: a composable heterogeneous accelerator-rich microprocessor. In Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pages 379--384, New York, NY, USA, 2012. ACM.

Digital Library

Google Scholar

[3]

J. Cong, K. Gururaj, and G. Han. Synthesis of reconfigurable high-performance multicore systems. In FPGA, 2009.

Digital Library

Google Scholar

[4]

A. Danalis, G. Marin, C. McCurdy, J. Meredith, P. Roth, K. Spafford, V. Tipparaju, and J. Vetter. The Scalable HeterOgeneous Computing (SHOC) Benchmark Suite. In GPGPU, 2010.

Digital Library

Google Scholar

[5]

V. Govindaraju, C.-H. Ho, and K. Sankaralingam. Dynamically specialized datapaths for energy efficient computing. In HPCA, 2011.

Digital Library

Google Scholar

[6]

R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. In ISCA, 2010.

Digital Library

Google Scholar

[7]

H.-Y. Liu and L. P. Carloni. On learning-based methods for design-space exploration with high-level synthesis. In Proceedings of the 50th Annual Design Automation Conference, DAC '13, pages 50:1--50:7, New York, NY, USA, 2013. ACM.

Digital Library

Google Scholar

[8]

M. J. Lyons, M. Hempstead, G.-Y. Wei, and D. Brooks. The accelerator store: A shared memory framework for accelerator-based systems. volume 8, pages 48:1--48:22, New York, NY, USA, 2012. ACM.

Digital Library

Google Scholar

[9]

G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation cores: reducing the energy of mature computations. In ASPLOS, 2010.

Digital Library

Google Scholar

Cited By

View all

Shao YReagen BWei GBrooks DYew PZhai AKeckler S(2014)AladdinProceeding of the 41st annual international symposium on Computer architecuture10.5555/2665671.2665689(97-108)Online publication date: 14-Jun-2014
https://dl.acm.org/doi/10.5555/2665671.2665689
Shao YReagen BWei GBrooks D(2014)AladdinACM SIGARCH Computer Architecture News10.1145/2678373.266568942:3(97-108)Online publication date: 14-Jun-2014
https://dl.acm.org/doi/10.1145/2678373.2665689

Index Terms

Quantifying acceleration: power/performance trade-offs of application kernels in hardware

Recommendations

XPPE: cross-platform performance estimation of hardware accelerators using machine learning
ASPDAC '19: Proceedings of the 24th Asia and South Pacific Design Automation Conference

The increasing heterogeneity in the applications to be processed ceased ASICs to exist as the most efficient processing platform. Hybrid processing platforms such as CPU+FPGA are emerging as powerful processing platforms to support an efficient ...
Portable, flexible, and scalable soft vector processors

Field-programmable gate arrays (FPGAs) are increasingly used to implement embedded digital systems, however, the hardware design necessary to do so is time-consuming and tedious. The amount of hardware design can be reduced by employing a microprocessor ...
MinDeg: a performance-guided replacement policy for run-time reconfigurable accelerators
CODES+ISSS '09: Proceedings of the 7th IEEE/ACM international conference on Hardware/software codesign and system synthesis

Reconfigurable Processors utilize a reconfigurable fabric (to implement application-specific accelerators) and may perform run-time reconfigurations to exchange the set of deployed accelerators during application execution. Depending on the application ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and Design

September 2013

440 pages

ISBN:9781479912353

General Chairs:
Pai Chou
UC Irvine / NTHU Taiwan
,
Ru Huang
Peking University
,
Program Chairs:
Yuan Xie
Penn State / AMD
,
Tanay Karnik
Intel

Publisher

IEEE Press

Publication History

Published: 04 September 2013

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISLPED'13

Sponsor:

SIGDA

ISLPED'13: International Symposium on Low Power Electronics and Design

September 4 - 6, 2013

Beijing, China

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
84
Total Downloads

Downloads (Last 12 months)4
Downloads (Last 6 weeks)0

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Shao YReagen BWei GBrooks DYew PZhai AKeckler S(2014)AladdinProceeding of the 41st annual international symposium on Computer architecuture10.5555/2665671.2665689(97-108)Online publication date: 14-Jun-2014
https://dl.acm.org/doi/10.5555/2665671.2665689
Shao YReagen BWei GBrooks D(2014)AladdinACM SIGARCH Computer Architecture News10.1145/2678373.266568942:3(97-108)Online publication date: 14-Jun-2014
https://dl.acm.org/doi/10.1145/2678373.2665689

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

XPPE: cross-platform performance estimation of hardware accelerators using machine learning

Portable, flexible, and scalable soft vector processors

MinDeg: a performance-guided replacement policy for run-time reconfigurable accelerators