Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2648668.2648759acmconferencesArticle/Chapter ViewAbstractPublication PagesislpedConference Proceedingsconference-collections
research-article

Quantifying acceleration: power/performance trade-offs of application kernels in hardware

Published: 04 September 2013 Publication History

Abstract

As the traditional performance gains of technology scaling diminish, one of the most promising directions is building special purpose fixed function hardware blocks, commonly referred to as accelerators. Accelerators have become prevalent in industrial SoC designs for their low power, high performance potential. In this work we explore thousands of implementations of classical software workloads in hardware. This thorough, detailed design space search of hardware accelerators gives architects a quantitative way to reason about the differences in implementations. The exploration presented in this work shows that the space is full of poor design choices. By thoroughly analyzing each benchmark, we show which provide the most performance when implemented in hardware given a fixed power budget and explain which design techniques work best for each workload.

References

[1]
E. S. Chung, P. A. Milder, J. C. Hoe, and K. Mai. Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs. In MICRO, 2010.
[2]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. Charm: a composable heterogeneous accelerator-rich microprocessor. In Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pages 379--384, New York, NY, USA, 2012. ACM.
[3]
J. Cong, K. Gururaj, and G. Han. Synthesis of reconfigurable high-performance multicore systems. In FPGA, 2009.
[4]
A. Danalis, G. Marin, C. McCurdy, J. Meredith, P. Roth, K. Spafford, V. Tipparaju, and J. Vetter. The Scalable HeterOgeneous Computing (SHOC) Benchmark Suite. In GPGPU, 2010.
[5]
V. Govindaraju, C.-H. Ho, and K. Sankaralingam. Dynamically specialized datapaths for energy efficient computing. In HPCA, 2011.
[6]
R. Hameed, W. Qadeer, M. Wachs, O. Azizi, A. Solomatnikov, B. C. Lee, S. Richardson, C. Kozyrakis, and M. Horowitz. Understanding sources of inefficiency in general-purpose chips. In ISCA, 2010.
[7]
H.-Y. Liu and L. P. Carloni. On learning-based methods for design-space exploration with high-level synthesis. In Proceedings of the 50th Annual Design Automation Conference, DAC '13, pages 50:1--50:7, New York, NY, USA, 2013. ACM.
[8]
M. J. Lyons, M. Hempstead, G.-Y. Wei, and D. Brooks. The accelerator store: A shared memory framework for accelerator-based systems. volume 8, pages 48:1--48:22, New York, NY, USA, 2012. ACM.
[9]
G. Venkatesh, J. Sampson, N. Goulding, S. Garcia, V. Bryksin, J. Lugo-Martinez, S. Swanson, and M. B. Taylor. Conservation cores: reducing the energy of mature computations. In ASPLOS, 2010.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISLPED '13: Proceedings of the 2013 International Symposium on Low Power Electronics and Design
September 2013
440 pages
ISBN:9781479912353

Sponsors

Publisher

IEEE Press

Publication History

Published: 04 September 2013

Check for updates

Author Tags

  1. accelerator
  2. design space exploration
  3. power performance trade-offs

Qualifiers

  • Research-article

Conference

ISLPED'13
Sponsor:

Acceptance Rates

Overall Acceptance Rate 398 of 1,159 submissions, 34%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media