Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2606265.2606942guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

On the Programmability and Performance of Heterogeneous Platforms

Published: 15 December 2013 Publication History

Abstract

General-purpose computing on an ever-broadening array of parallel devices has led to an increasingly complex and multi-dimensional landscape with respect to programmability and performance optimization. The growing diversity of parallel architectures presents many challenges to the domain scientist, including device selection, programming model, and level of investment in optimization. All of these choices influence the balance between programmability and performance. In this paper, we characterize the performance achievable across a range of optimizations, along with their programmability, for multi- and many-core platforms - specifically, an Intel Sandy Bridge CPU, Intel Xeon Phi co-processor, and NVIDIA Kepler K20 GPU - in the context of an n-body, molecular-modeling application called GEM. Our systematic approach to optimization delivers implementations with speed-ups of 194.98, 885.18, and 1020.88 on the CPU, Xeon Phi, and GPU, respectively, over the naive serial version. Beyond the speed-ups, we characterize the incremental optimization of the code from naive serial to fully hand-tuned on each platform through four distinct phases of increasing complexity to expose the strengths and weaknesses of the programming models offered by each platform.

Cited By

View all
  • (2019)On the Portability of CPU-Accelerated Applications via Automated Source-to-Source TranslationProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3293320.3293338(1-8)Online publication date: 14-Jan-2019
  • (2018)REPLICA MBTACThe Journal of Supercomputing10.1007/s11227-017-2199-z74:5(1911-1933)Online publication date: 1-May-2018
  • (2016)Cross-Accelerator Performance ProfilingProceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale10.1145/2949550.2949567(1-8)Online publication date: 17-Jul-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPADS '13: Proceedings of the 2013 International Conference on Parallel and Distributed Systems
December 2013
725 pages
ISBN:9781479920815

Publisher

IEEE Computer Society

United States

Publication History

Published: 15 December 2013

Author Tags

  1. AVX
  2. CUDA
  3. GPU
  4. Intel MIC
  5. NVIDIA Kepler K20
  6. OpenACC
  7. Xeon Phi
  8. optimization
  9. performance
  10. programmability

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2019)On the Portability of CPU-Accelerated Applications via Automated Source-to-Source TranslationProceedings of the International Conference on High Performance Computing in Asia-Pacific Region10.1145/3293320.3293338(1-8)Online publication date: 14-Jan-2019
  • (2018)REPLICA MBTACThe Journal of Supercomputing10.1007/s11227-017-2199-z74:5(1911-1933)Online publication date: 1-May-2018
  • (2016)Cross-Accelerator Performance ProfilingProceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale10.1145/2949550.2949567(1-8)Online publication date: 17-Jul-2016
  • (2015)Examining recent many-core architectures and programming models using SHOCProceedings of the 6th International Workshop on Performance Modeling, Benchmarking, and Simulation of High Performance Computing Systems10.1145/2832087.2832090(1-12)Online publication date: 15-Nov-2015

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media