Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3019120acmconferencesBook PagePublication PagesscConference Proceedingsconference-collections
WACCPD '16: Proceedings of the Third International Workshop on Accelerator Programming Using Directives
2016 Proceeding
Publisher:
  • IEEE Press
Conference:
SC16: The International Conference for High Performance Computing, Networking, Storage and Analysis Salt Lake City Utah November 13 - 18, 2016
ISBN:
978-1-5090-6152-5
Published:
13 November 2016
Sponsors:
SIGHPC, IEEE-CS\DATC
In-Cooperation:
Reflects downloads up to 18 Nov 2024Bibliometrics
Skip Abstract Section
Abstract

One of the hard realities is that the hardware continues to evolve very rapidly with diverse memory subsystems or cores with different ISAs or accelerators of varied types. The HPC community is in constant need for sophisticated software tools and techniques to port legacy code to these emerging platforms. Maintaining a single code base yet achieving performance portable solution continues to pose a daunting task. Directivebased programming models such as OpenACC, OpenMP tackle this issue by offering scientists a high-level approach to accelerate scientific applications and develop performance portable solutions. This enables accelerators to be first-class citizens for HPC!

To address the rapid pace of hardware evolution, developers continue to explore and add richer features to the various (parallel) programming standards. Domain scientists continue to explore the programming and tools space while preparing themselves for future Exascale systems.

This workshop aims to solicit papers that explore innovative language features - their implementations, compilation & runtime scheduling techniques, performance optimization strategies, autotuning tools exploring the optimization space and so on.

WACCPD has been one of the major forums for bringing together the users, developers and tools community to share their knowledge and experiences of using directives and similar approaches to program emerging complex systems.

Skip Table Of Content Section
research-article
Acceleration of element-by-element kernel in unstructured implicit low-order finite-element earthquake simulation using OpenACC on pascal GPUs
Pages 1–12

The element-by-element computation used in matrix-vector multiplications is the key kernel for attaining high-performance in unstructured implicit low-order finite-element earthquake simulations. We accelerate this CPU-based element-by-element kernel by ...

research-article
Towards achieving performance portability using directives for accelerators
Pages 13–24

In this paper we explore the performance portability of directives provided by OpenMP 4 and OpenACC to program various types of node architectures with attached accelerators, both self-hosted multicore and offload multicore/GPU. Our goal is to examine ...

research-article
A modern memory management system for OpenMP
Pages 25–35

Modern computers with multi-/many-core processors and accelerators feature a sophisticated and deep memory hierarchy, potentially including distinct main memory, high-bandwidth memory, texture memory and scratchpad memory. The performance ...

research-article
An extension of OpenACC directives for out-of-core stencil computation with temporal blocking
Pages 36–45

In this paper, aiming at realizing directive-based temporal blocking for out-of-core stencil computation, we present an extension of OpenACC directives and a source-to-source translator capable of accelerating out-of-core stencil computation on a ...

research-article
OpenACC cache directive: opportunities and optimizations
Pages 46–56

OpenACC's programming model presents a simple interface to programmers, offering a trade-off between performance and development effort. OpenACC relies on compiler technologies to generate efficient code and optimize for performance. Among the difficult ...

research-article
Identifying and scheduling loop chains using directives
Pages 57–67

Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms ...

research-article
Exploring compiler optimization opportunities for the OpenMP 4.x accelerator model on a POWER8+GPU platform
Pages 68–78

While GPUs are increasingly popular for high-performance computing, optimizing the performance of GPU programs is a time-consuming and non-trivial process in general. This complexity stems from the low abstraction level of standard GPU programming ...

research-article
A portable, high-level graph analytics framework targeting distributed, heterogeneous systems
Pages 79–88

As the HPC and Big Data communities continue to converge, heterogeneous and distributed systems are becoming commonplace. In order to take advantage of the immense computing power of these systems, distributing data efficiently and leveraging ...

Contributors
  • University of Delaware
  • Helmholtz-Center Dresden-Rossendorf
Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Acceptance Rates

Overall Acceptance Rate 7 of 14 submissions, 50%
YearSubmittedAcceptedRate
WACCPD '1514750%
Overall14750%