- Sponsor:
- sighpc
One of the hard realities is that the hardware continues to evolve very rapidly with diverse memory subsystems or cores with different ISAs or accelerators of varied types. The HPC community is in constant need for sophisticated software tools and techniques to port legacy code to these emerging platforms. Maintaining a single code base yet achieving performance portable solution continues to pose a daunting task. Directivebased programming models such as OpenACC, OpenMP tackle this issue by offering scientists a high-level approach to accelerate scientific applications and develop performance portable solutions. This enables accelerators to be first-class citizens for HPC!
To address the rapid pace of hardware evolution, developers continue to explore and add richer features to the various (parallel) programming standards. Domain scientists continue to explore the programming and tools space while preparing themselves for future Exascale systems.
This workshop aims to solicit papers that explore innovative language features - their implementations, compilation & runtime scheduling techniques, performance optimization strategies, autotuning tools exploring the optimization space and so on.
WACCPD has been one of the major forums for bringing together the users, developers and tools community to share their knowledge and experiences of using directives and similar approaches to program emerging complex systems.
Proceeding Downloads
Acceleration of element-by-element kernel in unstructured implicit low-order finite-element earthquake simulation using OpenACC on pascal GPUs
The element-by-element computation used in matrix-vector multiplications is the key kernel for attaining high-performance in unstructured implicit low-order finite-element earthquake simulations. We accelerate this CPU-based element-by-element kernel by ...
Towards achieving performance portability using directives for accelerators
- M. Graham Lopez,
- Verónica Vergara Larrea,
- Wayne Joubert,
- Oscar Hernandez,
- Azzam Haidar,
- Stanimire Tomov,
- Jack Dongarra
In this paper we explore the performance portability of directives provided by OpenMP 4 and OpenACC to program various types of node architectures with attached accelerators, both self-hosted multicore and offload multicore/GPU. Our goal is to examine ...
A modern memory management system for OpenMP
Modern computers with multi-/many-core processors and accelerators feature a sophisticated and deep memory hierarchy, potentially including distinct main memory, high-bandwidth memory, texture memory and scratchpad memory. The performance ...
An extension of OpenACC directives for out-of-core stencil computation with temporal blocking
In this paper, aiming at realizing directive-based temporal blocking for out-of-core stencil computation, we present an extension of OpenACC directives and a source-to-source translator capable of accelerating out-of-core stencil computation on a ...
OpenACC cache directive: opportunities and optimizations
OpenACC's programming model presents a simple interface to programmers, offering a trade-off between performance and development effort. OpenACC relies on compiler technologies to generate efficient code and optimize for performance. Among the difficult ...
Identifying and scheduling loop chains using directives
Exposing opportunities for parallelization while explicitly managing data locality is the primary challenge to porting and optimizing existing computational science simulation codes to improve performance and accuracy. OpenMP provides many mechanisms ...
Exploring compiler optimization opportunities for the OpenMP 4.x accelerator model on a POWER8+GPU platform
While GPUs are increasingly popular for high-performance computing, optimizing the performance of GPU programs is a time-consuming and non-trivial process in general. This complexity stems from the low abstraction level of standard GPU programming ...
A portable, high-level graph analytics framework targeting distributed, heterogeneous systems
As the HPC and Big Data communities continue to converge, heterogeneous and distributed systems are becoming commonplace. In order to take advantage of the immense computing power of these systems, distributing data efficiently and leveraging ...
Recommendations
Towards achieving performance portability using directives for accelerators
WACCPD '16: Proceedings of the Third International Workshop on Accelerator Programming Using DirectivesIn this paper we explore the performance portability of directives provided by OpenMP 4 and OpenACC to program various types of node architectures with attached accelerators, both self-hosted multicore and offload multicore/GPU. Our goal is to examine ...
Acceptance Rates
Year | Submitted | Accepted | Rate |
---|---|---|---|
WACCPD '15 | 14 | 7 | 50% |
Overall | 14 | 7 | 50% |