Guest editors' note
This special issue gathers selected papers of the Workshop on Clusters, Clouds and Data for Scientific Computing that was held at La Maison des Contes, Dareize-France, on October 3-6, 2016.
Evolution of a minimal parallel programming model
We take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how ...
Topology-aware job mapping
A Resource and Job Management System RJMS is a crucial system software part of the HPC stack. It is responsible for efficiently delivering computing power to applications in supercomputing environments. Its main intelligence relies on resource selection ...
BOAST
- Brice Videau,
- Kevin Pouget,
- Luigi Genovese,
- Thierry Deutsch,
- Dimitri Komatitsch,
- Frédéric Desprez,
- Jean-François Méhaut
The portability of real high-performance computing HPC applications on new platforms is an open and very delicate problem. Especially, the performance portability of the underlying computing kernels is problematic as they need to be tuned for each and ...
Task-based programming in COMPSs to converge from HPC to big data
Task-based programming has proven to be a suitable model for high-performance computing HPC applications. Different implementations have been good demonstrators of this fact and have promoted the acceptance of task-based programming in the OpenMP ...
Anatomy of machine learning algorithm implementations in MPI, Spark, and Flink
With the ever-increasing need to analyze large amounts of data to get useful insights, it is essential to develop complex parallel machine learning algorithms that can scale with data and number of parallel processes. These algorithms need to run on ...
DARE
- Charalampos Chalios,
- Giorgis Georgakoudis,
- Konstantinos Tovletoglou,
- George Karakonstantis,
- Hans Vandierendonck,
- Dimitrios S Nikolopoulos
Power consumption and reliability of memory components are two of the most important hurdles in realizing exascale systems. Dynamic random access memory DRAM scaling projections predict significant performance and power penalty due to the conventional ...
Resilient co-scheduling of malleable applications
Recently, the benefits of co-scheduling several applications have been demonstrated in a fault-free context, both in terms of performance and energy savings. However, large-scale computer systems are confronted by frequent failures, and resilience ...
Software-defined environments for science and engineering
Service-based access models coupled with recent advances in application deployment technologies are enabling opportunities for realizing highly customized software-defined environments that can achieve new levels of efficiencies and can support emerging ...
Co-scheduling Amdahl applications on cache-partitioned systems
Cache-partitioned architectures allow subsections of the shared last-level cache LLC to be exclusively reserved for some applications. This technique dramatically limits interactions between applications that are concurrently executing on a multicore ...
A failure detector for HPC platforms
- George Bosilca,
- Aurelien Bouteiller,
- Amina Guermouche,
- Thomas Herault,
- Yves Robert,
- Pierre Sens,
- Jack Dongarra
Building an infrastructure for exascale applications requires, in addition to many other key components, a stable and efficient failure detector. This article describes the design and evaluation of a robust failure detector that can maintain and ...
The future of scientific workflows
- Ewa Deelman,
- Tom Peterka,
- Ilkay Altintas,
- Christopher D Carothers,
- Kerstin Kleese van Dam,
- Kenneth Moreland,
- Manish Parashar,
- Lavanya Ramakrishnan,
- Michela Taufer,
- Jeffrey Vetter
Today's computational, experimental, and observational sciences rely on computations that involve many related tasks. The success of a scientific mission often hinges on the computer automation of these workflows. In April 2015, the US Department of ...
Reducing the energy consumption of large-scale computing systems through combined shutdown policies with multiple constraints
Large-scale distributed systems high-performance computing centers, networks, data centers are expected to consume huge amounts of energy. In order to address this issue, shutdown policies constitute an appealing approach able to dynamically adapt the ...
Morton ordering of 2D arrays for efficient access to hierarchical memory
This article investigates the recursive Morton ordering of two-dimensional arrays as an efficient way to access hierarchical memory across a range of heterogeneous computer platforms, ranging from manycore devices, multicore processors, clusters and ...