User profiles for Hugo Brunie
hugo brunieINRIA Verified email at inria.fr Cited by 87 |
Timemory: modular performance analysis for HPC
HPC has undergone a significant transition toward heterogeneous architectures. This
transition has introduced several issues in code migration to support multiple frameworks for …
transition has introduced several issues in code migration to support multiple frameworks for …
Autotuning convolutions is easier than you think
A wide range of scientific and machine learning applications depend on highly optimized
implementations of tensor computations. Exploiting the full capacity of a given processor …
implementations of tensor computations. Exploiting the full capacity of a given processor …
Tuning floating-point precision using dynamic program information and temporal locality
We present a methodology for precision tuning of full applications. These techniques must
select a search space composed of either variables or instructions and provide a scalable …
select a search space composed of either variables or instructions and provide a scalable …
Parcoach extension for a full-interprocedural collectives verification
The advent to exascale requires more scalable and efficient techniques to help developers
to locate, analyze and correct errors in parallel applications. PARallel COntrol flow Anomaly …
to locate, analyze and correct errors in parallel applications. PARallel COntrol flow Anomaly …
ASGarD: Adaptive Sparse Grid Discretization
…, BT McDaniel, L Mu, T Younkin, H Brunie… - Journal of Open …, 2024 - joss.theoj.org
Many areas of science exhibit physical processes that are described by high dimensional
partial differential equations (PDEs), eg, the 4D (Dorf et al., 2013), 5D (Candy et al., 2009) and …
partial differential equations (PDEs), eg, the 4D (Dorf et al., 2013), 5D (Candy et al., 2009) and …
PARCOACH Extension for Hybrid Applications with Interprocedural Analysis
Supercomputers are rapidly evolving with now millions of processing units, posing the
questions of their programmability. Despite the emergence of more widespread and functional …
questions of their programmability. Despite the emergence of more widespread and functional …
Profile-guided scope-based data allocation method
The complexity of High Performance Computing nodes memory system increases in order
to challenge application growing memory usage and increasing gap between computation …
to challenge application growing memory usage and increasing gap between computation …
Efficient convolution optimisation by composing micro-kernels
Optimizing the implementation of tensor computations is essential to exploiting the full
capacity of a given processor architecture on a wide range of scientific and machine learning …
capacity of a given processor architecture on a wide range of scientific and machine learning …
Optimisation des allocations de données pour des applications du Calcul Haute Performance sur une architecture à mémoires hétérogènes
H Brunie - 2019 - theses.hal.science
Le Calcul Haute Performance, regroupant l’ensemble des acteurs responsables de l’amélioration
des performances de calcul des applications scientifiques sur supercalculateurs, s’est …
des performances de calcul des applications scientifiques sur supercalculateurs, s’est …
Poseidon: A Source-to-Source Translator for Holistic HPC Optimizations of Ocean Models on Regular Grids
Ocean simulation models often underperform on modern high-performance computing (HPC)
architectures, necessitating costly and time-consuming code rewrites. We introduce …
architectures, necessitating costly and time-consuming code rewrites. We introduce …