[PDF][PDF] Automating parallel runtime optimizations using post-mortem analysis
S Krishnan, LV Kale - Proceedings of the 10th international conference …, 1996 - dl.acm.org
S Krishnan, LV Kale
Proceedings of the 10th international conference on Supercomputing, 1996•dl.acm.orgAttaining good performance for parallel programs frequently requires substantial expertise
and effort, which can be reduced by automated optimization. In this paper we concentrate on
run-time optimizations and techniques to automate them without programmer intervention,
using post-mortem analysis of parallel program execution. We classify the characteristics of
parallel programs with respect to object placement(mapping), scheduling and
communication, then describe techniques to discover these characteristics by post-mortem …
and effort, which can be reduced by automated optimization. In this paper we concentrate on
run-time optimizations and techniques to automate them without programmer intervention,
using post-mortem analysis of parallel program execution. We classify the characteristics of
parallel programs with respect to object placement(mapping), scheduling and
communication, then describe techniques to discover these characteristics by post-mortem …
Abstract
Attaining good performance for parallel programs frequently requires substantial expertise and effort, which can be reduced by automated optimization. In this paper we concentrate on run-time optimizations and techniques to automate them without programmer intervention, using post-mortem analysis of parallel program execution. We classify the characteristics of parallel programs with respect to object placement(mapping), scheduling and communication, then describe techniques to discover these characteristics by post-mortem analysis, present heuristics to choose appropriate optimizations based on these characteristics, and describe techniques to generate concise hints to runtime optimization libraries. Our ideas have been developed in the framework of the Paradise post-mortem analysis tool for the parallel object-oriented language Charm++. We also present results for optimizing simple parallel programs running on the Thinking Machines CM-5.
ACM Digital Library