CRAUL: Compiler and Run‐Time Integration for Adaptation under Load

S Ioannidis, U Rencuzogullari, R Stets… - Scientific …, 1999 - Wiley Online Library
S Ioannidis, U Rencuzogullari, R Stets, S Dwarkadas
Scientific Programming, 1999Wiley Online Library
Clusters of workstations provide a cost‐effective, high performance parallel computing
environment. These environments, however, are often shared by multiple users, or may
consist of heterogeneous machines. As a result, parallel applications executing in these
environments must operate despite unequal computational resources. For maximum
performance, applications should automatically adapt execution to maximize use of the
available resources. Ideally, this adaptation should be transparent to the application …
Clusters of workstations provide a cost‐effective, high performance parallel computing environment. These environments, however, are often shared by multiple users, or may consist of heterogeneous machines. As a result, parallel applications executing in these environments must operate despite unequal computational resources. For maximum performance, applications should automatically adapt execution to maximize use of the available resources. Ideally, this adaptation should be transparent to the application programmer. In this paper, we present CRAUL (Compiler and Run‐Time Integration for Adaptation Under Load), a system that dynamically balances computational load in a parallel application. Our target run‐time is software‐based distributed shared memory (SDSM). SDSM is a good target for parallelizing compilers since it reduces compile‐time complexity by providing data caching and other support for dynamic load balancing. CRAUL combines compile‐time support to identify data access patterns with a run‐time system that uses the access information to intelligently distribute the parallel workload in loop‐based programs. The distribution is chosen according to the relative power of the processors and so as to minimize SDSM overhead and maximize locality. We have evaluated the resulting load distribution in the presence of different types of load – computational, computational and memory intensive, and network load. CRAUL performs within 5–23% of ideal in the presence of load, and is able to improve on naive compiler‐based work distribution that does not take locality into account even in the absence of load.
Wiley Online Library