MagPIe: MPI's collective communication operations for clustered wide area systems

T Kielmann, RFH Hofman, HE Bal, A Plaat… - Proceedings of the …, 1999 - dl.acm.org
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice …, 1999dl.acm.org
Writing parallel applications for computational grids is a challenging task. To achieve good
performance, algorithms designed for local area networks must be adapted to the
differences in link speeds. An important class of algorithms are collective operations, such
as broadcast and reduce. We have developed MagPIe, a library of collective communication
operations optimized for wide area systems. MagPIe's algorithms send the minimal amount
of data over the slow wide area links, and only incur a single wide area latency. Using our …
Writing parallel applications for computational grids is a challenging task. To achieve good performance, algorithms designed for local area networks must be adapted to the differences in link speeds. An important class of algorithms are collective operations, such as broadcast and reduce. We have developed MAGPIE, a library of collective communication operations optimized for wide area systems. MAGPIE's algorithms send the minimal amount of data over the slow wide area links, and only incur a single wide area latency. Using our system, existing MPI applications can be run unmodified on geographically distributed systems. On moderate cluster sizes, using a wide area latency of 10 milliseconds and a bandwidth of 1 MByte/s, MAGPIE executes operations up to 10 times faster than MPICH, a widely used MPI implementation; application kernels improve by up to a factor of 4. Due to the structure of our algorithms, MAGPIE's advantage increases for higher wide area latencies.
ACM Digital Library