MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

Richard L. Graham¹ &
Galen Shipman¹

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5205))

Included in the following conference series:

European Parallel Virtual Machine / Message Passing Interface Users’ Group Meeting

1089 Accesses
43 Citations

Abstract

With local core counts on the rise, taking advantage of shared-memory to optimize collective operations can improve performance. We study several on-host shared memory optimized algorithms for MPI_Bcast, MPI_Reduce, and MPI_Allreduce, using tree-based, and reduce-scatter algorithms. For small data operations with relatively large synchronization costs fan-in/fan-out algorithms generally perform best. For large messages data manipulation constitute the largest cost and reduce-scatter algorithms are best for reductions. These optimization improve performance by up to a factor of three. Memory and cache sharing effect require deliberate process layout and careful radix selection for tree-based methods.

Research sponsored by the Mathematical, Information, and Computational Sciences Division, Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Hierarchical redesign of classic MPI reduction algorithms

Article 18 June 2016

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Article 16 March 2023

Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters

References

Thakur, R., Gropp, W.: Improving the performance of collective operations in mpich. In: Lecture Notes In Computer Science, pp. 257–267 (2006)
Google Scholar
Rabenseifner, R.: Optimization of collective reduction operations. In: Lecture Notes In Computer Science, pp. 1–9 (2004)
Google Scholar
LA-MPI, http://public.lanl.gov/lampi
Sistare, S., van de Vaart, R., Loh, E.: Optimization of mpi collectives on clusters of large-scale smp’s. In: Proceedings of SC 1999: High Performance Networking and Computing (1999)
Google Scholar
NEC web page, http://www.nec.de
Mamidala, A.R., et al.: Mpi collectives on modern multicore clusters: Performance optimizations and communication characteristics. In: CCGRID 2008 (accepted for publication, 2008)
Google Scholar
Mamidala, A.R., Vishnu, A., Panda, D.K.: Efficient shared memory and rdma based design for mpi_allgather over infiniband. In: Lecture Notes In Computer Science
Google Scholar
Tipparaju, V., Nieplocha, J., Panda, D.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of the International Parallel and Distributed Processing Symposium (2003)
Google Scholar
Wu, M.S., Kendall, R.A., Aluru, S.: Exploring collective communications on a cluster of smps. In: Proceedings, HPCAsia2004, pp. 114–117 (2004)
Google Scholar
Graham, R.L., Choi, S.E., Daniel, D.J., Desai, N.N., Minnich, R.G., Rasmussen, C.E., Risinger, L.D., Sukalksi, M.W.: A network-failure-tolerant message-passing system for terascale clusters. International Journal of Parallel Programming 31(4) (2003)
Google Scholar
Open MPI, http://www.open-mpi.org

Download references

Author information

Authors and Affiliations

Oak Ridge National Laboratory, Oak Ridge, TN, USA
Richard L. Graham & Galen Shipman

Authors

Richard L. Graham
View author publications
You can also search for this author in PubMed Google Scholar
Galen Shipman
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexey Lastovetsky Tahar Kechadi Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Graham, R.L., Shipman, G. (2008). MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2008. Lecture Notes in Computer Science, vol 5205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87475-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-540-87475-1_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-87474-4
Online ISBN: 978-3-540-87475-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Hierarchical redesign of classic MPI reduction algorithms

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Hierarchical redesign of classic MPI reduction algorithms

Sparbit: Towards to a Logarithmic-Cost and Data Locality-Aware MPI Allgather Algorithm

Orthrus: A Framework for Implementing Efficient Collective I/O in Multi-core Clusters

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation