Nothing Special   »   [go: up one dir, main page]

Skip to main content

MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives

  • Conference paper
Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 2008)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 5205))

Abstract

With local core counts on the rise, taking advantage of shared-memory to optimize collective operations can improve performance. We study several on-host shared memory optimized algorithms for MPI_Bcast, MPI_Reduce, and MPI_Allreduce, using tree-based, and reduce-scatter algorithms. For small data operations with relatively large synchronization costs fan-in/fan-out algorithms generally perform best. For large messages data manipulation constitute the largest cost and reduce-scatter algorithms are best for reductions. These optimization improve performance by up to a factor of three. Memory and cache sharing effect require deliberate process layout and careful radix selection for tree-based methods.

Research sponsored by the Mathematical, Information, and Computational Sciences Division, Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract No. DE-AC05-00OR22725 with UT-Battelle, LLC.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Thakur, R., Gropp, W.: Improving the performance of collective operations in mpich. In: Lecture Notes In Computer Science, pp. 257–267 (2006)

    Google Scholar 

  2. Rabenseifner, R.: Optimization of collective reduction operations. In: Lecture Notes In Computer Science, pp. 1–9 (2004)

    Google Scholar 

  3. LA-MPI, http://public.lanl.gov/lampi

  4. Sistare, S., van de Vaart, R., Loh, E.: Optimization of mpi collectives on clusters of large-scale smp’s. In: Proceedings of SC 1999: High Performance Networking and Computing (1999)

    Google Scholar 

  5. NEC web page, http://www.nec.de

  6. Mamidala, A.R., et al.: Mpi collectives on modern multicore clusters: Performance optimizations and communication characteristics. In: CCGRID 2008 (accepted for publication, 2008)

    Google Scholar 

  7. Mamidala, A.R., Vishnu, A., Panda, D.K.: Efficient shared memory and rdma based design for mpi_allgather over infiniband. In: Lecture Notes In Computer Science

    Google Scholar 

  8. Tipparaju, V., Nieplocha, J., Panda, D.: Fast collective operations using shared and remote memory access protocols on clusters. In: Proceedings of the International Parallel and Distributed Processing Symposium (2003)

    Google Scholar 

  9. Wu, M.S., Kendall, R.A., Aluru, S.: Exploring collective communications on a cluster of smps. In: Proceedings, HPCAsia2004, pp. 114–117 (2004)

    Google Scholar 

  10. Graham, R.L., Choi, S.E., Daniel, D.J., Desai, N.N., Minnich, R.G., Rasmussen, C.E., Risinger, L.D., Sukalksi, M.W.: A network-failure-tolerant message-passing system for terascale clusters. International Journal of Parallel Programming 31(4) (2003)

    Google Scholar 

  11. Open MPI, http://www.open-mpi.org

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexey Lastovetsky Tahar Kechadi Jack Dongarra

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Graham, R.L., Shipman, G. (2008). MPI Support for Multi-core Architectures: Optimized Shared Memory Collectives. In: Lastovetsky, A., Kechadi, T., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2008. Lecture Notes in Computer Science, vol 5205. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-87475-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-87475-1_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-87474-4

  • Online ISBN: 978-3-540-87475-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics