Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/CLUSTER.2012.75guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Designing an Offloaded Nonblocking MPI_Allgather Collective Using CORE-Direct

Published: 24 September 2012 Publication History

Abstract

Collective communication operations in the Message Passing Interface (MPI) consume a significant amount of time at scale, degrading the performance of scientific applications. Optimizing collectives is key to application performance and scalability. This paper focuses on hiding the latency of the all gather collective by efficiently offloading it to the networking hardware. We have investigated the use of Mellanox CORE-Direct offloading technology for independent progression of communication within the collective in order to achieve high communication/computation overlap. This study evaluates several design options for the nonblocking all gather collective and discusses implementations of offloaded Standard Exchange, Ring and Bruck algorithms in flat and hierarchical communicators under single-port and k-port modelling. We have applied our findings to improving the performance of the redesigned Radix Sort application kernel. Performance results suggest that our offloaded nonblocking all gather compares favourably to the blocking variant (with improvements of up to 68% for medium messages in a hierarchical collective) while providing high overlap capability. Multiport modelling is shown to be beneficial, especially in a flat communicator. Radix Sort enjoys up to 40% improvement in its runtime.

Cited By

View all
  • (2018)A Dedicated Message Matching Mechanism for Collective CommunicationsWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229712(1-10)Online publication date: 13-Aug-2018

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
CLUSTER '12: Proceedings of the 2012 IEEE International Conference on Cluster Computing
September 2012
630 pages
ISBN:9780769548074

Publisher

IEEE Computer Society

United States

Publication History

Published: 24 September 2012

Author Tags

  1. MPI
  2. allgather
  3. collective communication
  4. coredirect
  5. message passing
  6. offloading

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)A Dedicated Message Matching Mechanism for Collective CommunicationsWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229712(1-10)Online publication date: 13-Aug-2018

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media