Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/ICPP.2011.29guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs

Published: 13 September 2011 Publication History

Abstract

Shared memory is among the most common approaches to implementing message passing within multicorenodes. However, current shared memory techniques donot scale with increasing numbers of cores and expanding memory hierarchies--most notably when handling large data transfers and collective communication. Neglecting the underlying hardware topology, using copy-in/copy-out memory transfer operations, and overloading the memory subsystem using one-to-many types of operations are some of the most common mistakes in today's shared memory implementations. Unfortunately, they all negatively impact the performance and scalability of MPI libraries--and therefore applications. In this paper, we present several kernel-assisted intra-node collective communication techniques that address these three issues on many-core systems. We also present a new OpenMPI collective communication component that uses the KNEMLinux module for direct inter-process memory copying. Our Open MPI component implements several novel strategies to decrease the number of intermediate memory copies and improve data locality in order to diminish both cache pollution and memory pressure. Experimental results show that our KNEM-enabled Open MPI collective component can outperform state-of-art MPI libraries (Open MPI and MPICH2) on synthetic benchmarks, resulting in a significant improvement for a typical graph application.

Cited By

View all
  • (2023)Synchronizing MPI Processes in Space and TimeProceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615325(1-11)Online publication date: 11-Sep-2023
  • (2023)Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable ProcessorsProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605616(295-305)Online publication date: 7-Aug-2023
  • (2022)Designing Hierarchical Multi-HCA Aware Allgather in MPIWorkshop Proceedings of the 51st International Conference on Parallel Processing10.1145/3547276.3548524(1-10)Online publication date: 29-Aug-2022
  • Show More Cited By

Index Terms

  1. Kernel Assisted Collective Intra-node MPI Communication among Multi-Core and Many-Core CPUs
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      ICPP '11: Proceedings of the 2011 International Conference on Parallel Processing
      September 2011
      796 pages
      ISBN:9780769545103

      Publisher

      IEEE Computer Society

      United States

      Publication History

      Published: 13 September 2011

      Author Tags

      1. MPI
      2. NUMA
      3. collective communication
      4. kernel
      5. many-core
      6. multi-core
      7. shared memory

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 24 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Synchronizing MPI Processes in Space and TimeProceedings of the 30th European MPI Users' Group Meeting10.1145/3615318.3615325(1-11)Online publication date: 11-Sep-2023
      • (2023)Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable ProcessorsProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605616(295-305)Online publication date: 7-Aug-2023
      • (2022)Designing Hierarchical Multi-HCA Aware Allgather in MPIWorkshop Proceedings of the 51st International Conference on Parallel Processing10.1145/3547276.3548524(1-10)Online publication date: 29-Aug-2022
      • (2018)Framework for scalable intra-node collective operations using shared memoryProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291695(1-12)Online publication date: 11-Nov-2018
      • (2018)Cooperative rendezvous protocols for improved performance and overlapProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.5555/3291656.3291694(1-13)Online publication date: 11-Nov-2018
      • (2018)Framework for scalable intra-node collective operations using shared memoryProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC.2018.00032(1-12)Online publication date: 11-Nov-2018
      • (2018)Cooperative rendezvous protocols for improved performance and overlapProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC.2018.00031(1-13)Online publication date: 11-Nov-2018
      • (2017)Formal modeling and performance evaluation of a run-time rank remapping technique in Broadcast, Allgather and Allreduce MPI collective operationsProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.32(963-972)Online publication date: 14-May-2017
      • (2016)Numerical weather model BRAMS evaluation on many-core architecturesInternational Journal of Computational Science and Engineering10.1504/IJCSE.2016.07694012:4(330-340)Online publication date: 1-Jan-2016
      • (2016)Architectural support for efficient message passing on shared memory multi-coresJournal of Parallel and Distributed Computing10.1016/j.jpdc.2016.02.00595:C(92-106)Online publication date: 1-Sep-2016
      • Show More Cited By

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media