Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3176364.3176367acmotherconferencesArticle/Chapter ViewAbstractPublication PageshpcasiaConference Proceedingsconference-collections
research-article

Recent experiences in using MPI-3 RMA in the DASH PGAS runtime

Published: 31 January 2018 Publication History

Abstract

The Partitioned Global Address Space (PGAS) programming model has become a viable alternative to traditional message passing using MPI. The DASH project provides a PGAS abstraction entirely based on C++11. The underlying DASH RunTime, DART, provides communication and management functionality transparently to the user. In order to facilitate incremental transitions of existing MPI-parallel codes, the development of DART has focused on creating a PGAS runtime based on the MPI-3 RMA standard. From an MPI-RMA user perspective, this paper outlines our recent experiences in the development of DART and presents insights into issues that we faced and how we attempted to solve them, including issues surrounding memory allocation and memory consistency as well as communication latencies. We implemented a set of benchmarks for global memory allocation latency in the framework of the OSU micro-benchmark suite and present results for allocation and communication latency measurements of different global memory allocation strategies under three different MPI implementations.

References

[1]
Saman Amarasinghe, Dan Campbell, William Carlson, Andrew Chien, William Dally, Elmootazbellah Elnohazy, Mary Hall, Robert Harrison, William Harrod, Kerry Hill, et al. 2009. Exascale software study: Software challenges in extreme scale systems. DARPA IPTO, Air Force Research Labs, Tech. Rep (2009).
[2]
Roberto Belli and Torsten Hoefler. 2015. Notified access: Extending remote memory access programming models for producer-consumer synchronization. In Parallel and Distributed Processing Symposium (IPDPS), 2015 IEEE International. IEEE.
[3]
Dan Bonachea and Jason Duell. 2004. Problems with Using MPI 1.1 and 2.0 As Compilation Targets for Parallel Language Implementations. Int. J. High Perform. Comput. Netw. 1, 1--3 (Aug. 2004).
[4]
Bradford L. Chamberlain, David Callahan, and Hans P. Zima. 2007. Parallel Programmability and the Chapel Language. International Journal of High Performance Computing Applications 21 (August 2007).
[5]
Barbara Chapman, Tony Curtis, Swaroop Pophale, Stephen Poole, Jeff Kuehn, Chuck Koelbel, and Lauren Smith. 2010. Introducing OpenSHMEM: SHMEM for the PGAS community. In Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model.
[6]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph Von Praun, and Vivek Sarkar. 2005. X10: an object-oriented approach to non-uniform cluster computing. ACM Sigplan Notices 40, 10 (2005).
[7]
J. Dinan, P. Balaji, J. R. Hammond, S. Krishnamoorthy, and V. Tipparaju. 2012. Supporting the Global Arrays PGAS Model Using MPI One-Sided Communication. In IEEE 26th International Parallel and Distributed Processing Symposium.
[8]
Alessandro Fanfarillo, Tobias Burnus, Valeria Cardellini, Salvatore Filippone, Dan Nagle, and Damian Rouson. 2014. OpenCoarrays: Open-source Transport Layers Supporting Coarray Fortran Compilers. In Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models. ACM.
[9]
MPI Forum. 2015. MPI: A Message-Passing Interface Standard. Standard. http://mpi-forum.org/docs/mpi-3.1/mpi31-report.pdf
[10]
Tobias Fuchs and Karl Fürlinger. 2016. Expressing and Exploiting Multi-Dimensional Locality in DASH. In Software for Exascale Computing-SPPEXA 2013--2015. Springer.
[11]
Karl Fuerlinger, Tobias Fuchs, and Roger Kowalewski. 2016. DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms. In 2016 IEEE 18th International Conference on High Performance Computing and Communications.
[12]
Antonio Gómez-Iglesias, Dmitry Pekurovsky, Khaled Hamidouche, Jie Zhang, and Jérôme Vienne. 2015. Porting Scientific Libraries to PGAS in XSEDE Resources: Practice and Experience. In Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure (XSEDE '15). ACM.
[13]
William D. Gropp and Rajeev Thakur. 2007. Revealing the Performance of MPI RMA Implementations. Springer Berlin Heidelberg.
[14]
Daniel Grünewald and Christian Simmendinger. 2013. The GASPI API specification and its implementation GPI 2.0. In 7th International Conference on PGAS Programming Models.
[15]
Jeff R. Hammond, Sayan Ghosh, and Barbara M. Chapman. 2014. Implementing OpenSHMEM Using MPI-3 One-Sided Communication. Springer International Publishing.
[16]
Nathan Hjelm. 2014. Optimizing One-sided Operations in Open MPI. In Proceedings of the 21st European MPI Users' Group Meeting (EuroMPI/ASIA '14). ACM.
[17]
Troy A Johnson. 2013. Coarray C++. In 7th International Conference on PGAS Programming Models.
[18]
J. Lee and M. Sato. 2010. Implementation and Performance Evaluation of XcalableMP: A Parallel Programming Language for Distributed Memory Systems. In 2010 39th International Conference on Parallel Processing Workshops.
[19]
Jarek Nieplocha and Bryan Carpenter. 1999. ARMCI: A portable remote memory copy library for distributed array libraries and compiler runtime systems. In Parallel and Distributed Processing. Springer.
[20]
Jaroslaw Nieplocha, Robert J Harrison, and Richard J Littlefield. 1994. Global Arrays: a portable shared-memory programming model for distributed memory computers. In Proceedings of the 1994 ACM/IEEE conference on Supercomputing.
[21]
Robert W. Numrich and John Reid. 1998. Co-array Fortran for parallel programming. SIGPLAN Fortran Forum (Aug. 1998).
[22]
Monika ten Bruggencate and Duncan Roweth. 2010. DMAPP - An API for One-sided Program Models on Baker Systems. In 52. Cray User Group (CUG).
[23]
UPC Consortium. 2005. UPC Language Specifications, v1.2. Tech Report LBNL-59208. Lawrence Berkeley National Lab. http://www.gwu.edu/~upc/publications/LBNL-59208.pdf
[24]
Chaoran Yang, Wesley Bland, John Mellor-Crummey, and Pavan Balaji. 2014. Portable, MPI-interoperable Coarray Fortran. In Proceedings of the 19th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP '14). ACM.
[25]
Katherine Yelick, Dan Bonachea, Wei-Yu Chen, Phillip Colella, Kaushik Datta, Jason Duell, Susan L. Graham, Paul Hargrove, Paul Hilfinger, Parry Husbands, Costin Iancu, Amir Kamil, Rajesh Nishtala, Jimmy Su, Michael Welcome, and Tong Wen. 2007. Productivity and Performance Using Partitioned Global Address Space Languages. In Proceedings of the 2007 International Workshop on Parallel Symbolic Computation (PASCO '07). ACM.
[26]
Yili Zheng, Amir Kamil, Michael B Driscoll, Hongzhang Shan, and Katherine Yelick. 2014. UPC++: a PGAS Extension for C++. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International.
[27]
Huan Zhou, Kamran Idrees, and José Gracia. 2015. Leveraging MPI-3 Shared-Memory Extensions for Efficient PGAS Runtime Systems. In Euro-Par 2015: Parallel Processing - 21st International Conference on Parallel and Distributed Computing, Vienna, Austria, August 24--28, 2015, Proceedings.
[28]
Huan Zhou, Yousri Mhedheb, Kamran Idrees, Colin Glass, José Gracia, Karl Fürlinger, and Jie Tao. 2014. DART-MPI: An MPI-based Implementation of a PGAS Runtime System. In The 8th International Conference on Partitioned Global Address Space Programming Models (PGAS).

Cited By

View all
  • (2019)Using MPI-3 RMA for Active Messages2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI)10.1109/ExaMPI49596.2019.00011(47-56)Online publication date: Nov-2019
  • (2019)Global Task Data-Dependencies in PGAS ApplicationsHigh Performance Computing10.1007/978-3-030-20656-7_16(312-329)Online publication date: 17-May-2019

Index Terms

  1. Recent experiences in using MPI-3 RMA in the DASH PGAS runtime
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    HPCAsia '18 Workshops: Proceedings of Workshops of HPC Asia
    January 2018
    86 pages
    ISBN:9781450363471
    DOI:10.1145/3176364
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    • IPSJ: Information Processing Society of Japan

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 January 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DASH
    2. MPI-RMA
    3. PGAS
    4. communication latency
    5. global memory allocation
    6. partitioned global address space

    Qualifiers

    • Research-article

    Funding Sources

    • German Research Foundation (DFG)

    Conference

    HPC Asia 2018 WS
    Sponsor:
    • IPSJ
    HPC Asia 2018 WS: Workshops of HPC Asia 2018
    January 31, 2018
    Tokyo, Chiyoda

    Acceptance Rates

    Overall Acceptance Rate 69 of 143 submissions, 48%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)8
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Using MPI-3 RMA for Active Messages2019 IEEE/ACM Workshop on Exascale MPI (ExaMPI)10.1109/ExaMPI49596.2019.00011(47-56)Online publication date: Nov-2019
    • (2019)Global Task Data-Dependencies in PGAS ApplicationsHigh Performance Computing10.1007/978-3-030-20656-7_16(312-329)Online publication date: 17-May-2019

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media