Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1504176.1504213acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
research-article

MPIWiz: subgroup reproducible replay of mpi applications

Published: 14 February 2009 Publication History

Abstract

Message Passing Interface (MPI) is a widely used standard for managing coarse-grained concurrency on distributed computers. Debugging parallel MPI applications, however, has always been a particularly challenging task due to their high degree of concurrent execution and non-deterministic behavior. Deterministic replay is a potentially powerful technique for addressing these challenges, with existing MPI replay tools adopting either data-replay or order-replay approaches. Unfortunately, each approach has its tradeoffs. Data-replay generates substantial log sizes by recording every communication message. Order-replay generates small logs, but requires all processes to be replayed together. We believe that these drawbacks are the primary reasons that inhibit the wide adoption of deterministic replay as the critical enabler of cyclic debugging of MPI applications.
This paper describes subgroup reproducible replay (SRR), a hybrid deterministic replay method that provides the benefits of both data-replay and order-replay while balancing their trade-offs. SRR divides all processes into disjoint groups. It records the contents of messages crossing group boundaries as in data-replay, but records just message orderings for communication within a group as in order-replay. In this way, SRR can exploit the communication locality of traffic patterns in MPI applications. During replay, developers can then replay each group individually. SRR reduces recording overhead by not recording intra-group communication, and reduces replay overhead by limiting the size of each replay group. Exposing these tradeoffs gives the user the necessary control for making deterministic replay practical for MPI applications.
We have implemented a prototype, MPIWiz, to demonstrate and evaluate SRR. MPIWiz employs a replay framework that allows transparent binary instrumentation of both library and system calls. As a result, MPIWiz replays MPI applications with no source code modification and relinking, and handles non-determinism in both MPI and OS system calls. Our preliminary results show that MPIWiz can reduce recording overhead by over a factor of four relative to data-replay, yet without requiring the entire application to be replayed as in order-replay. Recording increases execution time by 27% while the application can be replayed in just 53% of its base execution time.

References

[1]
D. Bailey, T. Harris, W. Saphir, R. van der Wijngaart, A. Woo, and M. Yarrow. The NAS Parallel Benchmarks 2.0. Technical Report Report NAS-95-020, Numerical Aerodynamic Simulation Facility, NASA Ames Research Center, Mail Stop T 27 A-1, Moffett Field, CA 94035- 1000, USA, Dec. 05 1995.
[2]
A. Bouteiller, G. Bosilca, and J. Dongarra. Retrospect: Deterministic Replay of MPI Applications for Interactive Distributed Debugging. In 14th European PVM/MPI User's Group Meeting, pages 297--306, 2007.
[3]
C.-K. Cheng and Y.-C. A. Wei. An Improved Two-way Partitioning Algorithm with Stable Performance. IEEE Transactions on Computer Aided Design, 10(12):1502--1511, 1991.
[4]
C. Clénc¸on, J. Fritscher, M. J. Meehan, and R. Rühl. An Implementation of Race Detection and Deterministic Replay with MPI. In EuroPar'95, pages 155--166, Aug. 1995.
[5]
J. C. de Kergommeaux, M. Ronsse, and K. D. Bosschere. MPL*: Efficient Record/Play of Nondeterministic Features of Message Passing Libraries. In 6th European PVM/MPI Users' Group Meeting, pages 141--148, 1999.
[6]
J. DeSouza, B. Kuhn, B. R. de Supinski, V. Samofalov, S. Zheltov, and S. Bratanov. Automated, Scalable Debugging of MPI programs with Intel Message Checker. In SE-HPCS'05, pages 78--82, 2005.
[7]
C. Falzone, A. Chan, E. L. Lusk, and W. Gropp. Collective Error Detection for MPI Collective Operations. In PVM/MPI'05, pages 138--147, 2005.
[8]
A. Faraj and X. Yuan. Communication Characteristics in the NAS Parallel Benchmarks. In PDCS'02, pages 724--729, 2002.
[9]
J. Garbers, H. J. Prömel, and A. Steger. Finding Clusters in VLSI Circuits. In IEEE International Conference on Computer Aided Design, pages 520--523, Nov. 1990.
[10]
W. Gropp, E. Lusk, N. Doss, and A. Skjellum. A High-performance, Portable Implementation of the MPI Message Passing Interface Standard. Parallel Computing, 22(6):789--828, Sept. 1996.
[11]
Z. Guo, X. Wang, J. Tang, X. Liu, Z. Xu, M. Wu, M. F. Kaashoek, and Z. Zhang. R2: An Application-Level Kernel for Record and Replay. In OSDI'08, To appear, 2008.
[12]
W. Haque. Concurrent Deadlock Detection in Parallel Programs. Int. J. Comput. Appl., 28(1):19--25, 2006.
[13]
HPCC. Hpcc 1998 blue book (computing, information, and communications: Technologies for the 21st century). Computing, Information, and Communications (CIC) R&D Subcommittee of the National Science and Technology Council's Committee on Computing, Information, and Communications (CCIC), 1998.
[14]
Z. Huang, M. K. Purvis, and P. Werstein. Performance Evaluation of View-Oriented Parallel Programming. In ICPP'05, pages 251--258, 2005.
[15]
NAS Parallel Benchmarks: ProActive implementations. http://proactive.inria.fr/index.php?page=nas_benchmarks.
[16]
G. Karypis and V. Kumar. Multilevel k-way Partitioning Scheme for Irregular Graphs. Journal of Parallel and Distributed Computing, 48(1):96--129, 1998.
[17]
T. Kielmann, R. F. H. Hofman, H. E. Bal, A. Plaat, and R. A. F. Bhoedjang. MagPIe: MPI's Collective Communication Operations for Clustered Wide Area Systems. ACM SIGPLAN Notices, 34(8):131--140, 1999.
[18]
J. Kim and D. J. Lilja. Characterization of Communication Patterns in Message-Passing Parallel Scientific Application Programs. In CANPC'98, pages 202--216, 1998.
[19]
B. Krammer and M. S. Müller. MPI Application Development with MARMOT. In ParCo'05, pages 893--900, 2005.
[20]
D. Kranzlmüller, C. Schaubschläger, and J. Volkert. An Integrated Record&Replay Mechanism for Nondeterministic Message Passing Programs. In 8th European PVM/MPI Users' Group Meeting, pages 192--200, 2001.
[21]
D. Kranzlmüller and J. Volkert. NOPE: A Nondeterministic Program Evaluator. In ACPC'99, pages 490--499, 1999.
[22]
T. J. LeBlanc and J. M. Mellor-Crummey. Debugging Parallel Programs with Instant Replay. IEEE Trans. Computers, 36(4):471--482, 1987.
[23]
R. Lovas and P. Kacsuk. Correctness Debugging of Message Passing Programs Using Model Verification Techniques. In 14th European PVM/MPI User's Group Meeting, pages 335--343, 2007.
[24]
G. R. Luecke, H. Chen, J. Coyle, J. Hoekstra, M. Kraeva, and Y. Zou. MPI-CHECK: A Tool for Checking Fortran 90 MPI Programs. Concurrency and Computation: Practice and Experience, 15(2):93--100, 2003.
[25]
M. Maruyama, T. Tsumura, and H. Nakashima. Parallel Program Debugging based on Data-Replay. In PDCS'05, pages 151--156, 2005.
[26]
G. L. Miller, S.-H. Teng, and S. A. Vavasis. A Unified Geometric Approach to Graph Separators. In 32th Annual Symposium on Foundations of Computer Science, pages 538--547, Oct. 1991.
[27]
SIM-MPI Library. http://www.hpctest.org.cn/resources/sim-mpi.tgz.
[28]
N. Neophytou and P. Evripidou. Net-dbx: A Web-Based Debugger of MPI Programs Over Low-Bandwidth Lines. IEEE Trans. Parallel Distrib. Syst., 12(9):986--995, 2001.
[29]
S. Pervez, G. Gopalakrishnan, R. M. Kirby, R. Palmer, R. Thakur, and W. Gropp. Practical Model-Checking Method for Verifying Correctness of MPI Programs. In 14th European PVM/MPI User's Group Meeting, pages 344--353, 2007.
[30]
PGDBG Graphical Symbolic Debugger. http://www.pgroup.com/products/pgdbg.htm.
[31]
M. Rudgyard. Novel Techniques for Debugging and Optimizing Parallel Applications. In SC'06, page 281, 2006.
[32]
B. Schroeder and G. A. Gibson. A Large-scale Study of Failures in High-performance Computing Systems. In International Conference on Dependable Systems and Networks (DSN 2006), pages 249--258, 2006.
[33]
S. F. Siegel. Model Checking Nonblocking MPI Programs. In 8th International Conference on Verification, Model Checking, and Abstract Interpretation (VMCAI 2007), pages 44--58, 2007.
[34]
S. F. Siegel and G. S. Avrunin. Modeling Wildcard-free MPI Programs for Verification. In Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP 2005), pages 95--106, 2005.
[35]
S. F. Siegel, A. Mironova, G. S. Avrunin, and L. A. Clarke. Using Model Checking with Symbolic Execution to Verify Parallel Numerical Programs. In Proceedings of the ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA 2006), pages 157--168, 2006.
[36]
Totalview. http://www.totalviewtech.com/.
[37]
J. L. Träff and J. Worringen. Verifying Collective MPI Calls. In 11th European PVM/MPI Users' Group Meeting, pages 18--27, 2004.
[38]
J. S. Vetter and B. R. de Supinski. Dynamic Software Testing of MPI Applications with Umpire. In SC'00, pages 70--70, November, 4-10 2000.
[39]
J. S. Vetter and F. Mueller. Communication Characteristics of Large-scale Scientific Applications for Contemporary Cluster Architectures. J. Parallel Distrib. Comput., 63(9): 853--865, 2003.

Cited By

View all
  • (2024)Efficient Deadlock Detection in MPI Programs with Path Compression and Focus MatchingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674822(467-476)Online publication date: 24-Jul-2024
  • (2023)A Survey of Graph Comparison Methods with Applications to Nondeterminism in High-Performance ComputingThe International Journal of High Performance Computing Applications10.1177/1094342023116661037:3-4(306-327)Online publication date: 5-Apr-2023
  • (2023)Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and ReplayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607098(1-14)Online publication date: 12-Nov-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPoPP '09: Proceedings of the 14th ACM SIGPLAN symposium on Principles and practice of parallel programming
February 2009
322 pages
ISBN:9781605583976
DOI:10.1145/1504176
  • cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 44, Issue 4
    PPoPP '09
    April 2009
    294 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/1594835
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 February 2009

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed debugging
  2. message passing interface
  3. non-determinism
  4. record and replay

Qualifiers

  • Research-article

Conference

PPoPP09
Sponsor:

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Efficient Deadlock Detection in MPI Programs with Path Compression and Focus MatchingProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3674822(467-476)Online publication date: 24-Jul-2024
  • (2023)A Survey of Graph Comparison Methods with Applications to Nondeterminism in High-Performance ComputingThe International Journal of High Performance Computing Applications10.1177/1094342023116661037:3-4(306-327)Online publication date: 5-Apr-2023
  • (2023)Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and ReplayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607098(1-14)Online publication date: 12-Nov-2023
  • (2023)Performance Prediction for Scalability AnalysisPerformance Analysis of Parallel Applications for HPC10.1007/978-981-99-4366-1_6(129-161)Online publication date: 19-Jun-2023
  • (2023)Fast Communication Trace CollectionPerformance Analysis of Parallel Applications for HPC10.1007/978-981-99-4366-1_2(9-41)Online publication date: 19-Jun-2023
  • (2022)ClusterRR: a record and replay framework for virtual machine clusterProceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3516807.3516819(31-44)Online publication date: 25-Feb-2022
  • (2020)An efficient algorithm for match pair approximation in message passingParallel Computing10.1016/j.parco.2019.10258591:COnline publication date: 1-Mar-2020
  • (2019)Evaluation of model checkers by verifying message passing programsScience China Information Sciences10.1007/s11432-018-9825-362:10Online publication date: 3-Sep-2019
  • (2018)Record-and-Replay Techniques for HPC Systems: A SurveySupercomputing Frontiers and Innovations: an International Journal10.14529/jsfi1801025:1(11-30)Online publication date: 15-Mar-2018
  • (2017)Precise Predictive Analysis for Discovering Communication Deadlocks in MPI ProgramsACM Transactions on Programming Languages and Systems10.1145/309507539:4(1-27)Online publication date: 17-Aug-2017
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media