article

Simulation of MPI applications with time-independent traces

Authors:

Henri Casanova,

Frédéric Desprez,

George S. Markomanolis,

Frédéric SuterAuthors Info & Claims

Concurrency and Computation: Practice & Experience, Volume 27, Issue 5

Pages 1145 - 1168

https://doi.org/10.1002/cpe.3278

Published: 10 April 2015 Publication History

Abstract

Analyzing and understanding the performance behavior of parallel applications on parallel computing platforms is a long-standing concern in the High Performance Computing community. When the targeted platforms are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In the context of applications implemented with the Message Passing Interface, two simulation methods have been proposed, on-line simulation and off-line simulation, both with their own drawbacks and advantages. In this work, we present an off-line simulation framework, that is, one that simulates the execution of an application based on event traces obtained from an actual execution. The main novelty of this work, when compared to previously proposed off-line simulators, is that traces that drive the simulation can be acquired on large, distributed, heterogeneous, and non-dedicated platforms. As a result, the scalability of trace acquisition is increased, which is achieved by enforcing that traces contain no time-related information. Moreover, our framework is based on a state-of-the-art scalable, fast, and validated simulation kernel. We introduce the notion of performing off-line simulation from time-independent traces, propose and evaluate several trace acquisition strategies, describe our simulation framework, and assess its quality in terms of trace acquisition scalability, simulation accuracy, and simulation time. Copyright © 2014 John Wiley & Sons, Ltd.

References

[1]

Shende S, Malony A. The TAU parallel performance system. International Journal of High Performance Computing and Applications 2006; Volume 20 Issue 2: pp.287-311.

[2]

Geimer M, Wolf F, Wylie B, Mohr B. A scalable tool architecture for diagnosing wait states in massively parallel applications. Parallel Computing 2009; Volume 35 Issue 7: pp.375-388.

[3]

Knüpfer A, Rössel C, Mey D, Biersdorff S, Diethelm K, Eschweiler D, Geimer M, Gerndt M, Lorenz D, Malony A, Nagel WE, Oleynik Y, Philippen P, Saviankou P, Schmidl D, Shende S, Tschter R, Wagner M, Wesarg B, Wolf F. Score-P: a joint performance measurement run-time infrastructure for periscope, scalasca, tau, and vampir. In Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing, Brunst H, Müller MS, Nagel WE, Resch MM eds. Springer: Dresden, Germany, 2012; pp.79-91.

[4]

Dickens P, Heidelberger P, Nicol D. Parallelized direct execution simulation of message-passing parallel programs. IEEE Transactions on Parallel and Distributed Systems 1996; Volume 7 Issue 10: pp.1090-1105.

Digital Library

[5]

Bagrodia R, Deelman E, Phan T. Parallel simulation of large-scale parallel applications. International Journal of High Performance Computing and Applications 2001; Volume 15 Issue 1: pp.3-12.

[6]

Snavely A, Carrington L, Wolter N, Labarta Jesús, Badia R, Purkayastha A. A framework for application performance modeling and prediction. Proceedings of the ACM/IEEE Conference on Supercomputing SC'02, Baltimore, MA, 2002.

[7]

Zheng G, Kakulapati G, Kale L. BigSim: a parallel simulator for performance prediction of extremely large parallel machines. Proceedings of the 18th International Parallel and Distributed Processing Symposium, Santa Fe, NM, 2004.

[8]

Riesen R. A hybrid mpi simulator. Proceedings of the IEEE International Conference on Cluster Computing, Barcelona, Spain, 2006; pp.1-9.

[9]

León E, Riesen R, Maccabe A. Instruction-level simulation of a cluster at scale. Proceedings of the International Conference for High Performance Computing and Communications, Portland, OR, 2009.

[10]

Penoff B, Wagner A, Tüxen M, Rüngeler I. MPI-NetSim: a network simulation module for mpi. Proceedings of the 15th International Conference on Parallel and Distributed Systems ICPADS, Shenzen, China, 2009; pp.464-471.

[11]

Hoefler T, Siebert C, Lumsdaine A. LogGOPSim - simulating large-scale applications in the loggops model. Proceedings of the ACM Workshop on Large-scale System and Application Performance, Chicago, IL, 2010; pp.597-604.

[12]

Tikir M, Laurenzano M, Carrington L, Snavely A. PSINS: an open source event tracer and execution simulator for mpi applications. Proceedings of the 15th International Europar Conference, Lecture Notes in Computer Science, vol. 5704, Delft, Netherlands, August 2009; pp.135-148.

Digital Library

[13]

Núñez A, Fernández J, Garcia JD, Garcia F, Carretero J. New techniques for simulating high performance mpi applications on large storage networks. Journal of Supercomputing 2010; Volume 51 Issue 1: pp.40-57.

[14]

Zhai J, Chen W, Zheng W. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. Proceedings of the 15th ACM Sigplan Symposium on Principles and Practice of Parallel Programming, Bangalore, India, 2010; pp.305-314.

[15]

Hermanns MA, Geimer M, Wolf F, Wylie B. Verifying causality between distant performance phenomena in large-scale mpi applications, February 2009; pp.78-84.

[16]

Adve VS, Bagrodia R, Deelman E, Sakellariou R. Compiler-optimized simulation of large-scale applications on high performance architectures. Journal of Parallel and Distributed Computing 2002; Volume 62 Issue 3: pp.393-426.

[17]

Clauss PN, Stillwell M, Genaud S, Suter F, Casanova H, Quinson M. Single node on-line simulation of mpi applications with smpi. Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium IPDPS, Anchorage, AK, 2011; pp.661-672.

Digital Library

[18]

Casanova H, Legrand A, Quinson M. SimGrid: a generic framework for large-scale distributed experiments. Proceedings of the 10th IEEE International Conference on Computer Modeling and Simulation, Cambridge, UK, 2008; pp.126-131.

Digital Library

[19]

Prakash S, Deelman E, Bagrodia R. Asynchronous parallel simulation of parallel programs. IEEE Transactions on Software Engineering 2000; Volume 26 Issue 5: pp.385-400.

Digital Library

[20]

Bédaride P, Degomme A, Genaud S, Legrand A, Markomanolis GS, Quinson M, Stillwell M, Suter F, Videau B. Toward better simulation of mpi applications on ethernet/tcp networks. Proceedings of the 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems PMBS, Denver, CO, 2013.

[21]

Zheng G, Wilmarth T, Jagadishprasad P, Kalé L. Simulation-based performance prediction for large parallel machines. International Journal of Parallel Programming 2005; Volume 33 Issue 2-3: pp.183-207.

[22]

Badia R, Labarta J, Giménez J, Escalé F. Dimemas: predicting mpi applications behavior in grid environments. Proceedings of the Workshop on Grid Applications and Programming Tools, Seattle, WA, USA, 2003; pp.52-62.

[23]

Noeth M, Mueller F, Schulz M, <familyNamePrefix>de</familyNamePrefix>Supinski BR. Scalable compression and replay of communication traces in massively parallel environments. Proceedings of the IEEE International Parallel and Distributed Processing Symposium, Long Beach, CA, USA, 2007; pp.1-11.

[24]

Ratn P, Mueller F, <familyNamePrefix>de</familyNamePrefix>Supinski BR, Schulz M. Preserving time in large-scale communication traces. Proceedings of the 22nd Annual International Conference on Supercomputing, Austin, TX, USA, 2008; pp.46-55.

Digital Library

[25]

Gropp W. MPICH2: a new start for mpi implementations. In Proceedings of the 9th European PVM/MPI Users' Group Meeting, vol.Volume 2474, <bookSeriesTitle>Lecture Notes in Computer Science</bookSeriesTitle>. Springer: Linz, Austria, 2002; pp.7.

[26]

Gabriel E, Fagg G, Bosilca G, Angskun T, Dongarra J, Squyres J, Sahay V, Kambadur P, Barrett B, Lumsdaine A, Castain R, Daniel D, Graham R, Woodall T. Open MPI: goals, concept, and design of a next generation mpi implementation. In Proceedings of the 11th European PVM/MPI Users' Group Meeting, vol.Volume 3241, <bookSeriesTitle>Lecture Notes in Computer Science</bookSeriesTitle>. Springer: Budapest, Hungary, September 2004; pp.97-104.

[27]

Browne S, Dongarra J, Garner N, Ho G, Mucci P. A portable programming interface for performance evaluation on modern processors. Internation Journal of High Performance Computing and Applications 2000; Volume 14 Issue 3: pp.189-204.

[28]

Kufrin R. Perfsuite: an accessible, open source performance analysis environment for linux. Proceedings of the 6th International Conference on Linux Clusters: The HPC Revolution 2005 LCI-05, Chapel Hill, NC, 2005. Available from: "http://www.linuxclustersinstitute.org/conferences/archive/2005/PDF/21-Kufrin_R.pdf".

[29]

Knüpfer A, Brunst H, Doleschal J, Jurenz M, Lieber M, Mickler H, Müller M, Nagel W. The vampir performance analysis tool-set. Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing HLRS, Stuttgart, Germany, 2008; pp.139-155.

[30]

Chassin de Kergommeaux J, <familyNamePrefix>de</familyNamePrefix>Oliveira Stein B, Bernard P. Pajé, an interactive visualization tool for tuning multi-threaded parallel applications. Parallel Computing 2000; Volume 26 Issue 10: pp.1253-1274.

[31]

Desprez F, Markomanolis GS, Quinson M, Suter F. Assessing the performance of mpi applications through time-independent trace replay. Proceedings of the 2nd International Workshop on Parallel Software Tools and Tool Infrastructures PSTI, Taipei, Taiwan, 2011; pp.467-476.

Digital Library

[32]

Desprez F, Markomanolis GS, Suter F. Improving the accuracy and efficiency of time-independent trace replay. Proceedings of the 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems PMBS, Salt Lake City, UT, 2012; pp.446-455.

[33]

Bailey D, Barszcz E, Barton J, Browning D, Carter R, Dagum L, Fatoohi R, Frederickson P, Lasinski T, Schreiber R, Simon H, Venkatakrishnan V, Weeratunga S. The NAS parallel benchmarks - summary and preliminary results. Proceedings of Supercomputing '91, Albuquerque, NM, 1991; pp.158-165.

Digital Library

[34]

Jeannot E. Improving middleware performance with adoc: an adaptive online compression library for data transfer. Proceedings of the 19th International Parallel and Distributed Processing Symposium, Denver, CO, 2005.

Digital Library

[35]

Top 500 Supercomputer Sites. Available from: "http://www.top500.org/" {Accessed on February 2014}.

[36]

Reussner R, Sanders P, Träff JL. SKaMPI: a comprehensive benchmark for public benchmarking of mpi. Scientific Programming 2002; Volume 10 Issue 1: pp.55-65.

[37]

Genovese L, Neelov A, Goedecker S, Deutsch T, Ghasemi SA, Willand A, Caliste D, Zilberberg O, Rayson M, Bergman A, Schneider R. Daubechies wavelets as a basis set for density functional pseudopotential calculations. Journal of Chemical Physics 2008; Volume 129 Issue 014109. Available from: "http://scitation.aip.org/content/aip/journal/jcp/129/1/10.1063/1.2949547".

[38]

Mont-Blanc: european approach towards energy efficient high performance. Montblanc. Available from: "http://www.montblanc-project.eu/" {Accessed on February 2014}.

[39]

Markomanolis GS, Suter F. Time-independent trace acquisition framework - a grid'5000 How-to. Technical Report RT-0407, Institut National de Recherche en Informatique et en Automatique INRIA, 2011. Available from: "http://hal.inria.fr/inria-00586052" {Accessed on February 2014}.

Cited By

Cornebize TLegrand A(2022)Simulation-based optimization and sensibility analysis of MPI applicationsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.04.002166:C(111-125)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1016/j.jpdc.2022.04.002
Xu YDong DXu WLiao X(2019)SketchDLCACM Transactions on Architecture and Code Optimization10.1145/331257016:2(1-26)Online publication date: 18-Apr-2019
https://dl.acm.org/doi/10.1145/3312570
Degomme ALegrand AMarkomanolis GQuinson MStillwell MSuter F(2017)Simulating MPI Applications: The SMPI ApproachIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.266930528:8(2387-2400)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1109/TPDS.2017.2669305
Show More Cited By

Simulation of MPI applications with time-independent traces

Recommendations

MPI-based distribution in DEVS simulation
IITA'09: Proceedings of the 3rd international conference on Intelligent information technology application

Discrete EVent system Specification(DEVS) is widely used in modeling and simulation for the interoperability, re-useability, and composability. The Distributed and Parallel DEVS is developed to study the complex problems. But the available ...
MPI-Based Distributed in DEVS Simulation
IITA '09: Proceedings of the 2009 Third International Symposium on Intelligent Information Technology Application - Volume 02

Discrete EVent system Specification(DEVS) is widely used in modeling and simulation for the interoperability, re-useability, and composability. The Distributed and Parallel DEVS is developed to study the complex problems. But the available ...
Performance prediction through simulation of a hybrid MPI/OpenMP application
OpenMp

This paper deals with the performance prediction of hybrid MPI/OpenMP code. The use of HeSSE (Heterogeneous System Simulation Environment), along with an XML-based prototype language, MetaPL, makes it possible to predict hybrid application performance ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Concurrency and Computation: Practice & Experience

Concurrency and Computation: Practice & Experience Volume 27, Issue 5

April 2015

306 pages

ISSN:1532-0626

Issue’s Table of Contents

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 10 April 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 30 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Cornebize TLegrand A(2022)Simulation-based optimization and sensibility analysis of MPI applicationsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.04.002166:C(111-125)Online publication date: 1-Aug-2022
https://dl.acm.org/doi/10.1016/j.jpdc.2022.04.002
Xu YDong DXu WLiao X(2019)SketchDLCACM Transactions on Architecture and Code Optimization10.1145/331257016:2(1-26)Online publication date: 18-Apr-2019
https://dl.acm.org/doi/10.1145/3312570
Degomme ALegrand AMarkomanolis GQuinson MStillwell MSuter F(2017)Simulating MPI Applications: The SMPI ApproachIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.266930528:8(2387-2400)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.1109/TPDS.2017.2669305
Papadopoulou NGoumas GKoziris N(2017)Predictive communication modeling for HPC applicationsCluster Computing10.1007/s10586-017-0821-820:3(2725-2747)Online publication date: 1-Sep-2017
https://dl.acm.org/doi/10.1007/s10586-017-0821-8

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents