Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Simulation of MPI applications with time-independent traces

Published: 10 April 2015 Publication History

Abstract

Analyzing and understanding the performance behavior of parallel applications on parallel computing platforms is a long-standing concern in the High Performance Computing community. When the targeted platforms are not available, simulation is a reasonable approach to obtain objective performance indicators and explore various hypothetical scenarios. In the context of applications implemented with the Message Passing Interface, two simulation methods have been proposed, on-line simulation and off-line simulation, both with their own drawbacks and advantages. In this work, we present an off-line simulation framework, that is, one that simulates the execution of an application based on event traces obtained from an actual execution. The main novelty of this work, when compared to previously proposed off-line simulators, is that traces that drive the simulation can be acquired on large, distributed, heterogeneous, and non-dedicated platforms. As a result, the scalability of trace acquisition is increased, which is achieved by enforcing that traces contain no time-related information. Moreover, our framework is based on a state-of-the-art scalable, fast, and validated simulation kernel. We introduce the notion of performing off-line simulation from time-independent traces, propose and evaluate several trace acquisition strategies, describe our simulation framework, and assess its quality in terms of trace acquisition scalability, simulation accuracy, and simulation time. Copyright © 2014 John Wiley & Sons, Ltd.

References

[1]
Shende S, Malony A. The TAU parallel performance system. International Journal of High Performance Computing and Applications 2006; Volume 20 Issue 2: pp.287-311.
[2]
Geimer M, Wolf F, Wylie B, Mohr B. A scalable tool architecture for diagnosing wait states in massively parallel applications. Parallel Computing 2009; Volume 35 Issue 7: pp.375-388.
[3]
Knüpfer A, Rössel C, Mey D, Biersdorff S, Diethelm K, Eschweiler D, Geimer M, Gerndt M, Lorenz D, Malony A, Nagel WE, Oleynik Y, Philippen P, Saviankou P, Schmidl D, Shende S, Tschter R, Wagner M, Wesarg B, Wolf F. Score-P: a joint performance measurement run-time infrastructure for periscope, scalasca, tau, and vampir. In Proceedings of the 5th International Workshop on Parallel Tools for High Performance Computing, Brunst H, Müller MS, Nagel WE, Resch MM eds. Springer: Dresden, Germany, 2012; pp.79-91.
[4]
Dickens P, Heidelberger P, Nicol D. Parallelized direct execution simulation of message-passing parallel programs. IEEE Transactions on Parallel and Distributed Systems 1996; Volume 7 Issue 10: pp.1090-1105.
[5]
Bagrodia R, Deelman E, Phan T. Parallel simulation of large-scale parallel applications. International Journal of High Performance Computing and Applications 2001; Volume 15 Issue 1: pp.3-12.
[6]
Snavely A, Carrington L, Wolter N, Labarta Jesús, Badia R, Purkayastha A. A framework for application performance modeling and prediction. Proceedings of the ACM/IEEE Conference on Supercomputing SC'02, Baltimore, MA, 2002.
[7]
Zheng G, Kakulapati G, Kale L. BigSim: a parallel simulator for performance prediction of extremely large parallel machines. Proceedings of the 18th International Parallel and Distributed Processing Symposium, Santa Fe, NM, 2004.
[8]
Riesen R. A hybrid mpi simulator. Proceedings of the IEEE International Conference on Cluster Computing, Barcelona, Spain, 2006; pp.1-9.
[9]
León E, Riesen R, Maccabe A. Instruction-level simulation of a cluster at scale. Proceedings of the International Conference for High Performance Computing and Communications, Portland, OR, 2009.
[10]
Penoff B, Wagner A, Tüxen M, Rüngeler I. MPI-NetSim: a network simulation module for mpi. Proceedings of the 15th International Conference on Parallel and Distributed Systems ICPADS, Shenzen, China, 2009; pp.464-471.
[11]
Hoefler T, Siebert C, Lumsdaine A. LogGOPSim - simulating large-scale applications in the loggops model. Proceedings of the ACM Workshop on Large-scale System and Application Performance, Chicago, IL, 2010; pp.597-604.
[12]
Tikir M, Laurenzano M, Carrington L, Snavely A. PSINS: an open source event tracer and execution simulator for mpi applications. Proceedings of the 15th International Europar Conference, Lecture Notes in Computer Science, vol. 5704, Delft, Netherlands, August 2009; pp.135-148.
[13]
Núñez A, Fernández J, Garcia JD, Garcia F, Carretero J. New techniques for simulating high performance mpi applications on large storage networks. Journal of Supercomputing 2010; Volume 51 Issue 1: pp.40-57.
[14]
Zhai J, Chen W, Zheng W. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node. Proceedings of the 15th ACM Sigplan Symposium on Principles and Practice of Parallel Programming, Bangalore, India, 2010; pp.305-314.
[15]
Hermanns MA, Geimer M, Wolf F, Wylie B. Verifying causality between distant performance phenomena in large-scale mpi applications, February 2009; pp.78-84.
[16]
Adve VS, Bagrodia R, Deelman E, Sakellariou R. Compiler-optimized simulation of large-scale applications on high performance architectures. Journal of Parallel and Distributed Computing 2002; Volume 62 Issue 3: pp.393-426.
[17]
Clauss PN, Stillwell M, Genaud S, Suter F, Casanova H, Quinson M. Single node on-line simulation of mpi applications with smpi. Proceedings of the 25th IEEE International Parallel and Distributed Processing Symposium IPDPS, Anchorage, AK, 2011; pp.661-672.
[18]
Casanova H, Legrand A, Quinson M. SimGrid: a generic framework for large-scale distributed experiments. Proceedings of the 10th IEEE International Conference on Computer Modeling and Simulation, Cambridge, UK, 2008; pp.126-131.
[19]
Prakash S, Deelman E, Bagrodia R. Asynchronous parallel simulation of parallel programs. IEEE Transactions on Software Engineering 2000; Volume 26 Issue 5: pp.385-400.
[20]
Bédaride P, Degomme A, Genaud S, Legrand A, Markomanolis GS, Quinson M, Stillwell M, Suter F, Videau B. Toward better simulation of mpi applications on ethernet/tcp networks. Proceedings of the 4th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems PMBS, Denver, CO, 2013.
[21]
Zheng G, Wilmarth T, Jagadishprasad P, Kalé L. Simulation-based performance prediction for large parallel machines. International Journal of Parallel Programming 2005; Volume 33 Issue 2-3: pp.183-207.
[22]
Badia R, Labarta J, Giménez J, Escalé F. Dimemas: predicting mpi applications behavior in grid environments. Proceedings of the Workshop on Grid Applications and Programming Tools, Seattle, WA, USA, 2003; pp.52-62.
[23]
Noeth M, Mueller F, Schulz M, <familyNamePrefix>de</familyNamePrefix>Supinski BR. Scalable compression and replay of communication traces in massively parallel environments. Proceedings of the IEEE International Parallel and Distributed Processing Symposium, Long Beach, CA, USA, 2007; pp.1-11.
[24]
Ratn P, Mueller F, <familyNamePrefix>de</familyNamePrefix>Supinski BR, Schulz M. Preserving time in large-scale communication traces. Proceedings of the 22nd Annual International Conference on Supercomputing, Austin, TX, USA, 2008; pp.46-55.
[25]
Gropp W. MPICH2: a new start for mpi implementations. In Proceedings of the 9th European PVM/MPI Users' Group Meeting, vol.Volume 2474, <bookSeriesTitle>Lecture Notes in Computer Science</bookSeriesTitle>. Springer: Linz, Austria, 2002; pp.7.
[26]
Gabriel E, Fagg G, Bosilca G, Angskun T, Dongarra J, Squyres J, Sahay V, Kambadur P, Barrett B, Lumsdaine A, Castain R, Daniel D, Graham R, Woodall T. Open MPI: goals, concept, and design of a next generation mpi implementation. In Proceedings of the 11th European PVM/MPI Users' Group Meeting, vol.Volume 3241, <bookSeriesTitle>Lecture Notes in Computer Science</bookSeriesTitle>. Springer: Budapest, Hungary, September 2004; pp.97-104.
[27]
Browne S, Dongarra J, Garner N, Ho G, Mucci P. A portable programming interface for performance evaluation on modern processors. Internation Journal of High Performance Computing and Applications 2000; Volume 14 Issue 3: pp.189-204.
[28]
Kufrin R. Perfsuite: an accessible, open source performance analysis environment for linux. Proceedings of the 6th International Conference on Linux Clusters: The HPC Revolution 2005 LCI-05, Chapel Hill, NC, 2005. Available from: "http://www.linuxclustersinstitute.org/conferences/archive/2005/PDF/21-Kufrin_R.pdf".
[29]
Knüpfer A, Brunst H, Doleschal J, Jurenz M, Lieber M, Mickler H, Müller M, Nagel W. The vampir performance analysis tool-set. Proceedings of the 2nd International Workshop on Parallel Tools for High Performance Computing HLRS, Stuttgart, Germany, 2008; pp.139-155.
[30]
Chassin de Kergommeaux J, <familyNamePrefix>de</familyNamePrefix>Oliveira Stein B, Bernard P. Pajé, an interactive visualization tool for tuning multi-threaded parallel applications. Parallel Computing 2000; Volume 26 Issue 10: pp.1253-1274.
[31]
Desprez F, Markomanolis GS, Quinson M, Suter F. Assessing the performance of mpi applications through time-independent trace replay. Proceedings of the 2nd International Workshop on Parallel Software Tools and Tool Infrastructures PSTI, Taipei, Taiwan, 2011; pp.467-476.
[32]
Desprez F, Markomanolis GS, Suter F. Improving the accuracy and efficiency of time-independent trace replay. Proceedings of the 3rd International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems PMBS, Salt Lake City, UT, 2012; pp.446-455.
[33]
Bailey D, Barszcz E, Barton J, Browning D, Carter R, Dagum L, Fatoohi R, Frederickson P, Lasinski T, Schreiber R, Simon H, Venkatakrishnan V, Weeratunga S. The NAS parallel benchmarks - summary and preliminary results. Proceedings of Supercomputing '91, Albuquerque, NM, 1991; pp.158-165.
[34]
Jeannot E. Improving middleware performance with adoc: an adaptive online compression library for data transfer. Proceedings of the 19th International Parallel and Distributed Processing Symposium, Denver, CO, 2005.
[35]
Top 500 Supercomputer Sites. Available from: "http://www.top500.org/" {Accessed on February 2014}.
[36]
Reussner R, Sanders P, Träff JL. SKaMPI: a comprehensive benchmark for public benchmarking of mpi. Scientific Programming 2002; Volume 10 Issue 1: pp.55-65.
[37]
Genovese L, Neelov A, Goedecker S, Deutsch T, Ghasemi SA, Willand A, Caliste D, Zilberberg O, Rayson M, Bergman A, Schneider R. Daubechies wavelets as a basis set for density functional pseudopotential calculations. Journal of Chemical Physics 2008; Volume 129 Issue 014109. Available from: "http://scitation.aip.org/content/aip/journal/jcp/129/1/10.1063/1.2949547".
[38]
Mont-Blanc: european approach towards energy efficient high performance. Montblanc. Available from: "http://www.montblanc-project.eu/" {Accessed on February 2014}.
[39]
Markomanolis GS, Suter F. Time-independent trace acquisition framework - a grid'5000 How-to. Technical Report RT-0407, Institut National de Recherche en Informatique et en Automatique INRIA, 2011. Available from: "http://hal.inria.fr/inria-00586052" {Accessed on February 2014}.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Concurrency and Computation: Practice & Experience
Concurrency and Computation: Practice & Experience  Volume 27, Issue 5
April 2015
306 pages

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 10 April 2015

Author Tags

  1. MPI
  2. performance prediction
  3. simulation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Simulation-based optimization and sensibility analysis of MPI applicationsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2022.04.002166:C(111-125)Online publication date: 1-Aug-2022
  • (2019)SketchDLCACM Transactions on Architecture and Code Optimization10.1145/331257016:2(1-26)Online publication date: 18-Apr-2019
  • (2017)Simulating MPI Applications: The SMPI ApproachIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2017.266930528:8(2387-2400)Online publication date: 1-Aug-2017
  • (2017)Predictive communication modeling for HPC applicationsCluster Computing10.1007/s10586-017-0821-820:3(2725-2747)Online publication date: 1-Sep-2017

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media