Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1375527.1375537acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article

Preserving time in large-scale communication traces

Published: 07 June 2008 Publication History

Abstract

Analyzing the performance of large-scale scientific applications is becoming increasingly difficult due to the sheer size of performance data gathered. Recent work on scalable communication tracing applies online interprocess compression to address this problem. Yet, analysis of communication traces requires knowledge about time progression that cannot trivially be encoded in a scalable manner during compression. We develop scalable time stamp encoding schemes for communication traces.
At the same time, our work contributes novel insights into the scalable representation of time stamped data. We show that our representations capture sufficient information to enable what-if explorations of architectural variations and analysis for path-based timing irregularities while not requiring excessive disk space. We evaluate the ability of several time-stamped compressed MPI trace approaches to enable accurate timed replay of communication events. Our lossless traces are orders of magnitude smaller, if not near constant size, regardless of the number of nodes while preserving timing information suitable for application tuning or assessing requirements of future procurements. Our results prove time-preserving tracing without loss of communication information can scale in the number of nodes and time steps, which is a result without precedent.

References

[1]
The ASCI purple benchmarks.http://www.llnl.gov/asci/purple/benchmarks, 2002.]]
[2]
N. Adiga and et al. An overview of the BlueGene/Lsupercomputer. In Supercomputing, November 2002.]]
[3]
Dorian C. Arnold, Dong H. Ahn, Bronis R. de Supinski,Gregory L. Lee, Barton P. Miller, and Martin Schulz. Stack trace analysis for large scale debugging. In International Parallel and Distributed Processing Symposium, 2007.]]
[4]
Daniel Becker, Felix Wolf, Wolfgang Frings, Markus Geimer,Brian J.N. Wylie, and Bernd Mohr. Automatic trace-based performance analysis of metacomputing applications. In International Parallel and Distributed Processing Symposium, 2007.]]
[5]
Holger Brunst, Hans-Christian Hoppe, Wolfgang E. Nagel, and Manuela Winkler. Performance optimization for large scale computing: The scalable VAMPIR approach. In International Conference on Computational Science (2),pages 751--760, 2001.]]
[6]
Marc Casas, Rosa Badia, and Jesus Labarta. Automatic structure extraction from mpi applications tracefiles. In Euro-Par Conference, August 2007.]]
[7]
JaeWoong Chung, Chi Cao Minh, Austen McDonald, Travis Skare, Hassan Chafi, Brian D. Carlstrom, Christos Kozyrakis, and Kunle Olukotun. Tradeoffs in transactional memory virtualization. In Architectural Support for Programming Languages and Operating Systems, 2006.]]
[8]
F. Freitag, J. Caubet, and J. Labarta. On the scalability of tracing mechanisms. In Euro-Par Conference, pages 97--104, August 2002.]]
[9]
M. Geimer, F. Wolf, B. Wylie, and B. Mohr. Scalable parallel trace-based performance analysis. In European PVM/MPI Users' Group Meeting, 2007.]]
[10]
Paul Havlak and Ken Kennedy. An implementation of interprocedural bounded regular section analysis. IEEE Transactions on Parallel and Distributed Systems, 2(3):350--360, July 1991.]]
[11]
A. Knu"pfer, R. Brendel, H. Brunst, H. Mix, and W. E. Nagel. Introducing the open trace format (OTF). In International Conference on Computational Science, pages 526--533, May 2006.]]
[12]
Andreas Knupfer. Construction and compression of complete call graphs for post-mortem program trace analysis. In International Conference on Parallel Processing, pages 165--172, 2005.]]
[13]
D. E. Knuth. The Art of Computer Programming: Fundamental Algorithms, volume 2. Addison-Wesley, 2edition, 1973.]]
[14]
J. Marathe, F. Mueller, T. Mohan, B. R. de Supinski, S. A.McKee, and A. Yoo. METRIC: Tracking down inefficiencies in the memory hierarchy via binary rewriting. In International Symposium on Code Generation and Optimization, pages 289-300, March 2003.]]
[15]
M. Mesnier, M. Wachs, R. Sambasivan, J. Lopez, J. Hendricks, and G. R. Ganger. //trace: Parallel trace replay with approximate causal events. In USENIX Conference on File and Storage Technologies, February 2007.]]
[16]
W. E. Nagel, A. Arnold, M. Weber, H. C. Hoppe, and K. Solchenbach. VAMPIR: Visualization and analysis of MPIresources. Supercomputer, 12(1):69--80, 1996.]]
[17]
Marcin Neyman, Michal Bukowski, and Piotr Kuzora.Efficient replay of PVM programs. In European PVM/MPI Users' Group Meeting on Recent Advances in Parallel VirtualMachine and Message Passing Interface, pages 83--90, 1999.]]
[18]
M. Noeth, F. Mueller, M. Schulz, and B. R. de Supinski. Scalable compression and replay of communication traces in massively parallel environments. In International Parallel and Distributed Processing Symposium, April 2007.]]
[19]
V. Pillet, J. Labarta, T. Cortes, and S. Girona. PARAVER: A tool to visualise and analyze parallel code. In Proceedings of WoTUG-18: Transputer and occam Developments,volume 44 of Transputer and Occam Engineering, pages 17--31, April 1995.]]
[20]
Philip C. Roth, Dorian C. Arnold, and Barton P. Miller. MRNet: A software-based multicast/reduction network for scalable tools. In Supercomputing, pages 21--36, Washington, DC, USA, 2003. IEEE Computer Society.]]
[21]
Martin Schulz and Bronis R. de Supinski. PNMPI tools: A whole lot greater than the sum of their parts. In Supercomputing, 2007.]]
[22]
J. Vetter and M. McCracken. Statistical scalability analysis of communication operations in distributed applications. In ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2001.]]
[23]
F. Wong, R. Martin, R. Arpaci-Dusseau, and D. Culler. Architectural requirements and scalability of the NAS parallel benchmarks. In Supercomputing, 1999.]]
[24]
O. Zaki, E. Lusk, W. Gropp, and D. Swider. Toward scalable performance visualization with Jumpshot. International Journal of High Performance Computing Applications,13(3):277--288, 1999.]]

Cited By

View all
  • (2023)Structure-Based Communication Trace CompressionPerformance Analysis of Parallel Applications for HPC10.1007/978-981-99-4366-1_3(43-69)Online publication date: 19-Jun-2023
  • (2017)ScalaIOExtrap: Elastic I/O Tracing and Extrapolation2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2017.45(585-594)Online publication date: May-2017
  • (2016)DwarfCode: A Performance Prediction Tool for Parallel ApplicationsIEEE Transactions on Computers10.1109/TC.2015.241752665:2(495-507)Online publication date: 1-Feb-2016
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '08: Proceedings of the 22nd annual international conference on Supercomputing
June 2008
390 pages
ISBN:9781605581583
DOI:10.1145/1375527
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 June 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. high-performance computing
  2. message passing
  3. tracing

Qualifiers

  • Research-article

Conference

ICS08
Sponsor:
ICS08: International Conference on Supercomputing
June 7 - 12, 2008
Island of Kos, Greece

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Structure-Based Communication Trace CompressionPerformance Analysis of Parallel Applications for HPC10.1007/978-981-99-4366-1_3(43-69)Online publication date: 19-Jun-2023
  • (2017)ScalaIOExtrap: Elastic I/O Tracing and Extrapolation2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2017.45(585-594)Online publication date: May-2017
  • (2016)DwarfCode: A Performance Prediction Tool for Parallel ApplicationsIEEE Transactions on Computers10.1109/TC.2015.241752665:2(495-507)Online publication date: 1-Feb-2016
  • (2016)Structural Clustering: A New Approach to Support Performance Analysis at Scale2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS.2016.27(484-493)Online publication date: May-2016
  • (2016)Benchmark Generation and Simulation at Extreme ScaleProceedings of the 20th International Symposium on Distributed Simulation and Real-Time Applications10.1109/DS-RT.2016.18(9-18)Online publication date: 21-Sep-2016
  • (2016)Automated and dynamic abstraction of MPI application performanceCluster Computing10.1007/s10586-016-0615-419:3(1105-1137)Online publication date: 1-Sep-2016
  • (2015)HPC I/O trace extrapolationProceedings of the 4th Workshop on Extreme Scale Programming Tools10.1145/2832106.2832108(1-6)Online publication date: 15-Nov-2015
  • (2015)Simulation of MPI applications with time-independent tracesConcurrency and Computation: Practice & Experience10.1002/cpe.327827:5(1145-1168)Online publication date: 10-Apr-2015
  • (2014)A methodology for automatic generation of executable communication specifications from parallel MPI applicationsACM Transactions on Parallel Computing10.1145/26602491:1(1-30)Online publication date: 3-Oct-2014
  • (2014)CypressProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2014.17(143-153)Online publication date: 16-Nov-2014
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media