Abstract
As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction of the system peak performance is continuously growing. Tools are needed that give developers and computing center staff holistic indicators about the resource consumption of applications and potential performance pitfalls at scale. To use the full potential of a supercomputer today, applications must incorporate multilevel parallelism (threading and message passing) and carefully orchestrate file I/O. As a consequence, performance tools must also be able to monitor these system components in an integrated way and at the full machine scales. We present ipm, a modularized monitoring approach for MPI, OpenMP, file I/O, and other event sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Binet, S., Winklmeyer, F., Wiedenmann, W., Calafiura, P., Snyder, S.: Harnessing multicores: Strategies and implementations in ATLAS. In: Proceedings of the 17th International Conference on Computing in High Energy and Nuclear Physics (CHEP 2009), Prague, Czech Republic (2009)
Using Cray performance analysis tools. http://docs.cray.com/books/S-2376-41/S-2376-41.pdf.
Fuerlinger, K., Wright, N.J., Skinner, D.: Effective performance measurement at petascale using ipm. In: Proceedings of The Sixteenth IEEE International Conference on Parallel and Distributed Systems (ICPADS 2010), Shanghai, China, December (2010)
Fürlinger, K., Gerndt, M. ompP: A profiling tool for OpenMP. In: Proceedings of the First International Workshop on OpenMP (IWOMP 2005), Eugene, Oregon, USA, May (2005)
Geimer, M., Wolf, F., Wylie, B.J.N., Mohr, B.: Scalable parallel trace-based performance analysis. In: Proceedings of the 13th European PVM/MPI Users’ Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface (EuroPVM/MPI 2006), pp. 303–312. Bonn, Germany (2006)
Intel Thread Profiler http://www.intel.com/software/products/threading/tp/.
Intel Trace Analyzer http://www.intel.com/software/products/cluster/tanalyzer/.
Allen, D.M., Sameer, S.S.: Performance technology for complex parallel and distributed systems. pp. 37–46 (2000)
Mohr, B., Malony, A.D., Hoppe, H.-C., Schlimbach, F., Haab, G., Hoeflinger, J., Shah, S.: A performance monitoring interface for OpenMP. In: Proceedings of the Fourth Workshop on OpenMP (EWOMP 2002), Rome, Italy September (2002)
Mohr, B., Malony, A.D., Shende, S.S., Wolf, F.: Towards a performance tool interface for OpenMP: An approach based on directive rewriting. In: Proceedings of the Third Workshop on OpenMP (EWOMP’01), September (2001)
Nakhimovsky, G.: Debugging and performance tuning with library interposers, July 2001. http://developers.sun.com/solaris/articles/lib_interposers.html.
PAPI web page: http://icl.cs.utk.edu/papi/.
Roth, P.C., Arnold, D.C., Miller, B.P. MRNet: A software-based multicast/reduction network for scalable tools. In: Proceedings of the 2003 Conference on Supercomputing (SC 2003), Phoenix, Arizona, USA, November (2003)
Shende, S.S., Malony, A.D.: The TAU parallel performance system. International Journal of High Performance Computing Applications, ACTS Collection Special Issue (2005)
Skinner, D.: Integrated Performance Monitoring: A portable profiling infrastructure for parallel applications. In: Proceedings ISC2005: International Supercomputing Conference, Heidelberg, Germany (2005)
Szebenyi, Z., Wylie, B.J.N., Wolf, F.: Scalasca parallel performance analyses of PEPC. In: Proceedings of the Workshop on Productivity and Performance (PROPER 2008) at EuroPar 2008, Las Palmas de Gran Canaria, Spain (2008)
Tallent, N.R., Mellor-Crummey, J., Adhianto, L., Fagan, M.W., Krentel, M.: Diagnosing performance bottlenecks in emerging petascale applications. In: SC ’09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pages 1–11, New York, NY, USA, ACM (2009)
Tallent, N.R., Mellor-Crummey, J.M.: Effective performance measurement and analysis of multithreaded applications. SIGPLAN Not. 44(4), 229–240 (2009)
The Top 500 Supercomputer Sites, web page: http://www.top500.org.
Wright, N.J., Pfeiffer, W., Snavely, A.: Characterizing parallel scaling of scientific applications using IPM. In: The 10th LCI International Conference on High-Performance Clustered Computing, March 10–12 (2009)
Acknowledgements
This work was supported by the Bavaria-California Technology Center (BaCaTec) throughout the project “Performance and Workload Characterization for Multi-Core Supercomputers” and by the NSF under award OCI-0721397. This research was also supported by an allocation of advanced computing resources provided by the National Science Foundation. The computations were performed on Kraken (a Cray XT5) at the National Institute for Computational Sciences.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Fürlinger, K., Wright, N.J., Skinner, D., Klausecker, C., Kranzlmüller, D. (2011). Effective Holistic Performance Measurement at Petascale Using IPM. In: Bischof, C., Hegering, HG., Nagel, W., Wittum, G. (eds) Competence in High Performance Computing 2010. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24025-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-24025-6_2
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24024-9
Online ISBN: 978-3-642-24025-6
eBook Packages: Computer ScienceComputer Science (R0)