Article

Characteristics of the unexpected message queue of MPI applications

Authors:

EuroMPI'10: Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface

Pages 179 - 188

Published: 12 September 2010 Publication History

Abstract

High Performance Computing systems are used on a regular basis to run a myriad of application codes, yet a surprising dearth of information exists with respect to communications characteristics. Even less information is available on the low-level communication libraries, such as the length of MPI Unexpected Message Queues (UMQs) and the length of time such messages spend in these queues. Such information is vital to developing appropriate strategies for handling such data at the library and system level. In this paper we present data on the communication characteristics of three applications GTC, LSMS, and S3D. We present data on the size of their UMQ, the time spend searching the UMQ and the length of time such messages spend in these queues. We find that for the particular inputs used, these applications have widely varying characteristics with regard to UMQ length and show patterns for specific applications which persist over various scales.

References

[1]

Bailey, D.H., et al: The NAS parallel benchmark. Technical report, NAS Applied Research Branch (1994).

Google Scholar

[2]

Brightwell, R., Goudy, S., Rodrigues, A., Underwood, K.D.: Implications of application usage characteristics for collective communication offload. International Journal of High-Performance Computing and Networking - Special Issue: Design and Performance Evaluation of Group Communication in Parallel and Distributed Systems (IJHPCN) 4(3/4), 104-116 (2006).

Digital Library

Google Scholar

[3]

Brightwell, R., Underwood, K.D.: An analysis of NIC resource usage for offloading MPI. In: Parallel and Distributed Processing Symposium, International, vol. 9, p. 183a (2004).

Google Scholar

[4]

Eisenbach, M., Zhou, C.-G., Nicholson, D.M.C., Brown, G., Larkin, J.M., Schulthess, T.C.: A scalable method for ab initio computation of free energies in nanoscale systems. In: SC, Portland, Oregon, USA, November 14-20. ACM, New York (2009).

Digital Library

Google Scholar

[5]

Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, Version 2.2 (September 2009).

Google Scholar

[6]

Gabriel, E., et al.: Open MPI: Goals, concept, and design of a next generation MPI implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/ MPI 2004. LNCS, vol. 3241, pp. 97-104. Springer, Heidelberg (2004).

Google Scholar

[7]

Hawkes, E.R., Sankaran, R., Sutherland, J.C., Chen, J.H.: Direct numerical simulation of turbulent combustion: fundamental insights towards predictive models. Journal of Physics 16, 65-79 (2005).

Crossref

Google Scholar

[8]

Jones, T., et al: MPI Peruse - an MPI extension for revealing unexposed implementation information. Internet (May 2006), http://www.mpi-peruse.org

Google Scholar

[9]

Keller, R., Bosilca, G., Fagg, G., Resch, M.M., Dongarra, J.J.: Implementation and usage of the PERUSE-interface in open MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 347-355. Springer, Heidelberg (2006).

Digital Library

Google Scholar

[10]

Keller, R., Graham, R.L.: MPI queue characteristics of large-scale applications. In: Cray User Group (May 2010) (Submitted for publication).

Google Scholar

[11]

Koop, M.J., Sridhar, J.K., Panda, D.K.: TupleQ: Fully-asynchronous and zerocopy MPI over InfiniBand. In: International Parallel and Distributed Processing Symposium (IPDPS), pp. 1-8 (May 2009).

Digital Library

Google Scholar

[12]

Lin, Z., Hahm, T.S., Lee, W.-L.W., Tang, W.M., White, R.B.: Turbulent transport reduction by zonal flows: Massively parallel simulations. Science 281, 1835-1837 (1998).

Crossref

Google Scholar

[13]

Wang, Y., Stocks, G.M., Shelton, W.A., Nicholson, D.M.C., Temmerman, W.M., Szotek, Z.: Order-n multiple scattering approach to electronic structure calculations. Phys. Rev. Lett. 75(11), 2867-2870 (1995).

Crossref

Google Scholar

[14]

Wu, X., Taylor, V.: Using processor partitioning to evaluate the performance of MPI, OpenMP and hybrid parallel applications on dual- and quad-core Cray XT4 systems. In: Cray User Group Conference, May 4-7 (2009).

Google Scholar

Cited By

View all

Marts WDosanjh MSchonbein WGrant RBridges P(2019)MPI tag matching performance on ConnectX and ARMProceedings of the 26th European MPI Users' Group Meeting10.1145/3343211.3343224(1-10)Online publication date: 11-Sep-2019
https://dl.acm.org/doi/10.1145/3343211.3343224
Hermanns MHjlem NKnobloch MMohror KSchulz M(2018)Enabling callback-driven runtime introspection via MPI_TProceedings of the 25th European MPI Users' Group Meeting10.1145/3236367.3236370(1-10)Online publication date: 23-Sep-2018
https://dl.acm.org/doi/10.1145/3236367.3236370
Bienz AGropp WOlson L(2018)Improving Performance Models for Irregular Point-to-Point CommunicationProceedings of the 25th European MPI Users' Group Meeting10.1145/3236367.3236368(1-8)Online publication date: 23-Sep-2018
https://dl.acm.org/doi/10.1145/3236367.3236368
Show More Cited By

Characteristics of the unexpected message queue of MPI applications

Recommendations

An Overview of MPI Characteristics of Exascale Proxy Applications
High Performance Computing
Abstract
The scale of applications and computing systems is tremendously increasing and needs to increase even more to realize exascale systems. As the number of nodes keeps growing, communication has become key to high performance.
The Message Passing ...
The cyclic queue and the tandem queue

We consider a closed queueing network, consisting of two FCFS single server queues in series: a queue with general service times and a queue with exponential service times. A fixed number $$N$$ N of customers cycle through this network. We determine the joint ...
Implementing multidisciplinary and multi-zonal applications using MPI
FRONTIERS '95: Proceedings of the Fifth Symposium on the Frontiers of Massively Parallel Computation (Frontiers'95)

Multidisciplinary and multi-zonal applications are codes where two or more distinct parallel programs or copies of a single program are utilized to model a single problem. To support such applications, a program can be divided into several single ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

EuroMPI'10: Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface

September 2010

307 pages

ISBN:3642156452

Editors:
Rainer Keller
High Performance Computing Center Stuttgart, Stuttart, Germany
,
Edgar Gabriel
University of Houston, Parallel Software Technologies Laboratory, Houston, TX
,
Michael Resch
High Performance Computing Center Stuttgart, Stuttart, Germany
,
Jack Dongarra
University of Tennessee, Department of Electrical Engineering and Computer Science, Knoxville,TN

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 September 2010

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 28 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Marts WDosanjh MSchonbein WGrant RBridges P(2019)MPI tag matching performance on ConnectX and ARMProceedings of the 26th European MPI Users' Group Meeting10.1145/3343211.3343224(1-10)Online publication date: 11-Sep-2019
https://dl.acm.org/doi/10.1145/3343211.3343224
Hermanns MHjlem NKnobloch MMohror KSchulz M(2018)Enabling callback-driven runtime introspection via MPI_TProceedings of the 25th European MPI Users' Group Meeting10.1145/3236367.3236370(1-10)Online publication date: 23-Sep-2018
https://dl.acm.org/doi/10.1145/3236367.3236370
Bienz AGropp WOlson L(2018)Improving Performance Models for Irregular Point-to-Point CommunicationProceedings of the 25th European MPI Users' Group Meeting10.1145/3236367.3236368(1-8)Online publication date: 23-Sep-2018
https://dl.acm.org/doi/10.1145/3236367.3236368
Ghazimirsaeed SGrant RAfsahi A(2018)A Dedicated Message Matching Mechanism for Collective CommunicationsWorkshop Proceedings of the 47th International Conference on Parallel Processing10.1145/3229710.3229712(1-10)Online publication date: 13-Aug-2018
https://dl.acm.org/doi/10.1145/3229710.3229712
Dosanjh MGhazimirsaeed SGrant RSchonbein WLevenhagen MBridges PAfsahi A(2018)The Case for Semi-Permanent Cache OccupancyProceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225130(1-11)Online publication date: 13-Aug-2018
https://dl.acm.org/doi/10.1145/3225058.3225130
Ferreira KLevy SPedretti KGrant RPeña ABalaji PGropp WThakur R(2017)Characterizing MPI matching via trace-based simulationProceedings of the 24th European MPI Users' Group Meeting10.1145/3127024.3127040(1-11)Online publication date: 25-Sep-2017
https://dl.acm.org/doi/10.1145/3127024.3127040
Tampouratzis NMattheakis PPapaefstathiou I(2016)Accelerating Intercommunication in Highly Parallel SystemsACM Transactions on Architecture and Code Optimization10.1145/300571713:4(1-25)Online publication date: 2-Dec-2016
https://dl.acm.org/doi/10.1145/3005717
Zounmevo JAfsahi A(2014)A fast and resource-conscious MPI message queue mechanism for large-scale jobsFuture Generation Computer Systems10.1016/j.future.2013.07.00330:C(265-290)Online publication date: 1-Jan-2014
https://dl.acm.org/doi/10.1016/j.future.2013.07.003
Mattheakis PPapaefstathiou I(2013)Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systemsACM Transactions on Architecture and Code Optimization10.1145/2400682.24007109:4(1-25)Online publication date: 20-Jan-2013
https://dl.acm.org/doi/10.1145/2400682.2400710

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

An Overview of MPI Characteristics of Exascale Proxy Applications

The cyclic queue and the tandem queue

Implementing multidisciplinary and multi-zonal applications using MPI