Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/224170.224356acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article
Free access

Predicting application behavior in large scale shared-memory multiprocessors

Published: 08 December 1995 Publication History

Abstract

In this paper we present an analytical-based framework for parallel program performance prediction. The main thrust of this work is to provide a means for treating realistic applications within a single unified framework. Our approach is based upon the specification of a set of non-linear equations which describe the application, processor configuration, network and memory operations. These equations are solved iteratively since the application execution rate depends on the communication latencies. The iterative solution technique is found to be efficient as it typically requires only few iterations to reach convergence. Our modeling methodology achieves a good balance between abstraction and accuracy. This is attained by accounting for both time and space dimensions of memory references, while maintaining a simple description of the workload. We demonstrate both the practicality and the accuracy of our approach by comparing predicted results with measurements taken on a commercial multiprocessor system. We found the model to be faithful in reflecting changes in processor speed, and changes in the number and placement of allocated processors.

References

[1]
V. S. Adve. Analyzing the Behavior and Performance of Parallel Programs. PhD thesis, University of Wisconsin- Madison, 1993.]]
[2]
H. H. Ammar, S. M. R. Islam, M. Ammar, and S. Deng. Performance modeling of parallel algorithms. In International Conference on Parallel Processing, pages 68-71, August 1990.]]
[3]
D. Bailey, E. Barszcz, J. Barton, D. Browning R. Carter, L. Dagum, R. Fatoohi, S. Fineberg, P. Frederickson, T. Lasinski, R. Schreider, H. Simon, V. Venkatakrishnan, and S. Weeratunga. The NAS parallel benchmarks. Technical Report RNR-94-007, NASA Ames Research Center, Moffet Field, CA, March 1994.]]
[4]
R. Bianchini, M. E. Crovella, L. Kontothanassis, and T. J. LeBlanc. Memory contention in scalable cachecoherent multiprocessors. Technical Report 448, University of Rochester, Computer Science Department, April 1993.]]
[5]
T. Ball and J. R. Larus. Optimally profiling and tracing programs. In Conference Record of the 19th Annual ACM Symposium on Principles of Programming Languages, pages 59-70, 1992.]]
[6]
T. M. Chilimbi and J. R. Larus. Cachier: A tool for automatically inserting CICO annotations. In International Conference on Parallel Processing, volume II, pages 89-98. IEEE, August 1994.]]
[7]
T. Dunigan. Kendall Square multiprocessor: Early experiences and performance. Technical Report ORNL/TM- 12065, Oak Ridge National Laboratory, March 1992.]]
[8]
J. A. Fisher and S. M. Freudenberger. Predicting conditional branch directions from previous runs of a program. In 5th International Conference on Architectural Support for Programming Languages and Operating Systems, pages 85-95. ACM, September 1992.]]
[9]
R. Foster. Personal communication, 1993. Kendall Square Research.]]
[10]
A. J. Goldberg and J. L. Hennessy. Performance debugging shared memory multiprocessor programs with Mtool. In Supercomputing'91, pages 481-490. ACM, November 1991.]]
[11]
G. H. Golub and C. F. Van Loan. Matrix Computations. Second Edition, Chap. 10, The Johns Hopkins University Press, 1989.]]
[12]
M. A. Holliday and M. Stumm. Performance evaluation of hierarchical ring-based shared memory multiprocessors. IEEE Transactions on Computers, 1(43):52-67, January 1994.]]
[13]
H. Li and K. C. Sevcik. NUMACROS: Data parallel programming on NUMA multiprocessors. In Proceedings of Fourth Symposium on Experiences with Distributed and Multiprocessor Systems (SEDMS IV), pages 247-263. USENIX, September 1993.]]
[14]
E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik. Quantitative System Performance. Prentice-Hall, Englewood Cliffs, N.J, 1984.]]
[15]
M. Martonosi, A. Gupta, and T. Anderson. MemSpy: Analyzing memory system bottlenecks in programs. In Proc. of ACM SIGMETRICS Conf. on Measurement and Modeling of Computer Systems, pages 1-12, May 1992.]]
[16]
V. W. Mak and S. F. Lundstrom. Predicting performance of parallel computations. IEEE Transactions on Parallel and Distributed Systems, 1(3):257-270, July 1990.]]
[17]
S. Madala and J. N. Sinclair. Performance of synchronous parallel algorithms with regular structures. IEEE Transactions on Parallel and Distributed Systems, 2(1):105-116, January 1991.]]
[18]
C. Natarajan, S. Sharma, and R. K. Iyer. Measurementbased characterization of global memory and network contention, operating system and parallelization overheads: Case study on a shared-memory multiprocessor. In 21st International Symposium on Computer Architecture, pages 71-80. ACM, May 1994.]]
[19]
E. Rothberg. Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization. PhD thesis, Stanford University, 1993.]]
[20]
J. P. Singh. Parallel hierarchical N-body methods and their implications for multiprocessors. PhD thesis, Stanford University, 1993.]]
[21]
J. P. Singh, E. Rothberg, and A. Gupta. Modeling communication in parallel algorithms: A fruitful interaction between theory and systems. In 6th Symposium on Parallel Algorithms and Architectures, pages 189-199. ACM, June 1994.]]
[22]
A. Sivasubramaniam, A. Singla, U. Ramachandran, and H. Venkateswaran. An approach to scalability study of shared memory parallel systems. In Proc. of ACM SIG- METRICS Conf. on Measurement and Modeling of Computer Systems, pages 171-179, May 1994.]]
[23]
D. F. Vrsalovic, D. P. Siewiorek, and Z. Z. Segal E. F. Gehringer. Performance prediction and calibration for a class of multiprocessors. IEEE Transactions on Computers, 37(11):1353-1365, November 1988.]]
[24]
T. D. Wagner, E. Smirni, A. W. Apon, M. Madhukar, and L. W. Dowdy. Measuring the effects of thread placement on the Kendall Square KSR1. Technical Report ORNL/TM- 12462, Oak Ridge National Laboratory, August 1993.]]

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing
December 1995
875 pages
ISBN:0897918169
DOI:10.1145/224170
  • Chairman:
  • Sid Karin
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 December 1995

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. measurement
  2. modeling
  3. performance evaluation
  4. performance prediction

Qualifiers

  • Article

Conference

SC '95
Sponsor:

Acceptance Rates

Supercomputing '95 Paper Acceptance Rate 69 of 241 submissions, 29%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)6
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (1999)Predictive analysis of a wavefront application using LogGPACM SIGPLAN Notices10.1145/329366.30111734:8(141-150)Online publication date: 1-May-1999
  • (1999)Predictive analysis of a wavefront application using LogGPProceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/301104.301117(141-150)Online publication date: 1-May-1999
  • (1997)LoPCACM SIGPLAN Notices10.1145/263767.26380332:7(276-287)Online publication date: 21-Jun-1997
  • (1997)LoPCProceedings of the sixth ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/263764.263803(276-287)Online publication date: 21-Jun-1997

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media