Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1007/11532378_13guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A code isolator: isolating code fragments from large programs

Published: 22 September 2004 Publication History

Abstract

In this paper, we describe a tool we have developed called a code isolator. We envision such a tool will facilitate many software development activities in complex software systems, but we are using it to isolate code segments from large scientific and engineering codes, for the purposes of performance tuning. The goal of the code isolator is to provide an executable version of a code segment and representative data that mimics the performance of the code in the full program. The resulting isolated code can be used in performance tuning experiments, requiring just a tiny fraction of the execution time of the code when executing within the full program. We describe the analyses and transformations used in a code isolator tool, which we have largely automated in the SUIF compiler. We present a case study of its use with LS-DYNA, a large widely-used engineering application. In this paper, we demonstrate how the tool derives code that permits performance tuning for cache. We present results comparing L1 cache misses and execution time for the original program and the isolated program generated by the tool with some manual intervention. We find that the isolated code can be executed 3600 times faster than the original program, and most of the L1 cache misses are preserved. We identify areas where additional analyses can close the remaining gap in predicting and preserving cache misses in the isolated code.

References

[1]
V. Adve, R. Bagrodia, E. Deelman, T. Phan and R. Sakellariou. Compiler-Supported Simulation of Highly Scalable Parallel Applications. In Proceedings of SC99, Nov. 1999.
[2]
V. Adve, V. Lam and B. Ensink. Language and Compiler Support for Adaptive Distributed Applications. In Proc. of the ACM SIGPLAN Workshop on Optimization of Middleware and Distributed Systems (OM 2001) Snowbird, Utah, June 2001.
[3]
N. Baradaran, J. Chame, C. Chen, P. Diniz, M. Hall, Y. Lee, B. Liu and R. Lucas. ECO: an Empirical-based Compilation and Optimization System. In Proc. of the Workshop on Next Generation Software, held in conjunction with IPDPS '03, April, 2003.
[4]
J. Bilmes, K. Asanovic, C.-W. Chen, J. Demmel. Optimizing Matrix Multiply using PHiPAC: a Portable High-Performance ANSI-C Coding Methodology. In Proc. of the ACM International Conference on Supercomputing, 1997.
[5]
S. Chatterjee, E. Parker, P. J. Hanlon, A. R. Lebeck. Exact Analysis of the Cache Behavior of Nested Loops. In Proc. of the 2001 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'01), ACM Press, pp. 286-297, June 2001.
[6]
P. Diniz, Y. Lee, M. Hall and R. Lucas. A Case Study Using Empirical Optimization for a Large, Engineering Application In Proc. of the Workshop on Next Generation Software, held in conjunction with IPDPS '04, April, 2004.
[7]
M. Frigo. A Fast Fourier Transform Compiler. In the Proc. of the 1999 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '99), ACM Press, June 1999.
[8]
M. Hall, S. Amarasinghe, B. Murphy, S. Liao, M. Lam. Interprocedural Parallelization Analysis in SUIF. In ACM Transactions on Programming Languages and Systems, 2004.
[9]
T. Kurc, M. Uysal, H. Eom, J. Hollingsworth, J. Saltz, A. Sussman. Efficient Performance Prediction for Large-Scale Data-Intensive Applications. The International Journal of High Performance Computing Applications, Volume 14, number 3, pages 216-227, 2000.
[10]
LS-DYNA User's Manual V. 960. Livermore Software Technology Corporation, http://www.lstc.com, March 2001.
[11]
MIPSpro C and C++ Pragmas. Document Number 007-3587-003, 1998, 1999 Silicon Graphics, Inc.
[12]
T. Sherwood, E. Perelman, G. Hamerly, and B. Calder. Automatically Characterizing Large Scale Program Behavior. In Proceeding of the International Conference on Architectural Support for Programming Languages and Operating Systems, Oct. 2002.
[13]
M. Uysal, T.M. Kurc, A. Sussman, and J. Saltz. A Performance Prediction Framework for Data Intensive Applications on Large Scale Parallel Machines. Lecture Notes in Computer Science, 4th International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers, Pages: 243 - 258, 1998.
[14]
J.S. Vetter and P. Worley. Asserting Performance Expectations. Proceedings of SC 2002, Nov. 2002.
[15]
M. Voss and R. Eigenmann. High-Level Adaptive Program Optimization with ADAPT. In Proc. of the ACM SIGPLAN Conference on Principles and Practice of Parallel Processing (PPoPP'01), ACM Press, June, 2001.
[16]
C. Whaley and J. Dongarra. Automatically tuned linear algebra software. In Proc. of Supercomputing (SC'98), 1998.
[17]
J. Xiong, J. Johnson, R. Johnson and D. Padua. SPL: A Language and Compiler for DSP Algorithms. In Proc. of the ACM 2001 Conference on Programming Language Design and Implementation (PLDI'01), ACM Press, June 2001.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
LCPC'04: Proceedings of the 17th international conference on Languages and Compilers for High Performance Computing
September 2004
484 pages
ISBN:354028009X
  • Editors:
  • Rudolf Eigenmann,
  • Zhiyuan Li,
  • Samuel P. Midkiff

Sponsors

  • International Business Machines Corporation: International Business Machines Corporation
  • National Science Foundation, USA

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 22 September 2004

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2016)KGENProcedia Computer Science10.1016/j.procs.2016.05.46680:C(1450-1460)Online publication date: 1-Jun-2016
  • (2015)CEREACM Transactions on Architecture and Code Optimization10.1145/272471712:1(1-24)Online publication date: 16-Apr-2015
  • (2014)Fine-grained Benchmark Subsetting for System SelectionProceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization10.1145/2581122.2544144(132-142)Online publication date: 15-Feb-2014
  • (2014)Fine-grained Benchmark Subsetting for System SelectionProceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization10.1145/2544137.2544144(132-142)Online publication date: 15-Feb-2014
  • (2012)Portable section-level tuning of compiler parallelized applicationsProceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis10.5555/2388996.2389009(1-11)Online publication date: 10-Nov-2012
  • (2009)Effective source-to-source outlining to support whole program empirical optimizationProceedings of the 22nd international conference on Languages and Compilers for Parallel Computing10.1007/978-3-642-13374-9_21(308-322)Online publication date: 8-Oct-2009
  • (2005)Trust but verifyProceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming10.1145/1065944.1065971(196-205)Online publication date: 15-Jun-2005

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media