Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1109/SC.2004.61acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

Towards a Systematic, Pragmatic and Architecture-Aware Program Optimization Process for Complex Processors

Published: 06 November 2004 Publication History

Abstract

Because processor architectures are increasingly complex, it is increasingly difficult to embed accurate machine models within compilers. As a result, compiler efficiency tends to decrease. Currently, the trend is on top-down approaches: static compilers are progressively augmented with information from the architecture as in profile-based, iterative or dynamic compilation techniques. However, for the moment, fairly elementary architectural information is used. In this article, we adopt a bottom-up approach to the architecture complexity issue: we assume we know everything about the behavior of the program on the architecture. We present a manual but systematic process for optimizing a program on a complex processor architecture using extensive dynamic analysis, and we find that a small set of run-time information is sufficient to drive anefficient process. We have experimentally observed on an Alpha 21264 that this approach can yield significant performance improvement on Spec benchmarks, beyond peak Spec. We are currently using this approach for optimizing customer applications.

References

[1]
{1} C. Bastoul, A. Cohen, S. Girbal, S. Sharma, and O. Temam. Putting polyhedral loop transformation to work. In 10th International Workshop on Languages and Compilers for Parallel Computing (LCPC), October 2003.
[2]
{2} K. D. Cooper, D. Subramanian, and L. Torczon. Adaptive optimizing compilers for the 21st century. J. of Supercomputing, 2002.
[3]
{3} J. Dean, J. Hicks, C. Waldspurger, W. Weihl, and G. Chrysos. ProfileMe: Hardware support for instruction level profiling on out-of-order processors. In In Proceedings of the 30th International Symposium on Microarchitecture, NC, Dec. 1997.
[4]
{4} G. Fursin, M. O'Boyle, and P. Knijnenburg. Evaluating iterative compilation. In 11th Workshop on Languages and Compilers for Parallel Computing, LNCS, Washington DC, July 2002. Springer-Verlag.
[5]
{5} S. Girbal, G. Mouchard, A. Cohen, and O. Temam. DiST: A simple, reliable and scalable method to significantly reduce processor architecture simulation time. In Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'03), San Diego, California, June 2003.
[6]
{6} Intel Itanium2 processor reference manual for software development and optimization. http: //developer.intel.com/design/itatium2/manuals.
[7]
{7} T. Kistler and M. Franz. Continuous program optimization : a case study. ACM Transactions on Programming Languages and Systems (TOPLAS), 2003.
[8]
{8} T. Kisuki, P. Knijnenburg, M. O'Boyle, and H. Wijshoff. Iterative compilation in program optimization. In Proc. CPC'10 (Compilers for Parallel Computers), pages 35-44, 2000.
[9]
{9} K. S. McKinley and O. Temam. A quantitative analysis of loop nest locality. In ACM Symp. on Architectural Support for Programming Languages and Operating Systems (ASPLOS'96), 6, pages 94-104, 1996.
[10]
{10} Oprofile project. http://oprofile.sourceforge.net.
[11]
{11} Open research compiler. http://ipf-orc.sourceforge.net.
[12]
{12} D. Parello, O. Temam, and J.-M. Verdun. On increasing architecture awareness in program optimizations to bridge the gap between peak and sustained processor performance? matrix-multiply revisited. In SuperComputing'02, Baltimore, Maryland, Nov. 2002.
[13]
{13} E. Perelman, G. Hamerly, M. Biesbrouck, T. Sherwood, and B. Calder. Using simpoint for accurate and efficient simulation. In Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS'03), San Diego, California, June 2003.
[14]
{14} Perfmon project. http://www.hpl.hp.com/research/linux/perfmon.
[15]
{15} Standard performance evaluation corporation. http://www.spec.org.
[16]
{16} M. Stephenson, S. P. Amarasinghe, M. C. Martin, and U. -M. O'Reilly. Meta optimization: improving compiler heuristics with machine learning. In ACM Symp. on Programming Language Design and Implementation (PLDI'03), pages 77-90, San Diego, California, 2003.
[17]
{17} Intel VTune performance analysers. http://www.intel.com/software/products/vtune.
[18]
{18} R. Wunderlich, T. Wenisch, B. Falsafi, and J. Hoe. Smarts : accelerating microarchitecture simulation via rigorous statistical sampling. In In Proceedings of the 30th International Symposium on Computer Architecture, San Diego, California, June 2003.
[19]
{19} K. Yotov, X. Li, G. Ren, M. Cibulskis, G. DeJong, M. Garzaran, D. Padua, K. Pingali, P. Stodghill, and P. Wu. A comparison of empirical and model-driven optimization. In ACM Symp. on Programming Language Design and Implementation (PLDI'03), San Diego, California, June 2003.

Cited By

View all
  • (2024)Guided Equality SaturationProceedings of the ACM on Programming Languages10.1145/36329008:POPL(1727-1758)Online publication date: 5-Jan-2024
  • (2022)AutoDSE: Enabling Software Programmers to Design Efficient FPGA AcceleratorsACM Transactions on Design Automation of Electronic Systems10.1145/349453427:4(1-27)Online publication date: 12-Feb-2022
  • (2020)A Collaborative Filtering Approach for the Automatic Tuning of Compiler OptimisationsThe 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3372799.3394361(15-25)Online publication date: 16-Jun-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing
November 2004
724 pages
ISBN:0769521533

Sponsors

Publisher

IEEE Computer Society

United States

Publication History

Published: 06 November 2004

Check for updates

Qualifiers

  • Article

Conference

SC '04
Sponsor:

Acceptance Rates

SC '04 Paper Acceptance Rate 60 of 200 submissions, 30%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Guided Equality SaturationProceedings of the ACM on Programming Languages10.1145/36329008:POPL(1727-1758)Online publication date: 5-Jan-2024
  • (2022)AutoDSE: Enabling Software Programmers to Design Efficient FPGA AcceleratorsACM Transactions on Design Automation of Electronic Systems10.1145/349453427:4(1-27)Online publication date: 12-Feb-2022
  • (2020)A Collaborative Filtering Approach for the Automatic Tuning of Compiler OptimisationsThe 21st ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems10.1145/3372799.3394361(15-25)Online publication date: 16-Jun-2020
  • (2016)Architecture-Adaptive Code Variant TuningACM SIGARCH Computer Architecture News10.1145/2980024.287241144:2(325-338)Online publication date: 25-Mar-2016
  • (2016)Architecture-Adaptive Code Variant TuningACM SIGPLAN Notices10.1145/2954679.287241151:4(325-338)Online publication date: 25-Mar-2016
  • (2016)Architecture-Adaptive Code Variant TuningProceedings of the Twenty-First International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/2872362.2872411(325-338)Online publication date: 25-Mar-2016
  • (2012)Using graph-based program characterization for predictive modelingProceedings of the Tenth International Symposium on Code Generation and Optimization10.1145/2259016.2259042(196-206)Online publication date: 31-Mar-2012
  • (2011)Predictive modeling in a polyhedral optimization spaceProceedings of the 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization10.5555/2190025.2190059(119-129)Online publication date: 2-Apr-2011
  • (2010)Practical aggregation of semantical program properties for machine learning based optimizationProceedings of the 2010 international conference on Compilers, architectures and synthesis for embedded systems10.1145/1878921.1878951(197-206)Online publication date: 24-Oct-2010
  • (2007)Rapidly Selecting Good Compiler Optimizations using Performance CountersProceedings of the International Symposium on Code Generation and Optimization10.1109/CGO.2007.32(185-197)Online publication date: 11-Mar-2007
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media