Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/209936.209946acmconferencesArticle/Chapter ViewAbstractPublication PagesppoppConference Proceedingsconference-collections
Article
Free access

High-level optimization via automated statistical modeling

Published: 01 August 1995 Publication History

Abstract

We develop the use of statistical modeling for portable high-level optimizations such as data layout and algorithm selection. We build the models automatically from profiling information, which ensures robust and accurate models that reflect all aspects of the target platform.
We use the models to select among several data layouts for an iterative PDE solver and to select among several sorting algorithms. The selection is correct more than 99% of the time on each of four platforms. In the few cases it selects suboptimally, the selected implementation performs nearly as well; that is, it always makes at least a very good choice. Correct selection is platform and workload dependent and can improve performance by nearly a factor of three.
We also use the models to optimize parameters of these applications automatically. In all cases, the models predicted the optimal parameter setting, resulting in improvements ranging up to a factor of three.
Finally, we use the models to construct portable high-level libraries, which contain multiple implementations and support for automatic selection and parameter optimization of the fastest implementation for the target platform and workload.

References

[1]
F. Alien, M. Burke, P. Charles, R. Cytron, and J. Ferrante. An Overview of the PTRAN Analysis System for Multiprocessing. Journal of Parallel and Distributed Computing, Volume 5, Number 5, pages 617-640, October 1988.
[2]
T.E. Anderson, D. E. Culler and D. A. Patterson. The Case for NOW. IEEE Micro, to appear, 1995.
[3]
R. Barrett, M. Berry, T. F. Chan, J. DemmeI, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, H. van der Vorst. Templates for the Solution of Linear Systems: Building Blocks for lterative Methods. Society for Industrial and Applied Mathematics, 1994.
[4]
E.A. Brewer and B. C. Kuszmaul. How to Get Good Performance from the CM-5 Data Network, Proceedings of the 1994 International Parallel Processing Symposium (IPPS '94), April 1994.
[5]
G.E. Blelloch, C. E. Leiserson, B. M. Maggs, C. G. Plaxton, S. J. Smith, and M. Zagha. A Comparison of Sorting Algorithms for the Connection Machine CM-2. Technical Report 222, Thinking Machines Corporation, 1992.
[6]
E. A. Brewer. Portable High-Performance Supercomputing: High-Level Platform-Dependent Optimization. MIT Ph.D. Dissertation, September 1994.
[7]
M. Chen, Y. I. Choo, and J. Li. Compiling Parallel Programs by Optimizing Performance. Journal of Supercomputing, Volume 2, Number 2, pages 171- 207, October 1988.
[8]
D. Culler, R. Karp, D. Patterson, A. Sahay, K. E. Schauser, E. Santos, R. Subramonian, and T. von Eicken. LogP: Towards a Realistic Model of Parallel Computation. In Proceedings of Principles and Practice of Parallel Processing (PPoPP '93), May 1993.
[9]
M. E. Crovella and T. J. LeBlanc. Parallel Performance Prediction Using Lost Cycles Analysis. Proceedings of Supercomputing '94. November 1994.
[10]
R. Cytron. Useful Parallelism in a Multiprocessor Environment. In the Proceedings of the 1985 International Conference on Parallel Processing, August 1985.
[11]
W. J. Dally et al. The J-Machine: A Fine-Grain Concurrent Computer. In G.X. Ritter. editor, Proceedings of the IFIP Congress, pages 1147--1153. North-Holland, August 1989.
[12]
G.C. Fox, M. A. Johnson, G. A. Lyzenga, S. W. Otto, J. K. Salmon, and D. W. Walker. Solving Problems on Concurrent Processors, Volume 1: General Techniques and Regular Problems. Prentice Hall, 1988.
[13]
lntel Corporation. Paragon XP/S Product Overview, 1991.
[14]
FORE Systems. TCA-IO0 TURBOchannel ATM Computer Interface: User's Manual. 1992.
[15]
K. Gallivan, W. Jalby, A. Malony, and H. Wijshoff. Performance Prediction for Parallel Numerical Algorithms. Technical Report, Center for Supercomputing Research and Development, University of Illinois at Urbana-Champaign, 1991.
[16]
J.S. Huang and Y. C. Chow. Parallel Sorting and Data Partitioning by Sampling. in the Proceedings of the 7th Computer Software and Applications Conference (COMPSAC '83), pages 627-631, November 1983.
[17]
D. Loveman, High-Performance Fortran Proposal. Presented at the Rice University High-Performance Fortran Forum, January 27, 1992. {ftp: titan, cs.rice, edu}
[18]
V. Jacobson. Congestion Avoidance and Control. In the Proceedings of SIGCOMM '88, August 1988.
[19]
J. Kubiatowicz and A. Agarwal. Anatomy of a Message in the Alewife Multiprocessor. Proceedings of the 7th A CM international Conference on Supercomputing, July 1993.
[20]
C.E. Leiserson et al. The Network Architecture of the Connection Machine CM-5. Proceedings of the 1992 Symposium on Parallel Algorithms and Architectures (SPAA '92), revised March 21, 1994.
[21]
R. Milner, M. Tofte, and R. Harper. The Definition of Standard ML. MIT Press, 1990.
[22]
G. Nelson. Systems Programming with Modula-3. Prentice Hall, 1991.
[23]
C. D. Polychronopolous et al. Parafrase-2: An Environment for Parallelizing, Partitioning, Synchronizing, and Scheduling Programs on Multiprocessors. In the Proceedings of the 1989 International Conference on Parallel Processing, Volume II, pages 39-48, August 1989.
[24]
J.H. Reif and L. G. Valiant. A Logarithmic Time Sort for Linear Size Networks. Journal of the A CM, Volume 34, Number 1, pages 60-76, January 1987.
[25]
A. Dain Samples. Profile-Driven Compilation. Ph.D. Dissertation, University of California at Berkeley Technical Report UCB/CSD 91/627, April 1991.
[26]
A. Sussman. Model-Driven Mapping onto Distributed- Memory Parallel Computers. Ph.D. Dissei#ation, Carnegie-Mellon University Technical Report CMU- CS-91-187, September 1991.
[27]
A. Sussman. Model-Driven Mapping onto Distributed- Memory Parallel Computers. In the Proceedings of Supercomputing '92, pages 818-829, August 1992.
[28]
C.A. Thekkath, H. M. Levy and E. D. Lazowska. Efficient Support for Multicomputing on ATM Networks. University of Washington Technical Report 93-04-03. Department of Computer Science and Engineering, April 12, 1993.
[29]
T. yon Eicken et al. Active Messages: A Mechanism for Integrated Communication and Computation. Proceedings of the 19th International Symposium on Computer Architecture (ISCA '92), May 1992.
[30]
W. E. Weihl et aI. Prelude: A System for Portable Parallel Software. MIT Technical Report MIT/LCS/ TR-519, October 1991.
[31]
K. Yelick. Programming Models for Irregular Applications. Proceedings of the Second Workshop on Languages, Compilers, and Run-Time Environments for Distributed-Memory Multiprocessors, in A CM SIGPLAN Notices, 28(1), 17-20, january 1993.

Cited By

View all
  • (2020)Transferring Pareto Frontiers across Heterogeneous Hardware EnvironmentsProceedings of the ACM/SPEC International Conference on Performance Engineering10.1145/3358960.3379127(12-23)Online publication date: 20-Apr-2020
  • (2018)Neural networks for predicting algorithm runtime distributionsProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304415.3304620(1442-1448)Online publication date: 13-Jul-2018
  • (2018)Machine Learning in Compiler OptimizationProceedings of the IEEE10.1109/JPROC.2018.2817118106:11(1879-1901)Online publication date: Nov-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
PPOPP '95: Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
August 1995
234 pages
ISBN:0897917006
DOI:10.1145/209936
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 August 1995

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

PPoPP95
Sponsor:
PPoPP95: Principles & Practices of Parallel Programming
July 19 - 21, 1995
California, Santa Barbara, USA

Acceptance Rates

Overall Acceptance Rate 230 of 1,014 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)113
  • Downloads (Last 6 weeks)22
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Transferring Pareto Frontiers across Heterogeneous Hardware EnvironmentsProceedings of the ACM/SPEC International Conference on Performance Engineering10.1145/3358960.3379127(12-23)Online publication date: 20-Apr-2020
  • (2018)Neural networks for predicting algorithm runtime distributionsProceedings of the 27th International Joint Conference on Artificial Intelligence10.5555/3304415.3304620(1442-1448)Online publication date: 13-Jul-2018
  • (2018)Machine Learning in Compiler OptimizationProceedings of the IEEE10.1109/JPROC.2018.2817118106:11(1879-1901)Online publication date: Nov-2018
  • (2016)Statistical Models for Empirical Search-Based Performance TuningThe International Journal of High Performance Computing Applications10.1177/109434200404129318:1(65-94)Online publication date: 26-Jul-2016
  • (2016)Automatic Algorithm Selection in Computational Software Using Machine Learning2016 15th IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA.2016.0064(355-360)Online publication date: Dec-2016
  • (2015)Algorithm runtime predictionProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832840(4197-4201)Online publication date: 25-Jul-2015
  • (2015)Building Models of Computation of Service-Oriented Software via Monitoring Performance Indicators (Short Paper)Proceedings of the 2015 IEEE 8th International Conference on Service-Oriented Computing and Applications (SOCA)10.1109/SOCA.2015.21(173-179)Online publication date: 19-Oct-2015
  • (2014)Runtime prediction on new architecturesProceedings of the 10th Central and Eastern European Software Engineering Conference in Russia10.1145/2687233.2687236(1-7)Online publication date: 23-Oct-2014
  • (2014)Author retrospective for adaptive reduction parallelization techniquesACM International Conference on Supercomputing 25th Anniversary Volume10.1145/2591635.2591661(59-60)Online publication date: 10-Jun-2014
  • (2014)NitroProceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium10.1109/IPDPS.2014.59(501-512)Online publication date: 19-May-2014
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media