Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2259016.2259042acmconferencesArticle/Chapter ViewAbstractPublication PagescgoConference Proceedingsconference-collections
research-article

Using graph-based program characterization for predictive modeling

Published: 31 March 2012 Publication History

Abstract

Using machine learning has proven effective at choosing the right set of optimizations for a particular program. For machine learning techniques to be most effective, compiler writers have to develop expressive means of characterizing the program being optimized. The current state-of-the-art techniques for characterizing programs include using a fixed-length feature vector of either source code features extracted during compile time or performance counters collected when running the program. For the problem of identifying optimizations to apply, models constructed using performance counter characterizations of a program have been shown to outperform models constructed using source code features. However, collecting performance counters requires running the program multiple times, and this "dynamic" method of characterizing programs can be specific to inputs of the program. It would be preferable to have a method of characterizing programs that is as expressive as performance counter features, but that is "static" like source code features and therefore does not require running the program.
In this paper, we introduce a novel way of characterizing programs using a graph-based characterization, which uses the program's intermediate representation and an adapted learning algorithm to predict good optimization sequences. To evaluate different characterization techniques, we focus on loop-intensive programs and construct prediction models that drive polyhedral optimizations, such as auto-parallelism and various loop transformation.
We show that our graph-based characterization technique outperforms three current state-of-the-art characterization techniques found in the literature. By using the sequences predicted to be the best by our graph-based model, we achieved up to 73% of the speedup achievable in our search space for a particular platform, whereas we could only achieve up to 59% by other state-of-the-art techniques we evaluated.

References

[1]
MINimal IR space. http://www.assembla.com/wiki/show/minir-dev.
[2]
PoCC. http://www.cse.ohio-state.edu/pouchet/software/pocc.
[3]
Polybench v2.1. http://www-roc.inria.fr/pouchet/software/polybench.
[4]
Shogun. http://www.shogun-toolbox.org/.
[5]
Weka 3. http://www.cs.waikato.ac.nz/ml/weka.
[6]
F. Agakov, E. Bonilla, J. Cavazos, B. Franke, G. Fursin, M. O'Boyle, J. Thomson, M. Toussaint, and C. Williams. Using machine learning to focus iterative optimization. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 295--305, 2006.
[7]
K. M. Borgwardt and H. P. Kriegel. Shortest-path kernels on graphs. In IEEE International Conference on Data Mining (ICDM), pages 74--81, 2005.
[8]
H. Bunke and K. Riesen. A family of novel graph kernels for structural pattern recognition. In Iberoamerican Congress on Pattern Recognition (CIARP), pages 20--31, 2007.
[9]
J. Cavazos, C. Dubach, F. Agakov, E. Bonilla, M. F. O'Boyle, G. Fursin, and O. Temam. Automatic performance model construction for the fast software exploration of new hardware designs. In International Conference on Compilers, Architectures and Synthesis of Embedded Systems (CASES), October 2006.
[10]
J. Cavazos, G. Fursin, F. V. Agakov, E. V. Bonilla, M. F. P. O'Boyle, and O. Temam. Rapidly selecting good compiler optimizations using performance counters. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 185--197, 2007.
[11]
F. de Mesmay, Y. Voronenko, and M. Püschel. Offline library adaptation using automatically generated heuristics. In International Parallel and Distributed Processing Symposium (IPDPS), 2010.
[12]
J. Demme and S. Sethumadhavan. Approximate graph clustering for program characterization. In Proceedings of the International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC), January 2012.
[13]
C. Dubach, J. Cavazos, B. Franke, G. Fursin, M. F. O'Boyle, and O. Temam. Fast compiler optimisation evaluation using code-feature based performance prediction. In Proceedings of the International Conference on Computing Frontiers (CF), pages 131--142, 2007.
[14]
C. Dubach, T. M. Jones, E. V. Bonilla, G. Fursin, and M. F. O'Boyle. Portable compiler optimization across embedded programs and microarchitectures using machine learning. In Proceedings of the International Symposium on Microarchitecture (MICRO), December 2009.
[15]
G. Fursin and O. Temam. Collective optimization. In Proceedings of the International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC), January 2009.
[16]
G. Fursin, Y. Kashnikov, A. Memon, Z. Chamski, O. Temam, M. Namolaru, E. Yom-Tov, B. Mendelson, A. Zaks, E. Courtois, F. Bodin, P. Barnard, E. Ashton, E. Bonilla, J. Thomson, C. Williams, and M. O'Boyle. Milepost GCC: machine learning enabled self-tuning compiler. International Journal of Parallel Programming (IJPP), 39: 296--327, 2011.
[17]
T. Gärtner. A survey of kernels for structured data. SIGKDD Exploration Newsletter, 5:49--58, 2003.
[18]
D. Haussler. Convolution kernels on discrete structures. Technical Report UCSC-CRL-99-10, UC Santa Cruz, 1999.
[19]
H. Leather, E. Bonilla, and M. O'Boyle. Automatic feature generation for machine learning based optimizing compilation. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 81--91, 2009.
[20]
A. Monsifrot, F. Bodin, and R. Quiniou. A machine learning approach to automatic production of compiler heuristics. In Proceedings of the 10th International Conference on Artificial Intelligence: Methodology, Systems, and Applications (AIMSA), pages 41--50, 2002.
[21]
D. Parello, O. Temam, A. Cohen, and J.-M. Verdun. Towards a systematic, pragmatic and architecture-aware program optimization process for complex processors. In Proceedings of the 2004 ACM/IEEE Conference on Supercomputing (SC), pages 15--, 2004.
[22]
E. Park, L.-N. Pouchet, J. Cavazos, A. Cohen, and P. Sadayappan. Predictive modeling in a polyhedral optimization space. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 119--129, 2011.
[23]
B. Schölkopf and A. J. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2001.
[24]
B. Schölkopf, K. Tsuda, and J. P. Vert. Kernel Methods in Computational Biology. MIT Press, 2004.
[25]
M. Stephenson and S. P. Amarasinghe. Predicting unroll factors using supervised classification. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 123--134, 2005.
[26]
G. Tournavitis, Z. Wang, B. Franke, and M. F. O'Boyle. Towards a holistic approach to auto-parallelization: integrating profile-driven parallelism detection and machine-learning based mapping. In Proceedings of the International Conference on Programming Language Design and Implementation (PLDI), pages 177--187, 2009.
[27]
S. Triantafyllis, M. Vachharajani, N. Vachharajani, and D. I. August. Compiler optimization-space exploration. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 204--215, 2003.
[28]
S. V. N. Vishwanathan, N. Schraudolph, R. Kondor, and K. Borgwardt. Graph kernels. Journal of Machine Learning Research, 11:1201--1242, 2010.
[29]
Z. Wang and M. F. O'Boyle. Partitioning streaming parallelism for multi-cores: a machine learning based approach. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 307--318, 2010.
[30]
T. Yuki, L. Renganarayanan, S. Rajopadhye, C. Anderson, A. E. Eichenberger, and K. O'Brien. Automatic creation of tile size selection models. In Proceedings of the International Symposium on Code Generation and Optimization (CGO), pages 190--199, 2010.

Cited By

View all
  • (2024)Software Failure Prediction Based On Program State and First-Error CharacteristicsThe Computer Journal10.1093/comjnl/bxae02567:8(2559-2572)Online publication date: 23-Mar-2024
  • (2023)A Generative and Mutational Approach for Synthesizing Bug-Exposing Test Cases to Guide Compiler FuzzingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616332(1127-1139)Online publication date: 30-Nov-2023
  • (2023)Efficiently Learning Locality Optimizations by Decomposing Transformation DomainsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580272(37-49)Online publication date: 17-Feb-2023
  • Show More Cited By

Index Terms

  1. Using graph-based program characterization for predictive modeling

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CGO '12: Proceedings of the Tenth International Symposium on Code Generation and Optimization
      March 2012
      285 pages
      ISBN:9781450312066
      DOI:10.1145/2259016
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 31 March 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. compiler optimization
      2. graph-based program characterization
      3. iterative compilation
      4. machine learning
      5. support vector machine

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      CGO '12

      Acceptance Rates

      CGO '12 Paper Acceptance Rate 26 of 90 submissions, 29%;
      Overall Acceptance Rate 312 of 1,061 submissions, 29%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)26
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 18 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Software Failure Prediction Based On Program State and First-Error CharacteristicsThe Computer Journal10.1093/comjnl/bxae02567:8(2559-2572)Online publication date: 23-Mar-2024
      • (2023)A Generative and Mutational Approach for Synthesizing Bug-Exposing Test Cases to Guide Compiler FuzzingProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616332(1127-1139)Online publication date: 30-Nov-2023
      • (2023)Efficiently Learning Locality Optimizations by Decomposing Transformation DomainsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580272(37-49)Online publication date: 17-Feb-2023
      • (2023)Multigraph learning for parallelism discovery in sequential programsConcurrency and Computation: Practice and Experience10.1002/cpe.764835:9Online publication date: 13-Feb-2023
      • (2022)Improving Vectorization Heuristics in a Dynamic Compiler with Machine Learning ModelsProceedings of the 14th ACM SIGPLAN International Workshop on Virtual Machines and Intermediate Languages10.1145/3563838.3567679(36-47)Online publication date: 29-Nov-2022
      • (2022)Machine-Learning-Based Self-Optimizing Compiler Heuristics✱Proceedings of the 19th International Conference on Managed Programming Languages and Runtimes10.1145/3546918.3546921(98-111)Online publication date: 14-Sep-2022
      • (2022)Making the Most of Scarce Input Data in Deep Learning-Based Source Code Classification for Heterogeneous Device MappingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.311461741:6(1636-1648)Online publication date: Jun-2022
      • (2022)Multi-View Learning for Parallelism Discovery of Sequential Programs2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW55747.2022.00059(295-303)Online publication date: May-2022
      • (2022)Reduced O3 subsequence labelling: a stepping stone towards optimisation sequence predictionConnection Science10.1080/09540091.2022.204476134:1(2860-2877)Online publication date: 1-Mar-2022
      • (2021)Iterative Compilation Optimization Based on Metric Learning and Collaborative FilteringACM Transactions on Architecture and Code Optimization10.1145/348025019:1(1-25)Online publication date: 6-Dec-2021
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media