Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3205289.3205321acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
research-article
Public Access

Bootstrapping Parameter Space Exploration for Fast Tuning

Published: 12 June 2018 Publication History

Abstract

The task of tuning parameters for optimizing performance or other metrics of interest such as energy, variability, etc. can be resource and time consuming. Presence of a large parameter space makes a comprehensive exploration infeasible. In this paper, we propose a novel bootstrap scheme, called GEIST, for parameter space exploration to find performance-optimizing configurations quickly. Our scheme represents the parameter space as a graph whose connectivity guides information propagation from known configurations. Guided by the predictions of a semi-supervised learning method over the parameter graph, GEIST is able to adaptively sample and find desirable configurations using limited results from experiments. We show the effectiveness of GEIST for selecting application input options, compiler flags, and runtime/system settings for several parallel codes including LULESH, Kripke, Hypre, and OpenAtom.

References

[1]
Bilge Acun, Abhishek Gupta, Nikhil Jain, Akhil Langer, Harshitha Menon, Eric Mikida, Xiang Ni, Michael Robson, Yanhua Sun, Ehsan Totoni, Lukasz Wesolowski, and Laxmikant Kale. 2014. Parallel Programming with Migratable Objects: Charm++ in Practice (SC).
[2]
Prasanna Balaprakash, Robert B Gramacy, and Stefan M Wild. 2013. Active-learning-based surrogate models for empirical performance tuning. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on. IEEE, 1--8.
[3]
David Beckingsale, Olga Pearce, Ignacio Laguna, and Todd Gamblin. 2017. Apollo: Reusable models for fast, dynamic tuning of input-dependent code. In Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE International. IEEE.
[4]
Z. Bei, Z. Yu, H. Zhang, W. Xiong, C. Xu, L. Eeckhout, and S. Feng. 2016. RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration. IEEE Transactions on Parallel and Distributed Systems 27, 5 (May 2016), 1470--1483.
[5]
Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. 2009. Semi-supervised learning (chapelle, o. et al., eds.; 2006){book reviews}. IEEE Transactions on Neural Networks 20, 3 (2009), 542--542.
[6]
Ray S Chen and Jeffrey K Hollingsworth. 2015. Angel: A hierarchical approach to multi-objective online autotuning. In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers. ACM, 4.
[7]
Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. 2010. Evaluating Iterative Optimization Across 1000 Datasets. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '10). ACM, New York, NY, USA, 448--459.
[8]
I-H Chung and Jeffrey K Hollingsworth. 2006. A case study using automatic performance tuning for large-scale scientific programs. In High Performance Distributed Computing, 2006 15th IEEE International Symposium on. IEEE, 45--56.
[9]
Cristian Ţărpuş, I-Hsin Chung, and Jeffrey K. Hollingsworth. 2002. Active Harmony: Towards Automated Performance Tuning. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing (SC '02). IEEE Computer Society Press.
[10]
Dmitry Duplyakin, Jed Brown, and Robert Ricci. 2016. Active Learning in Performance Analysis. In Cluster Computing (CLUSTER), 2016 IEEE International Conference on. IEEE, 182--191.
[11]
Thomas L Falch and Anne C Elster. 2017. Machine learning-based autotuning for enhanced performance portability of OpenCL applications. Concurrency and Computation: Practice and Experience 29, 8 (2017).
[12]
R.D. Falgout, J.E. Jones, and U.M. Yang. 2006. The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners. In Numerical Solution of Partial Differential Equations on Parallel Computers, A.M. Bruaset and A. Tveito (Eds.). Vol. 51. Springer-Verlag, 267--294.
[13]
Archana Ganapathi, Kaushik Datta, Armando Fox, and David Patterson. 2009. A case for machine learning to optimize multicore performance. In Proceedings of the First USENIX conference on Hot topics in parallelism. USENIX Association.
[14]
Michael Gerndt and Michael Ott. 2010. Automatic performance analysis with periscope. Concurrency and Computation: Practice and Experience 22, 6 (2010).
[15]
Alexander Grebhahn, Norbert Siegmund, Harald Köstler, and Sven Apel. 2016. Performance prediction of multigrid-solver configurations. In Software for Exascale Computing. Springer, 69--88.
[16]
Philipp Gschwandtner, Juan José Durillo, and Thomas Fahringer. 2014. Multi-Objective Auto-Tuning with Insieme: Optimization and Trade-Off Analysis for Time, Energy and Resource Usage. In EuroPar. 87--98.
[17]
R D Hornung and J A Keasler. 2014. The RAJA Poratability Layer: Overview and Status. Technical Report LLNL-TR-661403. Lawrence Livermore National Laboratory.
[18]
Nikhil Jain, Eric Bohm, Eric Mikida, Subhasish Mandal, Minjung Kim, Prateek Jindal, Qi Li, Sohrab Ismail-Beigi, Glenn Martyna, and Laxmikant Kale. 2016. OpenAtom: Scalable Ab-Initio Molecular Dynamics with Diverse Capabilities. In International Supercomputing Conference (ISC HPC '16).
[19]
Pooyan Jamshidi, Norbert Siegmund, Miguel Velez, Christian Kästner, Akshay Patel, and Yuvraj Agarwal. 2017. Transfer learning for performance modeling of configurable systems: An exploratory analysis. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 497--508.
[20]
Thorsten Joachims. 2003. Transductive learning via spectral graph partitioning. In Proceedings of the 20th International Conference on Machine Learning (ICML-03).
[21]
AJ Kunen, TS Bailey, and PN Brown. 2015. KRIPKE-A massively parallel transport mini-app. Lawrence Livermore National Laboratory (LLNL), Livermore, CA, Tech. Rep (2015).
[22]
Ashraf Mahgoub, Paul Wood, Sachandhan Ganesh, Subrata Mitra, Wolfgang Gerlach, Travis Harrison, Folker Meyer, Ananth Grama, Saurabh Bagchi, and Somali Chaterji. 2017. Rafiki: A Middleware for Parameter Tuning of NoSQL Datastores for Dynamic Metagenomics Workloads. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware '17). ACM, New York, NY, USA, 28--40.
[23]
Aniruddha Marathe, Rushil Anirudh, Nikhil Jain, Abhinav Bhatele, Jayaraman Thiagarajan, Bhavya Kailkhura, Jae-Seung Yeom, Barry Rountree, and Todd Gamblin. 2017. Performance Modeling under Resource Constraints Using Deep Transfer Learning. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking,Storage and Analysis (SC '17). IEEE Computer Society. LLNL-CONF-736726.
[24]
Saurav Muralidharan, Manu Shantharam, Mary Hall, Michael Garland, and Bryan Catanzaro. 2014. Nitro: A framework for adaptive code variant tuning. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE.
[25]
William F Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Minimizing the cost of iterative compilation with active learning. In Proceedings of the 2017 International Symposium on Code Generation and Optimization. IEEE Press, 245--256.
[26]
Amit Roy, Prasanna Balaprakash, Paul D Hovland, and Stefan M Wild. 2016. Exploiting performance portability in search algorithms for autotuning. In Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE.
[27]
Burr Settles. 2012. Active learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 6, 1 (2012), 1--114.
[28]
Ananta Tiwari, Chun Chen, Jacqueline Chame, Mary Hall, and Jeffrey K Hollingsworth. 2009. A scalable autotuning framework for compiler optimization. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on. IEEE, 1--12.
[29]
Ananta Tiwari and Jeffrey K Hollingsworth. 2011. Online adaptive code generation and tuning. In Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International. IEEE, 879--892.
[30]
Yuto Yamaguchi, Christos Faloutsos, and Hiroyuki Kitagawa. 2016. Camlp: Confidence-aware modulated label propagation. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 513--521.
[31]
Huazhe Zhang and Henry Hoffmann. 2016. Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques. SIGPLAN Not. 51, 4 (2016), 545--559.

Cited By

View all
  • (2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
  • (2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
  • (2023)Performance Optimization using Multimodal Modeling and Heterogeneous GNNProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592984(45-57)Online publication date: 7-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '18: Proceedings of the 2018 International Conference on Supercomputing
June 2018
407 pages
ISBN:9781450357838
DOI:10.1145/3205289
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. autotuning
  2. performance
  3. sampling
  4. semi-supervised learning

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICS '18
Sponsor:

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)113
  • Downloads (Last 6 weeks)16
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
  • (2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
  • (2023)Performance Optimization using Multimodal Modeling and Heterogeneous GNNProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592984(45-57)Online publication date: 7-Aug-2023
  • (2023)Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and ReplayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607098(1-14)Online publication date: 12-Nov-2023
  • (2023)Efficiently Learning Locality Optimizations by Decomposing Transformation DomainsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580272(37-49)Online publication date: 17-Feb-2023
  • (2023)Transfer-learning-based Autotuning using Gaussian CopulaProceedings of the 37th International Conference on Supercomputing10.1145/3577193.3593712(37-49)Online publication date: 21-Jun-2023
  • (2022)Tackling Climate Change with Machine LearningACM Computing Surveys10.1145/348512855:2(1-96)Online publication date: 7-Feb-2022
  • (2021)Bootstrapping in-situ workflow auto-tuning via combining performance models of component applicationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476197(1-15)Online publication date: 14-Nov-2021
  • (2021)Bliss: auto-tuning complex applications using a pool of diverse lightweight learning modelsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454109(1280-1295)Online publication date: 19-Jun-2021
  • (2021)Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimizationConcurrency and Computation: Practice and Experience10.1002/cpe.668334:20Online publication date: 8-Nov-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media