research-article

Public Access

Bootstrapping Parameter Space Exploration for Fast Tuning

Authors:

Jayaraman J. Thiagarajan,

Rushil Anirudh,

Alfredo Gimenez,

Aniruddha Marathe,

Abhinav Bhatele,

Todd GamblinAuthors Info & Claims

ICS '18: Proceedings of the 2018 International Conference on Supercomputing

Pages 385 - 395

https://doi.org/10.1145/3205289.3205321

Published: 12 June 2018 Publication History

Abstract

The task of tuning parameters for optimizing performance or other metrics of interest such as energy, variability, etc. can be resource and time consuming. Presence of a large parameter space makes a comprehensive exploration infeasible. In this paper, we propose a novel bootstrap scheme, called GEIST, for parameter space exploration to find performance-optimizing configurations quickly. Our scheme represents the parameter space as a graph whose connectivity guides information propagation from known configurations. Guided by the predictions of a semi-supervised learning method over the parameter graph, GEIST is able to adaptively sample and find desirable configurations using limited results from experiments. We show the effectiveness of GEIST for selecting application input options, compiler flags, and runtime/system settings for several parallel codes including LULESH, Kripke, Hypre, and OpenAtom.

References

[1]

Bilge Acun, Abhishek Gupta, Nikhil Jain, Akhil Langer, Harshitha Menon, Eric Mikida, Xiang Ni, Michael Robson, Yanhua Sun, Ehsan Totoni, Lukasz Wesolowski, and Laxmikant Kale. 2014. Parallel Programming with Migratable Objects: Charm++ in Practice (SC).

Digital Library

[2]

Prasanna Balaprakash, Robert B Gramacy, and Stefan M Wild. 2013. Active-learning-based surrogate models for empirical performance tuning. In Cluster Computing (CLUSTER), 2013 IEEE International Conference on. IEEE, 1--8.

[3]

David Beckingsale, Olga Pearce, Ignacio Laguna, and Todd Gamblin. 2017. Apollo: Reusable models for fast, dynamic tuning of input-dependent code. In Parallel and Distributed Processing Symposium (IPDPS), 2017 IEEE International. IEEE.

[4]

Z. Bei, Z. Yu, H. Zhang, W. Xiong, C. Xu, L. Eeckhout, and S. Feng. 2016. RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration. IEEE Transactions on Parallel and Distributed Systems 27, 5 (May 2016), 1470--1483.

Digital Library

[5]

Olivier Chapelle, Bernhard Scholkopf, and Alexander Zien. 2009. Semi-supervised learning (chapelle, o. et al., eds.; 2006){book reviews}. IEEE Transactions on Neural Networks 20, 3 (2009), 542--542.

Digital Library

[6]

Ray S Chen and Jeffrey K Hollingsworth. 2015. Angel: A hierarchical approach to multi-objective online autotuning. In Proceedings of the 5th International Workshop on Runtime and Operating Systems for Supercomputers. ACM, 4.

Digital Library

[7]

Yang Chen, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Liang Peng, Olivier Temam, and Chengyong Wu. 2010. Evaluating Iterative Optimization Across 1000 Datasets. In Proceedings of the 31st ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI '10). ACM, New York, NY, USA, 448--459.

Digital Library

[8]

I-H Chung and Jeffrey K Hollingsworth. 2006. A case study using automatic performance tuning for large-scale scientific programs. In High Performance Distributed Computing, 2006 15th IEEE International Symposium on. IEEE, 45--56.

[9]

Cristian Ţărpuş, I-Hsin Chung, and Jeffrey K. Hollingsworth. 2002. Active Harmony: Towards Automated Performance Tuning. In Proceedings of the 2002 ACM/IEEE Conference on Supercomputing (SC '02). IEEE Computer Society Press.

Digital Library

[10]

Dmitry Duplyakin, Jed Brown, and Robert Ricci. 2016. Active Learning in Performance Analysis. In Cluster Computing (CLUSTER), 2016 IEEE International Conference on. IEEE, 182--191.

[11]

Thomas L Falch and Anne C Elster. 2017. Machine learning-based autotuning for enhanced performance portability of OpenCL applications. Concurrency and Computation: Practice and Experience 29, 8 (2017).

[12]

R.D. Falgout, J.E. Jones, and U.M. Yang. 2006. The Design and Implementation of hypre, a Library of Parallel High Performance Preconditioners. In Numerical Solution of Partial Differential Equations on Parallel Computers, A.M. Bruaset and A. Tveito (Eds.). Vol. 51. Springer-Verlag, 267--294.

[13]

Archana Ganapathi, Kaushik Datta, Armando Fox, and David Patterson. 2009. A case for machine learning to optimize multicore performance. In Proceedings of the First USENIX conference on Hot topics in parallelism. USENIX Association.

Digital Library

[14]

Michael Gerndt and Michael Ott. 2010. Automatic performance analysis with periscope. Concurrency and Computation: Practice and Experience 22, 6 (2010).

Digital Library

[15]

Alexander Grebhahn, Norbert Siegmund, Harald Köstler, and Sven Apel. 2016. Performance prediction of multigrid-solver configurations. In Software for Exascale Computing. Springer, 69--88.

[16]

Philipp Gschwandtner, Juan José Durillo, and Thomas Fahringer. 2014. Multi-Objective Auto-Tuning with Insieme: Optimization and Trade-Off Analysis for Time, Energy and Resource Usage. In EuroPar. 87--98.

[17]

R D Hornung and J A Keasler. 2014. The RAJA Poratability Layer: Overview and Status. Technical Report LLNL-TR-661403. Lawrence Livermore National Laboratory.

[18]

Nikhil Jain, Eric Bohm, Eric Mikida, Subhasish Mandal, Minjung Kim, Prateek Jindal, Qi Li, Sohrab Ismail-Beigi, Glenn Martyna, and Laxmikant Kale. 2016. OpenAtom: Scalable Ab-Initio Molecular Dynamics with Diverse Capabilities. In International Supercomputing Conference (ISC HPC '16).

[19]

Pooyan Jamshidi, Norbert Siegmund, Miguel Velez, Christian Kästner, Akshay Patel, and Yuvraj Agarwal. 2017. Transfer learning for performance modeling of configurable systems: An exploratory analysis. In Proceedings of the 32nd IEEE/ACM International Conference on Automated Software Engineering. IEEE Press, 497--508.

Digital Library

[20]

Thorsten Joachims. 2003. Transductive learning via spectral graph partitioning. In Proceedings of the 20th International Conference on Machine Learning (ICML-03).

Digital Library

[21]

AJ Kunen, TS Bailey, and PN Brown. 2015. KRIPKE-A massively parallel transport mini-app. Lawrence Livermore National Laboratory (LLNL), Livermore, CA, Tech. Rep (2015).

[22]

Ashraf Mahgoub, Paul Wood, Sachandhan Ganesh, Subrata Mitra, Wolfgang Gerlach, Travis Harrison, Folker Meyer, Ananth Grama, Saurabh Bagchi, and Somali Chaterji. 2017. Rafiki: A Middleware for Parameter Tuning of NoSQL Datastores for Dynamic Metagenomics Workloads. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Middleware '17). ACM, New York, NY, USA, 28--40.

Digital Library

[23]

Aniruddha Marathe, Rushil Anirudh, Nikhil Jain, Abhinav Bhatele, Jayaraman Thiagarajan, Bhavya Kailkhura, Jae-Seung Yeom, Barry Rountree, and Todd Gamblin. 2017. Performance Modeling under Resource Constraints Using Deep Transfer Learning. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking,Storage and Analysis (SC '17). IEEE Computer Society. LLNL-CONF-736726.

Digital Library

[24]

Saurav Muralidharan, Manu Shantharam, Mary Hall, Michael Garland, and Bryan Catanzaro. 2014. Nitro: A framework for adaptive code variant tuning. In Parallel and Distributed Processing Symposium, 2014 IEEE 28th International. IEEE.

Digital Library

[25]

William F Ogilvie, Pavlos Petoumenos, Zheng Wang, and Hugh Leather. 2017. Minimizing the cost of iterative compilation with active learning. In Proceedings of the 2017 International Symposium on Code Generation and Optimization. IEEE Press, 245--256.

Digital Library

[26]

Amit Roy, Prasanna Balaprakash, Paul D Hovland, and Stefan M Wild. 2016. Exploiting performance portability in search algorithms for autotuning. In Parallel and Distributed Processing Symposium Workshops, 2016 IEEE International. IEEE.

[27]

Burr Settles. 2012. Active learning. Synthesis Lectures on Artificial Intelligence and Machine Learning 6, 1 (2012), 1--114.

Digital Library

[28]

Ananta Tiwari, Chun Chen, Jacqueline Chame, Mary Hall, and Jeffrey K Hollingsworth. 2009. A scalable autotuning framework for compiler optimization. In Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on. IEEE, 1--12.

Digital Library

[29]

Ananta Tiwari and Jeffrey K Hollingsworth. 2011. Online adaptive code generation and tuning. In Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International. IEEE, 879--892.

Digital Library

[30]

Yuto Yamaguchi, Christos Faloutsos, and Hiroyuki Kitagawa. 2016. Camlp: Confidence-aware modulated label propagation. In Proceedings of the 2016 SIAM International Conference on Data Mining. SIAM, 513--521.

[31]

Huazhe Zhang and Henry Hoffmann. 2016. Maximizing Performance Under a Power Cap: A Comparison of Hardware, Software, and Hybrid Techniques. SIGPLAN Not. 51, 4 (2016), 545--559.

Digital Library

Cited By

Wu XBalaprakash PKruse MKoo JVideau BHovland PTaylor VGeltz BJana SHall M(2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
https://doi.org/10.1002/cpe.8322
Yang TChen RLi YLiu XWang G(2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3605573.3605578
Dutta AAlcaraz JTehraniJamsaz ACesar ESikora AJannesari AButt AMi NChard K(2023)Performance Optimization using Multimodal Modeling and Heterogeneous GNNProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592984(45-57)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592984
Show More Cited By

Index Terms

Bootstrapping Parameter Space Exploration for Fast Tuning

Recommendations

Minimization of Xeon Phi Core Use with Negligible Execution Time Impact
XSEDE16: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale

For many years GPUs have been components of HPC clusters (Titan and Piz Daint), while only in recent years has the Intel® Xeon Phi™ been included (Tianhe-2 and Stampede). For example, GPUs are in 14% of systems in the November 2015 Top500 list, while ...
When parameter tuning actually is parameter control
GECCO '11: Proceedings of the 13th annual conference on Genetic and evolutionary computation

In this paper, we show that sequential parameter optimization (SPO), a method that was designed for (offline) parameter tuning, can be successfully used as a controller for multistart approaches of evolutionary algorithms (EA). We demonstrate this by ...
A Holistic Approach to Automatic Mixed-Precision Code Generation and Tuning for Affine Programs
PPoPP '24: Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming

Reducing floating-point (FP) precision is used to trade the quality degradation of a numerical program's output for performance, but this optimization coincides with type casting, whose overhead is undisclosed until a mixed-precision code version is ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '18: Proceedings of the 2018 International Conference on Supercomputing

June 2018

407 pages

ISBN:9781450357838

DOI:10.1145/3205289

Copyright © 2018 ACM.

© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

U.S. Department of Energy

Conference

ICS '18

Sponsor:

SIGARCH

ICS '18: 2018 International Conference on Supercomputing

June 12 - 15, 2018

Beijing, China

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

18
Total Citations
View Citations
510
Total Downloads

Downloads (Last 12 months)113
Downloads (Last 6 weeks)16

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wu XBalaprakash PKruse MKoo JVideau BHovland PTaylor VGeltz BJana SHall M(2024)ytopt: Autotuning Scientific Applications for Energy Efficiency at Large ScalesConcurrency and Computation: Practice and Experience10.1002/cpe.8322Online publication date: 30-Oct-2024
https://doi.org/10.1002/cpe.8322
Yang TChen RLi YLiu XWang G(2023)CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter TuningProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605578(317-326)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3605573.3605578
Dutta AAlcaraz JTehraniJamsaz ACesar ESikora AJannesari AButt AMi NChard K(2023)Performance Optimization using Multimodal Modeling and Heterogeneous GNNProceedings of the 32nd International Symposium on High-Performance Parallel and Distributed Computing10.1145/3588195.3592984(45-57)Online publication date: 7-Aug-2023
https://dl.acm.org/doi/10.1145/3588195.3592984
Parasyris KGeorgakoudis GRangel ELaguna IDoerfert JMohror KArnold DBadia R(2023)Scalable Tuning of (OpenMP) GPU Applications via Kernel Record and ReplayProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607098(1-14)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607098
Patabandi THall MVerbrugge CLhoták OShen X(2023)Efficiently Learning Locality Optimizations by Decomposing Transformation DomainsProceedings of the 32nd ACM SIGPLAN International Conference on Compiler Construction10.1145/3578360.3580272(37-49)Online publication date: 17-Feb-2023
https://dl.acm.org/doi/10.1145/3578360.3580272
Randall TKoo JVideau BKruse MWu XHovland PHall MGe RBalaprakash PGallivan KNikolopoulos DBeivide RGallopoulos E(2023)Transfer-learning-based Autotuning using Gaussian CopulaProceedings of the 37th International Conference on Supercomputing10.1145/3577193.3593712(37-49)Online publication date: 21-Jun-2023
https://dl.acm.org/doi/10.1145/3577193.3593712
Rolnick DDonti PKaack LKochanski KLacoste ASankaran KRoss AMilojevic-Dupont NJaques NWaldman-Brown ALuccioni AMaharaj TSherwin EMukkavilli SKording KGomes CNg AHassabis DPlatt JCreutzig FChayes JBengio Y(2022)Tackling Climate Change with Machine LearningACM Computing Surveys10.1145/348512855:2(1-96)Online publication date: 7-Feb-2022
https://dl.acm.org/doi/10.1145/3485128
Shu TGuo YWozniak JDing XFoster IKurc Tde Supinski BHall MGamblin T(2021)Bootstrapping in-situ workflow auto-tuning via combining performance models of component applicationsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3476197(1-15)Online publication date: 14-Nov-2021
https://dl.acm.org/doi/10.1145/3458817.3476197
Roy RPatel TGadepally VTiwari DFreund SYahav E(2021)Bliss: auto-tuning complex applications using a pool of diverse lightweight learning modelsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454109(1280-1295)Online publication date: 19-Jun-2021
https://dl.acm.org/doi/10.1145/3453483.3454109
Wu XKruse MBalaprakash PFinkel HHovland PTaylor VHall M(2021)Autotuning PolyBench benchmarks with LLVM Clang/Polly loop optimization pragmas using Bayesian optimizationConcurrency and Computation: Practice and Experience10.1002/cpe.668334:20Online publication date: 8-Nov-2021
https://doi.org/10.1002/cpe.6683
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents