chapter

Free access

Practical two-step look-ahead Bayesian optimization

AUTHORs:

Peter I. FrazierAuthors Info & Claims

Proceedings of the 33rd International Conference on Neural Information Processing Systems

December 2019

Article No.: 880, Pages 9813 - 9823

Published: 08 December 2019 Publication History

PDF eReader Publisher Site

Abstract

Expected improvement and other acquisition functions widely used in Bayesian optimization use a "one-step" assumption: they value objective function evaluations assuming no future evaluations will be performed. Because we usually evaluate over multiple steps, this assumption may leave substantial room for improvement. Existing theory gives acquisition functions looking multiple steps in the future but calculating them requires solving a high-dimensional continuous-state continuous-action Markov decision process (MDP). Fast exact solutions of this MDP remain out of reach of today's methods. As a result, previous two- and multi-step lookahead Bayesian optimization algorithms are either too expensive to implement in most practical settings or resort to heuristics that may fail to fully realize the promise of two-step lookahead. This paper proposes a computationally efficient algorithm that provides an accurate solution to the two-step lookahead Bayesian optimization problem in seconds to at most several minutes of computation per batch of evaluations. The resulting acquisition function provides increased query efficiency and robustness compared with previous two- and multi-step lookahead methods in both single-threaded and batch experiments. This unlocks the value of two-step lookahead in practice. We demonstrate the value of our algorithm with extensive experiments on synthetic test functions and real-world problems.

References

[1]

S. Asmussen and P. W. Glynn. Stochastic simulation: algorithms and analysis, volume 57. Springer Science & Business Media, 2007.

[2]

D. Bingham. Optimization test problems. http://www.sfu.ca/~ssurjano/optimization.html, 2015.

[3]

E. Brochu, V. M. Cora, and N. De Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, 2010.

[4]

K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown. Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In NIPS workshop on Bayesian Optimization in Theory and Practice, volume 10, page 3, 2013.

[5]

D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: the mcmc hammer. Publications of the Astronomical Society of the Pacific, 125(925):306, 2013.

[6]

A. Forrester, A. Sobester, and A. Keane. Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, 2008.

[7]

P. I. Frazier. A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.

[8]

D. Ginsbourger and R. Riche. Towards gaussian process-based optimization with finite time horizon. mODa 9-Advances in Model-Oriented Design and Analysis, pages 89-96, 2010.

[9]

D. Ginsbourger, R. Le Riche, and L. Carraro. Kriging is well-suited to parallelize optimization. In Computational Intelligence in Expensive Optimization Problems, pages 131-162. Springer, 2010.

[10]

J. González, M. Osborne, and N. Lawrence. Glasses: Relieving the myopia of Bayesian optimisation. In Artificial Intelligence and Statistics, pages 790-799, 2016.

[11]

P. Heidelberger, X.-R. Cao, M. A. Zazanis, and R. Suri. Convergence properties of infinitesimal perturbation analysis estimates. Management Science, 34(11):1281-1302, 1988.

Digital Library

[12]

J. M. Hernández-Lobato, M. W. Hoffman, and Z. Ghahramani. Predictive entropy search for efficient global optimization of black-box functions. In Advances in Neural Information Processing Systems, pages 918-926, 2014.

Digital Library

[13]

L. J. Hong and B. L. Nelson. Discrete optimization via simulation using compass. Operations Research, 54(1):115-129, 2006.

Digital Library

[14]

D. Huang, T. T. Allen, W. I. Notz, and N. Zeng. Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of global optimization, 34(3):441-466, 2006.

[15]

D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13(4):455-492, 1998.

Digital Library

[16]

H. Kushner and G. G. Yin. Stochastic approximation and recursive algorithms and applications, volume 35. Springer Science & Business Media, 2003.

[17]

H. J. Kushner. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Fluids Engineering, 86(1):97-106, 1964.

[18]

R. Lam. Personal communication, 2018.

[19]

R. Lam, K. Willcox, and D. H. Wolpert. Bayesian optimization with a finite budget: An approximate dynamic programming approach. In Advances in Neural Information Processing Systems, pages 883-891, 2016.

[20]

P. L'Ecuyer. A unified view of the IPA, SF, and LR gradient estimation techniques. Management Science, 36(11):1364-1383, 1990.

[21]

Q. Liu and D. A. Pierce. A note on Gauss-Hermite quadrature. Biometrika, 81(3):624-629, 1994.

[22]

P. Milgrom and I. Segal. Envelope theorems for arbitrary choice sets. Econometrica, 70(2):583-601, 2002.

[23]

M. A. Osborne, R. Garnett, and S. J. Roberts. Gaussian processes for global optimization. In 3rd international conference on learning and intelligent optimization (LION3), pages 1-15. Citeseer, 2009.

[24]

M. Poloczek, J. Wang, and P. I. Frazier. Multi-information source optimization. In Advances in Neural Information Processing Systems, 2017. ArXiv preprint 1603.00389.

[25]

W. B. Powell. Approximate Dynamic Programming: Solving the curses of dimensionality, volume 703. John Wiley & Sons, 2007.

Digital Library

[26]

C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, 2006. ISBN ISBN 0-262-18253-X.

Digital Library

[27]

S. P. Smith. Differentiation of the cholesky algorithm. Journal of Computational and Graphical Statistics, 4(2):134-147, 1995.

[28]

J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951-2959, 2012.

Digital Library

[29]

N. Srinivas, A. Krause, M. Seeger, and S. M. Kakade. Gaussian process optimization in the bandit setting: No regret and experimental design. In ICML, pages 1015-1022, 2010.

Digital Library

[30]

B. S. Thomson, J. B. Bruckner, and A. M. Bruckner. Elementary real analysis. Classical Real Analysis.com, 2008.

[31]

J. Wang, S. C. Clark, E. Liu, and P. I. Frazier. Parallel Bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, to appear in Operations Research, 2016.

[32]

Z. Wang and S. Jegelka. Max-value entropy search for efficient Bayesian optimization. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3627-3635. JMLR. org, 2017.

Digital Library

[33]

J. Wilson, F. Hutter, and M. Deisenroth. Maximizing acquisition functions for Bayesian optimization. In Advances in Neural Information Processing Systems, pages 9884-9895, 2018.

[34]

J. Wu and P. Frazier. The parallel knowledge gradient method for batch bayesian optimization. In Advances in Neural Information Processing Systems, pages 3126-3134, 2016.

[35]

J. Wu, M. Poloczek, A. G. Wilson, and P. I. Frazier. Bayesian optimization with gradients. In Advances in Neural Information Processing Systems, 2017. ArXiv preprint 1703.04389.

Cited By

Maitra S(2024)Adaptive Bayesian Optimization Algorithm for Unpredictable Business EnvironmentsProceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence10.1145/3665065.3665078(78-85)Online publication date: 24-Apr-2024
https://dl.acm.org/doi/10.1145/3665065.3665078

Index Terms

Practical two-step look-ahead Bayesian optimization

Index terms have been assigned to the content through auto-classification.

Recommendations

Practical Free-Start Collision Attacks on 76-step SHA-1
Advances in Cryptology -- CRYPTO 2015
Abstract
In this paper we analyze the security of the compression function of SHA-1 against collision attacks, or equivalently free-start collisions on the hash function. While a lot of work has been dedicated to the analysis of SHA-1 in the past decade, ...
Practical bayesian optimization
How Bayesian should Bayesian optimisation be?
GECCO '21: Proceedings of the Genetic and Evolutionary Computation Conference Companion

Bayesian optimisation (BO) uses probabilistic surrogate models - usually Gaussian processes (GPs) - for the optimisation of expensive black-box functions. At each BO iteration, the GP hyperparameters are fit to previously-evaluated data by maximising ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems

December 2019

15947 pages

Copyright © 2019 Neural Information Processing Systems Foundation, Inc.

In-Cooperation

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 08 December 2019

Qualifiers

Chapter
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
43
Total Downloads

Downloads (Last 12 months)34
Downloads (Last 6 weeks)7

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Maitra S(2024)Adaptive Bayesian Optimization Algorithm for Unpredictable Business EnvironmentsProceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence10.1145/3665065.3665078(78-85)Online publication date: 24-Apr-2024
https://dl.acm.org/doi/10.1145/3665065.3665078

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Media

Figures

Other

Tables

View Table of Contents