Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3454287.3455167guideproceedingsArticle/Chapter ViewAbstractPublication PagesMonograph
chapter
Free access

Practical two-step look-ahead Bayesian optimization

Published: 08 December 2019 Publication History

Abstract

Expected improvement and other acquisition functions widely used in Bayesian optimization use a "one-step" assumption: they value objective function evaluations assuming no future evaluations will be performed. Because we usually evaluate over multiple steps, this assumption may leave substantial room for improvement. Existing theory gives acquisition functions looking multiple steps in the future but calculating them requires solving a high-dimensional continuous-state continuous-action Markov decision process (MDP). Fast exact solutions of this MDP remain out of reach of today's methods. As a result, previous two- and multi-step lookahead Bayesian optimization algorithms are either too expensive to implement in most practical settings or resort to heuristics that may fail to fully realize the promise of two-step lookahead. This paper proposes a computationally efficient algorithm that provides an accurate solution to the two-step lookahead Bayesian optimization problem in seconds to at most several minutes of computation per batch of evaluations. The resulting acquisition function provides increased query efficiency and robustness compared with previous two- and multi-step lookahead methods in both single-threaded and batch experiments. This unlocks the value of two-step lookahead in practice. We demonstrate the value of our algorithm with extensive experiments on synthetic test functions and real-world problems.

References

[1]
S. Asmussen and P. W. Glynn. Stochastic simulation: algorithms and analysis, volume 57. Springer Science & Business Media, 2007.
[2]
D. Bingham. Optimization test problems. http://www.sfu.ca/~ssurjano/optimization.html, 2015.
[3]
E. Brochu, V. M. Cora, and N. De Freitas. A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. arXiv preprint arXiv:1012.2599, 2010.
[4]
K. Eggensperger, M. Feurer, F. Hutter, J. Bergstra, J. Snoek, H. Hoos, and K. Leyton-Brown. Towards an empirical foundation for assessing bayesian optimization of hyperparameters. In NIPS workshop on Bayesian Optimization in Theory and Practice, volume 10, page 3, 2013.
[5]
D. Foreman-Mackey, D. W. Hogg, D. Lang, and J. Goodman. emcee: the mcmc hammer. Publications of the Astronomical Society of the Pacific, 125(925):306, 2013.
[6]
A. Forrester, A. Sobester, and A. Keane. Engineering design via surrogate modelling: a practical guide. John Wiley & Sons, 2008.
[7]
P. I. Frazier. A tutorial on bayesian optimization. arXiv preprint arXiv:1807.02811, 2018.
[8]
D. Ginsbourger and R. Riche. Towards gaussian process-based optimization with finite time horizon. mODa 9-Advances in Model-Oriented Design and Analysis, pages 89-96, 2010.
[9]
D. Ginsbourger, R. Le Riche, and L. Carraro. Kriging is well-suited to parallelize optimization. In Computational Intelligence in Expensive Optimization Problems, pages 131-162. Springer, 2010.
[10]
J. González, M. Osborne, and N. Lawrence. Glasses: Relieving the myopia of Bayesian optimisation. In Artificial Intelligence and Statistics, pages 790-799, 2016.
[11]
P. Heidelberger, X.-R. Cao, M. A. Zazanis, and R. Suri. Convergence properties of infinitesimal perturbation analysis estimates. Management Science, 34(11):1281-1302, 1988.
[12]
J. M. Hernández-Lobato, M. W. Hoffman, and Z. Ghahramani. Predictive entropy search for efficient global optimization of black-box functions. In Advances in Neural Information Processing Systems, pages 918-926, 2014.
[13]
L. J. Hong and B. L. Nelson. Discrete optimization via simulation using compass. Operations Research, 54(1):115-129, 2006.
[14]
D. Huang, T. T. Allen, W. I. Notz, and N. Zeng. Global optimization of stochastic black-box systems via sequential kriging meta-models. Journal of global optimization, 34(3):441-466, 2006.
[15]
D. R. Jones, M. Schonlau, and W. J. Welch. Efficient global optimization of expensive black-box functions. Journal of Global optimization, 13(4):455-492, 1998.
[16]
H. Kushner and G. G. Yin. Stochastic approximation and recursive algorithms and applications, volume 35. Springer Science & Business Media, 2003.
[17]
H. J. Kushner. A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. Journal of Fluids Engineering, 86(1):97-106, 1964.
[18]
R. Lam. Personal communication, 2018.
[19]
R. Lam, K. Willcox, and D. H. Wolpert. Bayesian optimization with a finite budget: An approximate dynamic programming approach. In Advances in Neural Information Processing Systems, pages 883-891, 2016.
[20]
P. L'Ecuyer. A unified view of the IPA, SF, and LR gradient estimation techniques. Management Science, 36(11):1364-1383, 1990.
[21]
Q. Liu and D. A. Pierce. A note on Gauss-Hermite quadrature. Biometrika, 81(3):624-629, 1994.
[22]
P. Milgrom and I. Segal. Envelope theorems for arbitrary choice sets. Econometrica, 70(2):583-601, 2002.
[23]
M. A. Osborne, R. Garnett, and S. J. Roberts. Gaussian processes for global optimization. In 3rd international conference on learning and intelligent optimization (LION3), pages 1-15. Citeseer, 2009.
[24]
M. Poloczek, J. Wang, and P. I. Frazier. Multi-information source optimization. In Advances in Neural Information Processing Systems, 2017. ArXiv preprint 1603.00389.
[25]
W. B. Powell. Approximate Dynamic Programming: Solving the curses of dimensionality, volume 703. John Wiley & Sons, 2007.
[26]
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, 2006. ISBN ISBN 0-262-18253-X.
[27]
S. P. Smith. Differentiation of the cholesky algorithm. Journal of Computational and Graphical Statistics, 4(2):134-147, 1995.
[28]
J. Snoek, H. Larochelle, and R. P. Adams. Practical bayesian optimization of machine learning algorithms. In Advances in neural information processing systems, pages 2951-2959, 2012.
[29]
N. Srinivas, A. Krause, M. Seeger, and S. M. Kakade. Gaussian process optimization in the bandit setting: No regret and experimental design. In ICML, pages 1015-1022, 2010.
[30]
B. S. Thomson, J. B. Bruckner, and A. M. Bruckner. Elementary real analysis. Classical Real Analysis.com, 2008.
[31]
J. Wang, S. C. Clark, E. Liu, and P. I. Frazier. Parallel Bayesian global optimization of expensive functions. arXiv preprint arXiv:1602.05149, to appear in Operations Research, 2016.
[32]
Z. Wang and S. Jegelka. Max-value entropy search for efficient Bayesian optimization. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 3627-3635. JMLR. org, 2017.
[33]
J. Wilson, F. Hutter, and M. Deisenroth. Maximizing acquisition functions for Bayesian optimization. In Advances in Neural Information Processing Systems, pages 9884-9895, 2018.
[34]
J. Wu and P. Frazier. The parallel knowledge gradient method for batch bayesian optimization. In Advances in Neural Information Processing Systems, pages 3126-3134, 2016.
[35]
J. Wu, M. Poloczek, A. G. Wilson, and P. I. Frazier. Bayesian optimization with gradients. In Advances in Neural Information Processing Systems, 2017. ArXiv preprint 1703.04389.

Cited By

View all
  • (2024)Adaptive Bayesian Optimization Algorithm for Unpredictable Business EnvironmentsProceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence10.1145/3665065.3665078(78-85)Online publication date: 24-Apr-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems
December 2019
15947 pages

In-Cooperation

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 08 December 2019

Qualifiers

  • Chapter
  • Research
  • Refereed limited

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)7
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptive Bayesian Optimization Algorithm for Unpredictable Business EnvironmentsProceedings of the 2024 8th International Conference on Intelligent Systems, Metaheuristics & Swarm Intelligence10.1145/3665065.3665078(78-85)Online publication date: 24-Apr-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media