Abstract
Bayesian optimisation is a widely used technique for finding the optima of black-box functions in a sample efficient way. When there are concurrent optimisation tasks/functions then it may be possible to transfer knowledge across each other in a multi-task setting and improve the efficiency further. Transferring knowledge requires estimation of task similarity, which in turn requires good knowledge about the objective functions. However, in a multi-task Bayesian optimisation setting the number of observations for all functions can be small, especially at the beginning, making reliable computation of task similarities difficult. In this paper, we propose a novel multi-task Bayesian optimisation method that uses information theory based approach to transfer knowledge across tasks and handle the uncertainty of similarity measurements in an unified framework. Each optimisation task uses contribution from other optimisation task via a mixture model on the location of optima by appropriately combining distribution over optimal locations for each individual task. The probability distribution of the optimal location for individual tasks can be obtained because the objective functions are modeled using Gaussian processes. The weights of the mixture distributions are computed based on the similarities (measured via KL divergence) between two distributions and then appropriately weighting down by the uncertainty in the knowledge. That is, we encourage transfer of knowledge only when two tasks are confident about their high similarity measure and discourage if they are not confident, even if the similarity is high. We evaluate and demonstrate the effectiveness of our proposed method on both synthetic and a set of hyperparameter tuning tests compared to state-of-the-art algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bardenet, R., Brendel, M., Kégl, B., Sebag, M.: Collaborative hyperparameter tuning. In: ICML, vol. 2, pp. 199–207 (2013)
Botev, Z.I., Grotowski, J.F., Kroese, D.P., et al.: Kernel density estimation via diffusion. The Ann. Stat. 38(5), 2916–2957 (2010)
Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml
Feurer, M., Letham, B., Bakshy, E.: Scalable meta-learning for Bayesian optimization. arXiv preprint arXiv:1802.02219 (2018)
Feurer, M., Springenberg, J.T., Hutter, F.: Initializing Bayesian hyperparameter optimization via meta-learning. In: AAAI, pp. 1128–1135 (2015)
Hernández-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 918–926. Curran Associates, Inc. (2014)
Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40
Joy, T.T., Rana, S., Gupta, S.K., Venkatesh, S.: Flexible transfer learning framework for Bayesian optimisation. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9651, pp. 102–114. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31753-3_9
Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng. 86(1), 97–106 (1964)
Li, C., et al.: Accelerating experimental design by incorporating experimenter hunches. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 257–266. IEEE (2018)
Moc̃kus, J., Tiesis, V., Z̃ilinskas, A.: The application of Bayesian methods for seeking the extremum. In: Toward Global Optimization, vol. 2, pp. 117–128. Elsevier (1978)
Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S.: Information-theoretic transfer learning framework for Bayesian optimisation. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11052, pp. 827–842. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10928-8_49
Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S.: Selecting optimal source for transfer learning in Bayesian optimisation. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11012, pp. 42–56. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97304-3_4
Rasmussen, C., Williams, C.: Gaussian processes for machine learning. In: Gaussian Processes for Machine Learning (2006)
Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust Bayesian neural networks. In: Advances in Neural Information Processing Systems, pp. 4134–4142 (2016)
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)
Swersky, K., Snoek, J., Adams, R.P.: Multi-task Bayesian optimization. In: Advances in Neural Information Processing Systems, pp. 2004–2012 (2013)
Vellanki, P., et al.: Bayesian functional optimisation with shape prior. arXiv preprint arXiv:1809.07260 (2018)
Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. Transfer 1, 1 (2014)
Acknowledgment
This research was partially funded by the Australian Government through the Australian Research Council (ARC). Prof Venkatesh is the recipient of an ARC Australian Laureate Fellowship (FL170100006).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S. (2019). Information-Theoretic Multi-task Learning Framework for Bayesian Optimisation. In: Liu, J., Bailey, J. (eds) AI 2019: Advances in Artificial Intelligence. AI 2019. Lecture Notes in Computer Science(), vol 11919. Springer, Cham. https://doi.org/10.1007/978-3-030-35288-2_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-35288-2_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35287-5
Online ISBN: 978-3-030-35288-2
eBook Packages: Computer ScienceComputer Science (R0)