Nothing Special   »   [go: up one dir, main page]

Skip to main content

Information-Theoretic Multi-task Learning Framework for Bayesian Optimisation

  • Conference paper
  • First Online:
AI 2019: Advances in Artificial Intelligence (AI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11919))

Included in the following conference series:

  • 2352 Accesses

Abstract

Bayesian optimisation is a widely used technique for finding the optima of black-box functions in a sample efficient way. When there are concurrent optimisation tasks/functions then it may be possible to transfer knowledge across each other in a multi-task setting and improve the efficiency further. Transferring knowledge requires estimation of task similarity, which in turn requires good knowledge about the objective functions. However, in a multi-task Bayesian optimisation setting the number of observations for all functions can be small, especially at the beginning, making reliable computation of task similarities difficult. In this paper, we propose a novel multi-task Bayesian optimisation method that uses information theory based approach to transfer knowledge across tasks and handle the uncertainty of similarity measurements in an unified framework. Each optimisation task uses contribution from other optimisation task via a mixture model on the location of optima by appropriately combining distribution over optimal locations for each individual task. The probability distribution of the optimal location for individual tasks can be obtained because the objective functions are modeled using Gaussian processes. The weights of the mixture distributions are computed based on the similarities (measured via KL divergence) between two distributions and then appropriately weighting down by the uncertainty in the knowledge. That is, we encourage transfer of knowledge only when two tasks are confident about their high similarity measure and discourage if they are not confident, even if the similarity is high. We evaluate and demonstrate the effectiveness of our proposed method on both synthetic and a set of hyperparameter tuning tests compared to state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bardenet, R., Brendel, M., Kégl, B., Sebag, M.: Collaborative hyperparameter tuning. In: ICML, vol. 2, pp. 199–207 (2013)

    Google Scholar 

  2. Botev, Z.I., Grotowski, J.F., Kroese, D.P., et al.: Kernel density estimation via diffusion. The Ann. Stat. 38(5), 2916–2957 (2010)

    Article  MathSciNet  Google Scholar 

  3. Dua, D., Karra Taniskidou, E.: UCI machine learning repository (2017). http://archive.ics.uci.edu/ml

  4. Feurer, M., Letham, B., Bakshy, E.: Scalable meta-learning for Bayesian optimization. arXiv preprint arXiv:1802.02219 (2018)

  5. Feurer, M., Springenberg, J.T., Hutter, F.: Initializing Bayesian hyperparameter optimization via meta-learning. In: AAAI, pp. 1128–1135 (2015)

    Google Scholar 

  6. Hernández-Lobato, J.M., Hoffman, M.W., Ghahramani, Z.: Predictive entropy search for efficient global optimization of black-box functions. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 918–926. Curran Associates, Inc. (2014)

    Google Scholar 

  7. Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40

    Chapter  Google Scholar 

  8. Joy, T.T., Rana, S., Gupta, S.K., Venkatesh, S.: Flexible transfer learning framework for Bayesian optimisation. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9651, pp. 102–114. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-31753-3_9

    Chapter  Google Scholar 

  9. Kushner, H.J.: A new method of locating the maximum point of an arbitrary multipeak curve in the presence of noise. J. Basic Eng. 86(1), 97–106 (1964)

    Article  Google Scholar 

  10. Li, C., et al.: Accelerating experimental design by incorporating experimenter hunches. In: 2018 IEEE International Conference on Data Mining (ICDM), pp. 257–266. IEEE (2018)

    Google Scholar 

  11. Moc̃kus, J., Tiesis, V., Z̃ilinskas, A.: The application of Bayesian methods for seeking the extremum. In: Toward Global Optimization, vol. 2, pp. 117–128. Elsevier (1978)

    Google Scholar 

  12. Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S.: Information-theoretic transfer learning framework for Bayesian optimisation. In: Berlingerio, M., Bonchi, F., Gärtner, T., Hurley, N., Ifrim, G. (eds.) ECML PKDD 2018. LNCS (LNAI), vol. 11052, pp. 827–842. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-10928-8_49

    Chapter  Google Scholar 

  13. Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S.: Selecting optimal source for transfer learning in Bayesian optimisation. In: Geng, X., Kang, B.-H. (eds.) PRICAI 2018. LNCS (LNAI), vol. 11012, pp. 42–56. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-97304-3_4

    Chapter  Google Scholar 

  14. Rasmussen, C., Williams, C.: Gaussian processes for machine learning. In: Gaussian Processes for Machine Learning (2006)

    Google Scholar 

  15. Springenberg, J.T., Klein, A., Falkner, S., Hutter, F.: Bayesian optimization with robust Bayesian neural networks. In: Advances in Neural Information Processing Systems, pp. 4134–4142 (2016)

    Google Scholar 

  16. Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)

    Article  MathSciNet  Google Scholar 

  17. Swersky, K., Snoek, J., Adams, R.P.: Multi-task Bayesian optimization. In: Advances in Neural Information Processing Systems, pp. 2004–2012 (2013)

    Google Scholar 

  18. Vellanki, P., et al.: Bayesian functional optimisation with shape prior. arXiv preprint arXiv:1809.07260 (2018)

  19. Yogatama, D., Mann, G.: Efficient transfer learning method for automatic hyperparameter tuning. Transfer 1, 1 (2014)

    Google Scholar 

Download references

Acknowledgment

This research was partially funded by the Australian Government through the Australian Research Council (ARC). Prof Venkatesh is the recipient of an ARC Australian Laureate Fellowship (FL170100006).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anil Ramachandran .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ramachandran, A., Gupta, S., Rana, S., Venkatesh, S. (2019). Information-Theoretic Multi-task Learning Framework for Bayesian Optimisation. In: Liu, J., Bailey, J. (eds) AI 2019: Advances in Artificial Intelligence. AI 2019. Lecture Notes in Computer Science(), vol 11919. Springer, Cham. https://doi.org/10.1007/978-3-030-35288-2_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35288-2_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35287-5

  • Online ISBN: 978-3-030-35288-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics