Theoretical Foundation for CMA-ES from Information Geometry Perspective

Youhei Akimoto¹,
Yuichi Nagata²,
Isao Ono² &
…
Shigenobu Kobayashi²

800 Accesses
44 Citations
Explore all metrics

Abstract

This paper explores the theoretical basis of the covariance matrix adaptation evolution strategy (CMA-ES) from the information geometry viewpoint.

To establish a theoretical foundation for the CMA-ES, we focus on a geometric structure of a Riemannian manifold of probability distributions equipped with the Fisher metric. We define a function on the manifold which is the expectation of fitness over the sampling distribution, and regard the goal of update of the parameters of sampling distribution in the CMA-ES as maximization of the expected fitness. We investigate the steepest ascent learning for the expected fitness maximization, where the steepest ascent direction is given by the natural gradient, which is the product of the inverse of the Fisher information matrix and the conventional gradient of the function.

Our first result is that we can obtain under some types of parameterization of multivariate normal distribution the natural gradient of the expected fitness without the need for inversion of the Fisher information matrix. We find that the update of the distribution parameters in the CMA-ES is the same as natural gradient learning for expected fitness maximization. Our second result is that we derive the range of learning rates such that a step in the direction of the exact natural gradient improves the parameters in the expected fitness. We see from the close relation between the CMA-ES and natural gradient learning that the default setting of learning rates in the CMA-ES seems suitable in terms of monotone improvement in expected fitness. Then, we discuss the relation to the expectation-maximization framework and provide an information geometric interpretation of the CMA-ES.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Akimoto, Y., Nagata, Y., Ono, I., Kobayashi S.: Theoretical analysis of evolutionary computation on continuously differentiable functions. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2010, pp. 1401–1408 (2010)
Google Scholar
Akimoto, Y., Nagata, Y., Ono, I., Kobayashi, S.: Bidirectional relation between CMA evolution strategies and natural evolution strategies. In: Parallel Problem Solving from Nature—PPSN XI, pp. 154–163. Springer, Berlin (2010)
Chapter Google Scholar
Akimoto, Y., Sakuma, J., Ono, I., Kobayashi, S.: Functionally specialized CMA-ES: a modification of CMA-ES based on the specialization of the functions of covariance matrix adaptation and step size adaptation. In: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation—GECCO ’08, pp. 479–486 (2008)
Chapter Google Scholar
Amari, S.: Information geometry of the EM and em algorithms for neural networks. Neural Netw. 8(9), 1379–1408 (1995)
Article Google Scholar
Amari S.i.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
Article MathSciNet Google Scholar
Amari S.i., Douglas S.: Why natural gradient? In: Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 1213–1216 (1998)
Google Scholar
Amari S.i., Nagaoka H.: Methods of Information Geometry. American Mathematical Society, Providence (2007)
Google Scholar
Billingsley, P.: Probability and Measure, 3rd edn. Wiley-Interscience, New York (1995)
MATH Google Scholar
Dayan, P., Hinton, G.E.: Using expectation-maximization for reinforcement learning. Neural Comput. 9(2), 271–278 (1997)
Article MATH Google Scholar
Dempster, A., Laird, N.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc. B 39(1), 1–38 (1977)
MathSciNet MATH Google Scholar
Evans, M., Swartz, T.: Approximating Integrals via Monte Carlo and Deterministic Methods. Oxford University Press, New York (2000)
MATH Google Scholar
Glasmachers, T., Schaul, T., Yi, S., Wierstra, D., Schmidhuber, J.: Exponential natural evolution strategies. In: Proceedings of Genetic and Evolutionary Computation Conference, pp. 393–400 (2010)
Google Scholar
Hansen, N.: The CMA Evolution Strategy: A Comparing Review, pp. 75–102. Springer, Berlin (2006)
Google Scholar
Hansen, N., Müller, S.D., Koumoutsakos, P.: Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES). Evol. Comput. 11(1), 1–18 (2003)
Article Google Scholar
Hansen, N., Ostermeier, A.: Completely derandomized self-adaptation in evolution strategies. Evol. Comput. 9(2), 159–195 (2001)
Article Google Scholar
Harville, D.A.: Matrix Algebra from a Statistician’s Perspective. Springer, Berlin (2008)
MATH Google Scholar
Jastrebski, G., Arnold, D.V.: Improving evolution strategies through active covariance matrix adaptation. In: 2006 IEEE International Conference on Evolutionary Computation, pp. 9719–9726 (2006)
Google Scholar
Kay, S.M.: Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory. Prentice Hall, New York (1993)
Google Scholar
Kober, J., Peters, J.: Policy search for motor primitives in robotics. Adv. Neural Inf. Process. Syst. 22, 1–8 (2009)
Google Scholar
Kullback, S.: Information Theory and Statistics. Wiley, New York (1959)
MATH Google Scholar
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic, Norwell (2002)
MATH Google Scholar
Neal, R.M., Hinton, G.E.: A view of the EM algorithm that justifies incremental, sparse, and other variants. Learning in Graphical Models, vol. 89, pp. 355–368 (1998)
Google Scholar
Peters, J., Schaal, S.: Natural actor-critic. Neurocomputing 71(7–9), 1180–1190 (2008)
Article Google Scholar
Ros, R., Hansen, N.: A simple modification in CMA-ES achieving linear time and space complexity. In: Parallel Problem Solving from Nature—PPSN X, pp. 296–305 (2008)
Chapter Google Scholar
Sun, Y., Wierstra, D., Schaul, T., Schmidhuber, J.: Efficient natural evolution strategies. In: Proceedings of the 11th Annual Conference on Genetic and Evolutionary Computation—GECCO ’09, pp. 539–545 (2009)
Chapter Google Scholar
Sun, Y., Wierstra, D., Schaul, T., Schmidhuber, J.: Stochastic search using the natural gradient. In: Proceedings of the 26th International Conference on Machine Learning, pp. 1161–1168 (2009)
Google Scholar
Wierstra, D., Schaul, T., Peters, J., Schmidhuber, J.: Fitness expectation maximization. In: Parallel Problem Solving from Nature—PPSN X, pp. 337–346. Springer, Berlin (2008)
Chapter Google Scholar
Wierstra, D., Schaul, T., Peters, J., Schmidhuber, J.: Natural evolution strategies. In: IEEE Congress on Evolutionary Computation, pp. 3381–3387 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

TAO Team – INRIA Saclay, LRI – Paris-Sud University, 91405, Orsay Cedex, France
Youhei Akimoto
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, 226-8502, Japan
Yuichi Nagata, Isao Ono & Shigenobu Kobayashi

Authors

Youhei Akimoto
View author publications
You can also search for this author in PubMed Google Scholar
Yuichi Nagata
View author publications
You can also search for this author in PubMed Google Scholar
Isao Ono
View author publications
You can also search for this author in PubMed Google Scholar
Shigenobu Kobayashi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Youhei Akimoto.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Akimoto, Y., Nagata, Y., Ono, I. et al. Theoretical Foundation for CMA-ES from Information Geometry Perspective. Algorithmica 64, 698–716 (2012). https://doi.org/10.1007/s00453-011-9564-8

Download citation

Received: 07 December 2010
Accepted: 22 August 2011
Published: 09 September 2011
Issue Date: December 2012
DOI: https://doi.org/10.1007/s00453-011-9564-8

Theoretical Foundation for CMA-ES from Information Geometry Perspective

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

What Does the Evolution Path Learn in CMA-ES?

Natural Gradient Interpretation of Rank-One Update in CMA-ES

The Hessian Estimation Evolution Strategy

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Theoretical Foundation for CMA-ES from Information Geometry Perspective

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

What Does the Evolution Path Learn in CMA-ES?

Natural Gradient Interpretation of Rank-One Update in CMA-ES

The Hessian Estimation Evolution Strategy

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now