Nothing Special   »   [go: up one dir, main page]

skip to main content
article

The Estimation of Distributions and the Minimum Relative Entropy Principle

Published: 01 January 2005 Publication History

Abstract

Estimation of Distribution Algorithms (EDA) have been proposed as an extension of genetic algorithms. In this paper we explain the relationship of EDA to algorithms developed in statistics, artificial intelligence, and statistical physics. The major design issues are discussed within a general interdisciplinary framework. It is shown that maximum entropy approximations play a crucial role. All proposed algorithms try to minimize the Kullback-Leibler divergence KLD between the unknown distribution p(x) and a class q(x) of approximations. However, the Kullback-Leibler divergence is not symmetric. Approximations which suppose that the function to be optimized is additively decomposed (ADF) minimize KLD(q||p), the methods which learn the approximate model from data minimize KLD(p||q). This minimization is identical to maximizing the log-likelihood. In the paper three classes of algorithms are discussed. FDA uses the ADF to compute an approximate factorization of the unknown distribution. The factors are marginal distributions, whose values are computed from samples. The second class is represented by the Bethe-Kikuchi approach which has recently been rediscovered in statistical physics. Here the values of the marginals are computed from a difficult constrained minimization problem. The third class learns the factorization from the data. We analyze our learning algorithm LFDA in detail. It is shown that learning is faced with two problems: first, to detect the important dependencies be- tween the variables, and second, to create an acyclic Bayesian network of bounded clique size.

References

[1]
Almond, R. G. (1995). Graphical Belief Modelling. Chapman & Hall, London.
[2]
Bertelé, U. and Brioschi, F. (1972). Nonserial Dynamic Programming. Academic Press, New York.
[3]
Bethe, H. (1935). Statistical theory of superlattices. Proc. Roy. Soc. London A, 150:552-558.
[4]
Cover, T. M. and Thomas, J. (1989). Elements of Information Theory. Wiley, New York.
[5]
Csiszár, I. (1975). I-divergence geometry of probability distributions and minimization problems. Annals of Probability, 3:146-158.
[6]
Huang, C. and Darwiche, A. (1996). Inference in belief networks: A procedural guide. International Journal of Approximate Reasoning, 15(3):225-263.
[7]
Jaynes, E. T. (1957). Information theory and statistical mechanics. Phys. Rev, 6:620-643.
[8]
Jaynes, E. T. (1978). Where do we stand on maximum entropy? In Levine, R. D. and Tribus, M., editors, The Maximum Entropy Formalism. MIT Press, Cambridge.
[9]
Jensen, F. V. and Jensen, F. (1994). Optimal junction trees. In Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, pages 360-366, Seattle.
[10]
Jordan, M. I. (1999). Learning in Graphical Models. MIT Press, Cambrigde.
[11]
Kikuchi, R. (1951). A theory of cooperative phenomena. Phys.Review, 115:988-1003.
[12]
Kullback, S. (1968). Probability densities with given marginals. Annals of Mathematical Statistics, 39(4):1236-1243.
[13]
Larrañaga, P. and Lozano, J. (2002). Estimation of Distribution Algorithms: A New Tool for Evolutionary Optimization. Kluwer Academic Press, Boston.
[14]
Lauritzen, S. L. (1996). Graphical Models. Clarendon Press, Oxford.
[15]
Lin, S. and Kernighan, B. (1973). An effective heuristic algorithm for the traveling-salesman. Operations Research, 21:498-516.
[16]
Martelli, A. and Montari, U. (1972). Nonserial dynamic programming: On the optimal strategy of variable elimination for the rectangular lattice. Jour. Math. Analyis and Applications, 40:226-242.
[17]
Mühlenbein, H. and Mahnig, T. (1999). FDA - a scalable evolutionary algorithm for the optimization of additively decomposed functions. Evolutionary Computation, 7(4):353-376.
[18]
Mühlenbein, H. and Mahnig, T. (2000). Evolutionary algorithms: From recombination to search distributions. In Kallel, L., Naudts, B., and Rogers, A., editors, Theoretical Aspects of Evolutionary Computing, Natural Computing, pages 137-176. Springer Verlag, Berlin.
[19]
Mühlenbein, H. and Mahnig, T. (2001). Evolutionary computation and beyond. In Uesaka, Y., Kanerva, P., and Asoh, H., editors, Foundations of Real-World Intelligence, pages 123-188. CSLI Publications, Stanford, California.
[20]
Mühlenbein, H. and Mahnig, T. (2002a). Evolutionary optimization and the estimation of search distributions with applications to graph bipartitioning. Journal of Approximate Reasoning, 31(3):157-192.
[21]
Mühlenbein, H. and Mahnig, T. (2002b). Mathematical analysis of evolutionary algorithms. In Ribeiro, C. C. and Hansen, R, editors, Essays and Surveys in Metaheuristics , Operations Research/Computer Science Interface Series, pages 525-556. Kluwer Academic Publisher, Norwell.
[22]
Mühlenbein, H. and Mahnig, T. (2003). Evolutionary algorithms and the Boltzmann distribution. In Jong, K. D., Poli, R., and Rowe, J. C., editors, Foundations of Genetic Algorithms 7, pages 525-556. Morgan Kaufmann Publishers, San Francisco.
[23]
Mühlenbein, H., Mahnig, T., and Ochoa, A. (1999). Schemata, distributions and graphical models in evolutionary optimization. Journal of Heuristics, 5(2):213-247.
[24]
Mühlenbein, H. and Paaß, G. (1996). From recombination of genes to the estimation of distributions I. binary parameters. In Voigt, H.-M., Ebeling, W., Rechenberg, I., and Schwefel, H.-P, editors, Lecture Notes in Computer Science 1141: Parallel Problem Solving from Nature - PPSN IV), pages 178-187, Springer Verlag, Berlin.
[25]
Ochoa, A., Höns, R., Soto, M., and Mühlenbein, H. (2003). A maximum entropy approach to sampling in EDA - the singly connected case. In Proceedings of the 8th Iberoamerican Congress on Pattern Recognition (CIARP 2003), volume 2905 of Lecture Notes in Computer Science, pages 683-690. Springer Verlag, Berlin.
[26]
Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 7:461-464.
[27]
Xiang, Y., Wong, S., and Cercone, N. (1997). A 'microscopic' study of minimum entropy search in learning decomposable markov networks. Machine Learning, 26:65-92.
[28]
Yedidia, J. S., Freeman, W. T., and Weiss, Y. (2001). Understanding belief propagation and its generalizations. Technical Report 2001-22, Mitsubishi Electric Research Laboratories.
[29]
Yedidia, J. S., Freeman, W. T., and Weiss, Y. (2004). Constructing free energy approximations and generalized belief propagation algorithms. Technical Report 2004-40, Mitsubishi Electric Research Laboratories.

Cited By

View all
  • (2021)Concept Driven Search and Visualization System for Exploring Scientific RepositoriesProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3430991(395-399)Online publication date: 2-Jan-2021
  • (2017)Bi-level programming model for multi-modal regional bus timetable and vehicle dispatch with stochastic travel timeCluster Computing10.1007/s10586-016-0719-x20:1(401-411)Online publication date: 1-Mar-2017
  • (2012)Higher-order linkage learning in the ECGAProceedings of the 14th annual conference on Genetic and evolutionary computation10.1145/2330163.2330202(265-272)Online publication date: 7-Jul-2012
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Evolutionary Computation
Evolutionary Computation  Volume 13, Issue 1
January 2005
144 pages
ISSN:1063-6560
EISSN:1530-9304
Issue’s Table of Contents

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 01 January 2005
Published in EVOL Volume 13, Issue 1

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Concept Driven Search and Visualization System for Exploring Scientific RepositoriesProceedings of the 3rd ACM India Joint International Conference on Data Science & Management of Data (8th ACM IKDD CODS & 26th COMAD)10.1145/3430984.3430991(395-399)Online publication date: 2-Jan-2021
  • (2017)Bi-level programming model for multi-modal regional bus timetable and vehicle dispatch with stochastic travel timeCluster Computing10.1007/s10586-016-0719-x20:1(401-411)Online publication date: 1-Mar-2017
  • (2012)Higher-order linkage learning in the ECGAProceedings of the 14th annual conference on Genetic and evolutionary computation10.1145/2330163.2330202(265-272)Online publication date: 7-Jul-2012
  • (2011)Comparison-based complexity of multiobjective optimizationProceedings of the 13th annual conference on Genetic and evolutionary computation10.1145/2001576.2001685(801-806)Online publication date: 12-Jul-2011
  • (2011)Hierarchical allelic pairwise independent functionsProceedings of the 13th annual conference on Genetic and evolutionary computation10.1145/2001576.2001663(633-640)Online publication date: 12-Jul-2011
  • (2010)Enabling the extended compact genetic algorithm for real-parameter optimization by using adaptive discretizationEvolutionary Computation10.1162/evco.2010.18.2.1820218:2(199-228)Online publication date: 1-Jun-2010
  • (2010)Sensibility of linkage information and effectiveness of estimated distributionsEvolutionary Computation10.1162/EVCO_a_0001018:4(547-579)Online publication date: 1-Dec-2010
  • (2010)Multivariate multi-model approach for globally multimodal problemsProceedings of the 12th annual conference on Genetic and evolutionary computation10.1145/1830483.1830544(311-318)Online publication date: 7-Jul-2010
  • (2010)A hybrid evolutionary algorithm based on alopex and estimation of distribution algorithm and its application for optimizationProceedings of the First international conference on Advances in Swarm Intelligence - Volume Part I10.1007/978-3-642-13495-1_67(549-557)Online publication date: 12-Jun-2010
  • (2009)On multivariate genetic systemsProceedings of the 18th international conference on Fuzzy Systems10.5555/1717561.1717568(35-40)Online publication date: 20-Aug-2009
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media