Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Measures of Diversity for Populations and Distances Between Individuals with Highly Reorganizable Genomes

Published: 01 December 2004 Publication History

Abstract

In this paper we address the problem of defining a measure of diversity for a population of individuals whose genome can be subjected to major reorganizations during the evolutionary process. To this end, we introduce a measure of diversity for populations of strings of variable length defined on a finite alphabet, and from this measure we derive a semi-metric distance between pairs of strings. The definitions are based on counting the number of substrings of the strings, considered first separately and then collectively. This approach is related to the concept of linguistic complexity, whose definition we generalize from single strings to populations. Using the substring count approach we also define a new kind of Tanimoto distance between strings. We show how to extend the approach to representations that are not based on strings and, in particular, to the tree-based representations used in the field of genetic programming. We describe how suffix trees can allow these measures and distances to be implemented with a computational cost that is linear in both space and time relative to the length of the strings and the size of the population. The definitions were devised to assess the diversity of populations having genomes of variable length and variable structure during evolutionary computation runs, but applications in quantitative genomics, proteomics, and pattern recognition can be also envisaged.

References

[1]
Crochemore, M., Hancart, C., and Lecroq, T. (2001). Algorithmique du texte. Vuibert, Paris.]]
[2]
Crochemore, M. and Rytter, W. (2002). Jewels of Stringology. World Scientific, Singapore.]]
[3]
de Jong, E., Watson, R., and Pollack, J. (2001). Reducing bloat and promoting diversity using multi-objective methods. In et al., L. S., editor, GECCO 2001, pages 11-18, San Francisco, CA. Morgan Kaufmann.]]
[4]
Graur, D. and Li, W.-H. (2000). Fundamentals of Molecular Evolution. Sinauer Associates, Sunderland, MA, 2nd edition.]]
[5]
Gusfield, G. (1997). Algorithms on Strings, Trees, and Sequences. Cambridge University Press, Sunderland, MA.]]
[6]
Haykin, S. (1999). Neural Networks: A Comprehensive Foundation. Prentice Hall, Upper Saddle River, NY, 2nd edition.]]
[7]
Keijzer, M. (1996). Efficiently representing populations in genetic programming. In Angeline, P. et al., editor, Advances in Genetic Programming, volume 2, pages 259- 278, CAmbridge, MA. MIT Press.]]
[8]
Keller, R. and Banzhaf, W. (1994). Explicit maintenance of genetic diversity on genospaces. Unpublished manuscript. Available online at Citeseer.]]
[9]
Langdon, W. and Poli, R. (2002). Foundations of Genetic Programming. Springer, Berlin.]]
[10]
Leung, Y., Gao, Y., and Xu, Z. (1997). Degree of population diversity - a perspective on premature convergence in genetic algorithms and its markov chain analysis. IEEE Trans. Neural Networks, 8(5):1165-1176.]]
[11]
Levandowsky, M. and Winter, D. (1971). Distance between sets. Nature, 234(5):34-35.]]
[12]
Lipkus, A. (1999). A proof of the triangle inequality for the tanimoto distance. J. of Mathematical Chemistry, 26:263-265.]]
[13]
Mattiussi, C. and Floreano, D. (2004). Evolution of analog networks using local string alignment on highly reorganizable genomes. In et al., R. Z., editor, Proceedings of the 2004 NASA/DoD Conference on Evolvable Hardware, 24-26 June 2004, Seattle, pages 30-37, Los Alamitos, CA. IEEE Computer Society.]]
[14]
McCreight, E. (1976). A space-economical suffix tree construction algorithm. J. ACM, 23(1):262-272.]]
[15]
Monro, G. (1987). The concept of multiset. Zeitschr. f. math. Logik ung Grundlagen d. Math., 33:171-178.]]
[16]
Morrison, W. and de Jong, K. (2002). Measurement of population diversity. In et al., P. C., editor, EA 2001, volume 2310 of LNCS, pages 31-41, Berlin. Springer.]]
[17]
O'Reilly, U.-M. (1997). Using a distance metric on genetic programs to understand genetic operators. In IEEE International Conference on Systems, Man, and Cybernetics, volume 5, pages 4092-4097.]]
[18]
Sankoff, D. and Kruskal, J. (1983). Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison. Addison-Wesley, Reading, MA.]]
[19]
Shapiro, J. (2002). A 21st century view of evolution. J. Biol. Phys., 28:1-20.]]
[20]
Theodoridis, S. and Koutroumbas, K. (2003). Pattern Recognition. Academic Press, Sand Diego, CA, 2nd edition.]]
[21]
Tomassini, M., Vanneschi, L., Fernández, F., and Galeano, G. (2004). A study of diversity in multipopulation genetic programming. In et al., P. L., editor, EA 2003, Artificial Evolution: 6th International Conference, Marseilles, France, October 27-30, 2003, volume 2936 of LNCS, pages 243-255, Berlin. Springer.]]
[22]
Trifonov, E. (1990). Making sense of the human genome. In Sarma, R. and Sarma, M., editors, Structure and Methods, volume 1, pages 69-77. Adenine Press, New York.]]
[23]
Troyanskaya, O., Arbell, O.and Koren, Y., Landau, G., and Bolshoy, A. (2002). Sequence complexity of prokariotic genomic sequences: A fast algorithm for calculating linguistic complexity. Bioinformatics, 18(5):679-688.]]
[24]
Ukkonen, E. (1995). On-line construction of suffix trees. Algorithmica, 14:249-260.]]
[25]
Wineberg, M. and Oppacher, F. (2000). Enhancing the ga's ability to cope with dynamic environments. In et al., D. W., editor, GECCO 2000, pages 3-10, San Francisco, CA. Morgan Kaufmann.]]
[26]
Wineberg, M. and Oppacher, F. (2003a). Distance between populations. In et al., E. C.-P., editor, GECCO 2003, volume 2724 of LNCS, pages 1481-1492, Berlin. Springer.]]
[27]
Wineberg, M. and Oppacher, F. (2003b). The underlying similarity of diversity measures used in evolutionary computation. In et al., E. C.-P., editor, GECCO 2003, volume 2724 of LNCS, pages 1493-1504, Berlin. Springer.]]

Cited By

View all
  1. Measures of Diversity for Populations and Distances Between Individuals with Highly Reorganizable Genomes

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Evolutionary Computation
      Evolutionary Computation  Volume 12, Issue 4
      December 2004
      141 pages
      ISSN:1063-6560
      EISSN:1530-9304
      Issue’s Table of Contents

      Publisher

      MIT Press

      Cambridge, MA, United States

      Publication History

      Published: 01 December 2004
      Published in EVOL Volume 12, Issue 4

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 21 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2021)Dynamic feature weighting for multi-label classification problemsProgress in Artificial Intelligence10.1007/s13748-021-00237-310:3(283-295)Online publication date: 1-Sep-2021
      • (2018)Balanced-evolution genetic algorithm for combinatorial optimization problemsNatural Computing: an international journal10.1007/s11047-018-9670-517:3(611-639)Online publication date: 1-Sep-2018
      • (2018)Optimal placement of an actuator in a two degrees of freedom system to generate desired vibrationsMicrosystem Technologies10.1007/s00542-018-3711-y24:8(3389-3398)Online publication date: 1-Aug-2018
      • (2017)How to Pack Trapezoids: Exact and Evolutionary AlgorithmsIEEE Transactions on Evolutionary Computation10.1109/TEVC.2016.260900021:3(463-476)Online publication date: 26-May-2017
      • (2015)Efficient genetic algorithms for optimising the location of discrete nodes to cover multiple demand pointsInternational Journal of Metaheuristics10.1504/IJMHEUR.2015.0744294:3/4(244-267)Online publication date: 1-Jan-2015
      • (2014)Generation of diversity in a reaction-diffusion-based controllerArtificial Life10.1162/ARTL_a_0013420:3(319-342)Online publication date: 1-Jul-2014
      • (2013)Exploration and exploitation in evolutionary algorithmsACM Computing Surveys10.1145/2480741.248075245:3(1-33)Online publication date: 3-Jul-2013
      • (2013)Fitness approximation for bot evolution in genetic programmingSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-012-0965-717:8(1479-1487)Online publication date: 1-Aug-2013
      • (2011)Ectropy of diversity measures for populations in Euclidean spaceInformation Sciences: an International Journal10.1016/j.ins.2010.12.004181:11(2316-2339)Online publication date: 1-Jun-2011
      • (2010)Using program data-state scarcity to guide automatic test data generationSoftware Quality Journal10.1007/s11219-009-9083-x18:1(109-144)Online publication date: 1-Mar-2010
      • Show More Cited By

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media