Abstract
Genetic Programming (GP) can make an important contribution to explainable artificial intelligence because it can create symbolic expressions as machine learning models. Nevertheless, to be explainable, the expressions must not become too large. This may, however, limit their potential to be accurate. The re-use of subexpressions has the unique potential to mitigate this issue. The Genetic Programming Gene-pool Optimal Mixing Evolutionary Algorithm (GP-GOMEA) is a recent model-based GP approach that has been found particularly capable of evolving small expressions. However, its tree representation offers no explicit mechanisms to re-use subexpressions. By contrast, the graph representation in Cartesian GP (CGP) is natively capable of re-use. For this reason, we introduce CGP-GOMEA, a variant of GP-GOMEA that uses graphs instead of trees. We experimentally compare various configurations of CGP-GOMEA with GP-GOMEA and find that CGP-GOMEA performs on par with GP-GOMEA on three common datasets. Moreover, CGP-GOMEA is found to produce models that re-use subexpressions more often than GP-GOMEA uses duplicate subexpressions. This indicates that CGP-GOMEA has unique added potential, allowing to find even smaller expressions than GP-GOMEA with similar accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Code and data can be found at https://github.com/matigekunstintelligentie/CGP-GOMEA.
References
Asuncion, A., Newman, D.: UCI machine learning repository (2007)
Bosman, P.A.N., Thierens, D.: On measures to build linkage trees in LTGA. In: Coello, C.A.C., Cutello, V., Deb, K., Forrest, S., Nicosia, G., Pavone, M. (eds.) PPSN 2012. LNCS, vol. 7491, pp. 276–285. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-32937-1_28
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
Dick, G., Owen, C.A., Whigham, P.A.: Feature standardisation and coefficient optimisation for effective symbolic regression. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 306–314 (2020)
Gronau, I., Moran, S.: Optimal implementations of UPGMA and other common clustering algorithms. Inf. Process. Lett. 104(6), 205–210 (2007)
Keijzer, M.: Improving symbolic regression with interval arithmetic and linear scaling. In: Ryan, C., Soule, T., Keijzer, M., Tsang, E., Poli, R., Costa, E. (eds.) EuroGP 2003. LNCS, vol. 2610, pp. 70–82. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-36599-0_7
Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection, vol. 1. MIT Press, Cambridge (1992)
Koza, J.R.: Genetic Programming II: Automatic Discovery of Reusable Programs, vol. 17. MIT Press, Cambridge (1994)
Lipton, Z.C.: The mythos of model interpretability: in machine learning, the concept of interpretability is both important and slippery. Queue 16(3), 31–57 (2018)
Miller, J.F., et al.: An empirical study of the efficiency of learning boolean functions using a cartesian genetic programming approach. In: Proceedings of the Genetic and Evolutionary Computation Conference, vol. 2, pp. 1135–1142 (1999)
Miller, J.F.: Cartesian genetic programming: its status and future. Genet. Program Evolvable Mach. 21, 1–40 (2019). https://doi.org/10.1007/s10710-019-09360-6
Poli, R., Banzhaf, W., Langdon, W.B., Miller, J.F., Nordin, P., Fogarty, T.C.: Genetic Programming. Springer (2004)
Pratt, J.W.: Remarks on zeros and ties in the Wilcoxon signed rank procedures. J. Am. Stat. Assoc. 54(287), 655–667 (1959)
Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)
Vilone, G., Longo, L.: Explainable artificial intelligence: a systematic review. arXiv preprint arXiv:2006.00093 (2020)
Virgolin, M., Alderliesten, T., Bel, A., Witteveen, C., Bosman, P.A.: Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1395–1402 (2018)
Virgolin, M., Alderliesten, T., Witteveen, C., Bosman, P.A.: Scalable genetic programming by gene-pool optimal mixing and input-space entropy-based building-block learning. In: Proceedings of the Genetic and Evolutionary Computation Conference, pp. 1041–1048 (2017)
Virgolin, M., Alderliesten, T., Witteveen, C., Bosman, P.A.: Improving model-based genetic programming for symbolic regression of small expressions. Evol. Comput. 29(2), 211–237 (2021)
Virgolin, M., De Lorenzo, A., Medvet, E., Randone, F.: Learning a formula of interpretability to learn interpretable formulas. In: Bäck, T., et al. (eds.) PPSN 2020. LNCS, vol. 12270, pp. 79–93. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58115-2_6
Woodward, J.R.: Complexity and cartesian genetic programming. In: Collet, P., Tomassini, M., Ebner, M., Gustafson, S., Ekárt, A. (eds.) EuroGP 2006. LNCS, vol. 3905, pp. 260–269. Springer, Heidelberg (2006). https://doi.org/10.1007/11729976_23
Acknowledgement
This research is part of the research programme Open Competition Domain Science-KLEIN with project number OCENW.KLEIN.111, which is financed by the Dutch Research Council (NWO). We further thank the Maurits en Anna de Kock Foundation for financing a high-performance computing system. We also thank Marco Virgolin aiding in implementing GP-GOMEA, and Dazhuang Liu and Evi Sijben for their fruitful discussions and reviews.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Harrison, J., Alderliesten, T., Bosman, P.A.N. (2022). Gene-pool Optimal Mixing in Cartesian Genetic Programming. In: Rudolph, G., Kononova, A.V., Aguirre, H., Kerschke, P., Ochoa, G., Tušar, T. (eds) Parallel Problem Solving from Nature – PPSN XVII. PPSN 2022. Lecture Notes in Computer Science, vol 13399. Springer, Cham. https://doi.org/10.1007/978-3-031-14721-0_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-14721-0_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14720-3
Online ISBN: 978-3-031-14721-0
eBook Packages: Computer ScienceComputer Science (R0)