Abstract
Genetic programming (GP) approaches are among the state-of-the-art for symbolic regression, the task of constructing symbolic expressions that fit well with data. To find highly accurate symbolic expressions, both the expression structure and any contained real-valued constants, are important. GP-GOMEA, a modern model-based evolutionary algorithm, is one of the leading algorithms for finding accurate, yet compact expressions. Yet, GP-GOMEA does not perform dedicated constant optimization, but rather uses ephemeral random constants. Hence, the accuracy of GP-GOMEA may well still be improved upon by the incorporation of a constant optimization mechanism. Existing research into mixed discrete-continuous optimization with EAs has shown that a simultaneous and well-integrated approach to optimizing both discrete and continuous parts, leads to the best results on a variety of problems, especially when there are interactions between these parts. In this paper, we therefore propose a novel approach where constants in expressions are optimized at the same time as the expression structure by merging the real-valued variant of GOMEA with GP-GOMEA. The proposed approach is compared to other forms of handling constants in GP-GOMEA, and in the context of other commonly used techniques such as linear scaling, restarts, and constant tuning after GP optimization. Our results indicate that our novel approach generally performs best and confirms the importance of simultaneous constant optimization during evolution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Alonso, C.L., Montaña, J.L., Borges, C.E.: Evolution strategies for constants optimization in genetic programming. In: 2009 21st IEEE International Conference on Tools with Artificial Intelligence, pp. 703–707 (2009). https://doi.org/10.1109/ICTAI.2009.35
Athanasios Tsanas, A.X.: Energy efficiency (2012). https://doi.org/10.24432/C51307
Benavoli, A., Corani, G., Demšar, J., Zaffalon, M.: Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J. Mach. Learn. Res. 18(1), 2653–2688 (2017)
Bouter, A., Alderliesten, T., Witteveen, C., Bosman, P.A.N.: Exploiting linkage information in real-valued optimization with the real-valued gene-pool optimal mixing evolutionary algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2017), pp. 705–712. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3071178.3071272
Cerny, B.M., Nelson, P.C., Zhou, C.: Using differential evolution for symbolic regression and numerical constant creation. In: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO08). ACM (2008). https://doi.org/10.1145/1389095.1389331
Dushatskiy, A., Virgolin, M., Bouter, A., Thierens, D., Bosman, P.A.N.: Parameterless gene-pool optimal mixing evolutionary algorithms (2021). https://doi.org/10.48550/ARXIV.2109.05259
Fisher, R.A.: On the interpretation of \(\cal{X}^2\) from contingency tables, and the calculation of p. J. Roy. Statist. Soc. 85(1), 87 (1922). https://doi.org/10.2307/2340521
Gronau, I., Moran, S.: Optimal implementations of upgma and other common clustering algorithms. Inf. Process. Lett. 104(6), 205–210 (2007). https://doi.org/10.1016/j.ipl.2007.07.002
Harrison, J., Virgolin, M., Alderliesten, T., Bosman, P.: Mini-batching, gradient-clipping, first- versus second-order: what works in gradient-based coefficient optimisation for symbolic regression? In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2023), pp. 1127–1136. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583131.3590368
Howard, L., D’Angelo, D.: The ga-p: a genetic algorithm and genetic programming hybrid. IEEE Expert 10(3), 11–15 (1995). https://doi.org/10.1109/64.393137
Yeh, I.-C.: Concrete compressive strength (1998). https://doi.org/10.24432/C5PK67
Gerritsma, J.R.O.: Yacht hydrodynamics (1981). https://doi.org/10.24432/C5XG7R
Keijzer, M.: Scaled symbolic regression. Genet. Program. Evol. Mach. 5(3), 259–269 (2004). https://doi.org/10.1023/b:genp.0000030195.77571.f9
Kommenda, M., Burlacu, B., Kronberger, G., Affenzeller, M.: Parameter identification for symbolic regression using nonlinear least squares. Genetic Programming and Evolvable Machines 21(3), 471–501 (2019). https://doi.org/10.1007/s10710-019-09371-3
Koza, J.: Genetic programming as a means for programming computers by natural selection. Statist. Comput. 4(2) (1994). https://doi.org/10.1007/bf00175355
Kronberger, G.: Local optimization often is ill-conditioned in genetic programming for symbolic regression. In: 2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 304–310 (2022). https://doi.org/10.1109/SYNASC57785.2022.00055
La Cava, W., et al.: Contemporary symbolic regression methods and their relative performance. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, vol. 1. Curran (2021), https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/c0c7c76d30bd3dcaefc96f40275bdc0a-Paper-round1.pdf
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf
Meurer, A., et al.: Sympy: symbolic computing in python. PeerJ Comput. Sci. 3, e103 (2017). https://doi.org/10.7717/peerj-cs.103
Mukherjee, S., Eppstein, M.J.: Differential evolution of constants in genetic programming improves efficacy and bloat. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO 2012), pp. 625–626. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2330784.2330891
Nicolau, M., Agapitos, A.: Choosing function sets with better generalisation performance for symbolic regression models. Genet. Program. Evol. Mach. 22(1), 73–100 (2020). https://doi.org/10.1007/s10710-020-09391-4
Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Curran Associates Inc., Red Hook (2019)
Rockett, P.: Constant optimization and feature standardization in multiobjective genetic programming. Genet. Program. Evol. Mach. 23(1), 37–69 (2021). https://doi.org/10.1007/s10710-021-09410-y
Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x
Sadowski, K.L., Thierens, D., Bosman, P.A.: Gambit: a parameterless model-based evolutionary algorithm for mixed-integer problems. Evolution. Comput. 26(1), 117–143 (2018). https://doi.org/10.1162/evco_a_00206
Sharman, K.: Evolving signal processing algorithms by genetic programming. In: 1st International Conference on Genetic Algorithms in Engineering Systems: Innovations and Applications (GALESIA). IEE (1995). https://doi.org/10.1049/cp:19951094
Sijben, E.M.C., Alderliesten, T., Bosman, P.A.N.: Multi-modal multi-objective model-based genetic programming to find multiple diverse high-quality models. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2022), pp. 440–448. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3512290.3528850
Thomas Brooks, D.P.: Airfoil self-noise (1989). https://doi.org/10.24432/C5VW2C
Virgolin, M., Alderliesten, T., Witteveen, C., Bosman, P.A.N.: Improving model-based genetic programming for symbolic regression of small expressions. Evol. Comput. 29(2), 211–237 (2021)
Virgolin, M., Bosman, P.A.N.: Coefficient mutation in the gene-pool optimal mixing evolutionary algorithm for symbolic regression. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO 2022), pp. 2289–2297. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3520304.3534036
Virgolin, M., Medvet, E., Alderliesten, T., Bosman, P.A.N.: Less is more: a call to focus on simpler models in genetic programming for interpretable machine learning (2022). https://doi.org/10.48550/ARXIV.2204.02046
Acknowledgements
This research was funded by the European Commission within the HORIZON Programme (TRUST AI Project, Contract No.: 952060).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Koch, J., Alderliesten, T., Bosman, P.A.N. (2024). Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression. In: Affenzeller, M., et al. Parallel Problem Solving from Nature – PPSN XVIII. PPSN 2024. Lecture Notes in Computer Science, vol 15148. Springer, Cham. https://doi.org/10.1007/978-3-031-70055-2_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-70055-2_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-70054-5
Online ISBN: 978-3-031-70055-2
eBook Packages: Computer ScienceComputer Science (R0)