Nothing Special   »   [go: up one dir, main page]

Skip to main content

Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression

  • Conference paper
  • First Online:
Parallel Problem Solving from Nature – PPSN XVIII (PPSN 2024)

Abstract

Genetic programming (GP) approaches are among the state-of-the-art for symbolic regression, the task of constructing symbolic expressions that fit well with data. To find highly accurate symbolic expressions, both the expression structure and any contained real-valued constants, are important. GP-GOMEA, a modern model-based evolutionary algorithm, is one of the leading algorithms for finding accurate, yet compact expressions. Yet, GP-GOMEA does not perform dedicated constant optimization, but rather uses ephemeral random constants. Hence, the accuracy of GP-GOMEA may well still be improved upon by the incorporation of a constant optimization mechanism. Existing research into mixed discrete-continuous optimization with EAs has shown that a simultaneous and well-integrated approach to optimizing both discrete and continuous parts, leads to the best results on a variety of problems, especially when there are interactions between these parts. In this paper, we therefore propose a novel approach where constants in expressions are optimized at the same time as the expression structure by merging the real-valued variant of GOMEA with GP-GOMEA. The proposed approach is compared to other forms of handling constants in GP-GOMEA, and in the context of other commonly used techniques such as linear scaling, restarts, and constant tuning after GP optimization. Our results indicate that our novel approach generally performs best and confirms the importance of simultaneous constant optimization during evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Alonso, C.L., Montaña, J.L., Borges, C.E.: Evolution strategies for constants optimization in genetic programming. In: 2009 21st IEEE International Conference on Tools with Artificial Intelligence, pp. 703–707 (2009). https://doi.org/10.1109/ICTAI.2009.35

  2. Athanasios Tsanas, A.X.: Energy efficiency (2012). https://doi.org/10.24432/C51307

  3. Benavoli, A., Corani, G., Demšar, J., Zaffalon, M.: Time for a change: a tutorial for comparing multiple classifiers through bayesian analysis. J. Mach. Learn. Res. 18(1), 2653–2688 (2017)

    Google Scholar 

  4. Bouter, A., Alderliesten, T., Witteveen, C., Bosman, P.A.N.: Exploiting linkage information in real-valued optimization with the real-valued gene-pool optimal mixing evolutionary algorithm. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2017), pp. 705–712. Association for Computing Machinery, New York (2017). https://doi.org/10.1145/3071178.3071272

  5. Cerny, B.M., Nelson, P.C., Zhou, C.: Using differential evolution for symbolic regression and numerical constant creation. In: Proceedings of the 10th Annual Conference on Genetic and Evolutionary Computation (GECCO08). ACM (2008). https://doi.org/10.1145/1389095.1389331

  6. Dushatskiy, A., Virgolin, M., Bouter, A., Thierens, D., Bosman, P.A.N.: Parameterless gene-pool optimal mixing evolutionary algorithms (2021). https://doi.org/10.48550/ARXIV.2109.05259

  7. Fisher, R.A.: On the interpretation of \(\cal{X}^2\) from contingency tables, and the calculation of p. J. Roy. Statist. Soc. 85(1), 87 (1922). https://doi.org/10.2307/2340521

  8. Gronau, I., Moran, S.: Optimal implementations of upgma and other common clustering algorithms. Inf. Process. Lett. 104(6), 205–210 (2007). https://doi.org/10.1016/j.ipl.2007.07.002

  9. Harrison, J., Virgolin, M., Alderliesten, T., Bosman, P.: Mini-batching, gradient-clipping, first- versus second-order: what works in gradient-based coefficient optimisation for symbolic regression? In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2023), pp. 1127–1136. Association for Computing Machinery, New York (2023). https://doi.org/10.1145/3583131.3590368

  10. Howard, L., D’Angelo, D.: The ga-p: a genetic algorithm and genetic programming hybrid. IEEE Expert 10(3), 11–15 (1995). https://doi.org/10.1109/64.393137

  11. Yeh, I.-C.: Concrete compressive strength (1998). https://doi.org/10.24432/C5PK67

  12. Gerritsma, J.R.O.: Yacht hydrodynamics (1981). https://doi.org/10.24432/C5XG7R

  13. Keijzer, M.: Scaled symbolic regression. Genet. Program. Evol. Mach. 5(3), 259–269 (2004). https://doi.org/10.1023/b:genp.0000030195.77571.f9

  14. Kommenda, M., Burlacu, B., Kronberger, G., Affenzeller, M.: Parameter identification for symbolic regression using nonlinear least squares. Genetic Programming and Evolvable Machines 21(3), 471–501 (2019). https://doi.org/10.1007/s10710-019-09371-3

  15. Koza, J.: Genetic programming as a means for programming computers by natural selection. Statist. Comput. 4(2) (1994). https://doi.org/10.1007/bf00175355

  16. Kronberger, G.: Local optimization often is ill-conditioned in genetic programming for symbolic regression. In: 2022 24th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC), pp. 304–310 (2022). https://doi.org/10.1109/SYNASC57785.2022.00055

  17. La Cava, W., et al.: Contemporary symbolic regression methods and their relative performance. In: Vanschoren, J., Yeung, S. (eds.) Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, vol. 1. Curran (2021), https://datasets-benchmarks-proceedings.neurips.cc/paper_files/paper/2021/file/c0c7c76d30bd3dcaefc96f40275bdc0a-Paper-round1.pdf

  18. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30, pp. 4765–4774. Curran Associates, Inc. (2017). http://papers.nips.cc/paper/7062-a-unified-approach-to-interpreting-model-predictions.pdf

  19. Meurer, A., et al.: Sympy: symbolic computing in python. PeerJ Comput. Sci. 3, e103 (2017). https://doi.org/10.7717/peerj-cs.103

  20. Mukherjee, S., Eppstein, M.J.: Differential evolution of constants in genetic programming improves efficacy and bloat. In: Proceedings of the 14th Annual Conference Companion on Genetic and Evolutionary Computation (GECCO 2012), pp. 625–626. Association for Computing Machinery, New York (2012). https://doi.org/10.1145/2330784.2330891

  21. Nicolau, M., Agapitos, A.: Choosing function sets with better generalisation performance for symbolic regression models. Genet. Program. Evol. Mach. 22(1), 73–100 (2020). https://doi.org/10.1007/s10710-020-09391-4

  22. Paszke, A., et al.: PyTorch: an imperative style, high-performance deep learning library. Curran Associates Inc., Red Hook (2019)

    Google Scholar 

  23. Rockett, P.: Constant optimization and feature standardization in multiobjective genetic programming. Genet. Program. Evol. Mach. 23(1), 37–69 (2021). https://doi.org/10.1007/s10710-021-09410-y

  24. Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1(5), 206–215 (2019). https://doi.org/10.1038/s42256-019-0048-x

  25. Sadowski, K.L., Thierens, D., Bosman, P.A.: Gambit: a parameterless model-based evolutionary algorithm for mixed-integer problems. Evolution. Comput. 26(1), 117–143 (2018). https://doi.org/10.1162/evco_a_00206

  26. Sharman, K.: Evolving signal processing algorithms by genetic programming. In: 1st International Conference on Genetic Algorithms in Engineering Systems: Innovations and Applications (GALESIA). IEE (1995). https://doi.org/10.1049/cp:19951094

  27. Sijben, E.M.C., Alderliesten, T., Bosman, P.A.N.: Multi-modal multi-objective model-based genetic programming to find multiple diverse high-quality models. In: Proceedings of the Genetic and Evolutionary Computation Conference (GECCO 2022), pp. 440–448. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3512290.3528850

  28. Thomas Brooks, D.P.: Airfoil self-noise (1989). https://doi.org/10.24432/C5VW2C

  29. Virgolin, M., Alderliesten, T., Witteveen, C., Bosman, P.A.N.: Improving model-based genetic programming for symbolic regression of small expressions. Evol. Comput. 29(2), 211–237 (2021)

    Article  Google Scholar 

  30. Virgolin, M., Bosman, P.A.N.: Coefficient mutation in the gene-pool optimal mixing evolutionary algorithm for symbolic regression. In: Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO 2022), pp. 2289–2297. Association for Computing Machinery, New York (2022). https://doi.org/10.1145/3520304.3534036

  31. Virgolin, M., Medvet, E., Alderliesten, T., Bosman, P.A.N.: Less is more: a call to focus on simpler models in genetic programming for interpretable machine learning (2022). https://doi.org/10.48550/ARXIV.2204.02046

Download references

Acknowledgements

This research was funded by the European Commission within the HORIZON Programme (TRUST AI Project, Contract No.: 952060).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johannes Koch .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Koch, J., Alderliesten, T., Bosman, P.A.N. (2024). Simultaneous Model-Based Evolution of Constants and Expression Structure in GP-GOMEA for Symbolic Regression. In: Affenzeller, M., et al. Parallel Problem Solving from Nature – PPSN XVIII. PPSN 2024. Lecture Notes in Computer Science, vol 15148. Springer, Cham. https://doi.org/10.1007/978-3-031-70055-2_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-70055-2_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-70054-5

  • Online ISBN: 978-3-031-70055-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics