Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3583133.3596327acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Priors for symbolic regression

Published: 24 July 2023 Publication History

Abstract

When choosing between competing symbolic models for a data set, a human will naturally prefer the "simpler" expression or the one which more closely resembles equations previously seen in a similar context. This suggests a non-uniform prior on functions, which is, however, rarely considered within a symbolic regression (SR) framework. In this paper we develop methods to incorporate detailed prior information on both functions and their parameters into SR. Our prior on the structure of a function is based on a n-gram language model, which is sensitive to the arrangement of operators relative to one another in addition to the frequency of occurrence of each operator. We also develop a formalism based on the Fractional Bayes Factor to treat numerical parameter priors in such a way that models may be fairly compared though the Bayesian evidence, and explicitly compare Bayesian, Minimum Description Length and heuristic methods for model selection. We demonstrate the performance of our priors relative to literature standards on benchmarks and a real-world dataset from the field of cosmology.

Supplementary Material

ZIP File (p2402-bartlett-suppl.zip)
Supplemental material.

References

[1]
Deaglan J. Bartlett, Harry Desmond, and Pedro G. Ferreira. 2022. Exhaustive Symbolic Regression. arXiv e-prints, Article arXiv:2211.11461 (Nov. 2022), arXiv:2211.11461 pages. arXiv:2211.11461 [astro-ph.CO]
[2]
Sören Becker, Michal Klein, Alexander Neitz, Giambattista Parascandolo, and Niki Kilbertus. 2022. Discovering ordinary differential equations that govern time-series. arXiv e-prints, Article arXiv:2211.02830 (Nov. 2022), arXiv:2211.02830 pages. arXiv:2211.02830 [cs.LG]
[3]
James O. Berger and Luis R. Pericchi. 1996. The Intrinsic Bayes Factor for Model Selection and Prediction. J. Amer. Statist. Assoc. 91, 433 (1996), 109--122. arXiv:https://www.tandfonline.com/doi/pdf/10.1080/01621459.1996.10476668
[4]
Jose M. Bernardo. 1979. Reference Posterior Distributions for Bayesian Inference. Journal of the Royal Statistical Society: Series B (Methodological) 41, 2 (1979), 113--128. arXiv:https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1979.tb01066.x
[5]
Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurelien Lucchi, and Giambattista Parascandolo. 2021. Neural Symbolic Regression that scales. In Proceedings of the 38th International Conference on Machine Learning (Proceedings of Machine Learning Research, Vol. 139), Marina Meila and Tong Zhang (Eds.). PMLR, 936--945. https://proceedings.mlr.press/v139/biggio21a.html
[6]
G. F. Bomarito, P. E. Leser, N. C. M. Strauss, K. M. Garbrecht, and J. D. Hochhalter. 2022. Bayesian Model Selection for Reducing Bloat and Overfitting in Genetic Programming for Symbolic Regression. In Proceedings of the Genetic and Evolutionary Computation Conference Companion (Boston, Massachusetts) (GECCO '22). Association for Computing Machinery, New York, NY, USA, 526--529.
[7]
G. F. Bomarito, P. E. Leser, N. C. M. Strauss, K. M. Garbrecht, and J. D. Hochhalter. 2023. Automated learning of interpretable models with quantified uncertainty. Computer Methods in Applied Mechanics and Engineering 403 (Jan. 2023), 115732. arXiv:2205.01626 [cs.NE]
[8]
Jure Brence, Ljupčo Todorovski, and Sašo Džeroski. 2021. Probabilistic grammars for equation discovery. Knowledge-Based Systems 224 (2021), 107077.
[9]
Per A Brodtkorb and John D'Errico. 2015. numdifftools 0.9.11. https://github.com/pbrod/numdifftools.
[10]
Dillon Brout et al. 2022. The Pantheon+ Analysis: Cosmological Constraints. arXiv e-prints, Article arXiv:2202.04077 (Feb. 2022), arXiv:2202.04077 pages. arXiv:2202.04077 [astro-ph.CO]
[11]
Qi Chen, Bing Xue, and Mengjie Zhang. 2022. Rademacher Complexity for Enhancing the Generalization of Genetic Programming for Symbolic Regression. IEEE Transactions on Cybernetics 52, 4 (2022), 2382--2395.
[12]
J.H. Conway, N.J.A. Sloane, E. Bannai, R.E. Borcherds, J. Leech, S.P. Norton, A.M. Odlyzko, R.A. Parker, L. Queen, and B.B. Venkov. 2013. Sphere Packings, Lattices and Groups. Springer New York. https://books.google.fr/books?id=hoTjBwAAQBAJ
[13]
T. M. Cover and J. A. Thomas. 1991. Elements of Information Theory (2nd ed.). Wiley.
[14]
Miles Cranmer. 2020. PySR: Fast & Parallelized Symbolic Regression in Python/Julia.
[15]
Miles Cranmer, Alvaro Sanchez-Gonzalez, Peter Battaglia, Rui Xu, Kyle Cranmer, David Spergel, and Shirley Ho. 2020. Discovering Symbolic Models from Deep Learning with Inductive Biases. NeurIPS 2020 (2020). arXiv:2006.11287 [cs.LG]
[16]
Stéphane d'Ascoli, Pierre-Alexandre Kamienny, Guillaume Lample, and François Charton. 2022. Deep Symbolic Regression for Recurrent Sequences. arXiv e-prints, Article arXiv:2201.04600 (Jan. 2022), arXiv:2201.04600 pages. arXiv:2201.04600 [cs.LG]
[17]
E. David. 1989. Genetic Algorithms in Search, Optimization and Machine Learning. Addison-Wesley.
[18]
R.P. Feynman, R.B. Leighton, and M. Sands. 2015. The Feynman Lectures on Physics, Vol. I: The New Millennium Edition: Mainly Mechanics, Radiation, and Heat. Number v. 1. Basic Books. https://books.google.fr/books?id=d76DBQAAQBAJ
[19]
R.P. Feynman, R.B. Leighton, M.L. Sands, and M.A. Gottlieb. 2006. The Feynman Lectures on Physics. Number v. 2 in The Feynman Lectures on Physics. Pearson/Addison-Wesley. https://books.google.fr/books?id=AbruAAAAMAAJ
[20]
R.P. Feynman, R.B. Leighton, M.L. Sands, and M.A. Gottlieb. 2006. The Feynman Lectures on Physics: Quantum mechanics. Pearson/Addison-Wesley. https://books.google.fr/books?id=_6XvAAAAMAAJ
[21]
William A. Gale and Geoffrey Sampson. 1995. Good-turing frequency estimation without tears. Journal of Quantitative Linguistics 2, 3 (1995), 217--237. arXiv:https://doi.org/10.1080/09296179508590051
[22]
H. Goldstein, C.P. Poole, and J.L. Safko. 2002. Classical Mechanics. Addison Wesley. https://books.google.fr/books?id=tJCuQgAACAAJ
[23]
I. J. GOOD. 1953. THE POPULATION FREQUENCIES OF SPECIES AND THE ESTIMATION OF POPULATION PARAMETERS. Biometrika 40, 3--4 (12 1953), 237--264. arXiv:https://academic.oup.com/biomet/article-pdf/40/3-4/237/492571/40-3-4-237.pdf
[24]
P. Grunwald. 2007. The Minimum Description Length Principle. MIT Press.
[25]
Peter Grünwald and Teemu Roos. 2019. Minimum Description Length Revisited. arXiv e-prints, Article arXiv:1908.08484 (Aug. 2019), arXiv:1908.08484 pages. arXiv:1908.08484 [stat.ME]
[26]
Roger Guimerà, Ignasi Reichardt, Antoni Aguilar-Mogas, Francesco A. Massucci, Manuel Miranda, Jordi Pallarès, and Marta Sales-Pardo. 2020. A Bayesian machine scientist to aid in the solution of challenging scientific problems. Science Advances 6, 5 (2020), eaav6971. arXiv:https://www.science.org/doi/pdf/10.1126/sciadv.aav6971
[27]
R. Haupt and S. Haupt. 2004. Practical genetic algorithms (2nd ed.). Wyley.
[28]
Hitoshi Iba, Hugo De Garis, and Taisuke Sato. 1994. Genetic programming using a minimum description length principle. Advances in genetic programming 1 (1994), 265--284.
[29]
John David Jackson. 1999. Classical electrodynamics (3rd ed. ed.). Wiley, New York, NY. http://cdsweb.cern.ch/record/490457
[30]
Harold Jeffreys. 1946. An Invariant Form for the Prior Probability in Estimation Problems. Proceedings of the Royal Society of London. Series A, Mathematical and Physical Sciences 186, 1007 (1946), 453--461. http://www.jstor.org/stable/97883
[31]
Pierre-Alexandre Kamienny, Stéphane d'Ascoli, Guillaume Lample, and François Charton. 2022. End-to-end symbolic regression with transformers. arXiv e-prints, Article arXiv:2204.10532 (April 2022), arXiv:2204.10532 pages. arXiv:2204.10532 [cs.LG]
[32]
Lukas Kammerer, Gabriel Kronberger, Bogdan Burlacu, Stephan M. Winkler, Michael Kommenda, and Michael Affenzeller. 2021. Symbolic Regression by Exhaustive Search: Reducing the Search Space Using Syntactical Constraints and Efficient Semantic Structure Deduplication. arXiv e-prints, Article arXiv:2109.13895 (Sept. 2021), arXiv:2109.13895 pages. arXiv:2109.13895 [cs.LG]
[33]
Slava M. Katz. 1987. Estimation of probabilities from sparse data for the language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process. 35 (1987), 400--401.
[34]
Michael Kommenda, Andreas Beham, Michael Affenzeller, and Gabriel Kronberger. 2021. Complexity Measures for Multi-objective Symbolic Regression. arXiv e-prints, Article arXiv:2109.00238 (Sept. 2021), arXiv:2109.00238 pages. arXiv:2109.00238 [cs.LG]
[35]
Michael Korns. 2011. Accuracy in Symbolic Regression. 129--151.
[36]
Jiří Kubalík, Erik Derner, and Robert Babuška. 2023. Neural Networks for Symbolic Regression. arXiv e-prints, Article arXiv:2302.00773 (Feb. 2023), arXiv:2302.00773 pages. arXiv:2302.00773 [cs.NE]
[37]
Guillaume Lample and François Charton. 2019. Deep Learning for Symbolic Mathematics. arXiv e-prints, Article arXiv:1912.01412 (Dec. 2019), arXiv:1912.01412 pages. arXiv:1912.01412 [cs.SC]
[38]
Aaron D. Lanterman. 2001. Schwarz, Wallace, and Rissanen: Intertwining Themes in Theories of Model Selection. International Statistical Review 69, 2 (2001), 185--212. arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1751-5823.2001.tb00456.x
[39]
Pablo Lemos, Niall Jeffrey, Miles Cranmer, Shirley Ho, and Peter Battaglia. 2022. Rediscovering orbital mechanics with machine learning. arXiv e-prints, Article arXiv:2202.02306 (Feb. 2022), arXiv:2202.02306 pages. arXiv:2202.02306 [astroph.EP]
[40]
F. B. Lempers. 1971. Posterior probabilities of alternative linear models. Some theoretical considerations and empirical experiments. [By] F. B. Lempers. Rotterdam University Press [Rotterdam]. 118 p. pages.
[41]
Christopher D. Manning and Hinrich Schutze. 1999. Foundations of statistical natural language processing. MIT Press, Cambridge, Mass.
[42]
Trent McConaghy. 2011. FFX: Fast, Scalable, Deterministic Symbolic Regression Technology. Springer New York, New York, NY, 235--260.
[43]
James McDermott, David R. White, Sean Luke, Luca Manzoni, Mauro Castelli, Leonardo Vanneschi, Wojciech Jaśkowski, Krzysztof Krawiec, Robin Harper, Kenneth De Jong, and Una May O'Reilly. 2012. Genetic programming needs better benchmarks. In GECCO'12 - Proceedings of the 14th International Conference on Genetic and Evolutionary Computation (GECCO'12 - Proceedings of the 14th International Conference on Genetic and Evolutionary Computation). 791--798. 14th International Conference on Genetic and Evolutionary Computation, GECCO'12 ; Conference date: 07-07-2012 Through 11-07-2012.
[44]
Aaron Meurer et al. 2017. SymPy: symbolic computing in Python. PeerJ Computer Science 3 (Jan. 2017), e103.
[45]
Anthony O'Hagan. 1995. Fractional Bayes Factors for Model Comparison. Journal of the Royal Statistical Society. Series B (Methodological) 57, 1 (1995), 99--138. http://www.jstor.org/stable/2346088
[46]
Adam G. Riess, Wenlong Yuan, Lucas M. Macri, Dan Scolnic, Dillon Brout, Stefano Casertano, David O. Jones, Yukei Murakami, Gagandeep S. Anand, Louise Breuval, Thomas G. Brink, Alexei V. Filippenko, Samantha Hoffmann, Saurabh W. Jha, W. D'arcy Kenworthy, John Mackenty, Benjamin E. Stahl, and WeiKang Zheng. 2022. A Comprehensive Measurement of the Local Value of the Hubble Constant with 1 km s−1 Mpc−1 Uncertainty from the Hubble Space Telescope and the SH0ES Team. ApJ 934, 1, Article L7 (July 2022), L7 pages. arXiv:2112.04510 [astro-ph.CO]
[47]
J. Rissanen. 1978. Modeling by shortest data description. Automatica 14, 5 (1978), 465--471.
[48]
Jorma Rissanen. 1983. A Universal Prior for Integers and Estimation by Minimum Description Length. The Annals of Statistics 11, 2 (1983), 416 -- 431.
[49]
Daniel Rivero and Enrique Fernandez-Blanco. 2019. A New Deterministic Technique for Symbolic Regression. arXiv e-prints, Article arXiv:1908.06754 (Aug. 2019), arXiv:1908.06754 pages. arXiv:1908.06754 [cs.LG]
[50]
M.D. Schwartz. 2014. Quantum Field Theory and the Standard Model. Cambridge University Press. https://books.google.fr/books?id=HbdEAgAAQBAJ
[51]
Gideon Schwarz. 1978. Estimating the Dimension of a Model. Annals of Statistics 6, 2 (July 1978), 461--464.
[52]
Dan Scolnic, Dillon Brout, Anthony Carr, Adam G. Riess, Tamara M. Davis, Arianna Dwomoh, David O. Jones, Noor Ali, Pranav Charvu, Rebecca Chen, Erik R. Peterson, Brodie Popovic, Benjamin M. Rose, Charlotte Wood, Peter J. Brown, Ken Chambers, David A. Coulter, Kyle G. Dettman, Georgios Dimitriadis, Alexei V. Filippenko, Ryan J. Foley, Saurabh W. Jha, Charles D. Kilpatrick, Robert P. Kirshner, Yen-Chen Pan, Armin Rest, Cesar Rojas-Bravo, Matthew R. Siebert, Benjamin E. Stahl, and WeiKang Zheng. 2021. The Pantheon+ Analysis: The Full Dataset and Light-Curve Release. arXiv e-prints, Article arXiv:2112.03863 (Dec. 2021), arXiv:2112.03863 pages. arXiv:2112.03863 [astro-ph.CO]
[53]
Parshin Shojaee, Kazem Meidani, Amir Barati Farimani, and Chandan K. Reddy. 2023. Transformer-based Planning for Symbolic Regression. arXiv e-prints, Article arXiv:2303.06833 (March 2023), arXiv:2303.06833 pages. arXiv:2303.06833 [cs.LG]
[54]
Guido F. Smits and Mark Kotanchek. 2005. Pareto-Front Exploitation in Symbolic Regression. Springer US, Boston, MA, 283--299.
[55]
R.S. Sutton and A.G. Barto. 2018. Reinforcement Learning, second edition: An Introduction. MIT Press. https://books.google.co.uk/books?id=sWV0DwAAQBAJ
[56]
J. Takeuchi. 1997. Characterization of the Bayes estimator and the MDL estimator for exponential families. IEEE Transactions on Information Theory 43, 4 (1997), 1165--1174.
[57]
A. M. Turing. 1950. I.---COMPUTING MACHINERY AND INTELLIGENCE. Mind LIX, 236 (10 1950), 433--460.
[58]
Silviu-Marian Udrescu, Andrew Tan, Jiahai Feng, Orisvaldo Neto, Tailin Wu, and Max Tegmark. 2020. AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. arXiv e-prints, Article arXiv:2006.10782 (June 2020), arXiv:2006.10782 pages. arXiv:2006.10782 [cs.LG]
[59]
Silviu-Marian Udrescu and Max Tegmark. 2020. AI Feynman: A physics-inspired method for symbolic regression. Science Advances 6, 16 (April 2020), eaay2631. arXiv:1905.11481 [physics.comp-ph]
[60]
Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O'Neill, Robert I. McKay, and Edgar Galván López. 2011. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genet. Program. Evolvable Mach. 12, 2 (2011), 91--119.
[61]
Mojtaba Valipour, Bowen You, Maysum Panju, and Ali Ghodsi. 2021. SymbolicGPT: A Generative Transformer Model for Symbolic Regression. arXiv e-prints, Article arXiv:2106.14131 (June 2021), arXiv:2106.14131 pages. arXiv:2106.14131 [cs.LG]
[62]
Martin Vastl, Jonáš Kulhánek, Jiří Kubalík, Erik Derner, and Robert Babuška. 2022. SymFormer: End-to-end symbolic regression using transformer-based architecture. arXiv e-prints, Article arXiv:2205.15764 (May 2022), arXiv:2205.15764 pages. arXiv:2205.15764 [cs.LG]
[63]
Ekaterina J. Vladislavleva, Guido F. Smits, and Dick den Hertog. 2009. Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming. IEEE Transactions on Evolutionary Computation 13, 2 (2009), 333--349.
[64]
C. S. Wallace and P. R. Freeman. 1987. Estimation and Inference by Compact Coding. Journal of the Royal Statistical Society. Series B (Methodological) 49, 3 (1987), 240--265. http://www.jstor.org/stable/2985992
[65]
C. S. Wallace and P. R. Freeman. 1992. Single-Factor Analysis by Minimum Message Length Estimation. Journal of the Royal Statistical Society: Series B (Methodological) 54, 1 (1992), 195--209. arXiv:https://rss.onlinelibrary.wiley.com/doi/pdf/10.1111/j.2517-6161.1992.tb01874.x
[66]
Steven Weinberg. 1972. Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity. John Wiley and Sons, New York.
[67]
Matthias Werner, Andrej Junginger, Philipp Hennig, and Georg Martius. 2022. Uncertainty in equation learning. In GECCO '22: Genetic and Evolutionary Computation Conference, Companion Volume, Boston, Massachusetts, USA, July 9 -- 13, 2022, Jonathan E. Fieldsend and Markus Wagner (Eds.). ACM, 2298--2305.
[68]
Tony Worm and Kenneth Chiu. 2013. Prioritized Grammar Enumeration: Symbolic Regression by Dynamic Programming. In Proceedings of the 15th Annual Conference on Genetic and Evolutionary Computation (Amsterdam, The Netherlands) (GECCO '13). Association for Computing Machinery, New York, NY, USA, 1021--1028.

Cited By

View all
  • (2024)Comparing Methods for Estimating Marginal Likelihood in Symbolic RegressionProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664142(2058-2066)Online publication date: 14-Jul-2024
  • (2024)Optimal inflationary potentialsPhysical Review D10.1103/PhysRevD.109.083524109:8Online publication date: 18-Apr-2024
  • (2024)SYREN-HALOFIT: A fast, interpretable, high-precision formula for the ΛCDM nonlinear matter power spectrumAstronomy & Astrophysics10.1051/0004-6361/202449854686(A150)Online publication date: 5-Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation
July 2023
2519 pages
ISBN:9798400701207
DOI:10.1145/3583133
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 July 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. model selection
  2. minimum description length
  3. symbolic regression
  4. equation learning
  5. data analysis
  6. cosmology
  7. language model

Qualifiers

  • Research-article

Funding Sources

Conference

GECCO '23 Companion
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)66
  • Downloads (Last 6 weeks)4
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Comparing Methods for Estimating Marginal Likelihood in Symbolic RegressionProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3664142(2058-2066)Online publication date: 14-Jul-2024
  • (2024)Optimal inflationary potentialsPhysical Review D10.1103/PhysRevD.109.083524109:8Online publication date: 18-Apr-2024
  • (2024)SYREN-HALOFIT: A fast, interpretable, high-precision formula for the ΛCDM nonlinear matter power spectrumAstronomy & Astrophysics10.1051/0004-6361/202449854686(A150)Online publication date: 5-Jun-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media