Computer Science > Neural and Evolutionary Computing

arXiv:2404.05898 (cs)

[Submitted on 8 Apr 2024]

Title:Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing

Authors:Guilherme Seidyo Imai Aldeia (1), Fabricio Olivetti de Franca (1), William G. La Cava (2 and 3) ((1) Federal University of ABC, (2) Boston Children's Hospital, (3) Harvard Medical School)

View PDF HTML (experimental)

Abstract:Symbolic regression (SR) searches for parametric models that accurately fit a dataset, prioritizing simplicity and interpretability. Despite this secondary objective, studies point out that the models are often overly complex due to redundant operations, introns, and bloat that arise during the iterative process, and can hinder the search with repeated exploration of bloated segments. Applying a fast heuristic algebraic simplification may not fully simplify the expression and exact methods can be infeasible depending on size or complexity of the expressions. We propose a novel agnostic simplification and bloat control for SR employing an efficient memoization with locality-sensitive hashing (LHS). The idea is that expressions and their sub-expressions traversed during the iterative simplification process are stored in a dictionary using LHS, enabling efficient retrieval of similar structures. We iterate through the expression, replacing subtrees with others of same hash if they result in a smaller expression. Empirical results shows that applying this simplification during evolution performs equal or better than without simplification in minimization of error, significantly reducing the number of nonlinear functions. This technique can learn simplification rules that work in general or for a specific problem, and improves convergence while reducing model complexity.

Comments:	9 pages, 10 figures, accepted to GECCO-24
Subjects:	Neural and Evolutionary Computing (cs.NE); Machine Learning (cs.LG)
Cite as:	arXiv:2404.05898 [cs.NE]
	(or arXiv:2404.05898v1 [cs.NE] for this version)
	https://doi.org/10.48550/arXiv.2404.05898
Journal reference:	GSI Aldeia, FO de França, WG La Cava. 2024. Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing. In Genetic and Evolutionary Computation Conference (GECCO '24)
Related DOI:	https://doi.org/10.1145/3638529.3654147

Submission history

From: Guilherme Seidyo Imai Aldeia [view email]
[v1] Mon, 8 Apr 2024 22:54:14 UTC (522 KB)

Computer Science > Neural and Evolutionary Computing

Title:Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Neural and Evolutionary Computing

Title:Inexact Simplification of Symbolic Regression Expressions with Locality-sensitive Hashing

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators