Computer Science > Artificial Intelligence

arXiv:2308.04041 (cs)

[Submitted on 8 Aug 2023]

Title:InfeRE: Step-by-Step Regex Generation via Chain of Inference

Authors:Shuai Zhang, Xiaodong Gu, Yuting Chen, Beijun Shen

View PDF

Abstract:Automatically generating regular expressions (abbrev. regexes) from natural language description (NL2RE) has been an emerging research area. Prior studies treat regex as a linear sequence of tokens and generate the final expressions autoregressively in a single pass. They did not take into account the step-by-step internal text-matching processes behind the final results. This significantly hinders the efficacy and interpretability of regex generation by neural language models. In this paper, we propose a new paradigm called InfeRE, which decomposes the generation of regexes into chains of step-by-step inference. To enhance the robustness, we introduce a self-consistency decoding mechanism that ensembles multiple outputs sampled from different models. We evaluate InfeRE on two publicly available datasets, NL-RX-Turk and KB13, and compare the results with state-of-the-art approaches and the popular tree-based generation approach TRANX. Experimental results show that InfeRE substantially outperforms previous baselines, yielding 16.3% and 14.7% improvement in DFA@5 accuracy on two datasets, respectively. Particularly, InfeRE outperforms the popular tree-based generation approach by 18.1% and 11.3% on both datasets, respectively, in terms of DFA@5 accuracy.

Comments:	This paper has been accepted by ASE'23
Subjects:	Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2308.04041 [cs.AI]
	(or arXiv:2308.04041v1 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2308.04041

Submission history

From: Shuai Zhang [view email]
[v1] Tue, 8 Aug 2023 04:37:41 UTC (3,765 KB)

Computer Science > Artificial Intelligence

Title:InfeRE: Step-by-Step Regex Generation via Chain of Inference

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:InfeRE: Step-by-Step Regex Generation via Chain of Inference

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators