Computer Science > Software Engineering

arXiv:2407.19055 (cs)

[Submitted on 26 Jul 2024]

Title:Effective Large Language Model Debugging with Best-first Tree Search

Authors:Jialin Song, Jonathan Raiman, Bryan Catanzaro

Abstract:Large Language Models (LLMs) show promise in code generation tasks. However, their code-writing abilities are often limited in scope: while they can successfully implement simple functions, they struggle with more complex tasks. A fundamental difference with how an LLM writes code, compared to a human programmer, is that it cannot consistently spot and fix bugs. Debugging is a crucial skill for programmers and it enables iterative code refinement towards a correct implementation. In this work, we propose a novel algorithm to enable LLMs to debug their code via self-reflection and search where a model attempts to identify its previous mistakes. Our key contributions are 1) a best-first tree search algorithm with self-reflections (BESTER) that achieves state-of-the-art Pass@1 in three code generation benchmarks. BESTER maintains its superiority when we measure pass rates taking into account additional inference costs incurred by tree search. 2) A novel interpretability study on what self-reflections attend to in buggy programs and how they impact bug fixes, which provides a deeper understanding of the debugging process. 3) An extensive study on when self-reflections are effective in finding bugs.

Subjects:	Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2407.19055 [cs.SE]
	(or arXiv:2407.19055v1 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2407.19055

Submission history

From: Jialin Song [view email]
[v1] Fri, 26 Jul 2024 19:26:00 UTC (249 KB)

Computer Science > Software Engineering

Title:Effective Large Language Model Debugging with Best-first Tree Search

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Effective Large Language Model Debugging with Best-first Tree Search

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators