Computer Science > Software Engineering

arXiv:2307.00593 (cs)

[Submitted on 2 Jul 2023 (v1), last revised 8 May 2024 (this version, v3)]

Title:Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Authors:Haoxin Tu, Zhide Zhou, He Jiang, Imam Nur Bani Yusuf, Yuxian Li, Lingxiao Jiang

Abstract:Compiler bugs pose a significant threat to safety-critical applications, and promptly as well as effectively isolating these bugs is crucial for assuring the quality of compilers. However, the limited availability of debugging information on reported bugs complicates the compiler bug isolation task. Existing compiler bug isolation approaches convert the problem into a test program mutation problem, but they are still limited by ineffective mutation strategies or high human effort requirements. Drawing inspiration from the recent progress of pre-trained Large Language Models (LLMs), such as ChatGPT, in code generation, we propose a new approach named LLM4CBI to utilize LLMs to generate effective test programs for compiler bug isolation. However, using LLMs directly for test program mutation may not yield the desired results due to the challenges associated with formulating precise prompts and selecting specialized prompts. To overcome the challenges, three new components are designed in LLM4CBI. First, LLM4CBI utilizes a program complexity-guided prompt production component, which leverages data and control flow analysis to identify the most valuable variables and locations in programs for mutation. Second, LLM4CBI employs a memorized prompt selection component, which adopts reinforcement learning to select specialized prompts for mutating test programs continuously. Third, a test program validation component is proposed to select specialized feedback prompts to avoid repeating the same mistakes during the mutation process. Compared with state-of-the-art approaches over 120 real bugs from GCC and LLVM, our evaluation demonstrates the advantages of LLM4CBI: It can isolate 69.70%/21.74% and 24.44%/8.92% more bugs than DiWi and RecBi within Top-1/Top-5 ranked results. We also demonstrate that the LLMs component used in LLM4CBI can be easily replaced while still achieving reasonable results.

Comments:	Accepted by IEEE Transactions on Software Engineering
Subjects:	Software Engineering (cs.SE)
Cite as:	arXiv:2307.00593 [cs.SE]
	(or arXiv:2307.00593v3 [cs.SE] for this version)
	https://doi.org/10.48550/arXiv.2307.00593
Related DOI:	https://doi.org/10.1109/TSE.2024.3397822

Submission history

From: Haoxin Tu - [view email]
[v1] Sun, 2 Jul 2023 15:20:54 UTC (2,132 KB)
[v2] Tue, 23 Apr 2024 17:24:06 UTC (928 KB)
[v3] Wed, 8 May 2024 08:46:17 UTC (929 KB)

Computer Science > Software Engineering

Title:Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Software Engineering

Title:Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators