FORGE aims to bring researchers, practitioners, and educators from the AI and Software Engineering community to solve the new challenges we meet in the era of foundation models.
Proceeding Downloads
Deep Multiple Assertions Generation
Software testing is one of the most crucial parts of the software development life cycle. Developers spend substantial amount of time and efforts on software testing. Recently, there has been a growing scholarly interest in the automation of software ...
MeTMaP: Metamorphic Testing for Detecting False Vector Matching Problems in LLM Augmented Generation
Augmented generation techniques such as Retrieval-Augmented Generation (RAG) and Cache-Augmented Generation (CAG) have revolutionized the field by enhancing large language model (LLM) outputs with external knowledge and cached information. However, the ...
Planning to Guide LLM for Code Coverage Prediction
Code coverage serves as a crucial metric to assess testing effectiveness, measuring the degree to which a test suite exercises different facets of the code, such as statements, branches, or paths. Despite its significance, coverage profilers necessitate ...
The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks
The application of Large Language Models (LLMs) in software engineering, particularly in static analysis tasks, represents a paradigm shift in the field. In this paper, we investigate the role that current LLMs can play in improving callgraph analysis ...
Reality Bites: Assessing the Realism of Driving Scenarios with Large Language Models
Large Language Models (LLMs) are demonstrating outstanding potential for tasks such as text generation, summarization, and classification. Given that such models are trained on a humongous amount of online knowledge, we hypothesize that LLMs can assess ...
Assessing the Impact of GPT-4 Turbo in Generating Defeaters for Assurance Cases
- Kimya Khakzad Shahandashti,
- Mithila Sivakumar,
- Mohammad Mahdi Mohajer,
- Alvine Boaye Belle,
- Song Wang,
- Timothy Lethbridge
Assurance cases (ACs) are structured arguments that allow verifying the correct implementation of the created systems' non-functional requirements (e.g., safety, security). This allows for preventing system failure. The latter may result in catastrophic ...
Exploring the Impact of the Output Format on the Evaluation of Large Language Models for Code Translation
Code translation between programming languages is a long-existing and critical task in software engineering, facilitating the modernization of legacy systems, ensuring cross-platform compatibility, and enhancing software performance. With the recent ...
Is Attention All You Need? Toward a Conceptual Model for Social Awareness in Large Language Models
Large Language Models (LLMs) are revolutionizing the landscape of Artificial Intelligence (AI) due to recent technological breakthroughs. Their remarkable success in aiding various Software Engineering (SE) tasks through AI-powered tools and assistants ...
An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets
Does the training of large language models potentially infringe upon code licenses? Furthermore, are there any datasets available that can be safely used for training these models without violating such licenses? In our study, we assess the current ...
Fine Tuning Large Language Model for Secure Code Generation
AI pair programmers, such as GitHub's Copilot, have shown great success in automatic code generation. However, such large language model-based code generation techniques face the risk of introducing security vulnerabilities to codebases. In this work, we ...
Investigating the Performance of Language Models for Completing Code in Functional Programming Languages: a Haskell Case Study
Language model-based code completion models have quickly grown in use, helping thousands of developers write code in many different programming languages. However, research on code completion models typically focuses on imperative languages such as ...
On Evaluating the Efficiency of Source Code Generated by LLMs
Recent years have seen the remarkable capabilities of large language models (LLMs) for code generation. Different from existing work that evaluate the correctness of the code generated by LLMs, we propose to further evaluate its efficiency. More ...
PathOCL: Path-Based Prompt Augmentation for OCL Generation with GPT-4
The rapid progress of AI-powered programming assistants, such as GitHub Copilot, has facilitated the development of software applications. These assistants rely on large language models (LLMs), which are foundation models (FMs) that support a wide range ...
Creative and Correct: Requesting Diverse Code Solutions from AI Foundation Models
AI foundation models have the capability to produce a wide array of responses to a single prompt, a feature that is highly beneficial in software engineering to generate diverse code solutions. However, this advantage introduces a significant trade-off ...
Commit Message Generation via ChatGPT: How Far Are We?
Commit messages concisely describe code changes in natural language and are important for software maintenance. Various automatic commit message generation approaches have been proposed, such as retrieval-based, learning-based, and hybrid approaches. ...