Export Citations
Issue Downloads
Editorial: TOSEM Journal in 2025 and Beyond
TOSEM is ACM’s flagship journal for publishing software engineering (SE) research. TOSEM stays true to the foundations of the discipline while meaningfully engaging with the wave of disruptive innovations in the field. In this light, we discuss the plans ...
Automating TODO-missed Methods Detection and Patching
TODO comments are widely used by developers to remind themselves or others about incomplete tasks. In other words, TODO comments are usually associated with temporary or suboptimal solutions. In practice, all the equivalent suboptimal implementations ...
On Process Discovery Experimentation: Addressing the Need for Research Methodology in Process Discovery
Process mining aims to derive insights into business processes from event logs recorded from information systems. Process discovery algorithms construct process models that describe the executed process. With the increasing availability of large-scale ...
Understanding Test Convention Consistency as a Dimension of Test Quality
Unit tests must be readable to help developers understand and evolve production code. Most existing test quality metrics assess test code’s ability to detect bugs. Few metrics focus on test code’s readability. One standard approach to improve readability ...
Test Case Minimization with Quantum Annealers
Quantum annealers are specialized quantum computers for solving combinatorial optimization problems with special quantum computing characteristics, e.g., superposition and entanglement. Theoretically, quantum annealers can outperform classic computers. ...
On the Understandability of Design-Level Security Practices in Infrastructure-as-Code Scripts and Deployment Architectures
- Evangelos Ntentos,
- Nicole Elisabeth Lueger,
- Georg Simhandl,
- Uwe Zdun,
- Simon Schneider,
- Riccardo Scandariato,
- Nicolás E. Díaz Ferreyra
Infrastructure as Code (IaC) automates IT infrastructure deployment, which is particularly beneficial for continuous releases, for instance, in the context of microservices and cloud systems. Despite its flexibility in application architecture, ...
MalSensor: Fast and Robust Windows Malware Classification
Driven by the substantial profits, the evolution of Portable Executable (PE) malware has posed persistent threats. PE malware classification has been an important research field, and numerous classification methods have been proposed. With the development ...
Reputation Gaming in Crowd Technical Knowledge Sharing
Stack Overflow incentive system awards users with reputation scores to ensure quality. The decentralized nature of the forum may make the incentive system prone to manipulation. This article offers, for the first time, a comprehensive study of the ...
Benchmarking and Categorizing the Performance of Neural Program Repair Systems for Java
Recent years have seen a rise in Neural Program Repair (NPR) systems in the software engineering community, which adopt advanced deep learning techniques to automatically fix bugs. Having a comprehensive understanding of existing systems can facilitate ...
Measuring and Mining Community Evolution in Developer Social Networks with Entropy-Based Indices
This work presents four novel entropy-based indices for measuring the community evolution of developer social networks (DSNs) in open source software (OSS) projects. The proposed indices offer a quantitative measure of community split, shrink, merge, and ...
Fine-Tuning Large Language Models to Improve Accuracy and Comprehensibility of Automated Code Review
As code review is a tedious and costly software quality practice, researchers have proposed several machine learning-based methods to automate the process. The primary focus has been on accuracy, that is, how accurately the algorithms are able to detect ...
Neuron Semantic-Guided Test Generation for Deep Neural Networks Fuzzing
In recent years, significant progress has been made in testing methods for deep neural networks (DNNs) to ensure their correctness and robustness. Coverage-guided criteria, such as neuron-wise, layer-wise, and path-/trace-wise, have been proposed for DNN ...
An Exploratory Study on Machine Learning Model Management
Effective model management is crucial for ensuring performance and reliability in Machine Learning (ML) systems, given the dynamic nature of data and operational environments. However, standard practices are lacking, often resulting in ad hoc approaches. ...
SimClone: Detecting Tabular Data Clones Using Value Similarity
Data clones are defined as multiple copies of the same data among datasets. The presence of data clones between datasets can cause issues such as difficulties in managing data assets and data license violations when using datasets with clones to build AI ...
MarMot: Metamorphic Runtime Monitoring of Autonomous Driving Systems
Autonomous driving systems (ADSs) are complex cyber-physical systems (CPSs) that must ensure safety even in uncertain conditions. Modern ADSs often employ deep neural networks (DNNs), which may not produce correct results in every possible driving ...
History-Driven Fuzzing for Deep Learning Libraries
Recently, many Deep Learning (DL) fuzzers have been proposed for API-level testing of DL libraries. However, they either perform unguided input generation (e.g., not considering the relationship between API arguments when generating inputs) or only ...
Studying the Impact of TensorFlow and PyTorch Bindings on Machine Learning Software Quality
Bindings for machine learning frameworks (such as TensorFlow and PyTorch) allow developers to integrate a framework’s functionality using a programming language different from the framework’s default language (usually Python). In this article, we study ...
Don’t Complete It! Preventing Unhelpful Code Completion for Productive and Sustainable Neural Code Completion Systems
Currently, large pre-trained language models are widely applied in neural code completion systems. Though large code models significantly outperform their smaller counterparts, around 70% of displayed code completions from Github Copilot are not accepted ...
CARL: Unsupervised Code-Based Adversarial Attacks for Programming Language Models via Reinforcement Learning
Code based adversarial attacks play a crucial role in revealing vulnerabilities of software system. Recently, pre-trained programming language models (PLMs) have demonstrated remarkable success in various significant software engineering tasks, ...
Decision Support Model for Selecting the Optimal Blockchain Oracle Platform: An Evaluation of Key Factors
Smart contract-based applications are executed in a blockchain environment, and they cannot directly access data from external systems, which is required for the service provision of these applications. Instead, smart contracts use agents known as ...
Diversity’s Double-Edged Sword: Analyzing Race’s Effect on Remote Pair Programming Interactions
Remote pair programming is widely used in software development, but no research has examined how race affects these interactions between developers. We embarked on this study due to the historical underrepresentation of Black developers in the tech ...
DiPri: Distance-Based Seed Prioritization for Greybox Fuzzing
Greybox fuzzing is a powerful testing technique. Given a set of initial seeds, greybox fuzzing continuously generates new test inputs to execute the program under test and drives executions with code coverage as feedback. Seed prioritization is an ...