Authors:
Richard V. R. Mariano
1
;
Geanderson E. dos Santos
2
and
Wladmir Cardoso Brandão
1
Affiliations:
1
Department of Computer Science, Pontifical Catholic University of Minas Gerais (PUC Minas), Belo Hozizonte, Brazil
;
2
Department of Computer Science, Federal University of Minas Gerais (UFMG), Belo Horizonte, Brazil
Keyword(s):
Software Maintenance, Quantitative Changes, Classification, Machine Learning.
Abstract:
Software maintenance is an important stage of software development, contributing to the quality of the software. Previous studies have shown that maintenance activities spend more than 40% of the development effort, consuming most part of the software budget. Understanding how these activities are performed can support managers to previously plan and allocate resources. Despite previous studies, there is still a lack of accurate models to classify software commits into maintenance activities. In this work, we deepen our previous work, in which we proposed improvements in one of the state-of-art techniques to classify software commits. First, we include three additional features that concern the size of the commit, from the state-of-art technique. Second, we propose the use of the XGBoost, one of the most advanced implementations of boosting tree algorithms, and tends to outperform other machine learning models. Additionally, we present a deep analysis of our model to understand their
decisions. Our findings show that our model outperforms the state-of-art technique achieving more than 77% of accuracy and more than 64% in the Kappa metric.
(More)