research-article

Estimativa do Tempo de Resolução de Issues no GitHub Usando Atributos Textuais e Temporais

Authors:

Gláucia Silva,

Giovanni ComarelaAuthors Info & Claims

SBES '21: Proceedings of the XXXV Brazilian Symposium on Software Engineering

Pages 253 - 262

https://doi.org/10.1145/3474624.3474647

Published: 05 October 2021 Publication History

Abstract

Estimating issues resolution time is one of the most important steps in software maintenance processes. However, although the subject is covered in the literature, there are few specific models for GitHub. This platform is very popular mainly in the open source context but its issue tracking system is not bureaucratic and issues are registered in a very simple way, which makes the process of building predictive models even more challenging. This work aims to develop machine learning models to estimate the resolution time of issues from GitHub. To handle the data scarcity, we propose textual attributes to capture issues characteristics; and temporal attributes to provide information about the time of issue events. Neural networks were used in classification algorithms and proved to be more suitable for solving this problem. To validate the proposed models we compared them with a reference from literature through different metrics and the results were positive with a significant improvement in accuracy.

Supplementary Material

p253-neto-supplements (p253-neto-supplements.zip)

Supplemental files

Download
91.41 KB

References

[1]

Shirin Akbarinasaji, Bora Caglayan, and Ayse Bener. 2018. Predicting bug-fixing time: A replication study using an open source software project. journal of Systems and Software 136 (2018), 173–186.

[2]

Wisam Haitham Abbood Al-Zubaidi, Hoa Khanh Dam, Aditya Ghose, and Xiaodong Li. 2017. Multi-objective search-based approach to estimate issue resolution time. In Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering. ACM, 53–62.

Digital Library

[3]

Prasanth Anbalagan and Mladen Vouk. 2009. On predicting the time taken to correct bug reports in open source projects. In 2009 IEEE International Conference on Software Maintenance. IEEE, 523–526.

[4]

Pasquale Ardimento, Nicola Boffoli, and Costantino Mele. 2020. A text-based regression approach to predict bug-fix time. In Complex Pattern Mining. Springer, 63–83.

[5]

Pasquale Ardimento and Andrea Dinapoli. 2017. Knowledge extraction from on-line open source bug tracking systems to predict bug-fixing time. In Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics. ACM, 7.

Digital Library

[6]

Tegawendé F Bissyandé, David Lo, Lingxiao Jiang, Laurent Réveillere, Jacques Klein, and Yves Le Traon. 2013. Got issues? who cares about it? a large scale investigation of issue trackers from github. In 2013 IEEE 24th international symposium on software reliability engineering (ISSRE). IEEE, 188–197.

[7]

Michael W Browne. 2000. Cross-validation methods. Journal of mathematical psychology 44, 1 (2000), 108–132.

Digital Library

[8]

Marcelo Cataldo and James D Herbsleb. 2008. Communication networks in geographically distributed software development. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. 579–588.

Digital Library

[9]

Kevin Crowston, Kangning Wei, Qing Li, and James Howison. 2006. Core and periphery in free/libre and open source software team communications. In Proceedings of the 39th Annual Hawaii International Conference on System Sciences (HICSS’06), Vol. 6. IEEE, 118a–118a.

Digital Library

[10]

Daojin Fan. 2010. Analysis of critical success factors in IT project management. In Industrial and Information Systems (IIS), 2010 2nd International Conference on, Vol. 2. IEEE, 487–490.

[11]

Fabio Ferreira, Luciana Lourdes Silva, and Marco Tulio Valente. 2020. Turnover in Open-Source Projects: The Case of Core Developers. In Proceedings of the 34th Brazilian Symposium on Software Engineering. 447–456.

Digital Library

[12]

Emanuel Giger, Martin Pinzger, and Harald Gall. 2010. Predicting the Fix Time of Bugs. In Proceedings of the 2Nd International Workshop on Recommendation Systems for Software Engineering (Cape Town, South Africa) (RSSE ’10). ACM, New York, NY, USA, 52–56. https://doi.org/10.1145/1808920.1808933

Digital Library

[13]

GitHubGuide. 2020. Mastering Issues. https://guides.github.com/features/issues/. Acessado em 08 fev. 2021.

[14]

Mayy Habayeb, Syed Shariyar Murtaza, Andriy Miranskyy, and Ayse Basar Bener. 2017. On the use of hidden markov model to predict the time to fix bugs. IEEE Transactions on Software Engineering 44, 12 (2017), 1224–1244.

Digital Library

[15]

Jiawei Han, Micheline Kamber, and Jian Pei. 2012. Data mining concepts and techniques, third edition. http://www.amazon.de/Data-Mining-Concepts-Techniques-Management/dp/0123814790/ref=tmm_hrd_title_0?ie=UTF8&qid=1366039033&sr=1-1

[16]

Pieter Hooimeijer and Westley Weimer. 2007. Modeling bug report quality. In Proceedings of the twenty-second IEEE/ACM international conference on Automated software engineering. 34–43.

Digital Library

[17]

Eirini Kalliamvakou, Daniela Damian, Kelly Blincoe, Leif Singer, and Daniel M German. 2015. Open source-style collaborative development practices in commercial projects using github. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. IEEE, 574–585.

[18]

Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The promises and perils of mining github. In Proceedings of the 11th working conference on mining software repositories. 92–101.

Digital Library

[19]

Riivo Kikas, Marlon Dumas, and Dietmar Pfahl. 2015. Issue dynamics in github projects. In International Conference on Product-Focused Software Process Improvement. Springer, 295–310.

Digital Library

[20]

Riivo Kikas, Marlon Dumas, and Dietmar Pfahl. 2016. Using dynamic and contextual features to predict issue lifetime in GitHub projects. In Proceedings of the 13th International Conference on Mining Software Repositories. ACM, 291–302.

Digital Library

[21]

Youngseok Lee, Suin Lee, Chan-Gun Lee, Ikjun Yeom, and Honguk Woo. 2020. Continual prediction of bug-fix time using deep learning-based activity stream embedding. IEEE Access 8(2020), 10503–10515.

[22]

Chandra Maddila, Chetan Bansal, and Nachiappan Nagappan. 2019. Predicting pull request completion time: a case study on large scale cloud services. In Proceedings of the 2019 27th acm joint meeting on european software engineering conference and symposium on the foundations of software engineering. 874–882.

Digital Library

[23]

Emilia Mendes. 2014. Practitioner’s knowledge representation: A pathway to improve software effort estimation. Springer. 1–211 pages. https://doi.org/10.1007/978-3-642-54157-5

[24]

Microsoft. 2020. O que é o Processo de Ciência de Dados de Equipe?https://docs.microsoft.com/pt-br/azure/machine-learning/team-data-science-process/overview. Acessado em 01 mar. 2021.

[25]

Nuno Pombo and Rui Teixeira. 2020. Contribution of Temporal Sequence Activities To Predict Bug Fixing Time. In 2020 IEEE 14th International Conference on Application of Information and Communication Technologies (AICT). IEEE, 1–6.

[26]

Krishnamoorthy Srinivasan and Douglas Fisher. 1995. Machine learning approaches to estimating software development effort. IEEE Transactions on Software Engineering 21, 2 (1995), 126–137.

Digital Library

[27]

Yue Yu, Huaimin Wang, Vladimir Filkov, Premkumar Devanbu, and Bogdan Vasilescu. 2015. Wait for it: Determinants of pull request evaluation latency on github. In 2015 IEEE/ACM 12th working conference on mining software repositories. IEEE, 367–371.

[28]

Hongyu Zhang, Liang Gong, and Steve Versteeg. 2013. Predicting bug-fixing time: an empirical study of commercial software projects. In Proceedings of the 2013 international conference on software engineering. IEEE Press, 1042–1051.

Cited By

Qiao YLu XWang CWang JTang WLi B(2024)Predicting Issue Resolution Time of OSS Using Multiple FeaturesJournal of Software: Evolution and Process10.1002/smr.274637:1Online publication date: 22-Nov-2024
https://doi.org/10.1002/smr.2746

Recommendations

Towards Prioritizing GitHub Issues
ISEC '20: Proceedings of the 13th Innovations in Software Engineering Conference (formerly known as India Software Engineering Conference)

The vast growth in usage of GitHub by developers to host their projects has led to extensive forking and open source contributions. These contributions occur in the form of issues that report bugs or pull requests to either fix bugs or add new features ...
Using dynamic and contextual features to predict issue lifetime in GitHub projects
MSR '16: Proceedings of the 13th International Conference on Mining Software Repositories

Methods for predicting issue lifetime can help software project managers to prioritize issues and allocate resources accordingly. Previous studies on issue lifetime prediction have focused on models built from static features, meaning features ...
Understanding and Enhancing Issue Prioritization in GitHub
ASE '23: Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering

GitHub has become a prominent platform for open source software development, facilitating collaboration and communication among a diverse group of contributors. Efficient issue tracking is a crucial aspect of managing projects on GitHub, and labels serve ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences

SBES '21: Proceedings of the XXXV Brazilian Symposium on Software Engineering

September 2021

473 pages

ISBN:9781450390613

DOI:10.1145/3474624

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

SBES '21

SBES '21: Brazilian Symposium on Software Engineering

September 27 - October 1, 2021

Joinville, Brazil

Acceptance Rates

Overall Acceptance Rate 147 of 427 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
83
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Qiao YLu XWang CWang JTang WLi B(2024)Predicting Issue Resolution Time of OSS Using Multiple FeaturesJournal of Software: Evolution and Process10.1002/smr.274637:1Online publication date: 22-Nov-2024
https://doi.org/10.1002/smr.2746

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten