research-article

Developers assignment for analyzing pull requests

Authors:

Manoel Limeira de Lima Júnior,

Daricélio Moreira Soares,

Alexandre Plastino,

Leonardo MurtaAuthors Info & Claims

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

Pages 1567 - 1572

https://doi.org/10.1145/2695664.2695884

Published: 13 April 2015 Publication History

Abstract

A new collaboration approach is becoming increasingly common in open-source projects: the pull request model. In this kind of collaboration, developers that do not belong to the core team of a project can submit contributions to the core team. In projects that receive many pull requests, the task of assigning developers to analyze them is a difficult one. In this work, we propose to use data mining techniques, more specifically, classification strategies, in order to suggest the most appropriate developers to analyze a contribution, considering the pull request model. The experiments were conducted using 21 open source projects, each one characterized by 14 attributes. The first set of experiments aimed at indicating just one developer to analyze the pull request. The obtained predictive accuracy ranged from 22.45% to 68.27%. The Random Forest classifier achieved the best result in 76% on the projects. In the second set of experiments, we conclude that, when suggesting three developers to analyze a pull request, the chance of identifying the developer that actually analyzed the pull request ranged from 47.33% to 95.47%.

References

[1]

D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6(1):37--66, 1991.

Digital Library

[2]

J. Anvik. Automating bug report assignment. In Proceedings of the 28th International Conference on Software Engineering, ICSE, pages 937--940. ACM, 2006.

Digital Library

[3]

J. Anvik and G. C. Murphy. Reducing the effort of bug report triage: Recommenders for development-oriented decisions. Transactions on Software Engineering and Methodology, 20(3): 10:1--10:35, 2011.

Digital Library

[4]

L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001.

Digital Library

[5]

Briandoll. 10 million repositories. https://github.com/blog/1724-10-million-repositories. Accessed: 2014-06-25.

[6]

Y. C. Cavalcanti et al. Challenges and opportunities for software change request repositories: a systematic mapping study. Journal of Software: Evolution and Process, 26(7):620--653, 2014.

Digital Library

[7]

J. Demšar. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7: 1--30, 2006.

Digital Library

[8]

G. Gousios. The GHTorent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR, pages 233--236. IEEE, 2013.

Digital Library

[9]

G. Gousios, M. Pinzger, and A. v. Deursen. An exploratory study of the pull-based software development model. In Proceedings of the 36th International Conference on Software Engineering, ICSE, pages 345--355. ACM, 2014.

Digital Library

[10]

J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., 3rd edition, 2011.

Digital Library

[11]

G. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345. Morgan Kaufmann, 1995.

Digital Library

[12]

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485--496, 2008.

Digital Library

[13]

T. Pohlert. The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR). R package.

[14]

J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.

Digital Library

[15]

J. Tsay, L. Dabbish, and J. Herbsleb. Influence of social and technical factors for evaluating contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering, ICSE, pages 356--366. ACM, 2014.

Digital Library

[16]

Y. Yu, H. Wang, G. Yin, and C. X. Ling. Reviewer recommender of pull-requests in GitHub. In Proceedings of the 30th International Conference on Software Maintenance and Evolution, ICSME, pages 609--612. IEEE, 2014.

Digital Library

Cited By

Rong GYu YZhang YZhang HShen HShao DKuang HWang MWei ZXu YWang J(2024)Distilling Quality Enhancing Comments From Code Reviews to Underpin Reviewer RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.335681950:7(1658-1674)Online publication date: Jul-2024
https://doi.org/10.1109/TSE.2024.3356819
Chakroborti DSchneider KRoy C(2024)ReBack: recommending backports in social coding environmentsAutomated Software Engineering10.1007/s10515-024-00416-131:1Online publication date: 23-Feb-2024
https://dl.acm.org/doi/10.1007/s10515-024-00416-1
Badampudi DUnterkalmsteiner MBritto R(2023)Modern Code Reviews—Survey of Literature and PracticeACM Transactions on Software Engineering and Methodology10.1145/358500432:4(1-61)Online publication date: 26-May-2023
https://dl.acm.org/doi/10.1145/3585004
Show More Cited By

Index Terms

Developers assignment for analyzing pull requests
1. Software and its engineering
  1. Software creation and management
    1. Collaboration in software development
      1. Programming teams
    2. Software post-development issues
      1. Software version control
  2. Software notations and tools
    1. Software configuration management and version control systems

Recommendations

An exploratory study of the pull-based software development model
ICSE 2014: Proceedings of the 36th International Conference on Software Engineering

The advent of distributed version control systems has led to the development of a new paradigm for distributed software development; instead of pushing changes to a central repository, developers pull them from other repositories and merge them ...
An insight into the pull requests of GitHub
MSR 2014: Proceedings of the 11th Working Conference on Mining Software Repositories

Given the increasing number of unsuccessful pull requests in GitHub projects, insights into the success and failure of these requests are essential for the developers. In this paper, we provide a comparative study between successful and unsuccessful ...
Pull request latency explained: an empirical overview
Abstract
Pull request latency evaluation is an essential application of effort evaluation in the pull-based development scenario. It can help the reviewers sort the pull request queue, remind developers about the review processing time, speed up the review ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing

April 2015

2418 pages

ISBN:9781450331968

DOI:10.1145/2695664

Conference Chairs:
Roger L. Wainwright
University of Tulsa
,
Juan Manuel Corchado
University of Salamanca, Spain
,
Program Chairs:
Alessio Bechini
University of Pisa, Italy
,
Jiman Hong
Soongsil University, South Korea

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGAPP: ACM Special Interest Group on Applied Computing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

CAPES
CNPq
FAPERJ

Conference

SAC 2015

Sponsor:

SIGAPP

SAC 2015: Symposium on Applied Computing

April 13 - 17, 2015

Salamanca, Spain

Acceptance Rates

SAC '15 Paper Acceptance Rate 291 of 1,211 submissions, 24%;

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
301
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Rong GYu YZhang YZhang HShen HShao DKuang HWang MWei ZXu YWang J(2024)Distilling Quality Enhancing Comments From Code Reviews to Underpin Reviewer RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.335681950:7(1658-1674)Online publication date: Jul-2024
https://doi.org/10.1109/TSE.2024.3356819
Chakroborti DSchneider KRoy C(2024)ReBack: recommending backports in social coding environmentsAutomated Software Engineering10.1007/s10515-024-00416-131:1Online publication date: 23-Feb-2024
https://dl.acm.org/doi/10.1007/s10515-024-00416-1
Badampudi DUnterkalmsteiner MBritto R(2023)Modern Code Reviews—Survey of Literature and PracticeACM Transactions on Software Engineering and Methodology10.1145/358500432:4(1-61)Online publication date: 26-May-2023
https://dl.acm.org/doi/10.1145/3585004
Ahasanuzzaman MOliva GHassan A(2023)Using knowledge units of programming languages to recommend reviewers for pull requests: an empirical studyEmpirical Software Engineering10.1007/s10664-023-10421-929:1Online publication date: 29-Dec-2023
https://doi.org/10.1007/s10664-023-10421-9
Zhao GLiu JAlencar Da Costa DZou YShirani POnut INg TKent KBaşar AOnut I(2022)Adopting Learning-to-rank Algorithm for Reviewer Recommendation Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering10.5555/3566055.3566059(22-31)Online publication date: 15-Nov-2022
https://dl.acm.org/doi/10.5555/3566055.3566059
Seker ADiri BArslan HAmasyalı M(2022)Open Source Software Development ChallengesResearch Anthology on Agile Software, Software Development, and Testing10.4018/978-1-6684-3702-5.ch102(2134-2164)Online publication date: 2022
https://doi.org/10.4018/978-1-6684-3702-5.ch102
Chakroborti DSchneider KRoy CRastogi ATufano RBavota GArnaoudova VHaiduc S(2022)BackportsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527920(636-647)Online publication date: 16-May-2022
https://dl.acm.org/doi/10.1145/3524610.3527920
Rong GZhang YYang LZhang FKuang HZhang HDwyer MDamian DZeller A(2022)Modeling review history for reviewer recommendationProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510213(1381-1392)Online publication date: 21-May-2022
https://dl.acm.org/doi/10.1145/3510003.3510213
Seker ADiri BArslan HAmasyalı M(2021)Open Source Software Development ChallengesResearch Anthology on Usage and Development of Open Source Software10.4018/978-1-7998-9158-1.ch003(33-62)Online publication date: 2021
https://doi.org/10.4018/978-1-7998-9158-1.ch003
Şeker ADiri BArslan H(2021)New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation SystemsApplied Sciences10.3390/app1103092011:3(920)Online publication date: 20-Jan-2021
https://doi.org/10.3390/app11030920
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents