Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2695664.2695884acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Developers assignment for analyzing pull requests

Published: 13 April 2015 Publication History

Abstract

A new collaboration approach is becoming increasingly common in open-source projects: the pull request model. In this kind of collaboration, developers that do not belong to the core team of a project can submit contributions to the core team. In projects that receive many pull requests, the task of assigning developers to analyze them is a difficult one. In this work, we propose to use data mining techniques, more specifically, classification strategies, in order to suggest the most appropriate developers to analyze a contribution, considering the pull request model. The experiments were conducted using 21 open source projects, each one characterized by 14 attributes. The first set of experiments aimed at indicating just one developer to analyze the pull request. The obtained predictive accuracy ranged from 22.45% to 68.27%. The Random Forest classifier achieved the best result in 76% on the projects. In the second set of experiments, we conclude that, when suggesting three developers to analyze a pull request, the chance of identifying the developer that actually analyzed the pull request ranged from 47.33% to 95.47%.

References

[1]
D. W. Aha, D. Kibler, and M. K. Albert. Instance-based learning algorithms. Machine Learning, 6(1):37--66, 1991.
[2]
J. Anvik. Automating bug report assignment. In Proceedings of the 28th International Conference on Software Engineering, ICSE, pages 937--940. ACM, 2006.
[3]
J. Anvik and G. C. Murphy. Reducing the effort of bug report triage: Recommenders for development-oriented decisions. Transactions on Software Engineering and Methodology, 20(3): 10:1--10:35, 2011.
[4]
L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001.
[5]
Briandoll. 10 million repositories. https://github.com/blog/1724-10-million-repositories. Accessed: 2014-06-25.
[6]
Y. C. Cavalcanti et al. Challenges and opportunities for software change request repositories: a systematic mapping study. Journal of Software: Evolution and Process, 26(7):620--653, 2014.
[7]
J. Demšar. Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7: 1--30, 2006.
[8]
G. Gousios. The GHTorent dataset and tool suite. In Proceedings of the 10th Working Conference on Mining Software Repositories, MSR, pages 233--236. IEEE, 2013.
[9]
G. Gousios, M. Pinzger, and A. v. Deursen. An exploratory study of the pull-based software development model. In Proceedings of the 36th International Conference on Software Engineering, ICSE, pages 345--355. ACM, 2014.
[10]
J. Han, M. Kamber, and J. Pei. Data Mining: Concepts and Techniques. Morgan Kaufmann Publishers Inc., 3rd edition, 2011.
[11]
G. John and P. Langley. Estimating continuous distributions in bayesian classifiers. In Proceedings of the Eleventh Conference on Uncertainty in Artificial Intelligence, pages 338--345. Morgan Kaufmann, 1995.
[12]
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485--496, 2008.
[13]
T. Pohlert. The Pairwise Multiple Comparison of Mean Ranks Package (PMCMR). R package.
[14]
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1993.
[15]
J. Tsay, L. Dabbish, and J. Herbsleb. Influence of social and technical factors for evaluating contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering, ICSE, pages 356--366. ACM, 2014.
[16]
Y. Yu, H. Wang, G. Yin, and C. X. Ling. Reviewer recommender of pull-requests in GitHub. In Proceedings of the 30th International Conference on Software Maintenance and Evolution, ICSME, pages 609--612. IEEE, 2014.

Cited By

View all
  • (2024)Distilling Quality Enhancing Comments From Code Reviews to Underpin Reviewer RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.335681950:7(1658-1674)Online publication date: Jul-2024
  • (2024)ReBack: recommending backports in social coding environmentsAutomated Software Engineering10.1007/s10515-024-00416-131:1Online publication date: 23-Feb-2024
  • (2023)Modern Code Reviews—Survey of Literature and PracticeACM Transactions on Software Engineering and Methodology10.1145/358500432:4(1-61)Online publication date: 26-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '15: Proceedings of the 30th Annual ACM Symposium on Applied Computing
April 2015
2418 pages
ISBN:9781450331968
DOI:10.1145/2695664
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 April 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. distributed software development
  2. pull request assignment
  3. pull-based development

Qualifiers

  • Research-article

Funding Sources

  • CAPES
  • CNPq
  • FAPERJ

Conference

SAC 2015
Sponsor:
SAC 2015: Symposium on Applied Computing
April 13 - 17, 2015
Salamanca, Spain

Acceptance Rates

SAC '15 Paper Acceptance Rate 291 of 1,211 submissions, 24%;
Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Distilling Quality Enhancing Comments From Code Reviews to Underpin Reviewer RecommendationIEEE Transactions on Software Engineering10.1109/TSE.2024.335681950:7(1658-1674)Online publication date: Jul-2024
  • (2024)ReBack: recommending backports in social coding environmentsAutomated Software Engineering10.1007/s10515-024-00416-131:1Online publication date: 23-Feb-2024
  • (2023)Modern Code Reviews—Survey of Literature and PracticeACM Transactions on Software Engineering and Methodology10.1145/358500432:4(1-61)Online publication date: 26-May-2023
  • (2023)Using knowledge units of programming languages to recommend reviewers for pull requests: an empirical studyEmpirical Software Engineering10.1007/s10664-023-10421-929:1Online publication date: 29-Dec-2023
  • (2022)Adopting Learning-to-rank Algorithm for Reviewer Recommendation Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering10.5555/3566055.3566059(22-31)Online publication date: 15-Nov-2022
  • (2022)Open Source Software Development ChallengesResearch Anthology on Agile Software, Software Development, and Testing10.4018/978-1-6684-3702-5.ch102(2134-2164)Online publication date: 2022
  • (2022)BackportsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527920(636-647)Online publication date: 16-May-2022
  • (2022)Modeling review history for reviewer recommendationProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510213(1381-1392)Online publication date: 21-May-2022
  • (2021)Open Source Software Development ChallengesResearch Anthology on Usage and Development of Open Source Software10.4018/978-1-7998-9158-1.ch003(33-62)Online publication date: 2021
  • (2021)New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation SystemsApplied Sciences10.3390/app1103092011:3(920)Online publication date: 20-Jan-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media