Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3379177.3388904acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

Action-based Recommendation in Pull-request Development

Published: 16 September 2020 Publication History

Abstract

Pull requests (PRs) selection is a challenging task faced by integrators in pull-based development (PbD), with hundreds of PRs submitted on a daily basis to large open-source projects. Managing these PRs manually consumes integrators' time and resources and may lead to delays in the acceptance, response, or rejection of PRs that can propose bug fixes or feature enhancements. On the one hand, well-known platforms for performing PbD, like GitHub, do not provide built-in recommendation mechanisms for facilitating the management of PRs. On the other hand, prior research on PRs recommendation has focused on the likelihood of either a PR being accepted or receive a response by the integrator. In this paper, we consider both those likelihoods, this to help integrators in the PRs selection process by suggesting to them the appropriate actions to undertake on each specific PR. To this aim, we propose an approach, called CARTESIAN (aCceptance And Response classificaTion-based requESt IdentificAtioN) modeling the PRs recommendation according to PR actions. In particular, CARTESIAN is able to recommend three types of PR actions: accept, respond, and reject. We evaluated CARTESIAN on the PRs of 19 popular GitHub projects. The results of our study demonstrate that our approach can identify PR actions with an average precision and recall of about 86%. Moreover, our findings also highlight that CARTESIAN outperforms the results of two baseline approaches in the task of PRs selection.

References

[1]
Ricardo Baeza-Yates and Berthier Ribeiro-Neto. 2011. Modern Information Retrieval the Concepts and Technology Behind Search. DBLP.
[2]
Earl T. Barr, Christian Bird, Peter C. Rigby, Abram Hindle, Daniel M. German, and Premkumar Devanbu. 2012. Cohesive and Isolated Development with Branches. In Fundamental Approaches to Software Engineering, Juan de Lara and Andrea Zisman (Eds.). Springer Berlin Heidelberg, 316--331.
[3]
Yoav Benjamini and Daniel Yekutieli. 2001. The control of the false discovery rate in multiple testing under dependency. Ann. Statist. 29, 4 (08 2001), 1165--1188. https://doi.org/10.1214/aos/1013699998
[4]
Christian Bird and Alberto Bacchelli. 2013. Expectations, Outcomes, and Challenges of Modern Code Review. IEEE. https://www.microsoft.com/en-us/research/publication/expectations-outcomes-and-challenges-of-modern-code-review/
[5]
H. Borges, A. Hora, and M. T. Valente. 2016. Understanding the Factors That Impact the Popularity of GitHub Repositories. In 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME). 334--344. https://doi.org/10.1109/ICSME.2016.31
[6]
C. Chen, S. Gao, and Z.Xing. 2016. Mining Analogical Libraries in Q A Discussions -- Incorporating Relational and Categorical Knowledge into Word Embedding. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. 338--348. https://doi.org/10.1109/SANER.2016.21
[7]
Tianqi Chen and Carlos Guestrin. 2016. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. ACM, 785--794.
[8]
D. J. Dittman, T. M. Khoshgoftaar, and A. Napolitano. 2015. The Effect of Data Sampling When Using Random Forest on Imbalanced Bioinformatics Data. In 2015 IEEE International Conference on Information Reuse and Integration. 457--463. https://doi.org/10.1109/IRI.2015.76
[9]
Felipe Ebert, Fernando Castor, Nicole Novielli, and Alexander Serebrenik. 2019. Confusion in Code Reviews: Reasons, Impacts, and Coping Strategies. In 26th IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2019, Hangzhou, China, February 24--27, 2019, Xinyu Wang, David Lo, and Emad Shihab (Eds.). IEEE, 49--60. https://doi.org/10.1109/SANER.2019.8668024
[10]
Yuanrui Fan, Xin Xia, David Lo, and Shanping Li. 2018. Early prediction of merged code changes to prioritize reviewing tasks. Empirical Software Engineering 23, 6 (01 Dec 2018), 3346--3393. https://doi.org/10.1007/s10664--018--9602--0
[11]
Denae Ford, Mahnaz Behroozi, Alexander Serebrenik, and Chris Parnin. 2019. Beyond the code itself: how programmers really look at pull requests. In Proceedings of the 41st International Conference on Software Engineering: Software Engineering in Society, ICSE 2019, Montreal, QC, Canada, May 25--31, 2019, Rick Kazman and Liliana Pasquale (Eds.). ACM, 51--60. https://doi.org/10.1109/ICSE-SEIS.2019.00014
[12]
Robin Genuer, Jean-Michel Poggi, and Christine Tuleau-Malot. 2010. Variable selection using random forests. Pattern Recognition Letters 31, 14 (2010), 2225--2236. https://doi.org/10.1016/j.patrec.2010.03.014
[13]
K. V. Ghag and K. Shah. 2015. Comparative analysis of effect of stopwords removal on sentiment classification. In 2015 International Conference on Computer, Communication and Control (IC4). 1--6. https://doi.org/10.1109/IC4.2015.7375527
[14]
Georgios Gousios, Martin Pinzger, and Arie van Deursen. 2014. An Exploratory Study of the Pull-based Software Development Model. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India) (ICSE 2014). ACM, New York, NY, USA, 345--355. https://doi.org/10.1145/2568225.2568260
[15]
G. Gousios, M. Storey, and A. Bacchelli. 2016. Work Practices and Challenges in Pull-Based Development: The Contributor's Perspective. In International Conference on Software Engineering (ICSE). 285--296.
[16]
G. Gousios, A. Zaidman, M. Storey, and A. v. Deursen. 2015. Work Practices and Challenges in Pull-Based Development: The Integrator's Perspective. In 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Vol. 1. 358--368. https://doi.org/10.1109/ICSE.2015.55
[17]
Tin Kam Ho. 1995. Random decision forests. In Proceedings of 3rd international conference on document analysis and recognition, Vol. 1. IEEE, 278--282.
[18]
Jing Jiang, Yun Yang, Jiahuan He, Xavier Blanc, and Li Zhang. 2017. Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development. Information and Software Technology 84 (2017), 48--62. https://doi.org/10.1016/j.infsof.2016.10.006
[19]
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M. Germán, and Daniela E. Damian. 2014. The promises and perils of mining GitHub. In 11th Working Conference on Mining Software Repositories, MSR 2014, Proceedings, May 31-June 1, 2014, Hyderabad, India. 92--101. https://doi.org/10.1145/2597073.2597074
[20]
Zhifang Liao, Yanbing Li, Dayu He, Jinsong Wu, Yan Zhang, and Xiaoping Fan. 2017. Topic-Based Integrator Matching for Pull Request. GLOBECOM 2017-2017 IEEE Global Communications Conference (2017), 1--6.
[21]
J. Liu, J. Li, and L. He. 2016. A Comparative Study of the Effects of Pull Request on GitHub Projects. In 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), Vol. 1. 313--322. https://doi.org/10.1109/COMPSAC.2016. 27
[22]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. CoRR abs/1310.4546 (2013). arXiv:1310.4546 http://arxiv.org/abs/1310.4546
[23]
Audris Mockus, Roy T. Fielding, and James D. Herbsleb. 2002. Two Case Studies of Open Source Software Development: Apache and Mozilla. ACM Trans. Softw. Eng. Methodol. 11, 3 (July 2002), 309--346. https://doi.org/10.1145/567793.567795
[24]
Abdillah Mohamed, Li Zhang, Jing Jiang, and Ahmed Ktob. 2018. Predicting Which Pull Requests Will Get Reopened in GitHub. In 25th Asia-Pacific Software Engineering Conference, APSEC 2018, Nara, Japan, December 4-7, 2018. 375--385. https://doi.org/10.1109/APSEC.2018.00052
[25]
William S Noble. 2006. What is a support vector machine? Nature biotechnology 24, 12 (2006), 1565.
[26]
Sebastiano Panichella. 2018. Summarization techniques for code, change, testing, and user feedback (Invited paper). In 2018 IEEE Workshop on Validation, Analysis and Evolution of Software Tests, VST@SANER 2018, Campobasso, Italy, March 20, 2018, Cyrille Artho and Rudolf Ramler (Eds.). IEEE, 1--5. https://doi.org/10.1109/VST.2018.8327148
[27]
Sebastiano Panichella, Andrea Di Sorbo, Emitza Guzman, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2015. How can i improve my app? Classifying user reviews for software maintenance and evolution. In 2015 IEEE International Conference on Software Maintenance and Evolution, ICSME 2015, Bremen, Germany, September 29-October 1, 2015, Rainer Koschke, Jens Krinke, and Martin P. Robillard (Eds.). IEEE Computer Society, 281--290. https://doi.org/10.1109/ICSM.2015.7332474
[28]
Martin Porter. [n.d.]. The Porter stemmer Algorithm. http://tartarus.org/~martin/PorterStemmer/. Accessed October 23, 2019.
[29]
Mohammad Masudur Rahman and Chanchal K. Roy. 2014. An Insight into the Pull Requests of GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories (Hyderabad, India) (MSR 2014). ACM, New York, NY, USA, 364--367. https://doi.org/10.1145/2597073.2597121
[30]
C. Seiffert, T. M. Khoshgoftaar, J. Van Hulse, and A. Napolitano. 2010. RUSBoost: A Hybrid Approach to Alleviating Class Imbalance. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 40, 1 (Jan 2010), 185--197. https://doi.org/10.1109/TSMCA.2009.2029559
[31]
Jacek Sliwerski, Thomas Zimmermann, and Andreas Zeller. 2005. When Do Changes Induce Fixes? SIGSOFT Softw. Eng. Notes 30, 4 (May 2005), 1--5. https://doi.org/10.1145/1082983.1083147
[32]
Andrea Di Sorbo, Sebastiano Panichella, Carol V. Alexandru, Junji Shimagaki, Corrado Aaron Visaggio, Gerardo Canfora, and Harald C. Gall. 2016. What would users change in my app? summarizing app reviews for recommending software changes. In Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, FSE 2016, Seattle, WA, USA, November 13-18, 2016, Thomas Zimmermann, Jane Cleland-Huang, and Zhendong Su (Eds.). ACM, 499--510. https://doi.org/10.1145/2950290.2950299
[33]
Patanamon Thongtanunam, Raula Gaikovina Kula, Ana Erika Camargo Cruz, Norihiro Yoshida, and Hajimu Iida. 2014. Improving Code Review Effectiveness Through Reviewer Recommendations. In Proceedings of the 7th International Workshop on Cooperative and Human Aspects of Software Engineering (Hyderabad, India) (CHASE 2014). ACM, New York, NY, USA, 119--122. https://doi.org/10.1145/2593702.2593705
[34]
Jason Tsay, Laura Dabbish, and James Herbsleb. 2014. Influence of Social and Technical Factors for Evaluating Contribution in GitHub. In Proceedings of the 36th International Conference on Software Engineering (Hyderabad, India). ACM, New York, NY, USA, 356--366. https://doi.org/10.1145/2568225.2568315
[35]
E. v. d. Veen, G. Gousios, and A. Zaidman. 2015. Automatically Prioritizing Pull Requests. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. 357--361. https://doi.org/10.1109/MSR.2015.40
[36]
Strother H Walker and David B Duncan. 1967. Estimation of the probability of an event as a function of several independent variables. Biometrika 54, 1-2 (1967), 167--179.
[37]
Yi Wang and David Redmiles. 2016. Cheap talk, cooperation, and trust in global software engineering. Empirical Software Engineering 21, 6 (01 Dec 2016), 2233--2267. https://doi.org/10.1007/s10664-015-9407-3
[38]
X. Ye, H. Shen, X. Ma, R. Bunescu, and C. Liu. 2016. From Word Embeddings to Document Similarities for Improved Information Retrieval in Software Engineering. In International Conference on Software Engineering. 404--415.
[39]
H. Ying, L. Chen, T. Liang, and J. Wu. 2016. EARec: Leveraging Expertise and Authority for Pull-Request Reviewer Recommendation in GitHub. In 2016 IEEE/ACM 3rd International Workshop on CrowdSourcing in Software Engineering (CSI-SE). 29--35. https://doi.org/10.1109/CSI-SE.2016.013
[40]
Y. Yu, H. Wang, V. Filkov, P. Devanbu, and B. Vasilescu. 2015. Wait for It: Determinants of Pull Request Evaluation Latency on GitHub. In 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories. 367--371. https://doi.org/10.1109/MSR.2015.42
[41]
Y. Yu, H. Wang, G. Yin, and C. X. Ling. 2014. Who Should Review this Pull-Request: Reviewer Recommendation to Expedite Crowd Collaboration. In 2014 21st Asia-Pacific Software Engineering Conference, Vol. 1. 335--342. https://doi.org/10.1109/APSEC.2014.57
[42]
Yue Yu, Huaimin Wang, Gang Yin, and Tao Wang. 2016. Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment? Information and Software Technology 74 (2016), 204--218. https://doi.org/10.1016/j.infsof.2016.01.004
[43]
Yue Yu, Gang Yin, Tao Wang, Cheng Yang, and Huaimin Wang. 2016. Determinants of pull-based development in the context of continuous integration. Science China Information Sciences 59, 8 (18 Jul 2016), 080104. https://doi.org/10.1007/s11432-016-5595-8
[44]
Y. Zhang, G. Yin, Y. Yu, and H. Wang. 2014. A Exploratory Study of @-Mention in GitHub's Pull-Requests. In Asia-Pacific Software Engineering Conference. 343--350.
[45]
Guoliang Zhao, Daniel Alencar da Costa, and Ying Zou. 2019. Improving the pull requests review process using learning-to-rank algorithms. Empirical Software Engineering 24, 4 (2019), 2140--2170.
[46]
Y. Zhou, Y. Su, T. Chen, Z. Huang, H. C. Gall, and S. Panichella. 2020. User Review-Based Change File Localization for Mobile Applications. IEEE Transactions on Software Engineering (2020), 1--1. https://doi.org/10.1109/TSE.2020.2967383

Cited By

View all
  • (2024)GPP: A Graph-Powered Prioritizer for Code Review RequestsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694990(104-116)Online publication date: 27-Oct-2024
  • (2024)Mining Pull Requests to Detect Process Anomalies in Open Source Software DevelopmentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639196(1-13)Online publication date: 20-May-2024
  • (2024)Prioritizing code review requests to improve review efficiency: a simulation studyEmpirical Software Engineering10.1007/s10664-024-10575-030:1Online publication date: 12-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSSP '20: Proceedings of the International Conference on Software and System Processes
June 2020
208 pages
ISBN:9781450375122
DOI:10.1145/3379177
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Machine learning
  2. Pull Requests recommendation
  3. Software maintenance and evolution

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Key Research and Development Program of China

Conference

ICSSP '20
Sponsor:

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)27
  • Downloads (Last 6 weeks)5
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)GPP: A Graph-Powered Prioritizer for Code Review RequestsProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering10.1145/3691620.3694990(104-116)Online publication date: 27-Oct-2024
  • (2024)Mining Pull Requests to Detect Process Anomalies in Open Source Software DevelopmentProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639196(1-13)Online publication date: 20-May-2024
  • (2024)Prioritizing code review requests to improve review efficiency: a simulation studyEmpirical Software Engineering10.1007/s10664-024-10575-030:1Online publication date: 12-Nov-2024
  • (2024)A preliminary investigation on using multi-task learning to predict change performance in code reviewsEmpirical Software Engineering10.1007/s10664-024-10526-929:6Online publication date: 28-Sep-2024
  • (2024)ReBack: recommending backports in social coding environmentsAutomated Software Engineering10.1007/s10515-024-00416-131:1Online publication date: 23-Feb-2024
  • (2023)Automated Identification and Qualitative Characterization of Safety Concerns Reported in UAV Software PlatformsACM Transactions on Software Engineering and Methodology10.1145/356482132:3(1-37)Online publication date: 26-Apr-2023
  • (2023)Evaluating Learning-to-Rank Models for Prioritizing Code Review Requests using Process Simulation2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)10.1109/SANER56733.2023.00050(461-472)Online publication date: Mar-2023
  • (2023)Automatically Prioritizing Tasks in Software DevelopmentIEEE Access10.1109/ACCESS.2023.330524911(90322-90334)Online publication date: 2023
  • (2023)Pull Requests Integration Process Optimization: An Empirical StudyEvaluation of Novel Approaches to Software Engineering10.1007/978-3-031-36597-3_8(155-178)Online publication date: 8-Jul-2023
  • (2022)BackportsProceedings of the 30th IEEE/ACM International Conference on Program Comprehension10.1145/3524610.3527920(636-647)Online publication date: 16-May-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media