Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/3566055.3566059dlproceedingsArticle/Chapter ViewAbstractPublication PagescasconConference Proceedingsconference-collections
research-article

Adopting Learning-to-rank Algorithm for Reviewer Recommendation

Published: 15 November 2022 Publication History

Abstract

Software code review is a software quality assurance activity in which one or several developers check the quality of a source code change. The success of a code review depends on finding appro­priate reviewers to inspect the code change, e.g., a pull request (PR). Otherwise, it can result in inefficient or low-quality code reviews. To match the expertise of a reviewer with a PR, existing approaches model the expertise of reviewers using different features (e.g., the file path similarity, the textual similarity, and the social connection features). However, the weights of different features are usually handcrafted and customized for each project. It is often time-consuming as the weights used in one project cannot be propa­gated to other projects. In this paper, we propose a learning-to-rank (LtR} approach that can automatically learn the best weights of features to recommend reviewers. We empirically study 80 GitHub projects to test the performance of our LtR approach and compare its performance with two baselines. The experiment results show that: (1) applying the maximum aggregation scheme to compute features improves the performance of our LtR approach; (2) the LtR approach outperforms the two baseline models by 28%, 10%, and 12% with respect to top-1 and top-3 accuracy, and Mean Reciprocal Rank on average; (3) the semantic similarity feature can be used to recommend appropriate reviewers; (4) and the social connection between the contributors and the reviewers is the most important feature to recommend appropriate reviewers to review PRs.

References

[1]
Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and chal­lenges of modern code review. In Proceedings of the 2013 international conference on software engineering. IEEE Press, 712-721.
[2]
Earl T Barr, Christian Bird, Peter C Rigby, Abram Hindle, Daniel M German, and Premkumar Devanbu. 2012. Cohesive and isolated development with branches. In International Conference on Fundamental Approaches to Software Engineering. Springer, 316-331.
[3]
Leo Breiman. 2001. Random forests. Machine learning 45, 1 (2001), 5-32.
[4]
Daniel Alencar da Costa, Shane Mcintosh, Uira Kulesza, and Ahmed E Hassan. 2016. The Impact of Switching to a Rapid Release Cycle on the Integration Delay of Addressed Issues-An Empirical Study of the Mozilla Firefox Project. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, 374-385.
[5]
Manoel Limeira de Lima Junior, Daricelio Moreira Soares, Alexandre Plastino, and Leonardo Murta. 2015. Developers assignment for analyzing pull requests. In Proceedings of the 30th Annual ACM Symposium on Applied Computing. ACM, 1567-1572.
[6]
Michael E Fagan. 1999. Design and code inspections to reduce errors in program development. IBM Systems Journal 38, 2/3 (1999), 258.
[7]
Yoav Goldberg and Omer Levy. 2014. word2vec Explained: deriving Mikolov et al:s negative-sampling word-embedding method arXiv preprint arXiv:1402.3722 (2014).
[8]
Georgios Gousios, Andy Zaidman, Margaret-Anne Storey, and Arie Van Deursen. 2015. Work practices and challenges in pull-based development: the integra­tor's perspective. In Proceedings of the 37th International Conference on Software Engineering-Volume 1. IEEE Press, 358-368.
[9]
Eirini Kalliamvakou, Georgios Gousios, Kelly Blincoe, Leif Singer, Daniel M German, and Daniela Damian. 2014. The promises and perils of mining GitHub. in Proceedings of the 11th working conference on mining software repositories. ACM, 92- 101.
[10]
Han Kyul Kim, Hyunjoong Kim, and Sungzoon Cho. 2017. Bag-of-concepts: Com­prehending document representation through clustering words in distributed representation. Neurocomputing 266 (2017), 336-352.
[11]
Matt Kusner, Yu Sun, Nicholas Kolkin, and Kilian Weinberger. 2015. From word embeddings to document distances. In International Conference on Machine Leam­ing. 957-966.
[12]
Quoc Le and Tomas Mikolov. 2014. Distributed representations of sentences and documents. In International Conference on Machine Leaming. 1188-1196.
[13]
Hang Li. 2011. A short introduction to learning to rank. IEICE TRANSACTIONS on Information and Systems 94, 10 (2011), 1854-1862.
[14]
Joseph Lilleberg, Yun Zhu, and Yanqing Zhang. 2015. Support vector machines and word2vec for text classification with semantic features. In 2015 IEEE 14th International Conference on Cognitive Informatics & Cognitive Computing (ICC!* CC). IEEE, 136-140.
[15]
Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems. 3111-3119.
[16]
Hao ran Niu, Iman Keivanloo, and Ying Zou. 2017. Learning to rank code examples for code search engines. Empirical Software Engineering 22, 1 (2017), 259-291.
[17]
Gustavo Pinto, Igor Steinmacher, and Marco Aurelio Gerosa. 2016. More common than you think: An in-depth study of casual contributors. In Software Analysis, Evolution, and Reengineering (SANER), 2016 IEEE 23rd International Conference on, Vol. 1. IEEE, 112-123.
[18]
Ankita Rane and Anand Kumar. 2018. Sentiment Classification System of Twitter Data for US Airline Service Analysis. In 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC). IEEE, 769-773.
[19]
Shivani Rao and Avinash Kak. 2011. Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In Pro­ceedings of the 8th Working Conference on Mining Software Repositories. ACM, 43-52.
[20]
Jeanine Romano, Jeffrey D Kromrey, Jesse Coraggio, and Jeff Skowronek. 2006. Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen'sd for evaluating group differences on the NSSE and other surveys. In annual meeting of the Florida Association of Institutional Research. 1-33.
[21]
Ahmed Tarnrawi, Tung Thanh Nguyen, Jafar M Al-Kofahl, and Tien N Nguyen. 2011. Fuzzy set and cache-based approach for bug triaging. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 365-375.
[22]
Patanamon Thongtanunam, Chakkrit Tantithamthavom, Raula Gaikovina Kula, Norihiro Yoshida, Hajimu Iida, and Ken-ichi Matsumoto. 2015. Who should review my code? A file location-based code-reviewer recommendation approach for modem code review. In Software Analysis, Evolution and Reengineering (SANER), 2015 IEEE 22nd International Conference on. IEEE, 141-150.
[23]
Xin Xia, David Lo, Xinyu Wang, and Xiaohu Yang. 2015. Who should review this change?: Putting text and file location analyses together for more accurate recommendations. In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on. IEEE, 261-270.
[24]
Xin Xia, David Lo, Xingen Wang, Chenyi Zhang, and Xinyu Wang. 2014. Cross­language bug localization. In Proceedings of the 22nd International Conference on Program Comprehension. ACM, 275-278.
[25]
Jifeng Xuan and Martin Monperrus. 2014. Learning to combine multiple ranking metrics for fault localization. In Software Maintenance and Evolution (ICSME), 2014 IEEE International Conference on. IEEE, 191-200.
[26]
Xin Ye, Razvan Bunescu, and Chang Liu. 2014. Learning to rank relevant files for bug reports using domain knowledge. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering. ACM, 689-699.
[27]
Yue Yu, Huaimin Wang, Gang Yin, and Charles X Ling. 2014. Who should review this pull-request: Reviewer recommendation to expedite crowd collaboration. In Software Engineering Conference (APSEC), 2014 21st Asia-Pacific, Vol. 1. IEEE, 335-342.
[28]
Yue Yu, Huaimin Wang, Gang Yin, and Tao Wang. 2016. Reviewer recommenda­tion for pull-requests in GitHub: What can we learn from code review and bug assignment? Information and Software Technology 74 (2016), 204-218.
[29]
Jerrold H Zar. 1998. Spearman rank correlation. Encyclopedia of Biostatistics (1998).
[30]
Feng Zhang, Ahmed E Hassan, Shane Mcintosh, and Ying Zou. 2017. The use of summation to aggregate software metrics hinders the performance of defect prediction models. IEEE Transactions on Software Engineering 43, 5 (2017), 476-491.
[31]
Guoliang Zhao, Daniel Costa, Alencar da, and Ying Zou. 2019. Improving the pull requests review process using learning-to-rank algorithms. Empirical Software Engineering (2019). https://doi.org/10.1007 /s10664-0l 9-09696-8
[32]
Jian Zhou and Hongyu Zhang. 2012. Learning to rank duplicate bug reports. In Proceedings of the 21st ACM international conference on Information and knowledge management. ACM, 852-861.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image DL Hosted proceedings
CASCON '22: Proceedings of the 32nd Annual International Conference on Computer Science and Software Engineering
November 2022
251 pages

Publisher

IBM Corp.

United States

Publication History

Published: 15 November 2022

Author Tags

  1. Code reviewer
  2. Reviewer recommendation
  3. Pull request
  4. Learning­-to-rank

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 24 of 90 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 22
    Total Downloads
  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media