Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Distilling Quality Enhancing Comments From Code Reviews to Underpin Reviewer Recommendation

Published: 01 July 2024 Publication History

Abstract

Code review is an important practice in software development. One of its main objectives is for the assurance of code quality. For this purpose, the efficacy of code review is subject to the credibility of reviewers, i.e., reviewers who have demonstrated strong evidence of previously making quality-enhancing comments are more credible than those who have not. Code reviewer recommendation (CRR) is designed to assist in recommending suitable reviewers for a specific objective and, in this context, assurance of code quality. Its performance is susceptible to the relevance of its training dataset to this objective, composed of all reviewers&#x2019; historical review comments, which, however, often contains a plethora of comments that are irrelevant to the enhancement of code quality. Furthermore, recommendation accuracy has been adopted as the sole metric to evaluate a recommender&#x0027;s performance, which is inadequate as it does not take reviewers&#x2019; relevant credibility into consideration. These two issues form the ground truth problem in CRR as they both originate from the relevance of dataset used to train and evaluate CRR algorithms. To tackle this problem, we first propose the concept of Quality-Enhancing Review Comments (<italic>QERC</italic>), which includes three types of comments - change-triggering inline comments, informative general comments, and approve-to-merge comments. We then devise a set of algorithms and procedures to obtain a distilled dataset by applying <italic>QERC</italic> to the original dataset. We finally introduce a new metric &#x2013; reviewer&#x0027;s credibility for quality enhancement (RCQE) &#x2013; as a complementary metric to recommendation accuracy for evaluating the performance of recommenders. To validate the proposed QERC-based approach to CRR, we conduct empirical studies using real data from seven projects containing over 82K pull requests and 346K review comments. Results show that: (a) <italic>QERC</italic> can effectively address the ground truth problem by distilling quality-enhancing comments from the dataset containing original code reviews, (b) <italic>QERC</italic> can assist recommenders in finding highly credible reviewers at a slight cost of recommendation accuracy, and (c) even &#x201C;wrong&#x201D; recommendations using the distilled dataset are likely to be more credible than those using the original dataset.

References

[1]
V. Kovalenko, N. Tintarev, E. Pasynkov, C. Bird, and A. Bacchelli, “Does reviewer recommendation help developers?” IEEE Trans. Softw. Eng., vol. 46, no. 7, pp. 710–731, Jul. 2020.
[2]
L. MacLeod, M. Greiler, M.-A. Storey, C. Bird, and J. Czerwonka, “Code reviewing in the trenches: Challenges and best practices,” IEEE Softw., vol. 35, no. 4, pp. 34–42, Jul./Aug. 2018.
[3]
O. Baysal, O. Kononenko, R. Holmes, and M. W. Godfrey, “The influence of non-technical factors on code review,” in Proc. Work. Conf. Reverse Eng. (WCRE), 2013, pp. 122–131.
[4]
O. Kononenko, O. Baysal, L. Guerrouj, Y. Cao, and M. W. Godfrey, “A study on the interplay between pull request review and continuous integration builds,” in Proc. Int. Conf. Softw. Maintenance Evolution (ICSME), 2015, pp. 111–120.
[5]
O. Kononenko, O. Baysal, and M. W. Godfrey, “Code review quality: How developers see it,” in Proc. Int. Conf. Softw. Eng. (ICSE), 2016, pp. 1028–1038.
[6]
E. Dogan, E. Tüzün, K. A. Tecimer, and H. A. Güvenir, “Investigating the validity of ground truth in code reviewer recommendation studies,” in Proc. Int. Symp. Empirical Softw. Eng. Meas. (ESEM), 2019, pp. 1–6.
[7]
I. X. Gauthier, M. Lamothe, G. Mussbacher, and S. McIntosh, “Is historical data an appropriate benchmark for reviewer recommendation systems?: A case study of the Gerrit community,” in Proc. 36th IEEE/ACM Int. Conf. Automated Softw. Eng. (ASE), Piscataway, NJ, USA: IEEE Press, 2021, pp. 30–41.
[8]
H. A. Cetin, E. Dogan, and E. Tüzün, “A review of code reviewer recommendation studies: Challenges and future directions,” Sci. Comput. Program., vol. 208, 2021, Art. no.
[9]
K. A. Tecimer, E. Tüzün, H. Dibeklioglu, and H. Erdogmus, “Detection and elimination of systematic labeling bias in code reviewer recommendation systems,” in Proc. Eval. Assessment Softw. Eng., 2021, pp. 181–190.
[10]
M. B. Zanjani, H. Kagdi, and C. Bird, “Automatically recommending peer reviewers in modern code review,” IEEE Trans. Softw. Eng., vol. 42, no. 6, pp. 530–543, Jun. 2016.
[11]
Y. Yu, H. Wang, G. Yin, and T. Wang, “Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment?” Inf. Softw. Technol. (IST), vol. 74, pp. 204–218, Jun. 2016.
[12]
J. Lipcak and B. Rossi, “A large-scale study on source code reviewer recommendation,” in Proc. Euromicro Conf. Softw. Eng. Adv. Appl. (SEAA), 2018, pp. 378–387.
[13]
P. Thongtanunam, C. Tantithamthavorn, R. G. Kula, N. Yoshida, H. Iida, and K.-i. Matsumoto, “Who should review my code? A file location-based code-reviewer recommendation approach for modern code review,” in Proc. Int. Conf. Softw. Anal., Evolution, Reeng. (SANER), 2015, pp. 141–150.
[14]
M. M. Rahman, C. K. Roy, and J. A. Collins, “CORRECT: Code reviewer recommendation in GitHub based on cross-project and technology experience,” in Proc. Int. Conf. Softw. Eng. Companion (ICSE-C), 2016, pp. 222–231.
[15]
M. M. Rahman, C. K. Roy, J. Redl, and J. A. Collins, “CORRECT: Code reviewer recommendation at GitHub for Vendasta technologies,” in Proc. Int. Conf. Automated Softw. Eng. (ASE), 2016, pp. 792–797.
[16]
W. S. Noble, “What is a support vector machine?” Nature Biotechnol., vol. 24, no. 12, pp. 1565–1567, 2006.
[17]
M. Belgiu and L. Drăguţ, “Random forest in remote sensing: A review of applications and future directions,” ISPRS J. Photogramm. Remote Sens., vol. 114, pp. 24–31, Apr. 2016.
[18]
S. H. Chen and C. A. Pollino, “Good practice in Bayesian network modelling,” Environ. Modell. Software, vol. 37, pp. 134–145, Nov. 2012.
[19]
G. Jeong, S. Kim, T. Zimmermann, and K. Yi, “Improving code review by predicting reviewers and acceptance of patches,” in Proc. Res. Softw. Anal. Error Free Comput. Tech-Memo (ROSAEC MEMO), 2009, pp. 1–18.
[20]
J. Jiang, J.-H. He, and X.-Y. Chen, “CoreDevRec: Automatic core member recommendation for contribution evaluation,” J. Comput. Sci. Technol.(JCST), vol. 30, no. 5, pp. 998–1016, 2015.
[21]
M. L. de Lima Júnior, D. M. Soares, A. Plastino, and L. Murta, “Automatic assignment of integrators to pull requests: The importance of selecting appropriate attributes,” J. Syst. Softw. (JSS), vol. 144, pp. 181–196, Oct. 2018.
[22]
J. Jiang, D. Lo, J. Zheng, X. Xia, Y. Yang, and L. Zhang, “Who should make decision on this pull request? Analyzing time-decaying relationships and file similarities for integrator prediction,” J. Syst. Softw., vol. 154, pp. 196–210, Aug. 2019.
[23]
H. Ying, L. Chen, T. Liang, and J. Wu, “EARec: Leveraging expertise and authority for pull-request reviewer recommendation in GitHub,” in Proc. Int. Workshop CrowdSourcing Softw. Eng. (CSI-SE), 2016, pp. 29–35.
[24]
H. Tong, C. Faloutsos, and J.-Y. Pan, “Random walk with restart: Fast solutions and applications,” Knowl. Inf. Syst., vol. 14, no. 3, pp. 327–346, 2008.
[25]
X. Xia, D. Lo, X. Wang, and X. Yang, “Who should review this change?: Putting text and file location analyses together for more accurate recommendations,” in Proc. Int. Conf. Softw. Maintenance Evolution (ICSME), 2015, pp. 261–270.
[26]
C. Yang et al., “RevRec: A two-layer reviewer recommendation algorithm in pull-based development model,” J. Central South Univ. (JSCU), vol. 25, no. 5, pp. 1129–1143, 2018.
[27]
R. Canamares, P. Castells, and A. Moffat, “Offline evaluation options for recommender systems,” Inf. Retrieval J., vol. 23, no. 4, pp. 387–410, 2020.
[28]
T. Hirao, R. G. Kula, A. Ihara, and K. Matsumoto, “Understanding developer commenting in code reviews,” IEICE Trans. Inf. Syst., vol. 102, no. 12, pp. 2423–2432, 2019.
[29]
A. Bosu, M. Greiler, and C. Bird, “Characteristics of useful code reviews: An empirical study at Microsoft,” in Proc. Work. Conf. Mining Softw. Repositories (MSR), 2015, pp. 146–156.
[30]
M. M. Rahman, C. K. Roy, and R. G. Kula, “Predicting usefulness of code review comments using textual features and developer experience,” in Proc. Int. Conf. Mining Softw. Repositories (MSR), 2017, pp. 215–226.
[31]
J. Kim and E. Lee, “Understanding review expertise of developers: A reviewer recommendation approach based on latent Dirichlet allocation,” Symmetry, vol. 10, no. 4, 2018, Art. no.
[32]
M. Hasan, A. Iqbal, M. R. U. Islam, A. Rahman, and A. Bosu, “Using a balanced scorecard to identify opportunities to improve code review effectiveness: An industrial experience report,” Empirical Softw. Eng., vol. 26, no. 6, pp. 1–34, 2021.
[33]
S. Lai, L. Xu, K. Liu, and J. Zhao, “Recurrent convolutional neural networks for text classification,” in Proc. 29th AAAI Conf. Artif. Intell., vol. 29, no. 1, Feb. 2015. pp. 2267–2273.
[34]
Z. Li, Y. Yu, G. Yin, T. Wang, Q. Fan, and H. Wang, “Automatic classification of review comments in pull-based development model,” in Proc. 29th Int. Conf. Softw. Eng. Knowl. Eng. (SEKE), 2017, pp. 572–577.
[35]
T. Mikolov, I. Sutskever, K. Chen, G. S. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” in Proc. Adv. Neural Inf. Process. Syst., vol. 26, 2013, pp. 3111–3119.
[36]
F. Gilardi, M. Alizadeh, and M. Kubli, “ChatGPT outperforms crowd-workers for text-annotation tasks,” 2023,.
[37]
X. Sun et al., “Text classification via large language models,” 2023,.
[38]
M. Caulo, B. Lin, G. Bavota, G. Scanniello, and M. Lanza, “Knowledge transfer in modern code review,” in Proc. 28th Int. Conf. Program Comprehension, 2020, pp. 230–240.
[39]
F. Ebert, F. Castor, N. Novielli, and A. Serebrenik, “Confusion in code reviews: Reasons, impacts, and coping strategies,” in Proc. IEEE 26th Int. Conf. Softw. Anal., Evolution Reeng. (SANER), Piscataway, NJ, USA: IEEE Press, 2019, pp. 49–60.
[40]
Z. Feng, M. Guizani, M. A. Gerosa, and A. Sarma, “The state of diversity and inclusion in Apache: A pulse check,” 2023,.
[41]
Y. Huang, D. Ford, and T. Zimmermann, “Leaving my fingerprints: Motivations and challenges of contributing to OSS for social good,” in Proc. IEEE/ACM 43rd Int. Conf. Softw. Eng. (ICSE), Piscataway, NJ, USA: IEEE Press, 2021, pp. 1020–1032.
[42]
M. Chouchen, A. Ouni, M. W. Mkaouer, R. G. Kula, and K. Inoue, “WhoReview: A multi-objective search-based approach for code reviewers recommendation in modern code review,” Appl. Soft Comput., vol. 100, 2021, Art. no.
[43]
E. Mirsaeedi and P. C. Rigby, “Mitigating turnover with code review recommendation: Balancing expertise, workload, and knowledge distribution,” in Proc. 42nd Int. Conf. Softw. Eng. (ICSE), New York, NY, USA: ACM, 2020, pp. 1183–1195.
[44]
Z. Xia, H. Sun, J. Jiang, X. Wang, and X. Liu, “A hybrid approach to code reviewer recommendation with collaborative filtering,” in Proc. Int. Workshop Softw. Mining (SoftwareMining), 2017, pp. 24–31.
[45]
J. Jiang, Y. Yang, J. He, X. Blanc, and L. Zhang, “Who should comment on this pull request? Analyzing attributes for more accurate commenter recommendation in pull-based development,” Inf. Softw. Technol. (IST), vol. 84, pp. 48–62, Apr. 2017.
[46]
M. L. de Lima Júnior, D. M. Soares, A. Plastino, and L. Murta, “Developers assignment for analyzing pull requests,” in Proc. ACM Symp. Appl. Comput. (SAC), 2015, pp. 1567–1572.
[47]
G. Rong, Y. Zhang, L. Yang, F. Zhang, H. Kuang, and H. Zhang, “Modeling review history for reviewer recommendation: A hypergraph approach,” 2022,.
[48]
A. Strand, M. Gunnarson, R. Britto, and M. Usman, “Using a context-aware approach to recommend code reviewers: Findings from an industrial case study,” in Proc. ACM/IEEE 42nd Int. Conf. Softw. Eng.: Softw. Eng. Pract., 2020, pp. 1–10.
[49]
S. Asthana et al., “WhoDo: Automating reviewer suggestions at scale,” in Proc. 27th Joint Meeting Eur. Softw. Eng. Conf. Symp. Found. Softw. Eng. (ESEC/FSE), New York, NY, USA: ACM, 2019, pp. 937–945.
[50]
R. Likert, “A technique for the measurement of attitudes,” Arch. Psychol., vol. 22, no. 140, 1932, pp. 5–55.
[51]
Q. Chen, D. Kong, L. Bao, C. Sun, X. Xia, and S. Li, “Code reviewer recommendation in Tencent: Practice, challenge, and direction,” in Proc. 44th Int. Conf. Softw. Eng.: Softw. Eng. Pract., 2022, pp. 115–124.
[52]
A. Bacchelli and C. Bird, “Expectations, outcomes, and challenges of modern code review,” in Proc. Int. Conf. Softw. Eng. (ICSE), 2013, pp. 712–721.
[53]
S. Rebai, A. Amich, S. Molaei, M. Kessentini, and R. Kazman, “Multi-objective code reviewer recommendations: Balancing expertise, availability and collaborations,” Automated Softw. Eng., vol. 27, no. 3, pp. 301–328, 2020.
[54]
W. H. A. Al-Zubaidi, P. Thongtanunam, H. K. Dam, C. Tantithamthavorn, and A. Ghose, “Workload-aware reviewer recommendation using a multi-objective search-based approach,” in Proc. 16th Int. Conf. Predictive Models Data Analytics Softw. Eng. (PROMISE), New York, NY, USA: ACM, 2020, pp. 21–30.
[55]
Y. Hu, J. Wang, J. Hou, S. Li, and Q. Wang, “Is there a ‘golden’ rule for code reviewer recommendation?:—An experimental evaluation,” in Proc. IEEE 20th Int. Conf. Softw. Qual., Rel. Secur. (QRS), Piscataway, NJ, USA: IEEE Press, 2020, pp. 497–508.
[56]
M. Fejzer, P. Przymus, and K. Stencel, “Profile based recommendation of code reviewers,” J. Intell. Inf. Syst. (JIIS), vol. 50, no. 3, pp. 597–619, 2018.
[57]
C. Hannebauer, M. Patalas, S. Stünkel, and V. Gruhn, “Automatically recommending code reviewers based on their expertise: An empirical comparison,” in Proc. Int. Conf. Automated Softw. Eng. (ASE), 2016, pp. 99–110.
[58]
C. Yang, X. Zhang, L. Zeng, Q. Fan, G. Yin, and H. Wang, “An empirical study of reviewer recommendation in pull-based development model,” in Proc. Asia-Pacific Symp. Internetware (Internetware), 2017, pp. 1–6.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Transactions on Software Engineering
IEEE Transactions on Software Engineering  Volume 50, Issue 7
July 2024
309 pages

Publisher

IEEE Press

Publication History

Published: 01 July 2024

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media