Rebasing in code review considered harmful: A large-scale empirical investigation

M Paixao, PH Maia - … conference on source code analysis and …, 2019 - ieeexplore.ieee.org
M Paixao, PH Maia
2019 19th international working conference on source code analysis …, 2019ieeexplore.ieee.org
Code review has been widely acknowledged as a key quality assurance process in both
open-source and industrial software development. Due to the asynchronicity of the code
review process, the system's codebase tends to incorporate external commits while a source
code change is reviewed, which cause the need for rebasing operations. External commits
have the potential to modify files currently under review, which causes re-work for
developers and fatigue for reviewers. Since source code changes observed during code …
Code review has been widely acknowledged as a key quality assurance process in both open-source and industrial software development. Due to the asynchronicity of the code review process, the system's codebase tends to incorporate external commits while a source code change is reviewed, which cause the need for rebasing operations. External commits have the potential to modify files currently under review, which causes re-work for developers and fatigue for reviewers. Since source code changes observed during code review may be due to external commits, rebasing operations may pose a severe threat to empirical studies that employ code review data. Yet, to the best of our knowledge, there is no empirical study that characterises and investigates rebasing in real-world software systems. Hence, this paper reports an empirical investigation aimed at understanding the frequency in which rebasing operations occur and their side-effects in the reviewing process. To achieve so, we perform an in-depth large-scale empirical investigation of the code review data of 11 software systems, 28,808 code reviews and 99,121 revisions. Our observations indicate that developers need to perform rebasing operations in an average of 75.35% of code reviews. In addition, our data suggests that an average of 34.21% of rebasing operations tend to tamper with the reviewing process. Finally, we propose a methodology to handle rebasing in empirical studies that employ code review data. We show how an empirical study that does not account for rebasing operations may report skewed, biased and inaccurate observations.
ieeexplore.ieee.org