Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2961111.2962601acmconferencesArticle/Chapter ViewAbstractPublication PagesesemConference Proceedingsconference-collections
research-article

Predicting Defectiveness of Software Patches

Published: 08 September 2016 Publication History

Abstract

Context: Software code review, as an engineering best practice, refers to the inspection of the code change in order to find possible defects and ensure change quality. Code reviews, however, may not guarantee finding the defects. Thus, there is a risk for a defective code change in a given patch, to pass the review process and be submitted.
Goal: In this research, we aim to apply different machine learning algorithms in order to predict the defectiveness of a patch after being reviewed, at the time of its submission.
Method: We built three models using three different machine learning algorithms: Logistic Regression, NaÃŕve Bayes, and Bayesian Network model. To build the models, we consider different factors involved in review process in terms of Product, Process and People (3P).
Results: Our empirical results show that, Bayesian Networks is able to better predict the defectiveness of the changed code with 76% accuracy.
Conclusions: Predicting defectiveness of change code is beneficial in making patch release decisions. The Bayesian Network model outperforms the others since it capturs the relationship among the factors in the review process.

References

[1]
S. Alhassan, B. Caglayan, and A. B. Bener. Do more people make the code more defect prone?: Social network analysis in oss projects. In SEKE, pages 93--98, 2010.
[2]
A. Aurum, H. Petersson, and C. Wohlin. State-of-the-art: software inspections after 25 years. Software Testing, Verification and Reliability, 12(3):133--154, 2002.
[3]
A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 International Conference on Software Engineering, pages 712--721. IEEE Press, 2013.
[4]
M. Beller, A. Bacchelli, A. Zaidman, and E. Juergens. Modern code reviews in open-source projects: which problems do they fix? In Proceedings of the 11th working conference on mining software repositories, pages 202--211. ACM, 2014.
[5]
L. C. Briand, B. Freimut, and F. Vollei. Using multiple adaptive regression splines to support decision making in code inspections. Journal of Systems and Software, 73(2):205--217, 2004.
[6]
B. Caglayan, A. T. Misirli, A. B. Bener, and A. Miranskyy. Predicting defective modules in different test phases. Software Quality Journal, 23(2):205--227, 2015.
[7]
G. Çalιklι and A. B. Bener. Influence of confirmation biases of developers on software quality: an empirical study. Software Quality Journal, 21(2):377--416, 2013.
[8]
N. V. Chawla. Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook, pages 853--867. Springer, 2005.
[9]
N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, pages 321--357, 2002.
[10]
J. Cohen, E. Brown, B. DuRette, and S. Teleki. Best kept secrets of peer code review. Smart Bear, 2006.
[11]
J. Czerwonka, M. Greiler, and J. Tilford. Code reviews do not find bugs. how the current code review best practice slows us down. In Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on, volume 2, pages 27--28. IEEE, 2015.
[12]
S. Das. Filters, wrappers and a boosting-based hybrid for feature selection. In ICML, volume 1, pages 74--81. Citeseer, 2001.
[13]
S. G. Eick, C. R. Loader, M. D. Long, L. G. Votta, and S. Vander Wiel. Estimating software fault content before coding. In Proceedings of the 14th international conference on Software engineering, pages 59--65. ACM, 1992.
[14]
M. E. Fagan. Design and code inspections to reduce errors in program development. In Pioneers and Their Contributions to Software Engineering, pages 301--334. Springer, 2001.
[15]
N. Fenton, P. Krause, and M. Neil. Software measurement: uncertainty and causal modeling. IEEE software, 19(4):116, 2002.
[16]
N. Fenton, M. Neil, W. Marsh, P. Hearty, L. Radliński, and P. Krause. On the effectiveness of early life cycle defect prediction with bayesian nets. Empirical Software Engineering, 13(5):499--537, 2008.
[17]
M. A. Hall and G. Holmes. Benchmarking attribute selection techniques for discrete class data mining. Knowledge and Data Engineering, IEEE Transactions on, 15(6):1437--1447, 2003.
[18]
T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. Software Engineering, IEEE Transactions on, 38(6):1276--1304, 2012.
[19]
A. E. Hassan. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering, pages 78--88. IEEE Computer Society, 2009.
[20]
J. Hauke and T. Kossowski. Comparison of values of pearson's and spearman's correlation coefficients on the same sets of data. Quaestiones geographicae, 30(2):87--93, 2011.
[21]
G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 111--120. ACM, 2009.
[22]
S. Just, R. Premraj, and T. Zimmermann. Towards the next generation of bug tracking systems. In Visual languages and Human-Centric computing, 2008. VL/HCC 2008. IEEE symposium on, pages 82--85. IEEE, 2008.
[23]
Y. Kamei, S. Matsumoto, A. Monden, K.-i. Matsumoto, B. Adams, and A. E. Hassan. Revisiting common bug prediction findings using effort-aware models. In Software Maintenance (ICSM), 2010 IEEE International Conference on, pages 1--10. IEEE, 2010.
[24]
M. Kläs, H. Nakao, F. Elberzhager, and J. Münch. Support planning and controlling of early quality assurance by combining expert judgment and defect data-a case study. Empirical Software Engineering, 15(4):423--454, 2010.
[25]
S. Kollanus and J. Koskinen. Survey of software inspection research. The Open Software Engineering Journal, 3(1):15--34, 2009.
[26]
O. Kononenko, O. Baysal, L. Guerrouj, Y. Cao, and M. W. Godfrey. Investigating code review quality: Do people and participation matter? In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on, pages 111--120. IEEE, 2015.
[27]
S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. Software Engineering, IEEE Transactions on, 34(4):485--496, 2008.
[28]
T. J. McCabe. A complexity measure. Software Engineering, IEEE Transactions on, (4):308--320, 1976.
[29]
S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proceedings of the 11th Working Conference on Mining Software Repositories, pages 192--201. ACM, 2014.
[30]
S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. An empirical study of the impact of modern code review practices on software quality. Empirical Software Engineering, pages 1--44, 2015.
[31]
T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. Software Engineering, IEEE Transactions on, 33(1):2--13, 2007.
[32]
T. Menzies, B. Turhan, A. Bener, G. Gay, B. Cukic, and Y. Jiang. Implications of ceiling effects in defect predictors. In Proceedings of the 4th international workshop on Predictor models in software engineering, pages 47--54. ACM, 2008.
[33]
A. T. Misirli and A. B. Bener. Bayesian networks for evidence-based decision-making in software engineering. Software Engineering, IEEE Transactions on, 40(6):533--554, 2014.
[34]
N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proceedings of the 30th international conference on Software engineering, pages 521--530. ACM, 2008.
[35]
A. D. Oral and A. B. Bener. Defect prediction for embedded software. In Computer and information sciences, 2007. iscis 2007. 22nd international symposium on, pages 1--6. IEEE, 2007.
[36]
H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research---theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.
[37]
M. Pinzger, N. Nagappan, and B. Murphy. Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 2--12. ACM, 2008.
[38]
D. M. Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011.
[39]
F. Rahman and P. Devanbu. How, and why, process metrics are better. In Proceedings of the 2013 International Conference on Software Engineering, pages 432--441. IEEE Press, 2013.
[40]
P. C. Rigby, D. M. German, and M.-A. Storey. Open source software peer review practices: a case study of the apache server. In Proceedings of the 30th international conference on Software engineering, pages 541--550. ACM, 2008.
[41]
Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. Software Engineering, IEEE Transactions on, 37(6):772--787, 2011.
[42]
B. Turhan and A. Bener. A multivariate analysis of static code attributes for defect prediction. In Quality Software, 2007. QSIC'07. Seventh International Conference on, pages 231--237. IEEE, 2007.
[43]
L. G. Votta Jr. Does every inspection need a meeting? ACM SIGSOFT Software Engineering Notes, 18(5):107--114, 1993.
[44]
P. Weißgerber, D. Neu, and S. Diehl. Small patches get in! In Proceedings of the 2008 international working conference on Mining software repositories, pages 67--76. ACM, 2008.
[45]
R. Wieringa and M. Daneva. Six strategies for generalizing software engineering theories. Science of computer programming, 101:136--152, 2015.

Cited By

View all
  • (2022)A Metric for Questions and Discussions Identifying Concerns in Software ReviewsSoftware10.3390/software10300161:3(364-380)Online publication date: 5-Sep-2022
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
  • (2022)Class Change Prediction by Incorporating Community Smell: An Empirical StudyInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250052832:09(1369-1388)Online publication date: 24-Sep-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ESEM '16: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement
September 2016
457 pages
ISBN:9781450344272
DOI:10.1145/2961111
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 September 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Code Review Quality
  2. Code review
  3. Defect Prediction
  4. Software Patch Defectiveness

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ESEM '16
Sponsor:

Acceptance Rates

ESEM '16 Paper Acceptance Rate 27 of 122 submissions, 22%;
Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)14
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)A Metric for Questions and Discussions Identifying Concerns in Software ReviewsSoftware10.3390/software10300161:3(364-380)Online publication date: 5-Sep-2022
  • (2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
  • (2022)Class Change Prediction by Incorporating Community Smell: An Empirical StudyInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250052832:09(1369-1388)Online publication date: 24-Sep-2022
  • (2021)An empirical study on the effect of community smells on bug predictionSoftware Quality Journal10.1007/s11219-020-09538-7Online publication date: 15-Feb-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media