research-article

Predicting Defectiveness of Software Patches

Authors:

Behjat Soltanifar,

Ayse BenerAuthors Info & Claims

ESEM '16: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Article No.: 22, Pages 1 - 10

https://doi.org/10.1145/2961111.2962601

Published: 08 September 2016 Publication History

Abstract

Context: Software code review, as an engineering best practice, refers to the inspection of the code change in order to find possible defects and ensure change quality. Code reviews, however, may not guarantee finding the defects. Thus, there is a risk for a defective code change in a given patch, to pass the review process and be submitted.

Goal: In this research, we aim to apply different machine learning algorithms in order to predict the defectiveness of a patch after being reviewed, at the time of its submission.

Method: We built three models using three different machine learning algorithms: Logistic Regression, NaÃŕve Bayes, and Bayesian Network model. To build the models, we consider different factors involved in review process in terms of Product, Process and People (3P).

Results: Our empirical results show that, Bayesian Networks is able to better predict the defectiveness of the changed code with 76% accuracy.

Conclusions: Predicting defectiveness of change code is beneficial in making patch release decisions. The Bayesian Network model outperforms the others since it capturs the relationship among the factors in the review process.

References

[1]

S. Alhassan, B. Caglayan, and A. B. Bener. Do more people make the code more defect prone?: Social network analysis in oss projects. In SEKE, pages 93--98, 2010.

[2]

A. Aurum, H. Petersson, and C. Wohlin. State-of-the-art: software inspections after 25 years. Software Testing, Verification and Reliability, 12(3):133--154, 2002.

[3]

A. Bacchelli and C. Bird. Expectations, outcomes, and challenges of modern code review. In Proceedings of the 2013 International Conference on Software Engineering, pages 712--721. IEEE Press, 2013.

Digital Library

[4]

M. Beller, A. Bacchelli, A. Zaidman, and E. Juergens. Modern code reviews in open-source projects: which problems do they fix? In Proceedings of the 11th working conference on mining software repositories, pages 202--211. ACM, 2014.

Digital Library

[5]

L. C. Briand, B. Freimut, and F. Vollei. Using multiple adaptive regression splines to support decision making in code inspections. Journal of Systems and Software, 73(2):205--217, 2004.

Digital Library

[6]

B. Caglayan, A. T. Misirli, A. B. Bener, and A. Miranskyy. Predicting defective modules in different test phases. Software Quality Journal, 23(2):205--227, 2015.

Digital Library

[7]

G. Çalιklι and A. B. Bener. Influence of confirmation biases of developers on software quality: an empirical study. Software Quality Journal, 21(2):377--416, 2013.

Digital Library

[8]

N. V. Chawla. Data mining for imbalanced datasets: An overview. In Data mining and knowledge discovery handbook, pages 853--867. Springer, 2005.

[9]

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer. Smote: synthetic minority over-sampling technique. Journal of artificial intelligence research, pages 321--357, 2002.

Digital Library

[10]

J. Cohen, E. Brown, B. DuRette, and S. Teleki. Best kept secrets of peer code review. Smart Bear, 2006.

[11]

J. Czerwonka, M. Greiler, and J. Tilford. Code reviews do not find bugs. how the current code review best practice slows us down. In Software Engineering (ICSE), 2015 IEEE/ACM 37th IEEE International Conference on, volume 2, pages 27--28. IEEE, 2015.

Digital Library

[12]

S. Das. Filters, wrappers and a boosting-based hybrid for feature selection. In ICML, volume 1, pages 74--81. Citeseer, 2001.

Digital Library

[13]

S. G. Eick, C. R. Loader, M. D. Long, L. G. Votta, and S. Vander Wiel. Estimating software fault content before coding. In Proceedings of the 14th international conference on Software engineering, pages 59--65. ACM, 1992.

Digital Library

[14]

M. E. Fagan. Design and code inspections to reduce errors in program development. In Pioneers and Their Contributions to Software Engineering, pages 301--334. Springer, 2001.

[15]

N. Fenton, P. Krause, and M. Neil. Software measurement: uncertainty and causal modeling. IEEE software, 19(4):116, 2002.

Digital Library

[16]

N. Fenton, M. Neil, W. Marsh, P. Hearty, L. Radliński, and P. Krause. On the effectiveness of early life cycle defect prediction with bayesian nets. Empirical Software Engineering, 13(5):499--537, 2008.

Digital Library

[17]

M. A. Hall and G. Holmes. Benchmarking attribute selection techniques for discrete class data mining. Knowledge and Data Engineering, IEEE Transactions on, 15(6):1437--1447, 2003.

Digital Library

[18]

T. Hall, S. Beecham, D. Bowes, D. Gray, and S. Counsell. A systematic literature review on fault prediction performance in software engineering. Software Engineering, IEEE Transactions on, 38(6):1276--1304, 2012.

Digital Library

[19]

A. E. Hassan. Predicting faults using the complexity of code changes. In Proceedings of the 31st International Conference on Software Engineering, pages 78--88. IEEE Computer Society, 2009.

Digital Library

[20]

J. Hauke and T. Kossowski. Comparison of values of pearson's and spearman's correlation coefficients on the same sets of data. Quaestiones geographicae, 30(2):87--93, 2011.

[21]

G. Jeong, S. Kim, and T. Zimmermann. Improving bug triage with bug tossing graphs. In Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, pages 111--120. ACM, 2009.

Digital Library

[22]

S. Just, R. Premraj, and T. Zimmermann. Towards the next generation of bug tracking systems. In Visual languages and Human-Centric computing, 2008. VL/HCC 2008. IEEE symposium on, pages 82--85. IEEE, 2008.

Digital Library

[23]

Y. Kamei, S. Matsumoto, A. Monden, K.-i. Matsumoto, B. Adams, and A. E. Hassan. Revisiting common bug prediction findings using effort-aware models. In Software Maintenance (ICSM), 2010 IEEE International Conference on, pages 1--10. IEEE, 2010.

Digital Library

[24]

M. Kläs, H. Nakao, F. Elberzhager, and J. Münch. Support planning and controlling of early quality assurance by combining expert judgment and defect data-a case study. Empirical Software Engineering, 15(4):423--454, 2010.

Digital Library

[25]

S. Kollanus and J. Koskinen. Survey of software inspection research. The Open Software Engineering Journal, 3(1):15--34, 2009.

[26]

O. Kononenko, O. Baysal, L. Guerrouj, Y. Cao, and M. W. Godfrey. Investigating code review quality: Do people and participation matter? In Software Maintenance and Evolution (ICSME), 2015 IEEE International Conference on, pages 111--120. IEEE, 2015.

Digital Library

[27]

S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. Software Engineering, IEEE Transactions on, 34(4):485--496, 2008.

Digital Library

[28]

T. J. McCabe. A complexity measure. Software Engineering, IEEE Transactions on, (4):308--320, 1976.

Digital Library

[29]

S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. The impact of code review coverage and code review participation on software quality: A case study of the qt, vtk, and itk projects. In Proceedings of the 11th Working Conference on Mining Software Repositories, pages 192--201. ACM, 2014.

Digital Library

[30]

S. McIntosh, Y. Kamei, B. Adams, and A. E. Hassan. An empirical study of the impact of modern code review practices on software quality. Empirical Software Engineering, pages 1--44, 2015.

Digital Library

[31]

T. Menzies, J. Greenwald, and A. Frank. Data mining static code attributes to learn defect predictors. Software Engineering, IEEE Transactions on, 33(1):2--13, 2007.

Digital Library

[32]

T. Menzies, B. Turhan, A. Bener, G. Gay, B. Cukic, and Y. Jiang. Implications of ceiling effects in defect predictors. In Proceedings of the 4th international workshop on Predictor models in software engineering, pages 47--54. ACM, 2008.

Digital Library

[33]

A. T. Misirli and A. B. Bener. Bayesian networks for evidence-based decision-making in software engineering. Software Engineering, IEEE Transactions on, 40(6):533--554, 2014.

[34]

N. Nagappan, B. Murphy, and V. Basili. The influence of organizational structure on software quality: an empirical case study. In Proceedings of the 30th international conference on Software engineering, pages 521--530. ACM, 2008.

Digital Library

[35]

A. D. Oral and A. B. Bener. Defect prediction for embedded software. In Computer and information sciences, 2007. iscis 2007. 22nd international symposium on, pages 1--6. IEEE, 2007.

[36]

H. Petersson, T. Thelin, P. Runeson, and C. Wohlin. Capture--recapture in software inspections after 10 years research---theory, evaluation and application. Journal of Systems and Software, 72(2):249--264, 2004.

[37]

M. Pinzger, N. Nagappan, and B. Murphy. Can developer-module networks predict failures? In Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering, pages 2--12. ACM, 2008.

Digital Library

[38]

D. M. Powers. Evaluation: from precision, recall and f-measure to roc, informedness, markedness and correlation. 2011.

[39]

F. Rahman and P. Devanbu. How, and why, process metrics are better. In Proceedings of the 2013 International Conference on Software Engineering, pages 432--441. IEEE Press, 2013.

Digital Library

[40]

P. C. Rigby, D. M. German, and M.-A. Storey. Open source software peer review practices: a case study of the apache server. In Proceedings of the 30th international conference on Software engineering, pages 541--550. ACM, 2008.

Digital Library

[41]

Y. Shin, A. Meneely, L. Williams, and J. A. Osborne. Evaluating complexity, code churn, and developer activity metrics as indicators of software vulnerabilities. Software Engineering, IEEE Transactions on, 37(6):772--787, 2011.

Digital Library

[42]

B. Turhan and A. Bener. A multivariate analysis of static code attributes for defect prediction. In Quality Software, 2007. QSIC'07. Seventh International Conference on, pages 231--237. IEEE, 2007.

Digital Library

[43]

L. G. Votta Jr. Does every inspection need a meeting? ACM SIGSOFT Software Engineering Notes, 18(5):107--114, 1993.

Digital Library

[44]

P. Weißgerber, D. Neu, and S. Diehl. Small patches get in! In Proceedings of the 2008 international working conference on Mining software repositories, pages 67--76. ACM, 2008.

Digital Library

[45]

R. Wieringa and M. Daneva. Six strategies for generalizing software engineering theories. Science of computer programming, 101:136--152, 2015.

Cited By

Wakimoto MMorisaki S(2022)A Metric for Questions and Discussions Identifying Concerns in Software ReviewsSoftware10.3390/software10300161:3(364-380)Online publication date: 5-Sep-2022
https://doi.org/10.3390/software1030016
Yang YXia XLo DBi TGrundy JYang X(2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3503509
Dou QChen JGao JHuang Z(2022)Class Change Prediction by Incorporating Community Smell: An Empirical StudyInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250052832:09(1369-1388)Online publication date: 24-Sep-2022
https://doi.org/10.1142/S0218194022500528

Predicting Defectiveness of Software Patches
1. Software and its engineering
  1. Software creation and management
    1. Software development techniques
      1. Reusability

Recommendations

Enhancing the defectiveness prediction of methods and classes via JIT
AbstractContext
Defect prediction can help at prioritizing testing tasks by, for instance, ranking a list of items (methods and classes) according to their likelihood to be defective. While many studies investigated how to predict the defectiveness of ...
Cross-project smell-based defect prediction
Abstract
Defect prediction is a technique introduced to optimize the testing phase of the software development pipeline by predicting which components in the software may contain defects. Its methodology trains a classifier with data regarding a set of ...
Assessing Software Quality by Program Clustering and Defect Prediction
WCRE '11: Proceedings of the 2011 18th Working Conference on Reverse Engineering

Many empirical studies have shown that defect prediction models built on product metrics can be used to assess the quality of software modules. So far, most methods proposed in this direction predict defects by class or file. In this paper, we propose a ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ESEM '16: Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

September 2016

457 pages

ISBN:9781450344272

DOI:10.1145/2961111

General Chair:
Marcela Genero
University of Castilla-La Mancha, Spain
,
Program Chairs:
Andreas Jedlitschka
Fraunhofer IESE, Germany
,
Magne Jørgensen
Simula Research Laboratory, Norway
,
Giuseppe Scanniello
University of Basilicata, Italy
,
Sreedevi Sampath
University of Maryland Baltimore County, USA
,
Danilo Caivano
SER&Practices, Italy
,
Daniel Port
University of Hawaii, USA

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 September 2016

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ESEM '16

Sponsor:

SIGSOFT

ESEM '16: ACM/IEEE 9th International Symposium on Empirical Software Engineering and Measurement

September 8 - 9, 2016

Ciudad Real, Spain

Acceptance Rates

ESEM '16 Paper Acceptance Rate 27 of 122 submissions, 22%;

Overall Acceptance Rate 130 of 594 submissions, 22%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
375
Total Downloads

Downloads (Last 12 months)14
Downloads (Last 6 weeks)1

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wakimoto MMorisaki S(2022)A Metric for Questions and Discussions Identifying Concerns in Software ReviewsSoftware10.3390/software10300161:3(364-380)Online publication date: 5-Sep-2022
https://doi.org/10.3390/software1030016
Yang YXia XLo DBi TGrundy JYang X(2022)Predictive Models in Software Engineering: Challenges and OpportunitiesACM Transactions on Software Engineering and Methodology10.1145/350350931:3(1-72)Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1145/3503509
Dou QChen JGao JHuang Z(2022)Class Change Prediction by Incorporating Community Smell: An Empirical StudyInternational Journal of Software Engineering and Knowledge Engineering10.1142/S021819402250052832:09(1369-1388)Online publication date: 24-Sep-2022
https://doi.org/10.1142/S0218194022500528
Eken BPalma FAyşe BAyşe T(2021)An empirical study on the effect of community smells on bug predictionSoftware Quality Journal10.1007/s11219-020-09538-7Online publication date: 15-Feb-2021
https://doi.org/10.1007/s11219-020-09538-7

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents