research-article

Defect prediction guided search-based software testing

Authors:

Burak TurhanAuthors Info & Claims

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

Pages 448 - 460

https://doi.org/10.1145/3324884.3416612

Published: 27 January 2021 Publication History

Abstract

Today, most automated test generators, such as search-based software testing (SBST) techniques focus on achieving high code coverage. However, high code coverage is not sufficient to maximise the number of bugs found, especially when given a limited testing budget. In this paper, we propose an automated test generation technique that is also guided by the estimated degree of defectiveness of the source code. Parts of the code that are likely to be more defective receive more testing budget than the less defective parts. To measure the degree of defectiveness, we leverage Schwa, a notable defect prediction technique.

We implement our approach into EvoSuite, a state of the art SBST tool for Java. Our experiments on the Defects4J benchmark demonstrate the improved efficiency of defect prediction guided test generation and confirm our hypothesis that spending more time budget on likely defective parts increases the number of bugs found in the same time budget.

References

[1]

Aldeida Aleti and Lars Grunske. 2015. Test data generation with a Kalman filter-based adaptive genetic algorithm. Journal of Systems and Software 103 (2015), 343--352.

Digital Library

[2]

Aldeida Aleti, Irene Moser, and Lars Grunske. 2017. Analysing the fitness landscape of search-based software testing problems. Automated Software Engineering 24, 3 (2017), 603--621.

Digital Library

[3]

M Moein Almasi, Hadi Hemmati, Gordon Fraser, Andrea Arcuri, and Jānis Benefelds. 2017. An industrial evaluation of unit test generation: Finding real faults in a financial application. In Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track. IEEE Press, 263--272.

Digital Library

[4]

Nadia Alshahwan, Xinbo Gao, Mark Harman, Yue Jia, Ke Mao, Alexander Mols, Taijin Tei, and Ilya Zorin. 2018. Deploying search based software engineering with Sapienz at Facebook. In International Symposium on Search Based Software Engineering. Springer, 3--45.

[5]

Andrea Arcuri and Lionel Briand. 2014. A Hitchhiker's guide to statistical tests for assessing randomized algorithms in software engineering. Software Testing, Verification and Reliability 24, 3 (2014), 219--250.

Digital Library

[6]

Andrea Arcuri, José Campos, and Gordon Fraser. 2016. Unit test generation during software development: Evosuite plugins for maven, intellij and jenkins. In 2016 IEEE International Conference on Software Testing, Verification and Validation (ICST). IEEE, 401--408.

[7]

Andrea Arcuri and Gordon Fraser. 2013. Parameter tuning or default values? An empirical investigation in search-based software engineering. Empirical Software Engineering 18, 3 (2013), 594--623.

[8]

Victor R Basili, Lionel C. Briand, and Walcélio L Melo. 1996. A validation of object-oriented design metrics as quality indicators. IEEE Transactions on software engineering 22, 10 (1996), 751--761.

Digital Library

[9]

Manfred Broy, Ingolf H Kruger, Alexander Pretschner, and Christian Salzmann. 2007. Engineering automotive software. Proc. IEEE 95, 2 (2007), 356--373.

[10]

Bora Caglayan, Burak Turhan, Ayse Bener, Mayy Habayeb, Andriy Miransky, and Enzo Cialini. 2015. Merits of organizational metrics in defect prediction: an industrial replication. In Proceedings of the 37th International Conference on Software Engineering-Volume 2. IEEE Press, 89--98.

Digital Library

[11]

José Campos, Andrea Arcuri, Gordon Fraser, and Rui Abreu. 2014. Continuous test generation: enhancing continuous integration with automated test generation. In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering. ACM, 55--66.

Digital Library

[12]

José Campos, Annibale Panichella, and Gordon Fraser. 2019. EvoSuiTE at the SBST 2019 tool competition. In Proceedings of the 12th International Workshop on Search-Based Software Testing. IEEE Press, 29--32.

Digital Library

[13]

Hoa Khanh Dam, Trang Pham, Shien Wee Ng, Truyen Tran, John Grundy, Aditya Ghose, Taeksu Kim, and Chul-Joo Kim. 2019. Lessons learned from using a deep tree-based model for software defect prediction in practice. In Proceedings of the 16th International Conference on Mining Software Repositories. IEEE Press, 46--57.

Digital Library

[14]

Paulo André Faria de Freitas. 2015. Software Repository Mining Analytics to Estimate Software Component Reliability. (2015).

[15]

EvoSuite. 2019. EvoSuite - automated generation of JUnit test suites for Java classes. https://github.com/EvoSuite/evosuite Last accessed on: 29/11/2019.

[16]

The Apache Software Foundation. 2019. Apache Commons Math. https://github.com/apache/commons-math Last accessed on: 19/09/2019.

[17]

Martin Fowler and Matthew Foemmel. 2006. Continuous integration.

[18]

Gordon Fraser. 2018. EvoSuite - Automatic Test Suite Generation for Java. http://www.evosuite.org/ Last accessed on: 19/09/2019.

[19]

Gordon Fraser and Andrea Arcuri. 2011. Evolutionary generation of whole test suites. In 2011 11th International Conference on Quality Software. IEEE, 31--40.

Digital Library

[20]

Gordon Fraser and Andrea Arcuri. 2012. Whole test suite generation. IEEE Transactions on Software Engineering 39, 2 (2012), 276--291.

Digital Library

[21]

G. Fraser and A. Arcuri. 2013. EvoSuite at the SBST 2013 Tool Competition. In 2013 IEEE Sixth International Conference on Software Testing, Verification and Validation Workshops. 406--409.

Digital Library

[22]

Gordon Fraser and Andrea Arcuri. 2014. EvoSuite at the Second Unit Testing Tool Competition. In Future Internet Testing, Tanja E.J. Vos, Kiran Lakhotia, and Sebastian Bauersfeld (Eds.). Springer International Publishing, Cham, 95--100.

[23]

Gordon Fraser and Andrea Arcuri. 2014. A large-scale evaluation of automated unit test generation using evosuite. ACM Transactions on Software Engineering and Methodology (TOSEM) 24, 2 (2014), 8.

Digital Library

[24]

Gordon Fraser and Andrea Arcuri. 2015. 1600 faults in 100 projects: automatically finding faults while achieving high coverage with evosuite. Empirical Software Engineering 20, 3 (2015), 611--639.

Digital Library

[25]

Gordon Fraser and Andrea Arcuri. 2016. EvoSuite at the SBST 2016 tool competition. In 2016 IEEE/ACM 9th International Workshop on Search-Based Software Testing (SBST). IEEE, 33--36.

Digital Library

[26]

Gordon Fraser, José Miguel Rojas, and Andrea Arcuri. 2018. Evosuite at the SBST 2018 Tool Competition. In Proceedings of the 11th International Workshop on Search-Based Software Testing (SBST '18). ACM, New York, NY, USA, 34--37.

Digital Library

[27]

Gordon Fraser, José Miguel Rojas, José Campos, and Andrea Arcuri. 2017. Evo-Suite at the SBST 2017 Tool Competition. In Proceedings of the 10th International Workshop on Search-Based Software Testing (SBST '17). IEEE Press, Piscataway, NJ, USA, 39--41.

[28]

Gordon Fraser, Matt Staats, Phil McMinn, Andrea Arcuri, and Frank Padberg. 2013. Does automated white-box test generation really help software testers?. In Proceedings of the 2013 International Symposium on Software Testing and Analysis. ACM, 291--301.

Digital Library

[29]

Andre Freitas. 2015. Schwa. https://pypi.org/project/Schwa Last accessed on 16/09/2019.

[30]

André Freitas. 2015. schwa. https://github.com/andrefreitas/schwa Last accessed on 16/09/2019.

[31]

Gregory Gay. 2017. Generating effective test suites by combining coverage criteria. In International Symposium on Search Based Software Engineering. Springer, 65--82.

[32]

Emanuel Giger, Marco D'Ambros, Martin Pinzger, and Harald C Gall. 2012. Method-level bug prediction. In Proceedings of the 2012 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement. IEEE, 171--180.

Digital Library

[33]

Git. 2019. Git. https://git-scm.com Last accessed on: 19/09/2019.

[34]

Todd L Graves, Alan F Karr, James S Marron, and Harvey Siy. 2000. Predicting fault incidence using software change history. IEEE Transactions on software engineering 26, 7 (2000), 653--661.

Digital Library

[35]

Andrew Habib and Michael Pradel. 2018. How many of all bugs do we find? a study of static bug detectors. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering. 317--328.

Digital Library

[36]

Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, open problems and challenges for search based software testing. In 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST). IEEE, 1--12.

[37]

Hideaki Hata, Osamu Mizuno, and Tohru Kikuno. 2012. Bug prediction based on fine-grained module histories. In 2012 34th international conference on software engineering (ICSE). IEEE, 200--210.

[38]

Rene Just. 2019. Defects4J - A Database of Real Faults and an Experimental Infrastructure to Enable Controlled Experiments in Software Engineering Research. https://github.com/rjust/defects4j Last accessed on: 02/10/2019.

[39]

René Just, Darioush Jalali, and Michael D Ernst. 2014. Defects4J: A database of existing faults to enable controlled testing studies for Java programs. In Proceedings of the 2014 International Symposium on Software Testing and Analysis. ACM, 437--440.

Digital Library

[40]

Sunghun Kim, Thomas Zimmermann, E James Whitehead Jr, and Andreas Zeller. 2007. Predicting faults from cached history. In Proceedings of the 29th international conference on Software Engineering. IEEE Computer Society, 489--498.

Digital Library

[41]

Xuan Bach D Le, David Lo, and Claire Le Goues. 2016. History driven program repair. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 213--224.

[42]

Chris Lewis, Zhongpeng Lin, Caitlin Sadowski, Xiaoyan Zhu, Rong Ou, and E James Whitehead Jr. 2013. Does bug prediction support human developers? findings from a google case study. In Proceedings of the 2013 International Conference on Software Engineering. IEEE Press, 372--381.

Digital Library

[43]

Chris Lewis and Rong Ou. 2011. Bug Prediction at Google. http://google-engtools.blogspot.com/2011/12/ Last accessed on: 16/09/2019.

[44]

Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective automated testing for Android applications. In Proceedings of the 25th International Symposium on Software Testing and Analysis. 94--105.

Digital Library

[45]

Tim Menzies, Jeremy Greenwald, and Art Frank. 2006. Data mining static code attributes to learn defect predictors. IEEE transactions on software engineering 33, 1 (2006), 2--13.

[46]

Nachiappan Nagappan and Thomas Ball. 2005. Use of relative code churn measures to predict system defect density. In Proceedings of the 27th international conference on Software engineering. ACM, 284--292.

[47]

Nachiappan Nagappan, Brendan Murphy, and Victor Basili. 2008. The influence of organizational structure on software quality. In 2008 ACM/IEEE 30th International Conference on Software Engineering. IEEE, 521--530.

Digital Library

[48]

Nachiappan Nagappan, Andreas Zeller, Thomas Zimmermann, Kim Herzig, and Brendan Murphy. 2010. Change bursts as defect predictors. In 2010 IEEE 21st International Symposium on Software Reliability Engineering. IEEE, 309--318.

Digital Library

[49]

Carlos Oliveira, Aldeida Aleti, Lars Grunske, and Kate Smith-Miles. 2018. Mapping the effectiveness of automated test suite generation techniques. IEEE Transactions on Reliability 67, 3 (2018), 771--785.

[50]

Carlos Oliveira, Aldeida Aleti, Yuan-Fang Li, and Mohamed Abdelrazek. 2019. Footprints of Fitness Functions in Search-Based Software Testing. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO '19). Association for Computing Machinery, 1399--1407.

Digital Library

[51]

Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2015. Reformulating branch coverage as a many-objective optimization problem. In 2015 IEEE 8th international conference on software testing, verification and validation (ICST). IEEE, 1--10.

[52]

Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2017. Automated test case generation as a many-objective optimisation problem with dynamic selection of the targets. IEEE Transactions on Software Engineering 44, 2 (2017), 122--158.

[53]

Annibale Panichella, Fitsum Meshesha Kifetew, and Paolo Tonella. 2018. A large scale empirical comparison of state-of-the-art search-based test case generators. Information and Software Technology 104 (2018), 236--256.

[54]

David Paterson, Jose Campos, Rui Abreu, Gregory M Kapfhammer, Gordon Fraser, and Phil McMinn. 2019. An Empirical Study on the Use of Defect Prediction for Test Case Prioritization. In 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). IEEE, 346--357.

[55]

Spencer Pearson, José Campos, René Just, Gordon Fraser, Rui Abreu, Michael D Ernst, Deric Pang, and Benjamin Keller. 2017. Evaluating and improving fault localization. In Proceedings of the 39th International Conference on Software Engineering. IEEE Press, 609--620.

Digital Library

[56]

Foyzur Rahman, Daryl Posnett, Abram Hindle, Earl Barr, and Premkumar Devanbu. 2011. BugCache for inspections: hit or miss?. In Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering. ACM, 322--331.

Digital Library

[57]

José Miguel Rojas, Mattia Vivanti, Andrea Arcuri, and Gordon Fraser. 2017. A detailed investigation of the effectiveness of whole test suite generation. Empirical Software Engineering 22, 2 (2017), 852--893.

Digital Library

[58]

Urko Rueda, Tanja EJ Vos, and ISWB Prasetya. 2015. Unit Testing Tool Competition-Round Three. In 2015 IEEE/ACM 8th International Workshop on Search-Based Software Testing. IEEE, 19--24.

[59]

Sina Shamshiri, Rene Just, Jose Miguel Rojas, Gordon Fraser, Phil McMinn, and Andrea Arcuri. 2015. Do automatically generated unit tests find real faults? an empirical study of effectiveness and challenges (t). In 2015 30th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 201--211.

Digital Library

[60]

András Vargha and Harold D Delaney. 2000. A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics 25, 2 (2000), 101--132.

[61]

Thomas Zimmermann, Rahul Premraj, and Andreas Zeller. 2007. Predicting defects for eclipse. In Third International Workshop on Predictor Models in Software Engineering (PROMISE'07: ICSE Workshops 2007). IEEE, 9--9.

Digital Library

Cited By

Xu WCheng J(2024)A Model-Based Approach to Mobile Application TestingInternational Journal of Advanced Network, Monitoring and Controls10.2478/ijanmc-2023-00718:4(1-10)Online publication date: 16-Mar-2024
https://doi.org/10.2478/ijanmc-2023-0071
Perera ATurhan BAleti ABöhme M(2024)On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software TestingACM Transactions on Software Engineering and Methodology10.1145/365502233:6(1-27)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3655022
MON KKONDO MCHOI EMIZUNO O(2023)Commit-Based Class-Level Defect Prediction for Python ProjectsIEICE Transactions on Information and Systems10.1587/transinf.2022MPP0003E106.D:2(157-165)Online publication date: 1-Feb-2023
https://doi.org/10.1587/transinf.2022MPP0003
Show More Cited By

Index Terms

Defect prediction guided search-based software testing
1. Software and its engineering
  1. Software creation and management
    1. Search-based software engineering
    2. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software Testing
Defect predictors, static bug detectors, and humans inspecting the code can propose locations in the program that are more likely to be buggy before they are discovered through testing. Automated test generators such as search-based software testing (SBST)...
An extensive evaluation of search-based software testing: a review

In recent years, search-based software testing (SBST) is the active research topic in software testing. SBST is the process of generating test cases that use metaheuristics for optimization of a task in the framework of software testing to solve ...
RepOK-based reduction of bounded exhaustive testing

While the effectiveness of bounded exhaustive test suites increases as one increases the scope for the bounded exhaustive generation, both the time for test generation and the time for test execution grow exponentially with respect to the scope. In this ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '20: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering

December 2020

1449 pages

ISBN:9781450367684

DOI:10.1145/3324884

General Chair:
John Grundy,
Program Chairs:
Claire Le Goues,
David Lo

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Australian Research Council

Conference

ASE '20

Sponsor:

ASE '20: 35th IEEE/ACM International Conference on Automated Software Engineering

December 21 - 25, 2020

Virtual Event, Australia

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
347
Total Downloads

Downloads (Last 12 months)79
Downloads (Last 6 weeks)8

Reflects downloads up to 18 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Xu WCheng J(2024)A Model-Based Approach to Mobile Application TestingInternational Journal of Advanced Network, Monitoring and Controls10.2478/ijanmc-2023-00718:4(1-10)Online publication date: 16-Mar-2024
https://doi.org/10.2478/ijanmc-2023-0071
Perera ATurhan BAleti ABöhme M(2024)On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software TestingACM Transactions on Software Engineering and Methodology10.1145/365502233:6(1-27)Online publication date: 27-Jun-2024
https://dl.acm.org/doi/10.1145/3655022
MON KKONDO MCHOI EMIZUNO O(2023)Commit-Based Class-Level Defect Prediction for Python ProjectsIEICE Transactions on Information and Systems10.1587/transinf.2022MPP0003E106.D:2(157-165)Online publication date: 1-Feb-2023
https://doi.org/10.1587/transinf.2022MPP0003
Ren XYe XLin YXing ZLi SLyu MChandra SBlincoe KTonella P(2023)API-Knowledge Aware Search-Based Software Testing: Where, What, and HowProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3611643.3616269(1320-1332)Online publication date: 30-Nov-2023
https://dl.acm.org/doi/10.1145/3611643.3616269
Perera AAleti ATurhan BBohme M(2023)An Experimental Assessment of Using Theoretical Defect Predictors to Guide Search-Based Software TestingIEEE Transactions on Software Engineering10.1109/TSE.2022.314700849:1(131-146)Online publication date: 1-Jan-2023
https://doi.org/10.1109/TSE.2022.3147008
Stradowski SMadeyski L(2023)Industrial applications of software defect prediction using machine learningInformation and Software Technology10.1016/j.infsof.2023.107192159:COnline publication date: 10-May-2023
https://dl.acm.org/doi/10.1016/j.infsof.2023.107192
Chen JChen JCai SChen HZhang C(2023)A novel combinatorial testing approach with fuzzing strategyJournal of Software: Evolution and Process10.1002/smr.2537Online publication date: 2-Feb-2023
https://doi.org/10.1002/smr.2537
Perera AAleti ATantithamthavorn CJiarpakdee JTurhan BKuhn LWalker K(2022)Search-based fairness testing for regression-based machine learning systemsEmpirical Software Engineering10.1007/s10664-022-10116-727:3Online publication date: 1-May-2022
https://dl.acm.org/doi/10.1007/s10664-022-10116-7
Zhu XBöhme MKim YKim JVigna GShi E(2021)Regression Greybox FuzzingProceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security10.1145/3460120.3484596(2169-2182)Online publication date: 12-Nov-2021
https://dl.acm.org/doi/10.1145/3460120.3484596
Chen JChen JCai SChen HZhang CHuang C(2021)A Test Case Generation Method of Combinatorial Testing based on τ-way Testing with Adaptive Random Testing2021 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW53611.2021.00048(83-90)Online publication date: Oct-2021
https://doi.org/10.1109/ISSREW53611.2021.00048
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents