Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2001420.2001461acmconferencesArticle/Chapter ViewAbstractPublication PagesisstaConference Proceedingsconference-collections
research-article

The use of mutation in testing experiments and its sensitivity to external threats

Published: 17 July 2011 Publication History

Abstract

Mutation analysts has emerged as a standard approach for empirical assessment of testing techniques. The test practitioners decide about cost-effectiveness of testing strategies based on the number of mutants the testing techniques detect. Though fundamental rigor to empirical software testing, the use of mutants in the absence of real-world faults has raised the concern of whether mutants and real faults exhibit similar properties. This paper revisits this important concern and disseminates interesting findings regarding mutants and whether these synthetic faults can predict fault detection ability of test suites. The results of controlled experiments conducted in this paper show that mutation when used in testing experiments is highly sensitive to external threats caused by some influential factors including mutation operators, test suite size, and programming languages. This paper raises the awareness message of the use of mutation in testing experiment and suggests that any interpretation or generalization of experimental findings based on mutation should be justified according to the influential factors involved.

References

[1]
Docjar. http://www.docjar.com.
[2]
H. Agrawal, R. DeMillo, B. Hathaway, W. Hsu, E. Krauser, R. Martin, A. Mathur, and E. Spafford. Design of mutant operators for the c programming language. Technical Report SERC-TR-41-P, Department of Computer Science, Purdue University, Lafayette, Indiana, April 2006.
[3]
J. H. Andrews, L. C. Briand, and Y. Labiche. Is mutation an appropriate tool for testing experiments? In International Conference on Software Engineering (ICSE), pages 402--411, 2005.
[4]
J. H. Andrews, L. C. Briand, Y. Labiche, and A. S. Namin. Using mutation analysis for assessing and comparing testing coverage criteria. IEEE Transactions on Software Engineering, 32(8):608--624, 2006.
[5]
J. S. Bradbury, J. R. Cordy, and J. Dingel. An empirical framework for comparing effectiveness of testing and property-based formal analysis. In ACM SIGPLAN/SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE), pages 2--5, 2005.
[6]
L. C. Briand, Y. Labiche, and M. M. Sówka. Automated, contract-based user testing of commercial-off-the-shelf components. In International Conference on Software Engineering (ICSE), pages 92--101, 2006.
[7]
M. Delamaro and J. Maldonado. A tool for the assessment of test adequacy for c programs. In Proceedings of the Conference on Performability in Computing Systems (PCS 96), pages 79--95, New Brunswick, NJ, July 1996.
[8]
H. Do and G. Rothermel. A controlled experiment assessing test case prioritization techniques via mutation faults. IEEE International Conference on Software Maintenance, 0:411--420, 2005.
[9]
H. Do and G. Rothermel. On the use of mutation faults in empirical assessments of test case prioritization techniques. IEEE Transactions on Software Engineering, 32:733--752, 2006.
[10]
R. A. Fisher. The Design of Experiments. MacMillan, 9th edition, 1971.
[11]
P. G. Frankl and S. N. Weiss. An experimental comparison of the effectiveness of the all-uses and all-edges adequacy criteria. In Symposium on Testing, Analysis, and Verification, pages 154--164, 1991.
[12]
G. Fraser and A. Zeller. Mutation-driven generation of unit tests and oracles. In International Symposium on Software Testing and analysis (ISSTA), pages 147--158, 2010.
[13]
J. Guilford. Fundamental Statistics in Psychology and Education. McGraw-Hill, New York, 1956.
[14]
Y.-S. Ma, J. Offutt, and Y.-R. Kwon. MuJava: a mutation system for Java. In Proceedings of the 28th international conference on Software engineering, International Conference on Software Engineering (ICSE), pages 827--830, New York, NY, USA, 2006. ACM.
[15]
J. Mayer and C. Schneckenburger. An empirical analysis and comparison of random testing techniques. In International Symposium on Empirical Software Engineering (ISESE), pages 105--114, 2006.
[16]
C. Murphy, K. Shen, and G. E. Kaiser. Automatic system testing of programs without test oracles. In International Symposium on Software Testing and analysis (ISSTA), pages 189--200, 2009.
[17]
A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness. In International Symposium on Software Testing and analysis (ISSTA), pages 57--68, 2009.
[18]
A. S. Namin, J. H. Andrews, and D. J. Murdoch. Sufficient mutation operators for measuring test effectiveness. In International Conference on Software Engineering (ICSE), pages 351--360, 2008.
[19]
A. Offutt, A. Lee, G. Rothermel, R. Untch, and C. Zapf. An experimental determination of sufficient mutation operators. ACM Transactions on Software Engineering and Methodology, 5(2):99--118, 1996.
[20]
J. Offutt, Y.-S. Ma, and Y. R. Kwon. The class-level mutants of muJava. In International Workshop on Automation of Software Test (AST), pages 78--84, 2006.
[21]
A. Pretschner, T. Mouelhi, and Y. L. Traon. Model-based tests for access control policies. In International Conference on Software Testing (ICST), pages 338--347, 2008.
[22]
M. J. Rutherford, A. Carzaniga, and A. L. Wolf. Evaluating test suites and adequacy criteria using simulation-based models of distributed systems. IEEE Transactions on Software Engineering, 34(4):452--470, 2008.
[23]
S. Sawilowsky and R. Blair. A more realistic look at the robustness and type II error properties of the t test to departures from population normality. Psychological Bulletin, 111:353--360, 1992.
[24]
K. R. Walcott, M. L. Soffa, G. M. Kapfhammer, and R. S. Roos. Time-aware test suite prioritization. In International Symposium on Software Testing and analysis (ISSTA), pages 1--12, 2006.
[25]
W. E. Wong, V. Debroy, and B. Choi. A family of code coverage-based heuristics for effective fault localization. Journal of Systems and Software, 83(2):188--208, 2010.
[26]
Q. Xie and A. M. Memon. Using a pilot study to derive a GUI model for automated testing. ACM Transactions on Software Engineering Methodology, 18(2), 2008.
[27]
T. Xie. Augmenting automatically generated unit-test suites with regression oracle checking. In European Conference on Object-Oriented Programming(ECOOP), pages 380--403, 2006.
[28]
L. Zhang, S.-S. Hou, C. Guo, T. Xie, and H. Mei. Time-aware test-case prioritization using integer linear programming. In International Symposium on Software Testing and analysis (ISSTA), pages 213--224, 2009.
[29]
L. Zhang, S.-S. Hou, J.-J. Hu, T. Xie, and H. Mei. Is operator-based mutant selection superior to random mutant selection? In International Conference on Software Engineering (ICSE), pages 435--444, 2010.

Cited By

View all
  • (2024)Mutation Coverage is not Strongly Correlated with Mutation CoverageProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644442(1-11)Online publication date: 15-Apr-2024
  • (2024)Detecting Faults vs. Revealing Failures: Exploring the Missing Link2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS62785.2024.00021(115-126)Online publication date: 1-Jul-2024
  • (2024)Subsumption, correctness and relative correctness: Implications for software testingScience of Computer Programming10.1016/j.scico.2024.103177(103177)Online publication date: Jul-2024
  • Show More Cited By

Recommendations

Reviews

Andrew Brooks

A 2005 study [1] reported the results of an experiment that suggested that software mutations exhibited properties similar to real faults; this experiment spurred the growth in popularity of mutation testing. This current work is a general replication of that 2005 study. The Proteum tool with 108 mutation operators was applied to the C programs and test cases used in the 2005 study. Figure 1(c) shows that hand-seeded faults were harder to detect than mutants, in agreement with the 2005 study. Figure 1(f), however, shows that detecting mutants tended to be harder than detecting real faults, in disagreement with the 2005 study. Experimental materials for Java programs were specially developed for the general replication. Figures 3(a) and 3(b) show that, for Java, mutants were harder to detect than hand-seeded faults, in disagreement with the now replicated findings for C programs. The authors rightly conclude that there is a need for caution when interpreting and generalizing findings from mutation testing, and that extensive empirical studies are required to fully unravel the relationships between real faults, hand-seeded faults, and mutants. Table 6 reveals that three of the four Java programs had maximum mutation scores of less than 30 percent. Some researchers might consider the test suites to be rather weak and unrepresentative of mutation testing in practice. Despite this criticism, this paper is strongly recommended to those researching software testing and those applying mutation testing as a technique in software quality assurance. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ISSTA '11: Proceedings of the 2011 International Symposium on Software Testing and Analysis
July 2011
394 pages
ISBN:9781450305624
DOI:10.1145/2001420
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 July 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. experimental design
  2. hand-seeded faults
  3. mutants
  4. mutation testing
  5. real faults
  6. statistical analysis

Qualifiers

  • Research-article

Conference

ISSTA '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 58 of 213 submissions, 27%

Upcoming Conference

ISSTA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 20 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Mutation Coverage is not Strongly Correlated with Mutation CoverageProceedings of the 5th ACM/IEEE International Conference on Automation of Software Test (AST 2024)10.1145/3644032.3644442(1-11)Online publication date: 15-Apr-2024
  • (2024)Detecting Faults vs. Revealing Failures: Exploring the Missing Link2024 IEEE 24th International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS62785.2024.00021(115-126)Online publication date: 1-Jul-2024
  • (2024)Subsumption, correctness and relative correctness: Implications for software testingScience of Computer Programming10.1016/j.scico.2024.103177(103177)Online publication date: Jul-2024
  • (2024)A new perspective on the competent programmer hypothesis through the reproduction of real faults with repeated mutationsSoftware Testing, Verification and Reliability10.1002/stvr.1874Online publication date: 29-Feb-2024
  • (2023)Three Forms of Mutant Subsumption: Basic, Strict and BroadSoftware Technologies10.1007/978-3-031-37231-5_6(122-144)Online publication date: 19-Jul-2023
  • (2023)Mutation‐based data augmentation for software defect predictionJournal of Software: Evolution and Process10.1002/smr.2634Online publication date: 6-Nov-2023
  • (2022)Prioritizing mutants to guide mutation testingProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510187(1743-1754)Online publication date: 21-May-2022
  • (2022)Inference and test generation using program invariants in chemical reaction networksProceedings of the 44th International Conference on Software Engineering10.1145/3510003.3510176(1193-1205)Online publication date: 21-May-2022
  • (2022)The ratio of equivalent mutantsJournal of Systems and Software10.1016/j.jss.2021.111039181:COnline publication date: 22-Apr-2022
  • (2021)On Understanding Contextual Changes of Failures2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS)10.1109/QRS54544.2021.00112(1036-1047)Online publication date: Dec-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media