Using program data-state scarcity to guide automatic test data generation

Mohammad Alshraideh¹,
Leonardo Bottaci² &
Basel A. Mahafzah¹

260 Accesses
18 Citations
Explore all metrics

Abstract

Finding test data to cover structural test coverage criteria such as branch coverage is largely a manual and hence expensive activity. A potential low cost alternative is to generate the required test data automatically. Search-based test data generation is one approach that has attracted recent interest. This approach is based on the definition of an evaluation or cost function that is able to discriminate between candidate test cases with respect to achieving a given test goal. The cost function is implemented by appropriate instrumentation of the program under test. The candidate test is then executed on the instrumented program. This provides an evaluation of the candidate test in terms of the “distance” between the computation achieved by the candidate test and the computation required to achieve the test goal. Providing the cost function is able to discriminate reliably between candidate tests that are close or far from covering the test goal and the goal is feasible, a search process is able to converge to a solution, i.e., a test case that satisfies the coverage goal. For some programs, however, an informative cost function is difficult to define. The operations performed by these programs are such that the cost function returns a constant value for a very wide range of inputs. A typical example of this problem arises in the instrumentation of branch predicates that depend on the value of a Boolean-valued (flag) variable although the problem is not limited to programs that contain flag variables. Although methods are known for overcoming the problems of flag variables in particular cases, the more general problem of a near constant cost function has not been tackled. This paper presents a new heuristic for directing the search when the cost function at a test goal is not able to differentiate between candidate test inputs. The heuristic directs the search toward test cases that produce rare or scarce data states. Scarce inputs for the cost function are more likely to produce new cost values. The proposed method is evaluated empirically for a number of example programs for which existing methods are inadequate.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A New Approach for Search Space Reduction and Seeding by Analysis of the Clauses

A Framework for Guided Test Case Generation in Constraint Logic Programming

Legion: Best-First Concolic Testing (Competition Contribution)

References

Alshraideh, M., & Bottaci, L. (2006). Automatic software test data generation for string data using heuristic search with domain specific search operators. Software Testing, Verification and Reliability, 16(3), 175–203.
Article Google Scholar
Baresel, A., Harman, M., Binkley, D., & Korel, B. (2004). Evolutionary testing in the presence of loop-assigned flags: A testability transformation approach. In G. S. Avrunin & G. Rothermel (Eds.), Proceedings of 2004 ACM SIGSOFT international symposium on software testing and analysis, ISSTA 2004, (pp. 108–118). New York: ACM.
Booker, L. B. (1982). Intelligent behavior as an adaptation to the task environment. PhD thesis, Ann Arbor, MI: The University of Michigan.
Bottaci, L. (2002). Instrumenting programs with flag variables for test data search by genetic algorithm. In Proceedings of genetic and evolutionary computation conference (GECCO 2002), (pp. 1337–1342). San Mateo, CA: Morgan Kaufmann.
Bottaci, L. (2003). Predicate expression cost functions to guide evolutionary search for test data. In Proceedings of genetic and evolutionary computation conference (GECCO 2003), (pp. 2455–2464). Chicago, IL: Springer Verlag.
Bottaci, L. (2005). Use of branch cost functions to diversify the search for test data. In Proceedings of the UK software testing workshop (UKTest 2005), (pp. 151–163). UK: University of Sheffield.
Bouchachia, A. (2007). An immune genetic algorithm for software test data generation. In Proceedings of the 7th international conference on hybrid intelligent systems (HIS 2007), (pp. 84–89). Washington, DC.
Bueno, P., Paulo, M. S., Wong, W. E., & Jino, M. (2007). Improving random test sets using the diversity oriented test data generation. In Proceedings of the 2nd international workshop on random testing: co-located with the 22nd IEEE/ACM international conference on automated software engineering (ASE 2007), (pp. 10–17). New York, NY: ACM.
Chen, T. Y., Kuo, F. C., Merkel, R. G., & Tse, T. (2009). Adaptive random testing: The art of test case diversity. Journal of Systems and Software (in press), Corrected Proof:–, doi:10.1016/j.jss.2009.02.022, http://www.sciencedirect.com/science/article/B6V0N-4VRX633-1/2/50b8286d83d4352947ae21e1c6930c00
Collins, R. J., & Jefferson, D. R. (1991). Selection in massively parallel genetic. In Belew, R. K., Booker, L. B. (Eds.), Proceedings of the fourth international on genetic algorithms, (pp. 249–256). San Mateo, CA: Morgan Kaufmann.
Coward, P. D. (1991). Symbolic execution and testing. Information and Software Technique, 33(1), 53–64.
Article Google Scholar
Davis, L. (1991). Handbook of Genetic Algorithms. NY: International Thomson Computer Press.
Google Scholar
de Halleux, J., & Tillmann, N. (2008). Parameterized unit testing with pex. In B. Beckert & R. Hähnle (Eds.), Tests and proofs, proceedings of the second international conference, TAP 2008, (Vol. 4966, pp. 171–181). Springer, Lecture Notes in Computer Science.
De Jong, L. A. (1975). An analysis of the behavior of a class of genetic adaptive systems. PhD thesis, Ann Arbor, MI: The University of Michigan.
De Jong, E. D., Watson, R., & Pollack, J. (2001). Reducing bloat and promoting diversity using multi-objective methods. In L. Spector et al. (Eds.), Proceedings of the genetic and evolutionary computation conference, (pp. 11–18). San Francisco, CA: Morgan Kaufmann.
Deb, K., & Goldberg, D. E. (1989). An investigation of niche and species formation in genetic function optimization. In Proceedings of the third international conference on genetic algorithms, (pp. 42–50). San Mateo, CA: Morgan Kaufmann.
Duran, J. E., & Ntafos, S. C. (1984). An evaluation of random testing. IEEE Transactions on Software Engineering, 10(4), 438–444.
Article Google Scholar
Feldt, R., Torkar, R., Gorschek, T., & Afzal, W. (2008). Searching for cognitively diverse tests: Towards universal test diversity metrics. In IEEE international conference on software testing verification and validation workshop, ICSTW ’08, (pp. 178–186).
Ferguson, R., & Korel, B. (1996). The chaining approach for software test data generation. ACM Transactions on Software Engineering and Methodology, 5(1), 63–86.
Article Google Scholar
Goldberg, D., & Richardson, J. (1987). Genetic algorithms with sharing for multimodal function optimization. In Proceedings of the second international conference on genetic algorithms, (pp. 148–154). San Francisco, CA: Morgan Kaufmann.
Goldberg, D. E. (1989). Genetic Algorithms in Search Optimization and Machine Learning. Reading, MA: Addison Wesley.
MATH Google Scholar
Harman, M., Hu, L., Hierons, R., Baresel, A., & Sthamer, H. (2002). Improving evolutionary testing by flag removal. In Proceedings of genetic and evolutionary computation conference GECCO 2002, New York, USA, pp. 1351–1358.
Harman, M., Hu, L., Hierons, R., Wegener, J., Baresel, A., Sthamer, H., & Roper, M. (2004). Testability transformation. IEEE Transactions on Software Engineering, 30(1), 3–16.
Article Google Scholar
Jones, B.F., Sthamer, H., & Eyres, D. (1996). Automatic structural testing using genetic algorithms. Software Engineering Journal, 11(5), 299–306.
Article Google Scholar
Keijzer, M. (1996). Efficiently representing populations in genetic programming. In Angeline, P., Kinnear, K. Jr. (Eds.), Advances in Genetic Programming, (Vol. 2, pp. 259–278), Cambridge: MIT Press.
Google Scholar
Korel, B. (1990). Automated software test data generation. IEEE Transactions on Software Engineering, 16(8), 870–879.
Article Google Scholar
Korel, B., Harman, M., Chung, S., Apirukvorapinit, P., Gupta, R., & Zhang, Q. (2005). Data dependence based testability transformation in automated test generation. In 16th IEEE international symposium on software reliability engineering (pp. 245–254)
Koza, J. (1992). Genetic Programming: On the Programming of Computers by Means of Natural Selection. Cambridge, MA: MIT Press.
MATH Google Scholar
Langdon, W. B. (1998). Genetic programming and data structures: Genetic programming + data structures = automatic programming!, genetic programming, (Vol. 1). Boston: Kluwer.
MacQueen, J. B. (1967). Some methods for classification and analysis of multivariate observations. In Proceedings of fifth berkeley symposium on mathematical statistics and probability, (Vol. 1, pp. 281–297). Berkeley: University of California Press.
Malaiya, Y. (1995). Antirandom testing: Getting the most out of black-box testing. In Proceeding of international symposium on software reliability engineering, (pp. 86–95). citeseer.ist.psu.edu/article/malaiya95antirandom.html.
Mattiussi, C., Waibel, M., & Floreano, D. (2004). Measures of diversity for populations and distances between individuals with highly reorganizable genomes. Evolutionary Computation, 12(4), 495–515.
Article Google Scholar
McMinn, P. (2004). Search-based software test data generation: A survey. Software Testing, Verification and Reliability, 14(2), 105–156.
Article Google Scholar
McMinn, P., Binkley, D., Harman, M., & Tonella, P. (2006). The species per path approach to searchbased test data generation. In proceedings of the international symposium on software testing and analysis (ISSTA 2006), (pp. 13–24). Portland, ME.
O’Reilly, U. M. (1997). Using a distance metric on genetic programs to understand genetic operators. In IEEE international conference on systems, man, and cybernetics, computational cybernetics and simulation, 5, 4092–4097.
Roper, M. (1997). Computer aided software testing using genetic algorithms. In Proceeding of the 10th international software quality week conference, San Francisco, USA, pp. 27–30.
Rosca, J. (1995). Entropy-driven adaptive representation. In Rosca, J. (Ed.), Proceedings of the workshop on genetic programming: From theory to real-world applications, Tahoe City, CA, pp. 23–32.
Voas, J., & Miller, K. (1995). Software testability: The new verification. IEEE Software, 12(3), 17–28.
Article Google Scholar
Wappler, S., Baresel, A., & Wegener, J. (2007). Improving evolutionary testing in the presence of function-assigned flags. In Proceedings of the testing: Academic and industrial conference practice and research techniques, (pp. 23–34). IEEE Digital Library.
Wegener, J., Baresel, A., & Sthamer, H. (2001). Evolutionary test environment for automatic structural testing. Information and Software Technology Special Issue on Software Engineering using Metaheuristic Innovative Algorithms, 43(14), 841–854.
Google Scholar
Wu, S. H., Jandhyala, S., Malaiya, Y. K., & Jayasumana, A. P. (2008). Antirandom testing: A distance-based approach. VLSI Design, 2008(2), 1–9.
Article MathSciNet Google Scholar

Download references

Acknowledgments

The authors would like to express their gratitude to the anonymous referees for their valuable comments and suggestions for improving the paper.

Author information

Authors and Affiliations

Department of Computer Science, University of Jordan, Amman, 11942, Jordan
Mohammad Alshraideh & Basel A. Mahafzah
Department of Computer Science, The University of Hull, Hull, HU6 7RX, UK
Leonardo Bottaci

Authors

Mohammad Alshraideh
View author publications
You can also search for this author in PubMed Google Scholar
Leonardo Bottaci
View author publications
You can also search for this author in PubMed Google Scholar
Basel A. Mahafzah
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Alshraideh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Alshraideh, M., Bottaci, L. & Mahafzah, B.A. Using program data-state scarcity to guide automatic test data generation. Software Qual J 18, 109–144 (2010). https://doi.org/10.1007/s11219-009-9083-x

Download citation

Published: 16 September 2009
Issue Date: March 2010
DOI: https://doi.org/10.1007/s11219-009-9083-x

Using program data-state scarcity to guide automatic test data generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Approach for Search Space Reduction and Seeding by Analysis of the Clauses

A Framework for Guided Test Case Generation in Constraint Logic Programming

Legion: Best-First Concolic Testing (Competition Contribution)

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Using program data-state scarcity to guide automatic test data generation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A New Approach for Search Space Reduction and Seeding by Analysis of the Clauses

A Framework for Guided Test Case Generation in Constraint Logic Programming

Legion: Best-First Concolic Testing (Competition Contribution)

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation