research-article

Open access

Reducing Test Runtime by Transforming Test Fixtures

Authors:

Abdelrahman Baz,

August ShiAuthors Info & Claims

ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

Pages 1757 - 1769

https://doi.org/10.1145/3691620.3695541

Published: 27 October 2024 Publication History

Abstract

Software testing is a fundamental part of software development, but the cost of running tests can be high. Existing approaches to speed up testing such as test-suite reduction or regression test selection aim to run only a subset of tests from the full test suite, but these approaches run the risk of missing to run some key tests that are needed to detect faults in the code.

We propose a new technique to transform test code to speed up test runtime while still running all the tests. The insight is that testing frameworks such as JUnit for Java projects allow for developers to define test fixtures, i.e., methods that run before or after every test to setup or teardown test state, but these test fixtures need not be called all the time before/after each test. It may be sufficient to do the setup and teardown once at the beginning and end, respectively, of all tests. Our technique, TestBoost, transforms the test fixtures within a test class to instead run once before/after all tests in the test class, thereby running the test fixtures less frequently while still running all tests and ensuring that tests all still pass, as they did before. Our evaluation on 697 test classes from 34 projects shows that on average we can reduce the runtime per test class by 28.39% for the cases with positive significant improvement. Using these transformed test classes can result in an average 18.24% reduction per test suite runtime. We find that the coverage of the transformed test classes changes by <1%, and when we submitted 15 pull requests, 9 have already been merged.

References

[1]

2011. Testing at the speed and scale of Google. http://google-engtools.blogspot.com/2011/06/testing-at-speed-and-scale-of-google.html.

[2]

2019. JavaParser. http://javaparser.org.

[3]

2023. GitHub Actions. https://github.com/features/actions.

[4]

2023. Reducing Test Runtime by Transforming Test Fixtures. https://sites.google.com/view/transforming-test-fixtures.

[5]

2023. Travis-CI. https://travis-ci.org.

[6]

2024. JaCoCo Java Code Coverage Library. https://www.eclemma.org/jacoco/.

[7]

2024. PIT Mutation Testing. http://pitest.org.

[8]

Jonathan Bell, Gail Kaiser, Eric Melski, and Mohan Dattatreya. 2015. Efficient Dependency Detection for Safe Java Test Acceleration. In International Symposium on Foundations of Software Engineering. 770--781.

[9]

T. Y. Chen and M. F. Lau. 1998. A new heuristic for test suite reduction. Journal of Information and Software Technology 40, 5--6 (1998), 347--354.

[10]

T. Y. Chen and M. F. Lau. 1998. A simulation study on some heuristics for test suite reduction. Journal of Information and Software Technology 40, 13 (1998), 777--787.

[11]

Moritz Eck, Fabio Palomba, Marco Castelluccio, and Alberto Bacchelli. 2019. Understanding Flaky Tests: The Developer's Perspective. In European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 830--840.

Digital Library

[12]

Sebastian Elbaum, Gregg Rothermel, and John Penix. 2014. Techniques for Improving Regression Testing in Continuous Integration Development Environments. In International Symposium on Foundations of Software Engineering. 235--245.

[13]

Daniel Elsner, Florian Hauer, Alexander Pretschner, and Silke Reimer. 2021. Empirically Evaluating Readily Available Information for Regression Test Optimization in Continuous Integration. In International Symposium on Software Testing and Analysis. 491--504.

[14]

Emelie Engström, Mats Skoglund, and Per Runeson. 2008. Empirical evaluations of regression test selection techniques: A systematic review. In International Symposium on Empirical Software Engineering and Measurement. 22--31.

Digital Library

[15]

Alessio Gambi, Jonathan Bell, and Andreas Zeller. 2018. Practical test dependency detection. In International Conference on Software Testing, Verification, and Validation. 1--11.

[16]

Milos Gligoric, Lamyaa Eloussi, and Darko Marinov. 2015. Practical Regression Test Selection with Dynamic File Dependencies. In International Symposium on Software Testing and Analysis. 211--222.

[17]

Alex Gyori, August Shi, Farah Hariri, and Darko Marinov. 2015. Reliable testing: Detecting state-polluting tests to prevent test dependency. In International Symposium on Software Testing and Analysis. 223--233.

Digital Library

[18]

Dan Hao, Lu Zhang, Xingxia Wu, Hong Mei, and Gregg Rothermel. 2012. On-demand test suite reduction. In International Conference on Software Engineering. 738--748.

[19]

Mary Jean Harrold, James A. Jones, Tongyu Li, Donglin Liang, Alessandro Orso, Maikel Pennings, Saurabh Sinha, S. Alexander Spoon, and Ashish Gujarathi. 2001. Regression Test Selection for Java Software. In Conference on Object-Oriented Programming, Systems, Languages, and Applications. 312--326.

[20]

Mary Jean Harrold, David Rosenblum, Gregg Rothermel, and Elaine Weyuker. 2001. Empirical studies of a prediction model for regression test selection. IEEE Transactions on Software Engineering 27, 3 (2001), 248--263.

Digital Library

[21]

Michael Hilton, Jonathan Bell, and Darko Marinov. 2018. A Large-Scale, Longitudinal Study of Test Coverage Evolution. In International Conference on Automated Software Engineering. 53--63.

[22]

Yue Jia and Mark Harman. 2011. An analysis and survey of the development of mutation testing. IEEE Transactions on Software Engineering 37, 5 (2011), 649--678.

Digital Library

[23]

Bo Jiang, Zhenyu Zhang, Wing Kwong Chan, and T. H. Tse. 2009. Adaptive random test case prioritization. In International Conference on Automated Software Engineering. 233--244.

[24]

James A. Jones and Mary Jean Harrold. 2001. Test-suite reduction and prioritization for modified condition/decision coverage. In International Conference on Software Maintenance. 92--102.

[25]

Wing Lam, Reed Oei, August Shi, Darko Marinov, and Tao Xie. 2019. iDFlakies: A framework for detecting and partially classifying flaky tests. In International Conference on Software Testing, Verification, and Validation. 312--322.

[26]

Wing Lam, August Shi, Reed Oei, Sai Zhang, Michael D. Ernst, and Tao Xie. 2020. Dependent-Test-Aware Regression Testing Techniques. In International Symposium on Software Testing and Analysis. 298--311.

[27]

Wing Lam, Stefan Winter, Angello Astorga, Victoria Stodden, and Darko Marinov. 2020. Understanding Reproducibility and Characteristics of Flaky Tests Through Test Reruns in Java Projects. In International Symposium on Software Reliability Engineering. 403--413.

[28]

Wing Lam, Stefan Winter, Anjiang Wei, Tao Xie, Darko Marinov, and Jonathan Bell. 2020. A large-scale longitudinal study of flaky tests. Proceedings of the ACM on Programming Languages 4, OOPSLA (2020), 1--29.

Digital Library

[29]

Owolabi Legunsen, Farah Hariri, August Shi, Yafeng Lu, Lingming Zhang, and Darko Marinov. 2016. An Extensive Study of Static Regression Test Selection in Modern Software Evolution. In International Symposium on Foundations of Software Engineering. 583--594.

[30]

Owolabi Legunsen, August Shi, and Darko Marinov. 2017. STARTS: STAtic regression test selection. In International Conference on Automated Software Engineering. IEEE, 949--954.

[31]

Chengpeng Li, M Mahdi Khosravi, Wing Lam, and August Shi. 2023. Systematically Producing Test Orders to Detect Order-Dependent Flaky Tests. In International Symposium on Software Testing and Analysis. 627--638.

Digital Library

[32]

Chengpeng Li and August Shi. 2022. Evolution-aware detection of order-dependent flaky tests. In International Symposium on Software Testing and Analysis. 114--125.

Digital Library

[33]

Chengpeng Li, Chenguang Zhu, Wenxi Wang, and August Shi. 2022. Repairing Order-Dependent Flaky Tests via Test Generation. In International Conference on Software Engineering. 1881--1892.

Digital Library

[34]

Yu Liu, Jiyang Zhang, Pengyu Nie, Milos Gligoric, and Owolabi Legunsen. 2023. More Precise Regression Test Selection via Reasoning about Semantics-Modifying Changes. In International Symposium on Software Testing and Analysis. 664--676.

Digital Library

[35]

Qingzhou Luo, Farah Hariri, Lamyaa Eloussi, and Darko Marinov. 2014. An Empirical Analysis of Flaky Tests. In International Symposium on Foundations of Software Engineering. 643--653.

[36]

Qi Luo, Kevin Moran, and Denys Poshyvanyk. 2016. A large-scale empirical comparison of static and dynamic test case prioritization techniques. In International Symposium on Foundations of Software Engineering. 559--570.

Digital Library

[37]

Xue-ying Ma, Bin-kui Sheng, and Cheng-qing Ye. 2005. Test-suite reduction using genetic algorithm. In International Conference on Advanced Parallel Processing Technologies. 253--262.

[38]

Mateusz Machalica, Alex Samylkin, Meredith Porth, and Satish Chandra. 2019. Predictive test selection. In International Conference on Software Engineering, Software Engineering in Practice. 91--100.

Digital Library

[39]

Atif Memon, Zebao Gao, Bao Nguyen, Sanjeev Dhanda, Eric Nickell, Rob Siemborski, and John Micco. 2017. Taming Google-scale continuous testing. In International Conference on Software Engineering, Software Engineering in Practice. 233--242.

Digital Library

[40]

Pengyu Nie, Ahmet Celik, Matthew Coley, Aleksandar Milicevic, Jonathan Bell, and Milos Gligoric. 2020. Debugging the performance of Maven's test isolation: Experience report. In International Symposium on Software Testing and Analysis. 249--259.

Digital Library

[41]

Alessandro Orso, Nanjuan Shi, and Mary Jean Harrold. 2004. Scaling regression testing to large software systems. In International Symposium on Foundations of Software Engineering. 241--251.

Digital Library

[42]

Qianyang Peng, August Shi, and Lingming Zhang. 2020. Empirically Revisiting and Enhancing IR-Based Test-Case Prioritization. In International Symposium on Software Testing and Analysis. 324--336.

[43]

Gregg Rothermel and Mary Jean Harrold. 1997. A safe, efficient regression test selection technique. ACM Transactions on Software Engineering Methodology 6, 2 (1997), 173--210.

Digital Library

[44]

Gregg Rothermel, Mary Jean Harrold, Jeffery von Ronne, and Christie Hong. 2002. Empirical studies of test-suite reduction. Journal of Software Testing, Verification and Reliability 12, 4 (2002), 219--249.

[45]

G. Rothermel, R.H. Untch, Chengyun Chu, and M.J. Harrold. 1999. Test case prioritization: an empirical study. In International Conference on Software Maintenance. 179--188.

[46]

Ripon K. Saha, Lingming Zhang, Sarfraz Khurshid, and Dewayne E. Perry. 2015. An information retrieval approach for regression test prioritization based on program changes. In International Conference on Software Engineering. 268--279.

[47]

August Shi, Jonathan Bell, and Darko Marinov. 2019. Mitigating the effects of flaky tests on mutation testing. In International Symposium on Software Testing and Analysis. 112--122.

Digital Library

[48]

August Shi, Alex Gyori, Milos Gligoric, Andrey Zaytsev, and Darko Marinov. 2014. Balancing trade-offs in test-suite reduction. In International Symposium on Foundations of Software Engineering. 246--256.

Digital Library

[49]

August Shi, Alex Gyori, Suleman Mahmood, Peiyuan Zhao, and Darko Marinov. 2018. Evaluating Test-Suite Reduction in Real Software Evolution. In International Symposium on Software Testing and Analysis. 84--94.

[50]

August Shi, Wing Lam, Reed Oei, Tao Xie, and Darko Marinov. 2019. iFixFlakies: A Framework for Automatically Fixing Order-Dependent Flaky Tests. In European Software Engineering Conference and Symposium on the Foundations of Software Engineering. 545--555.

Digital Library

[51]

August Shi, Peiyuan Zhao, and Darko Marinov. 2019. Understanding and Improving Regression Test Selection in Continuous Integration. In International Symposium on Software Reliability Engineering. 228--238.

[52]

Arash Vahabzadeh, Andrea Stocco, and Ali Mesbah. 2018. Fine-grained test minimization. In International Conference on Software Engineering. 210--221.

Digital Library

[53]

Ruixin Wang, Yang Chen, and Wing Lam. 2022. IPFlakies: A Framework for Detecting and Fixing Python Order-Dependent Flaky Tests. In International Conference on Software Engineering (Tool Demonstrations Track). 120--124.

Digital Library

[54]

Anjiang Wei, Pu Yi, Tao Xie, Darko Marinov, and Wing Lam. 2021. Probabilistic and Systematic Coverage of Consecutive Test-Method Pairs for Detecting Order-Dependent Flaky Tests. In Tools and Algorithms for the Construction and Analysis of Systems. 270--287.

[55]

Shin Yoo and Mark Harman. 2012. Regression Testing Minimization, Selection and Prioritization: A Survey. Journal of Software Testing, Verification and Reliability 22, 2 (2012), 67--120.

Digital Library

[56]

Jiyang Zhang, Yu Liu, Milos Gligoric, Owolabi Legunsen, and August Shi. 2022. Comparing and Combining Analysis-Based and Learning-Based Regression Test Selection. In ACM/IEEE International Conference on Automation of Software Test. 17--28.

[57]

Lingming Zhang. 2018. Hybrid regression test selection. In International Conference on Software Engineering. 199--209.

Digital Library

[58]

Lingming Zhang, Darko Marinov, Lu Zhang, and Sarfraz Khurshid. 2011. An empirical study of JUnit test-suite reduction. In International Symposium on Software Reliability Engineering. 170--179.

Digital Library

[59]

Sai Zhang, Darioush Jalali, Jochen Wuttke, Kıvanç Muşlu, Wing Lam, Michael D. Ernst, and David Notkin. 2014. Empirically revisiting the test independence assumption. In International Symposium on Software Testing and Analysis. 385--396.

Digital Library

[60]

Chenguang Zhu, Owolabi Legunsen, August Shi, and Milos Gligoric. 2019. A Framework for Checking Regression Test Selection Tools. In International Conference on Software Engineering. 430--441.

Digital Library

Index Terms

Reducing Test Runtime by Transforming Test Fixtures
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation
      1. Software defect analysis
        Software testing and debugging

Recommendations

A Static Approach to Prioritizing JUnit Test Cases

Test case prioritization is used in regression testing to schedule the execution order of test cases so as to expose faults earlier in testing. Over the past few years, many test case prioritization techniques have been proposed in the literature. Most ...
Empirically evaluating Greedy-based test suite reduction methods at different levels of test suite complexity

Test suite reduction is an important approach that decreases the cost of regression testing. A test suite reduction technique operates based on the relationship between the test cases in the regression test suite and the test requirements in the program ...
Optimizing test prioritization via test distribution analysis
ESEC/FSE 2018: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering

Test prioritization aims to detect regression faults faster via reordering test executions, and a large number of test prioritization techniques have been proposed accordingly. However, test prioritization effectiveness is usually measured in terms of ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

ASE '24: Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering

October 2024

2587 pages

ISBN:9798400712487

DOI:10.1145/3691620

General Chair:
Vladimir Filkov,
Program Co-chairs:
Baishakhi Ray
Columbia University, USA; AWS AI Lab
,
Minghui Zhou
Peking University, China

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2024

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSF (National Science Foundation)

Conference

ASE '24

Sponsor:

ASE '24: 39th IEEE/ACM International Conference on Automated Software Engineering

October 27 - November 1, 2024

CA, Sacramento, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
64
Total Downloads

Downloads (Last 12 months)64
Downloads (Last 6 weeks)64

Reflects downloads up to 24 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents