research-article

Open access

SMT2Test: From SMT Formulas to Effective Test Cases

Authors:

Zhendong SuAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue OOPSLA2

Article No.: 279, Pages 222 - 245

https://doi.org/10.1145/3689719

Published: 08 October 2024 Publication History

Abstract

One of the primary challenges in software testing is generating high-quality test inputs and obtaining corresponding test oracles. This paper introduces a novel methodology to mitigate this challenge in testing program verifiers by employing SMT (Satisfiability Modulo Theories) formulas as a universal test case generator. The key idea is to transform SMT formulas into programs and link the satisfiability of the formulas with the safety property of the programs, allowing the satisfiability of the formulas to act as a test oracle for program verifiers. This method was implemented as a framework named SMT2Test, which enables the transformation of SMT formulas into Dafny and C programs. An intermediate representation was designed to augment the flexibility of this framework, streamlining the transformation for other programming languages and fostering modular transformation strategies. We evaluated the effectiveness of SMT2Test by finding defects in two program verifiers: the Dafny verifier and CPAchecker. Utilizing the SMT2Test framework with the SMT formulas from the SMT competition and SMT solver fuzzers, we discovered and reported a total of 14 previously unknown defects in these program verifiers that were not found by previous methods. After reporting, all of them have been confirmed, and 6 defects have been fixed. These findings show the effectiveness of our method and imply its potential application in testing other programming language infrastructures.

References

[1]

Haniel Barbosa, Clark Barrett, Martin Brain, Gereon Kremer, Hanna Lachnitt, Makai Mann, Abdalrhman Mohamed, Mudathir Mohamed, Aina Niemetz, and Andres Nötzli. 2022. cvc5: A versatile and industrial-strength SMT solver. In TACAS. 415–442. https://doi.org/10.1007/978-3-030-99524-9_24

Digital Library

[2]

Mike Barnett, Bor-Yuh Evan Chang, Robert DeLine, Bart Jacobs, and K Rustan M Leino. 2006. Boogie: A modular reusable verifier for object-oriented programs. In FMCO. 364–387. https://doi.org/10.1007/11804192_17

Digital Library

[3]

Earl T Barr, Mark Harman, Phil McMinn, Muzammil Shahbaz, and Shin Yoo. 2014. The oracle problem in software testing: A survey. IEEE transactions on software engineering, 41, 5 (2014), 507–525. https://doi.org/10.1109/tse.2014.2372785

Digital Library

[4]

Clark Barrett, Leonardo De Moura, and Aaron Stump. 2005. SMT-COMP: Satisfiability modulo theories competition. In CAV. 20–23. https://doi.org/10.1007/11513988_4

Digital Library

[5]

Dirk Beyer. 2023. Competition on software verification and witness validation: SV-COMP 2023. In TACAS. 495–522. https://doi.org/10.1007/978-3-031-30820-8_29

Digital Library

[6]

Dirk Beyer and M Erkan Keremoglu. 2011. CPAchecker: A tool for configurable software verification. In CAV. 184–190. https://doi.org/10.1007/978-3-642-22110-1_16

[7]

Dmitry Blotsky, Federico Mora, Murphy Berzish, Yunhui Zheng, Ifaz Kabir, and Vijay Ganesh. 2018. Stringfuzz: A fuzzer for string solvers. In CAV. 45–51. https://doi.org/10.1007/978-3-319-96142-2_6

[8]

Marco Bozzano, Roberto Bruttomesso, Alessandro Cimatti, Tommi Junttila, Peter Van Rossum, Stephan Schulz, and Roberto Sebastiani. 2005. MathSAT: Tight integration of SAT and mathematical decision procedures. Journal of Automated Reasoning, 35, 1-3 (2005), 265–293. https://doi.org/10.1007/978-1-4020-5571-3_12

Digital Library

[9]

Alexandra Bugariu and Peter Müller. 2020. Automatically testing string solvers. In ICSE. 1459–1470. https://doi.org/10.1145/3377811.3380398

Digital Library

[10]

Cristian Cadar, Daniel Dunbar, and Dawson R Engler. 2008. KLEE: Unassisted and automatic generation of high-coverage tests for complex systems programs. In OSDI. 8, 209–224.

Digital Library

[11]

Alessandro Cimatti, Alberto Griggio, Bastiaan Joost Schaafsma, and Roberto Sebastiani. 2013. The mathsat5 SMT solver. In TACAS. 93–107. https://doi.org/10.1007/978-3-642-36742-7_7

Digital Library

[12]

Edmund Clarke, Daniel Kroening, and Flavio Lerda. 2004. A tool for checking ANSI-C programs. In TACAS. 168–176. https://doi.org/10.1007/978-3-540-24730-2_15

[13]

Edmund M Clarke, E Allen Emerson, and Joseph Sifakis. 2009. Model checking: algorithmic verification and debugging. Commun. ACM, 52, 11 (2009), 74–84. https://doi.org/10.1145/1592761.1592781

Digital Library

[14]

Patrick Cousot and Radhia Cousot. 1977. Abstract interpretation: a unified lattice model for static analysis of programs by construction or approximation of fixpoints. In POPL. 238–252. https://doi.org/10.1145/512950.512973

Digital Library

[15]

Chris Cummins, Pavlos Petoumenos, Alastair Murray, and Hugh Leather. 2018. Compiler fuzzing through deep learning. In ISSTA. 95–105. https://doi.org/10.1145/3213846.3213848

Digital Library

[16]

Pascal Cuoq, Benjamin Monate, Anne Pacalet, Virgile Prevosto, John Regehr, Boris Yakobowski, and Xuejun Yang. 2012. Testing static analyzers with randomly generated programs. In NASA FM. 120–125. https://doi.org/10.1007/978-3-642-28891-3_12

Digital Library

[17]

Leonardo De Moura and Nikolaj Bjørner. 2008. Z3: An efficient SMT solver. In TACAS. 337–340. https://doi.org/10.1007/978-3-540-78800-3_24

[18]

Bruno Dutertre. 2014. Yices 2.2. In CAV. 737–744. https://doi.org/10.1007/978-3-319-08867-9_49

Digital Library

[19]

Andrew Ferraiuolo, Andrew Baumann, Chris Hawblitzel, and Bryan Parno. 2017. Komodo: Using verification to disentangle secure-enclave hardware from software. In SOSP. 287–305. https://doi.org/10.1145/3132747.3132782

Digital Library

[20]

Robert W Floyd. 1993. Assigning meanings to programs. In Program Verification: Fundamental Issues in Computer Science. 65–81. https://doi.org/10.1007/978-94-011-1793-7_4

[21]

Alex Groce, Iftekhar Ahmed, Josselin Feist, Gustavo Grieco, Jiri Gesi, Mehran Meidani, and Qihong Chen. 2021. Evaluating and improving static analysis tools via differential mutation analysis. In QRS. 207–218. https://doi.org/10.1109/qrs54544.2021.00032

[22]

Arie Gurfinkel, Temesghen Kahsai, Anvesh Komuravelli, and Jorge A Navas. 2015. The SeaHorn verification framework. In CAV. 343–361. https://doi.org/10.1007/978-3-319-21690-4_20

[23]

Travis Hance, Andrea Lattuada, Chris Hawblitzel, Jon Howell, Rob Johnson, and Bryan Parno. 2020. Storage systems are distributed systems (so verify them that way!). In OSDI. 99–115.

[24]

William Gallard Hatch, Pierce Darragh, Sorawee Porncharoenwase, Guy Watson, and Eric Eide. 2023. Generating conforming programs with Xsmith. In GPCE. https://doi.org/10.1145/3624007.3624056

Digital Library

[25]

Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R Lorch, Bryan Parno, Michael L Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: proving practical distributed systems correct. In SOSP. 1–17. https://doi.org/10.1145/2815400.2815428

Digital Library

[26]

Charles Antony Richard Hoare. 1969. An axiomatic basis for computer programming. Commun. ACM, 12, 10 (1969), 576–580. https://doi.org/10.1145/363235.363259

Digital Library

[27]

Chiao Hsieh and Sayan Mitra. 2019. Dione: A protocol verification system built with dafny for i/o automata. In IFM. 227–245. https://doi.org/10.1007/978-3-030-34968-4_13

[28]

Ahmed Irfan, Sorawee Porncharoenwase, Zvonimir Rakamarić, Neha Rungta, and Emina Torlak. 2022. Testing dafny (experience paper). In ISSTA. 556–567. https://doi.org/10.1145/3533767.3534382

Digital Library

[29]

Timotej Kapus and Cristian Cadar. 2017. Automatic testing of symbolic execution engines via program generation and differential testing. In ASE. 590–600. https://doi.org/10.1109/ase.2017.8115669

[30]

James C King. 1976. Symbolic execution and program testing. Commun. ACM, 19, 7 (1976), 385–394. https://doi.org/10.1109/c-m.1978.218139

Digital Library

[31]

Florent Kirchner, Nikolai Kosmatov, Virgile Prevosto, Julien Signoles, and Boris Yakobowski. 2015. Frama-C: A software analysis perspective. Formal aspects of computing, 27 (2015), 573–609. https://doi.org/10.1007/s00165-014-0326-7

Digital Library

[32]

Christian Klinger, Maria Christakis, and Valentin Wüstholz. 2019. Differentially testing soundness and precision of program analyzers. In ISSTA. 239–250. https://doi.org/10.1145/3293882.3330553

Digital Library

[33]

Daniel Kroening and Michael Tautschnig. 2014. CBMC–C bounded model checker. In TACAS. 389–391. https://doi.org/10.1007/978-3-642-54862-8_26

[34]

Vu Le, Mehrdad Afshari, and Zhendong Su. 2014. Compiler validation via equivalence modulo inputs. PLDI, 49, 6 (2014), 216–226. https://doi.org/10.1145/2594291.2594334

Digital Library

[35]

Vu Le, Chengnian Sun, and Zhendong Su. 2015. Finding deep compiler bugs via guided stochastic program mutation. OOPSLA, 50, 10 (2015), 386–399. https://doi.org/10.1145/2858965.2814319

Digital Library

[36]

K Rustan M Leino. 2010. Dafny: An automatic program Verifier for functional correctness. In LPAR. 348–370. https://doi.org/10.1007/978-3-642-17511-4_20

[37]

K Rustan M Leino and Peter Müller. 2009. A basis for verifying multi-threaded programs. In ESOP. 378–393. https://doi.org/10.1007/978-3-642-00590-9_27

Digital Library

[38]

K Rustan M Leino and Wolfram Schulte. 2004. Exception Safety for C#. In SEFM. 4, 218–227.

[39]

Guodong Li and Konrad Slind. 2008. Trusted source translation of a total function language. In TACAS. 471–485. https://doi.org/10.1007/978-3-540-78800-3_37

[40]

Daniel Liew, Cristian Cadar, Alastair F Donaldson, and J Ryan Stinnett. 2019. Just fuzz it: solving floating-point constraints using coverage-guided fuzzing. In ESEC/FSE. 521–532. https://doi.org/10.1145/3338906.3338921

Digital Library

[41]

Xiao Liu, Xiaoting Li, Rupesh Prajapati, and Dinghao Wu. 2019. DeepFuzz: Automatic generation of syntax valid c programs for fuzz testing. In AAAI. 33, 1044–1051. https://doi.org/10.1609/aaai.v33i01.33011044

Digital Library

[42]

Vsevolod Livinskii, Dmitry Babokin, and John Regehr. 2020. Random testing for C and C++ compilers with YARPGen. In OOPSLA. 4, 1–25. https://doi.org/10.1145/3428264

Digital Library

[43]

Benjamin Mikek and Qirun Zhang. 2023. Speeding up SMT solving via compiler pptimization. In ESEC/FSE. https://doi.org/10.1145/3611643.3616357

Digital Library

[44]

Derek Gordon Murray, Michael Isard, and Yuan Yu. 2011. Steno: Automatic optimization of declarative queries. In PLDI. 121–131. https://doi.org/10.1145/1993316.1993513

Digital Library

[45]

Glenford J Myers, Corey Sandler, and Tom Badgett. 2011. The art of software testing. John Wiley & Sons.

Digital Library

[46]

Jiwon Park, Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2021. Generative type-aware mutation for testing SMT solvers. In OOPSLA. 5, 1–19. https://doi.org/10.1145/3485529

Digital Library

[47]

Terence J. Parr and Russell W. Quong. 1995. ANTLR: A predicated-LL (k) parser generator. Software: Practice and Experience, 25, 7 (1995), 789–810. https://doi.org/10.1002/spe.4380250705

Digital Library

[48]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. In PLDI. 335–346. https://doi.org/10.1145/2254064.2254104

Digital Library

[49]

Jubi Taneja, Zhengyang Liu, and John Regehr. 2020. Testing static analyses for precision and soundness. In CGO. 81–93. https://doi.org/10.1145/3368826.3377927

Digital Library

[50]

David A Terei and Manuel MT Chakravarty. 2010. An LLVM backend for GHC. In Proceedings of the third ACM Haskell symposium on Haskell. 109–120. https://doi.org/10.1145/1863523.1863538

Digital Library

[51]

Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. On the unusual effectiveness of type-aware operator mutations for testing SMT solvers. In OOPSLA. https://doi.org/10.1145/3428261

Digital Library

[52]

Dominik Winterer, Chengyu Zhang, and Zhendong Su. 2020. Validating SMT solvers via semantic fusion. In PLDI. 718–730. https://doi.org/10.1145/3385412.3385985

Digital Library

[53]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In PLDI. 283–294. https://doi.org/10.1145/2345156.1993532

[54]

Peisen Yao, Heqing Huang, Wensheng Tang, Qingkai Shi, Rongxin Wu, and Charles Zhang. 2021. Fuzzing SMT solvers via two-dimensional input space exploration. In ISSTA. 322–335. https://doi.org/10.1145/3460319.3464803

Digital Library

[55]

Peisen Yao, Heqing Huang, Wensheng Tang, Qingkai Shi, Rongxin Wu, and Charles Zhang. 2021. Skeletal approximation enumeration for SMT solver testing. In ESEC/FSE. 1141–1153. https://doi.org/10.1145/3468264.3468540

Digital Library

[56]

Chengyu Zhang, Ting Su, Yichen Yan, Fuyuan Zhang, Geguang Pu, and Zhendong Su. 2019. Finding and understanding bugs in software model checkers. In ESEC/FSE. 763–773. https://doi.org/10.1145/3338906.3338932

Digital Library

[57]

Chengyu Zhang and Zhendong Su. 2024. Artifact: SMT2Test: From SMT Formulas to Effective Test Cases. https://doi.org/10.5281/zenodo.12669638

Digital Library

Index Terms

SMT2Test: From SMT Formulas to Effective Test Cases
1. Software and its engineering
  1. Software creation and management
    1. Software verification and validation

Recommendations

Truncating abstraction of bit-vector operations for BDD-based SMT solvers
Abstract
During the last few years, BDD-based SMT solvers proved to be competitive in deciding satisfiability of quantified bit-vector formulas. However, these solvers usually do not perform well on input formulas with complicated arithmetic. Hitherto, ...
Learning to solve SMT formulas
NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems

We present a new approach for learning to solve SMT formulas. We phrase the challenge of solving SMT formulas as a tree search problem where at each step a transformation is applied to the input formula until the formula is solved. Our approach works in ...
Satisfiability Modulo Custom Theories in Z3
Verification, Model Checking, and Abstract Interpretation
Abstract
We introduce user-propagators as a new feature of the Z3 SMT solver. User-propagation allows users to write custom theory extensions for Z3, by implementing callbacks via the Z3 API. These callbacks are invoked by Z3 and eliminate eager processing ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages

Proceedings of the ACM on Programming Languages Volume 8, Issue OOPSLA2

October 2024

2691 pages

EISSN:2475-1421

DOI:10.1145/3554319

Editor:
Michael Hicks
Amazon, USA

Issue’s Table of Contents

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2024

Published in PACMPL Volume 8, Issue OOPSLA2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
268
Total Downloads

Downloads (Last 12 months)268
Downloads (Last 6 weeks)247

Reflects downloads up to 21 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents