Booleguru, the Propositional Polyglot (Short Paper)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14739))

Included in the following conference series:

International Joint Conference on Automated Reasoning

1466 Accesses

Abstract

Recent approaches on verification and reasoning solve SAT and QBF encodings using state-of-the-art SMT solvers, as it “makes implementation much easier”. The ease-of-use of these solvers make SAT and QBF solvers less visible to users of solvers—who are maybe from different research communities—potentially not exploiting the power of state-of-the-art tools. In this work, we motivate the need to build bridges over the widening solver-gap and introduce Booleguru, a tool to convert between formats for logic formulas. It makes SAT and QBF solvers more accessible by using techniques known from SMT solvers, such as advanced Python interfaces like Z3Py and easily generatable languages like SMT-LIB, integrating them to our conversion tool. We then introduce a language to manipulate and combine multiple formulas, optionally applying transformations for quickly prototyping encodings. Booleguru’s advanced scripting capabilities form a programming environment specialized for Boolean logic, offering a more efficient way to develop novel problem encodings.

This work was supported by the LIT AI Lab funded by the State of Upper Austria.

You have full access to this open access chapter, Download conference paper PDF

Generating Extended Resolution Proofs with a BDD-Based SAT Solver

PyQBF: A Python Framework for Solving Quantified Boolean Formulas

Bit-Precise Reasoning via Int-Blasting

Keywords

1 Introduction

Numerous recent publications with encodings of problems into SAT and QBF do not use SAT or QBF solvers directly [6, 16, 18]. SMT solvers, often the feature-rich and popular state-of-the-art solver Z3 [8], are used instead, as it “makes implementation much simpler” [17], although no theory reasoning is involved. Z3’s programming API or the Common Lisp compatible SMT-LIB standard [3] are well documented and regarded by many as easy to use. While this ease of use leads to wide adoption and fast results, adapting encodings to use less general solving backends that are potentially more efficient for the problem at hand remains hard, e.g. switching from SMT solving to using a less general SAT solver. Researchers focus on optimizing their encodings against an SMT solver’s performance characteristics, instead of testing them against many different (also non-SMT) solvers. We consider the transformation of formulas into conjunctive normal form (CNF) required for SAT and QBF solvers to be non-trivial, especially for beginners. Seemingly bad intermediary results are discarded, as the effort required to re-encode the problem to be solvable with SAT or QBF solvers is too large to be spent during prototyping, effectively forming a solver gap. We want to bridge over this gap and reduce the friction involved with testing other solvers outside the SMT world, without extensive modifications to encoding generators. In this work, we analyze what features are required to build this bridge and develop Booleguru, a polyglot for (quantified) Boolean logic. Our tool is available under the permissive MIT license at:

https://github.com/maximaximal/booleguru

1.1 Bridging the Solver Gap with Propositional Logic

Table 1. Formula formats that are optimized (✓), usable (\(\approx \)), or unusable (✗) for encoding the respective problem

Full size table

Table 2. File formats and their capabilities.

Full size table

In order to build a bridge over the solver gap described above, we first have to identify it. When encoding problems into logic, one also has to decide which format to encode into. The encoding itself is then typically accomplished with some encoder program, which is closely tied to the problem to be encoded. Changing the output format of an encoder involves considerable effort, as encoders are typically tailored to the output formats they were designed to support. As encoders grow in complexity, communities form around them, while still relying on the original output format. The decision at the beginning of an encoder’s development then influences the newly formed community, as changing their encoder to generate a different output format involves considerable effort. We therefore identify gaps in the solving landscape opening between the different input formats of different solvers for different problems. Table 1 lists a selection of different problems with their associated dominant file formats. Table 2 lists features offered by each format. All of them offer ways to encode propositional logic in CNF, with some of them extending it with quantifiers over variables (\(\forall \), \(\exists \)). More advanced formats also allow encoding formulas in Non-CNF, i.e., formulas built from expressions and more complex logical operators. If a format supports quantifiers, they may always be added as a prefix to the formula. While every formula with embedded quantifiers can be prenexed to be in such a prenex form, some formats also allow encoding formulas in non-prenex form, extending a format’s expressiveness. Some formats also allow structure-sharing, which lets problems reference sub-expressions multiple times, without repeating them. The overlapping capabilities of different formats suggest that a conversion tool has to be able to process all of them, and to serialize complex features into less complex formats, where supported.

1.2 Related Work

Other communities already went through this bridge-building effort in order to reduce duplicated work and advance their fields. One of the biggest examples are the researchers of the machine learning community, who commonly use libraries like PyTorch [15], SciPy [21], and Pandas [20]. This allows others to use new innovations in these libraries, such as newly added learning algorithms or properties in PyTorch, or better storage formats in Pandas.

Multiple conversion tools already exist for QBF [9, 13, 19]. While all of these tools convert between specific formats, no tool tries to encompass multiple conversion or combination capabilities. Some SMT solvers are able to read multiple input formats [2, 7, 8], with all supporting SMT-LIB2, the format favored by the SMT-Competition [22]. SMT solvers do not offer to combine multiple formulas seamlessly. Booleguru fills this niche and provides such a convert & combine capability, while also enabling a unique development environment to create new formulas. It enables previously tedious comparisons between different solvers solving similar problems, like SMT and QBF solvers, as shown in Fig. 2.

2 Booleguru, the Propositional Multitool

After introducing the overlaps between file formats and solving communities, we now introduce our conversion tool: Booleguru. As shown in the architecture diagram in Fig. 1, it consists of readers, transformers, and serializers for propositional logic and extensions. Inputs in arbitrary formats may be read, modified, and then serialized in the same or a different file format. This section first describes how Booleguru stores propositional formulas in memory, so that all capabilities described in Table 2 can be provided. We then introduce features intended for working with formulas.

2.1 Representing Propositional Formulas in Memory

In order to be accepted as an efficient tool to work with propositional formulas, they have to be both fast to create and to traverse. While this is true for a tool in any problem domain, the lower expressiveness of propositional logic compared to bit-vectors or more complex theories leads to large formulas with many nodes, relying on tools with especially high throughput. For this, they are stored in a directed-acyclic-graph (DAG), enabling structure-sharing. Each node in the DAG is either a variable, a unary (negation), or a binary operation (and, or, etc.). A node is stored in a struct of 16B, which (on most architectures) exactly fits into a single cache-line. The remaining bits to fill the cache line are occupied by structural information of the expression and user-writable extra data to be stored within the DAG, which speeds up transformers that need to store temporary data on nodes. Beside the user writable data, nodes stay constant over the whole execution.

References between nodes are stored as 32bit unsigned integers, which are evaluated relatively to the nodes of a whole formula. When creating a formula, a hash table is used to check if a given node already exists, and if it does not, a new node is appended at the back of the node array. References to previous nodes are immutable, which enables cheaply appending new expressions that are composed of others. Expressions may only reference other expressions with IDs smaller than themselves, cycles imply a malstructured formula. Traversing this DAG does not involve hash lookups or pointer indirections, as every reference can be resolved directly through the child’s index in the array. Information about a sub-expression is collected during insertion of a new node, removing the requirement to scan the DAG in order to check for commonly required structural information. The 32bit references make traversal very efficient, but limits formula sizes to \(2^{32}-1\) nodes. We will provide a compile-time switch to increase reference sizes in a future version.

2.2 Parsing Formulas

We already implemented several parsers:

The readers are mostly implemented using the ANTLR parser generator [14]. While slower than hand-written parsers, the library allowed us to iterate faster during development and add more input languages with a shared base. Some parsers are hand-written, and performance critical ones are incrementally optimized to a specialized implementation. Each parser produces a reference to an expression inside a shared expression manager. Multiple expressions can then be combined into new ones, disregarding their source format. The command-line parsing is also fully done using an ANTLR parser producing an expression, in order to provide a language for composing formulas from multiple files or scripts.

2.3 Transforming Formulas

Transformers are the umbrella term for functions that work on one or more expressions passed to them. They may return new expressions built from their inputs and may be chained together. They are either implemented in native C++, Lua [12], or Fennel^{Footnote 1}, with Lua and Fennel possibly supplied at runtime by a user without re-compiling Booleguru. Several transformations are already implemented, with more being added in future releases. The list below uses the Colon Operator (:op) notation, which is transformed into Fennel function calls during CLI parsing. Each such transformation can be supplied to the Booleguru CLI, where they strongly bind (stronger than binary operators) to the expression preceding them. Generating transformers without expressions as inputs have to be written without an expression preceding them, akin to variables.

2.4 Serializing Formulas

After reading, combining, and transforming input formulas, they can be printed in different output formats. For this, several serializers have been developed, which are listed below. (Q)DIMACS relies on the provided Tseitin encoding by default, but one may use other methods that arrive on an expression which is tagged to be in CNF. This made Booleguru a helpful tool in the QBFGallery 2023^{Footnote 2}.

3 Booleguru, the Programming Environment

During development of new problem encodings or crafted formulas, there is usually a step where an encoding tool or a formula generator is written [1, 4, 11]. In our experience, these tools often rely on similar primitives:

Create a new variable.
Compose an expression based on sub-expressions.
Read from an external source or draw random data.
Write the formula in the desired output format.

The Lua, Fennel, and Python APIs offered by Booleguru abstract over multiple output formats and the concept of writing formulas into files. When writing a generator for a new formula, the Booleguru primitives offer the user to work directly with the formula’s AST, instead of having to generate syntax describing the AST in the target language. This makes generators more suited to change, as they are always composed of (nested) functions, each generating sub-expressions.

In addition to using Booleguru as a first-class execution environment for formula generators, it may be used to reduce SMT encodings developed using Z3Py to a SAT or QBF solving problem. Booleguru optionally generates a Python module that emulates the popular interface of Z3.

Lua and Fennel Interface. The Lua and Fennel interfaces are both accessible through an embedded interpreter. Lua and Fennel scripts that are already distributed as a part of Booleguru are compiled ahead of time by LuaJIT^{Footnote 3}. This makes both initialization and execution of scripts as efficient as possible in the Release build of Booleguru. User-supplied scripts are compiled at runtime using LuaJIT.

Python Interface. Additionally to the specialized Lua and Fennel interfaces, Booleguru provides the pybooleguru interface which is directly modelled after the widely used Z3Py Python library. We observed that when using Z3Py during development of a new encoding, the jump from the SMT solver Z3 to more fundamental SAT or QBF solving becomes harder. This interface is intended to be a drop-in replacement to Z3Py, enabling the conversion of a complex Z3-specific encoder written in Python into an encoder capable of producing additional output of a different format. Python scripts may be read as inputs or a script may import pybooleguru instead of z3py.

C++ Interface.

Additionally to the scripting interfaces, the C++ interface itself may also be used. Functions are provided to easily build expressions using C++, which can be useful when developing new tools in systems languages.

3.1 Command-Line Interactive Interface

The command-line interface of Booleguru is also considered a programming environment, as it seamlessly merges a grammar for propositional logic in infix notation with the Scheme-based Fennel, a programming language written in prefix-notation. Each Fennel expression has to return a logical expression, which it builds using the provided primitives. By combining transformations and working with expressions, new functionality can be implemented using the CLI alone. For example, for the formula \((a\wedge {}b)\vee {}(a\wedge {}\lnot {}b)\), both solutions can be extracted using the CLI:

3.2 Developing Booleguru

Our tool is implemented in C++, with some helpers provided to ease development. The build system is realized using CMake, which makes Booleguru easily embeddable into other projects. All modules have test suites to be run during development. They test basic features like the expression tree, but also transformers and serializers. Defining new tests is easy and running all tests is quick and parallelizable. The build system offers a fast Release build, a slower Debug build without optimization options, and a Sanitized build with enabled address- and undefined behavior sanitizers. If available, LuaJIT is used for executing Lua and Fennel scripts, otherwise the regular Lua distribution is provided as a fallback. Booleguru also natively supports to be built as a WebAssembly executable, making it runnable in browsers, including Lua and Fennel scripts.

Embedded Fuzzer. Transformers that process operations may also be fuzzed using the LLVM libFuzzer integration. Booleguru has to be built in the fuzzing mode, which creates the specialized booleguru-fuzzer binary. The command-line has to be provided via the environment variable BOOLEGURU_ARGS, with a fuzz file that serves as the injection point for fuzzed inputs. Arbitrary transformations may then be performed on the expressions, which are called with every iteration of the fuzzer. Booleguru’s embedded mutator (inspired by the mutator of Google Protocol Buffers) randomly creates arbitrary many input structures until unexpected stops are encountered. The fuzzing capability was used extensively during development of the transformers. The resulting inputs can be displayed in Limboole format using the booleguru-print-corpus tool.

4 Conclusion

We developed the propositional polyglot Booleguru, which can be used to convert between several widely used logic formats, to transform or combine formulas, and to develop new encodings more efficiently. We discussed the requirements our tool has to fulfill and introduced the implementation based on these requirements. Finally, we explained how Booleguru is used to generate new encodings, using the embedded Lua, Fennel, and Python scripting support. Booleguru already proved itself as a valuable tool during the QBFGallery 2023, for revisiting quantifier shifting in QBF [10], and other projects.

Notes

References

Amendola, G., Ricca, F., Truszczynski, M.: A generator of hard 2QBF formulas and ASP programs. In: Sixteenth International Conference on Principles of Knowledge Representation and Reasoning (2018)
Google Scholar
Barbosa, H., et al.: cvc5: a versatile and industrial-strength SMT solver. In: TACAS 2022. LNCS, vol. 13243, pp. 415–442. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-99524-9_24
Chapter Google Scholar
Barret, C., Fontaine, P., Tinelli, C.: The SMT-LIB Standard: Version 2.6, May 2021. www.SMT-LIB.org
Beyersdorff, O., Pulina, L., Seidl, M., Shukla, A.: QBFFam: a tool for generating QBF families from proof complexity. In: Li, C.-M., Manyà, F. (eds.) SAT 2021. LNCS, vol. 12831, pp. 21–29. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80223-3_3
Chapter Google Scholar
Biere, A., Heljanko, K., Wieringa, S.: AIGER 1.9 and beyond. Technical report 11/2, Institute for Formal Models and Verification, Johannes Kepler University, Altenbergerstr. 69, 4040 Linz, Austria (2011)
Google Scholar
Bonnah, E., Nguyen, L., Hoque, K.A.: Motion planning using hyperproperties for time window temporal logic. IEEE Robot. Autom. Lett. 8, 1–8 (2023)
Article Google Scholar
Brummayer, R., Biere, A.: Boolector: an efficient SMT solver for bit-vectors and arrays. In: Kowalewski, S., Philippou, A. (eds.) TACAS 2009. LNCS, vol. 5505, pp. 174–177. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-00768-2_16
Chapter Google Scholar
de Moura, L., Bjørner, N.: Z3: an efficient SMT solver. In: Ramakrishnan, C.R., Rehof, J. (eds.) TACAS 2008. LNCS, vol. 4963, pp. 337–340. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78800-3_24
Chapter Google Scholar
Hecking-Harbusch, J., Tentrup, L.: Solving QBF by abstraction. In: Electronic Proceedings in Theoretical Computer Science (2018). https://doi.org/10.4204/eptcs.277.7
Heisinger, S., Heisinger, M., Rebola-Pardo, A., Seidl, M.: Quantifier shifting for quantified boolean formulas revisited. In: Benzmüller, C., Heule, M., Schmidt, R. (eds.) Automated Reasoning - 12th International Joint Conference, IJCAR 2024, Nancy, France, July 3-6, 2024, Proceedings. LNCS, vol. 14739, pp. 325–343, Springer, Cham (2024). https://doi.org/10.1007/978-3-031-63498-7_20
Heisinger, S., Seidl, M.: True crafted formula families for benchmarking quantified satisfiability solvers. In: Dubois, C., Kerber, M. (eds.) CICM 2023. LNCS, vol. 14101p, pp. 291–296. Springer-Verlag, Berlin, Heidelberg (2023). https://doi.org/10.1007/978-3-031-42753-4_20
Chapter Google Scholar
Ierusalimschy, R.: Programming in Lua. Roberto Ierusalimschy (2006)
Google Scholar
Jordan, C., Klieber, W., Seidl, M.: Non-CNF QBF solving with QCIR. In: AAAI Workshop: Beyond NP. AAAI Technical report, vol. WS-16-05. AAAI Press (2016)
Google Scholar
Parr, T.: The definitive ANTLR 4 reference. The Definitive ANTLR 4 Reference, pp. 1–326 (2013)
Google Scholar
Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems, vol. 32, pp. 8024–8035. Curran Associates, Inc. (2019)
Google Scholar
Peham, T., Brandl, N., Kueng, R., Wille, R., Burgholzer, L.: Depth-optimal synthesis of Clifford circuits with SAT solvers (2023)
Google Scholar
Saaltink, C., Nicoletti, S.M., Volk, M., Hahn, E.M., Stoelinga, M.: Solving queries for Boolean fault tree logic via quantified sat. In: Proceedings of the 9th ACM SIGPLAN International Workshop on Formal Techniques for Safety-Critical Systems, pp. 48–59. FTSCS 2023, Association for Computing Machinery (2023). https://doi.org/10.1145/3623503.3623535
Schwarzová, T., Strejcek, J., Major, J.: Reducing acceptance marks in Emerson-lei automata by QBF solving. In: 26th International Conference on Theory and Applications of Satisfiability Testing, SAT 2023, July 4-8, Alghero, Italy. LIPIcs, vol. 271, pp. 23:1–23:20 (2023). https://doi.org/10.4230/LIPICS.SAT.2023.23
Seidl, M., Lonsing, F., Biere, A.: qbf2epr: a tool for generating EPR formulas from QBF. In: Third Workshop on Practical Aspects of Automated Reasoning, PAAR-2012, Manchester, UK, June 30–July 1, 2012. EPiC Series in Computing, vol. 21, pp. 139–148. EasyChair (2012). https://doi.org/10.29007/2B5D
The pandas development team: pandas-dev/pandas: Pandas, February 2020. https://doi.org/10.5281/zenodo.3509134
Virtanen, P., et al.: SciPy 1.0 Contributors: SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nature Methods, pp. 261–272 (2020). https://doi.org/10.1038/s41592-019-0686-2
Weber, T., Conchon, S., Déharbe, D., Heizmann, M., Niemetz, A., Reger, G.: The SMT competition 2015–2018. J. Satisf. Boolean Model. Comput. 11(1), 221–259 (2019). https://doi.org/10.3233/SAT190123
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Johannes Kepler University Linz, Linz, 4040, Austria
Maximilian Heisinger, Simone Heisinger & Martina Seidl

Authors

Maximilian Heisinger
View author publications
You can also search for this author in PubMed Google Scholar
Simone Heisinger
View author publications
You can also search for this author in PubMed Google Scholar
Martina Seidl
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maximilian Heisinger .

Editor information

Editors and Affiliations

Otto-Friedrich-Universität Bamberg, Bamberg, Germany
Christoph Benzmüller
Carnegie Mellon University, Pittsburgh, PA, USA
Marijn J.H. Heule
The University of Manchester, Manchester, UK
Renate A. Schmidt

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Heisinger, M., Heisinger, S., Seidl, M. (2024). Booleguru, the Propositional Polyglot (Short Paper). In: Benzmüller, C., Heule, M.J., Schmidt, R.A. (eds) Automated Reasoning. IJCAR 2024. Lecture Notes in Computer Science(), vol 14739. Springer, Cham. https://doi.org/10.1007/978-3-031-63498-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-031-63498-7_19
Published: 01 July 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-63497-0
Online ISBN: 978-3-031-63498-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Booleguru, the Propositional Polyglot (Short Paper)

Abstract

Similar content being viewed by others

Generating Extended Resolution Proofs with a BDD-Based SAT Solver