research-article

Open access

Finding Cross-Rule Optimization Bugs in Datalog Engines

Authors:

Chi Zhang,

Linzhang Wang,

Manuel RiggerAuthors Info & Claims

Proceedings of the ACM on Programming Languages, Volume 8, Issue OOPSLA1

Article No.: 98, Pages 110 - 136

https://doi.org/10.1145/3649815

Published: 29 April 2024 Publication History

PDF eReader

Abstract

Datalog is a popular and widely-used declarative logic programming language. Datalog engines apply many cross-rule optimizations; bugs in them can cause incorrect results. To detect such optimization bugs, we propose an automated testing approach called Incremental Rule Evaluation (IRE), which synergistically tackles the test oracle and test case generation problem. The core idea behind the test oracle is to compare the results of an optimized program and a program without cross-rule optimization; any difference indicates a bug in the Datalog engine. Our core insight is that, for an optimized, incrementally-generated Datalog program, we can evaluate all rules individually by constructing a reference program to disable the optimizations that are performed among multiple rules. Incrementally generating test cases not only allows us to apply the test oracle for every new rule generated—we also can ensure that every newly added rule generates a non-empty result with a given probability and eschew recomputing already-known facts. We implemented IRE as a tool named Deopt, and evaluated Deopt on four mature Datalog engines, namely Soufflé, CozoDB, μZ, and DDlog, and discovered a total of 30 bugs. Of these, 13 were logic bugs, while the remaining were crash and error bugs. Deopt can detect all bugs found by queryFuzz, a state-of-the-art approach. Out of the bugs identified by Deopt, queryFuzz might be unable to detect 5. Our incremental test case generation approach is efficient; for example, for test cases containing 60 rules, our incremental approach can produce 1.17× (for DDlog) to 31.02× (for Soufflé) as many valid test cases with non-empty results as the naive random method. We believe that the simplicity and the generality of the approach will lead to its wide adoption in practice.

Supplementary Material

Auxiliary Archive (oopslaa24main-p15-p-archive.zip)

The supplementary material of Deopt, including the description of the bugs found by Deopt.

Download
453.40 KB

References

[1]

Serge Abiteboul and Richard Hull. 1988. Data Functions, Datalog and Negation. In Proceedings of the 1988 ACM SIGMOD International Conference on Management of Data (SIGMOD ’88). Association for Computing Machinery, New York, NY, USA. 143–153. isbn:0897912683 https://doi.org/10.1145/50202.50218

Abstract

Supplementary Material

References

Cited By

Index Terms

Recommendations

Understanding and Finding Java Decompiler Bugs

Finding Atomicity-Violation Bugs through Unserializable Interleaving Testing

Detecting optimization bugs in database engines via non-optimizing reference engine construction

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Badges

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations