Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Verifying equivalence of database-driven applications

Published: 27 December 2017 Publication History

Abstract

This paper addresses the problem of verifying equivalence between a pair of programs that operate over databases with different schemas. This problem is particularly important in the context of web applications, which typically undergo database refactoring either for performance or maintainability reasons. While web applications should have the same externally observable behavior before and after schema migration, there are no existing tools for proving equivalence of such programs. This paper takes a first step towards solving this problem by formalizing the equivalence and refinement checking problems for database-driven applications. We also propose a proof methodology based on the notion of bisimulation invariants over relational algebra with updates and describe a technique for synthesizing such bisimulation invariants. We have implemented the proposed technique in a tool called Mediator for verifying equivalence between database-driven applications written in our intermediate language and evaluate our tool on 21 benchmarks extracted from textbooks and real-world web applications. Our results show that the proposed methodology can successfully verify 20 of these benchmarks.

Supplementary Material

WEBM File (verifyingequivalence.webm)

References

[1]
Alfred V. Aho, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalences Among Relational Expressions. SIAM J. Comput. 8, 2 (1979), 218–246.
[2]
Joseph Albert, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1999. Equivalence of Keyed Relational Schemas by Conjunctive Queries. J. Comput. Syst. Sci. 58, 3 (1999), 512–534.
[3]
Scott W Ambler. 2007. Test-driven development of relational databases. Ieee Software 24, 3 (2007).
[4]
Scott W Ambler and Pramod J Sadalage. 2006. Refactoring databases: Evolutionary database design. Pearson Education.
[5]
Shay Artzi, Adam Kiezun, Julian Dolby, Frank Tip, Danny Dig, Amit Paradkar, and Michael D Ernst. 2008. Finding bugs in dynamic web applications. In Proc. of ISSTA. 261–272.
[6]
Paolo Atzeni, Giorgio Ausiello, Carlo Batini, and Marina Moscarini. 1982. Inclusion and Equivalence between Relational Database Schemata. Theor. Comput. Sci. 19 (1982), 267–285.
[7]
Thomas Ball, Todd Millstein, and Sriram K. Rajamani. 2005. Polymorphic Predicate Abstraction. ACM Trans. Program. Lang. Syst. 27, 2 (2005), 314–343.
[8]
Clark Barrett, Aaron Stump, and Cesare Tinelli. 2010. The satisfiability modulo theories library (SMT-LIB). www. SMT-LIB. org 15 (2010), 18–52.
[9]
Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2011. Relational verification using product programs. In Proc. of FM. 200–214.
[10]
Gilles Barthe, Juan Manuel Crespo, and César Kunz. 2013. Beyond 2-Safety: Asymmetric Product Programs for Relational Program Verification. In Proc. of LFCS. 29–43.
[11]
Catriel Beeri, Alberto O. Mendelzon, Yehoshua Sagiv, and Jeffrey D. Ullman. 1979. Equivalence of Relational Database Schemes. In Proc. of STOC. 319–329.
[12]
Michael Benedikt, Timothy Griffin, and Leonid Libkin. 1998. Verifiable Properties of Database Transactions. Inf. Comput. 147, 1 (1998), 57–88.
[13]
Nick Benton. 2004. Simple Relational Correctness Proofs for Static Analyses and Program Transformations. In Proc. of POPL. 14–25.
[14]
Loredana Caruccio, Giuseppe Polese, and Genoveffa Tortora. 2016. Synchronization of Queries and Views Upon Schema Evolutions: A Survey. Proc. of TODS 41, 2 (2016), 9:1–9:41.
[15]
Ashok K. Chandra and Philip M. Merlin. 1977. Optimal Implementation of Conjunctive Queries in Relational Data Bases. In Proc. of STOC. 77–90.
[16]
David Chays, Saikat Dan, Phyllis G. Frankl, Filippos I. Vokolos, and Elaine J. Weyuker. 2000. A Framework for Testing Database Applications. In Proc. of ISSTA. 147–157.
[17]
Alvin Cheung, Armando Solar-Lezama, and Samuel Madden. 2013. Optimizing database-backed applications with query synthesis. In Proc. of PLDI. 3–14.
[18]
Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017a. Cosette: An Automated Prover for SQL. In Proc. of CIDR.
[19]
Shumo Chu, Konstantin Weitz, Alvin Cheung, and Dan Suciu. 2017b. HoTTSQL: Proving query rewrites with univalent SQL semantics. In Proc. of PLDI. 510–524.
[20]
Rance Cleaveland and Matthew Hennessy. 1993. Testing equivalence as a bisimulation equivalence. Formal Aspects of Computing 5, 1 (1993), 1–20.
[21]
Sara Cohen, Werner Nutt, and Alexander Serebrenik. 1999. Rewriting Aggregate Queries Using Views. In Proc. of PODS. 155–166.
[22]
Carlo Curino, Hyun Jin Moon, Alin Deutsch, and Carlo Zaniolo. 2013. Automating the database schema evolution process. VLDB J. 22, 1 (2013), 73–98.
[23]
Satyaki Das, David Dill, and Seungjoon Park. 1999. Experience with predicate abstraction. In Proc. of CAV. 687–687.
[24]
Leonardo Mendonça de Moura and Nikolaj Bjørner. 2008. Z3: An Efficient SMT Solver. In Proc. of TACAS. 337–340.
[25]
Benjamin Delaware, Clément Pit-Claudel, Jason Gross, and Adam Chlipala. 2015. Fiat: Deductive Synthesis of Abstract Data Types in a Proof Assistant. In Proc. of POPL. 689–700.
[26]
Yuetang Deng, Phyllis Frankl, and David Chays. 2005. Testing Database Transactions with AGENDA. In Proc. of ICSE. 78–87.
[27]
Alin Deutsch, Monica Marcus, Liying Sui, Victor Vianu, and Dayou Zhou. 2005. A Verifier for Interactive, Data-Driven Web Applications. In Proc. of SIGMOD. 539–550.
[28]
Alin Deutsch, Liying Sui, and Victor Vianu. 2007. Specification and verification of data-driven Web applications. J. Comput. Syst. Sci. 73, 3 (2007), 442–474.
[29]
Michael Emmi, Rupak Majumdar, and Koushik Sen. 2007. Dynamic Test Input Generation for Database Applications. In Proc. of ISSTA. 151–162.
[30]
Ronald Fagin, Phokion G Kolaitis, Lucian Popa, and Wang-Chiew Tan. 2011. Schema mapping evolution through composition and inversion. In Schema matching and mapping. Springer, 191–222.
[31]
Stéphane Faroult and Pascal L’Hermite. 2008. Refactoring SQL applications. O’Reilly Media.
[32]
Dennis Felsing, Sarah Grebing, Vladimir Klebanov, Philipp Rümmer, and Mattias Ulbrich. 2014. Automating regression verification. In Proc. of ASE. 349–360.
[33]
Milos Gligoric and Rupak Majumdar. 2013. Model Checking Database Applications. In Proc. of TACAS. 549–564.
[34]
Benny Godlin and Ofer Strichman. 2009. Regression verification. In Proc. of DAC. 466–471.
[35]
Benny Godlin and Ofer Strichman. 2013. Regression verification: proving the equivalence of similar programs. Software Testing, Verification and Reliability 23, 3 (2013), 241–258.
[36]
Todd J. Green. 2009. Containment of conjunctive queries on annotated relations. In Proc. of ICDT. 296–309.
[37]
Chris Hawblitzel, Ming Kawaguchi, Shuvendu K. Lahiri, and Henrique Rebêlo. 2013. Towards Modularly Comparing Programs Using Automated Theorem Provers. In Proc. of CADE. 282–299.
[38]
Richard Hull. 1986. Relative Information Capacity of Simple Relational Database Schemata. SIAM J. Comput. 15, 3 (1986), 856–886.
[39]
Vasileios Koutavas and Mitchell Wand. 2006a. Bisimulations for Untyped Imperative Objects. In Proc. of ESOP. 146–161.
[40]
Vasileios Koutavas and Mitchell Wand. 2006b. Small bisimulations for reasoning about higher-order imperative programs. In Proc. of POPL. 141–152.
[41]
Sudipta Kundu, Zachary Tatlock, and Sorin Lerner. 2009. Proving optimizations correct using parameterized program equivalence. In Proc. of PLDI. 327–337.
[42]
Shuvendu K Lahiri, Chris Hawblitzel, Ming Kawaguchi, and Henrique Rebêlo. 2012. Symdiff: A language-agnostic semantic diff tool for imperative programs. In Proc. of CAV. 712–717.
[43]
Shuvendu K Lahiri, Kenneth L McMillan, Rahul Sharma, and Chris Hawblitzel. 2013. Differential assertion checking. In Proc. of ESEC/FSE. 345–355.
[44]
Shuvendu K Lahiri and Shaz Qadeer. 2009. Complexity and algorithms for monomial and clausal predicate abstraction. In Proc. of CADE. 214–229.
[45]
Renée J. Miller, Yannis E. Ioannidis, and Raghu Ramakrishnan. 1993. The Use of Information Capacity in Schema Integration and Translation. In Proc. of VLDB. 120–133.
[46]
Joseph P. Near and Daniel Jackson. 2012. Rubicon: bounded verification of web applications. In Proc. of FSE. 60.
[47]
George C. Necula. 2000. Translation validation for an optimizing compiler. In Proc. of PLDI. 83–94.
[48]
Oracle. 2005. Oracle Schema Optimization Guide. https://docs.oracle.com/cd/B14099_19/web.1012/b15901/tuning007.htm . (2005).
[49]
Suzette Person, Matthew B. Dwyer, Sebastian G. Elbaum, and Corina S. Pasareanu. 2008. Differential symbolic execution. In Proc. of FSE. 226–237.
[50]
Amir Pnueli, Michael Siegel, and Eli Singerman. 1998. Translation Validation. In Proc. of TACAS. 151–166.
[51]
Erhard Rahm and Philip A. Bernstein. 2006. An online bibliography on schema evolution. SIGMOD Record 35, 4 (2006), 30–31.
[52]
Martin Rinard. 1999. Credible Compilation. In MIT TechReport. MIT–LCS–TR–776.
[53]
Arnon Rosenthal and David S. Reiner. 1994. Tools and Transformations - Rigorous and Otherwise - for Practical Database Design. ACM Trans. Database Syst. 19, 2 (1994), 167–211.
[54]
Yehoshua Sagiv and Mihalis Yannakakis. 1980. Equivalences Among Relational Expressions with the Union and Difference Operators. J. ACM 27, 4 (1980), 633–655.
[55]
Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2007. Environmental Bisimulations for Higher-Order Languages. In Proc. of LICS. 293–302.
[56]
Davide Sangiorgi, Naoki Kobayashi, and Eijiro Sumii. 2011. Environmental bisimulations for higher-order languages. ACM Trans. Program. Lang. Syst. 33, 1 (2011), 5:1–5:69.
[57]
Marcelo Sousa and Isil Dillig. 2016. Cartesian Hoare logic for verifying k-safety properties. In Proc. of PLDI. 57–69.
[58]
Michael Stepp, Ross Tate, and Sorin Lerner. 2011. Equality-based translation validator for LLVM. In Proc. of CAV. 737–742.
[59]
Eijiro Sumii and Benjamin C. Pierce. 2004. A bisimulation for dynamic sealing. In Proc. of POPL. 161–172.
[60]
Eijiro Sumii and Benjamin C. Pierce. 2005. A bisimulation for type abstraction and recursion. In Proc. of POPL. 63–74.
[61]
Boris A Trakhtenbrot. 1950. Impossibility of an algorithm for the decision problem in finite classes. Doklady Akademii Nauk SSSR 70 (1950), 569–572.
[62]
Victor Vianu. 2009. Automatic verification of database-driven systems: a new frontier. In Proc. of ICDT. 1–13.
[63]
Joost Visser. 2008. Coupled Transformation of Schemas, Documents, Queries, and Constraints. Electr. Notes Theor. Comput. Sci. 200, 3 (2008), 3–23.
[64]
Yuepeng Wang, Isil Dillig, Shuvendu K. Lahiri, and William R. Cook. 2017. Verifying Equivalence of Database-Driven Applications. http://arxiv.org/abs/1710.07660 . (2017). arXiv: 1710.07660
[65]
Gary Wassermann, Dachuan Yu, Ajay Chander, Dinakar Dhurjati, Hiroshi Inamura, and Zhendong Su. 2008. Dynamic test input generation for web applications. In Proc. of ISSTA. 249–260.
[66]
Wikimedia. 2017. Schema changes. https://wikitech.wikimedia.org/wiki/Schema_changes . (2017).
[67]
Tim Wood, Sophia Drossopoulou, Shuvendu K. Lahiri, and Susan Eisenbach. 2017. Modular Verification of Procedure Equivalence in the Presence of Memory Allocation. In Proc. of ESOP. 937–963.
[68]
Hongseok Yang. 2007. Relational separation logic. Theoretical Computer Science 375, 1-3 (2007), 308–334.
[69]
Anna Zaks and Amir Pnueli. 2008. Covac: Compiler validation by program analysis of the cross-product. In Proc. of FM. 35–51.
[70]
Lenore D. Zuck, Amir Pnueli, and Benjamin Goldberg. 2003. VOC: A Methodology for the Translation Validation of Optimizing Compilers. J. UCS 9, 3 (2003), 223–247.

Cited By

View all
  • (2024)VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity ConstraintsProceedings of the ACM on Programming Languages10.1145/36498498:OOPSLA1(1071-1099)Online publication date: 29-Apr-2024
  • (2024)Knowledge Equivalence in Digital Twins of Intelligent SystemsACM Transactions on Modeling and Computer Simulation10.1145/363530634:1(1-37)Online publication date: 14-Jan-2024
  • (2023)Predicate Pushdown for Data Science PipelinesProceedings of the ACM on Management of Data10.1145/35892811:2(1-28)Online publication date: 20-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 2, Issue POPL
January 2018
1961 pages
EISSN:2475-1421
DOI:10.1145/3177123
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 December 2017
Published in PACMPL Volume 2, Issue POPL

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Equivalence Checking
  2. Program Verification
  3. Relational Databases

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)10
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)VeriEQL: Bounded Equivalence Verification for Complex SQL Queries with Integrity ConstraintsProceedings of the ACM on Programming Languages10.1145/36498498:OOPSLA1(1071-1099)Online publication date: 29-Apr-2024
  • (2024)Knowledge Equivalence in Digital Twins of Intelligent SystemsACM Transactions on Modeling and Computer Simulation10.1145/363530634:1(1-37)Online publication date: 14-Jan-2024
  • (2023)Predicate Pushdown for Data Science PipelinesProceedings of the ACM on Management of Data10.1145/35892811:2(1-28)Online publication date: 20-Jun-2023
  • (2023)On Proving the Correctness of Refactoring Class Diagrams of MDE MetamodelsACM Transactions on Software Engineering and Methodology10.1145/354954132:2(1-42)Online publication date: 5-Apr-2023
  • (2023)External Behavior of a Logic Program and Verification of RefactoringTheory and Practice of Logic Programming10.1017/S1471068423000200(1-15)Online publication date: 18-Jul-2023
  • (2022)Synthesizing axiomatizations using logic learningProceedings of the ACM on Programming Languages10.1145/35633486:OOPSLA2(1697-1725)Online publication date: 31-Oct-2022
  • (2022)Optimizing Recursive Queries with Progam SynthesisProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517827(79-93)Online publication date: 10-Jun-2022
  • (2021)Online Dynamic B-MatchingACM SIGMETRICS Performance Evaluation Review10.1145/3453953.345397648:3(99-108)Online publication date: 5-Mar-2021
  • (2021)Scooter & Sidecar: a domain-specific approach to writing secure database migrationsProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454072(710-724)Online publication date: 19-Jun-2021
  • (2021)Repairing serializability bugs in distributed database programs via automated schema refactoringProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation10.1145/3453483.3454028(32-47)Online publication date: 19-Jun-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media