Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3510003.3510093acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article
Open access

Automatic detection of performance bugs in database systems using equivalent queries

Published: 05 July 2022 Publication History

Abstract

Because modern data-intensive applications rely heavily on database systems (DBMSs), developers extensively test these systems to eliminate bugs that negatively affect functionality. Besides functional bugs, however, there is another important class of faults that negatively affect the response time of a DBMS, known as performance bugs. Despite their potential impact on end-user experience, performance bugs have received considerably less attention than functional bugs. To fill this gap, we present Amoeba, a technique and tool for automatically detecting performance bugs in DBMSs. The core idea behind Amoeba is to construct semantically equivalent query pairs, run both queries on the DBMS under test, and compare their response time. If the queries exhibit significantly different response times, that indicates the possible presence of a performance bug in the DBMS. To construct equivalent queries, we propose to use a set of structure and expression mutation rules especially targeted at uncovering performance bugs. We also introduce feedback mechanisms for improving the effectiveness and efficiency of the approach. We evaluate Amoeba on two widely-used DBMSs, namely PostgreSQL and CockroachDB, with promising results: Amoeba has so far discovered 39 potential performance bugs, among which developers have already confirmed 6 bugs and fixed 5 bugs.

References

[1]
1992. Database Language SQL. http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt.
[2]
2015. AFL: American Fuzzy Lop. http://lcamtuf.coredump.cx/afl/.
[3]
2016. OSS-Fuzz: Continuous Fuzzing for Open Source Software. https://github.com/google/oss-fuzz.
[4]
2016. SQLSmith. https://github.com/anse1/sqlsmith.
[5]
2020. Apache Calcite. https://calcite.apache.org/.
[6]
2020. CockroachDB. https://www.cockroachlabs.com/docs/releases/v20.2.0.
[7]
2020. SQLAlchemy. https://www.sqlalchemy.org/.
[8]
2021. CockroachDB Performance Bug Reports. https://github.com/cockroachdb/cockroach/issues?q=performance.
[9]
2021. PostgreSQL Performance Bug Reports. https://www.postgresql.org/search/?m=1&q=performance&1=8&d=-1&s=r.
[10]
2021. PostgreSQL Wiki. https://wiki.postgresql.org/wiki/Mainpage.
[11]
2021. SCOTT schema. https://www.orafaq.com/wiki/SCOTT.
[12]
2022. Supplementary material. https://bit.ly/3I995jL
[13]
Hardik Bati, Leo Giakoumakis, Steve Herbert, and Aleksandras Surna. 2007. A genetic approach for random testing of database systems. In VLDB. 1243--1251.
[14]
Nicolas Bruno, Surajit Chaudhuri, and Dilys Thomas. 2006. Generating Queries with Cardinality Constraints for DBMS Testing. In TKDE. 1721--1725.
[15]
Stefano Ceri and Georg Gottlob. 1985. Translating SQL into relational algebra: Optimization, semantics, and equivalence of SQL queries. In TSE. 324--345.
[16]
Bikash Chandra, Bhupesh Chawda, Biplab Kar, KV Maheshwara Reddy, Shetal Shah, and S Sudarshan. 2015. Data generation for testing and grading SQL queries. In VLDB Journal. 731--755.
[17]
CL Philip Chen and Chun-Yang Zhang. 2014. Data-intensive applications, challenges, techniques and technologies: A survey on Big Data. In Information sciences. 314--347.
[18]
Tsong Y Chen, Shing C Cheung, and Shiu Ming Yiu. 2020. Metamorphic testing: a new approach for generating next test cases. In arXiv preprint arXiv:2002.12543.
[19]
Shumo Chu, Chenglong Wang, Konstantin Weitz, and Alvin Cheung. 2017. Cosette: An Automated Prover for SQL. In CIDR.
[20]
Hicham G Elmongui, Vivek Narasayya, and Ravishankar Ramamurthy. 2009. A framework for testing query transformation rules. In SIGMOD. 257--268.
[21]
Leo Giakoumakis and César A Galindo-Legaria. 2008. Testing SQL Server's Query Optimizer: Challenges, Techniques and Experiences. In Data Engineering Bulletin. 36--43.
[22]
Goetz Graefe. 1993. Query Evaluation Techniques for Large Databases. In CSUR. 73--169.
[23]
Goetz Graefe. 1995. The cascades framework for query optimization. In Data Engineering Bulletin. 19--29.
[24]
Zhongxian Gu, Mohamed A. Soliman, and Florian M. Waas. 2015. Testing the accuracy of query optimizers. In DBTest. 1--6.
[25]
Toshihide Ibaraki and Tiko Kameda. 1984. On the optimal nesting order for computing n-relational joins. In TODS. 482--502.
[26]
Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, and Woonhak Kang. 2019. Apollo: Automatic detection and diagnosis of performance regressions in database systems. In VLDB. 57--70.
[27]
R Kearns, Stephen Shead, and Alan Fekete. 1997. A teaching system for SQL. In ACSE. 224--231.
[28]
D. Lee, S. K. Cha, and A. H. Lee. 2012. A Performance Anomaly Detection and Analysis Framework for DBMS Development. In TKDE. 1345--1360.
[29]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really?. In VLDB. 204--215.
[30]
Zhan Li, Olga Papaemmanouil, and Mitch Cherniack. 2016. OptMark: A Toolkit for Benchmarking Query Optimizers. In CIKM. 2155--2160.
[31]
William M McKeeman. 1998. Differential testing for software. In Digital Technical Journal. 100--107.
[32]
Rasha Osman and William J. Knottenbelt. 2012. Database System Performance Evaluation Models: A Survey. In Performance Evaluation. 471--493.
[33]
Jaroslav Pokornỳ. 2015. Database technologies in the world of big data. In CompSysTech. 1--12.
[34]
Raghu Ramakrishnan and Johannes Gehrke. 2003. Database Management Systems.
[35]
Kim-Thomas Rehmann, Changyun Seo, Dongwon Hwang, Binh Than Truong, Alexander Boehm, and Dong Hun Lee. 2016. Performance Monitoring in SAP HANA's Continuous Integration Process. In PER. 43--52.
[36]
Raymond Reiter. 1986. A Sound and Sometimes Complete Query Evaluation Algorithm for Relational Databases with Null Values. In J. ACM. 349--370.
[37]
Manuel Rigger and Zhendong Su. 2020. Detecting Optimization Bugs in Database Engines via Non-Optimizing Reference Engine Construction. In FSE.
[38]
Manuel Rigger and Zhendong Su. 2020. Finding Bugs in Database Systems via Query Partitioning. In OOPSLA.
[39]
Manuel Rigger and Zhendong Su. 2020. Testing Database Engines via Pivoted Query Synthesis. In OSDI.
[40]
Shetal Shah, S Sudarshan, Suhas Kajbaje, Sandeep Patidar, Bhanu Pratap Gupta, and Devang Vira. 2011. Generating test data for killing SQL mutants: A constraint-based approach. In ICDE. 1175--1186.
[41]
Hitesh Kumar Sharma, Mr SC Nelson, et al. 2017. Explain Plan and SQL Trace the Two Approaches for RDBMS Tuning. In Database Systems Journal. 31--39.
[42]
Donald R Slutz. 1998. Massive Stochastic Testing of SQL. In VLDB. 618--622.
[43]
Florian Waas and César Galindo-Legaria. 2000. Counting, Enumerating, and Sampling of Execution Plans in a Cost-Based Query Optimizer. In SIGMOD. 499--509.
[44]
Gerhard Weikum, Axel Moenkeberg, Christof Hasse, and Peter Zabback. 2002. Self-Tuning Database Technology and Information Services: From Wishful Thinking to Viable Engineering. In VLDB. 20--31.
[45]
Khaled Yagoub, Peter Belknap, Benoit Dageville, Karl Dias, Shantanu Joshi, and Hailing Yu. 2008. Oracle's SQL Performance Analyzer. In Data Engineering Bulletin. 51--58.
[46]
Jiaqi Yan, Qiuye Jin, Shrainik Jain, Stratis D Viglas, and Allison Lee. 2018. Snowtrail: Testing with Production Queries on a Cloud Database. In DBTEST.
[47]
Rui Zhong, Yongheng Chen, Hong Hu, Hangfan Zhang, Wenke Lee, and Dinghao Wu. 2020. SQUIRREL: Testing Database Management Systems with Language Validity and Coverage Feedback. In CCS. 955--970.
[48]
Qi Zhou, Joy Arulraj, Shamkant Navathe, William Harris, and Dong Xu. 2019. Automated verification of query equivalence using satisfiability modulo theories. In VLDB. 1276--1288.

Cited By

View all
  • (2024)Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database ConstructionProceedings of the VLDB Endowment10.14778/3659437.365944517:8(1884-1897)Online publication date: 31-May-2024
  • (2024)Keep It Simple: Testing Databases via Differential Query PlansProceedings of the ACM on Management of Data10.1145/36549912:3(1-26)Online publication date: 30-May-2024
  • (2024)Testing Gremlin-Based Graph Database Systems via Query DisassemblingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680392(1695-1707)Online publication date: 11-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICSE '22: Proceedings of the 44th International Conference on Software Engineering
May 2022
2508 pages
ISBN:9781450392211
DOI:10.1145/3510003
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2022

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. database testing
  2. differential testing
  3. query optimization

Qualifiers

  • Research-article

Funding Sources

Conference

ICSE '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 276 of 1,856 submissions, 15%

Upcoming Conference

ICSE 2025

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)371
  • Downloads (Last 6 weeks)48
Reflects downloads up to 14 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Detecting Metadata-Related Logic Bugs in Database Systems via Raw Database ConstructionProceedings of the VLDB Endowment10.14778/3659437.365944517:8(1884-1897)Online publication date: 31-May-2024
  • (2024)Keep It Simple: Testing Databases via Differential Query PlansProceedings of the ACM on Management of Data10.1145/36549912:3(1-26)Online publication date: 30-May-2024
  • (2024)Testing Gremlin-Based Graph Database Systems via Query DisassemblingProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680392(1695-1707)Online publication date: 11-Sep-2024
  • (2024)Sedar: Obtaining High-Quality Seeds for DBMS Fuzzing via Cross-DBMS SQL TransferProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639210(1-12)Online publication date: 20-May-2024
  • (2024)Understanding Transaction Bugs in Database SystemsProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639207(1-13)Online publication date: 20-May-2024
  • (2024)Testing Graph Database Systems via Equivalent Query RewritingProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639200(1-12)Online publication date: 20-May-2024
  • (2024)Mozi: Discovering DBMS Bugs via Configuration-Based Equivalent TransformationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639112(1-12)Online publication date: 20-May-2024
  • (2024)CERT: Finding Performance Issues in Database Systems Through the Lens of Cardinality EstimationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639076(1-13)Online publication date: 20-May-2024
  • (2024)Detecting Logic Bugs in Graph Database Management Systems via Injective and Surjective Graph Query TransformationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3623307(1-12)Online publication date: 20-May-2024
  • (2024)Differential Optimization Testing of Gremlin-Based Graph Database Systems2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00012(25-36)Online publication date: 27-May-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media