Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

Query reverse engineering

Published: 01 October 2014 Publication History

Abstract

In this paper, we introduce a new problem termed query reverse engineering (QRE). Given a database D and a result table T-the output of some known or unknown query Q on D-the goal of QRE is to reverse-engineer a query Q' such that the output of query Q' on database D (denoted by Q'(D)) is equal to T (i.e., Q(D)). The QRE problem has useful applications in database usability, data analysis, and data security. In this work, we propose a data- driven approach, TALOS for Tree-based classifier with At Least One Semantics, that is based on a novel dynamic data classification formulation and extend the approach to efficiently support the three key dimensions of the QRE problem: whether the input query is known/unknown, supporting different query fragments, and supporting multiple database versions.

References

[1]
Andritsos, P., Miller, R.J., Tsaparas, P.: Information-theoretic tools for mining database structure from large data sets. In: SIGMOD (2004)
[2]
Arasu, A., Kaushik, R., Li, J.: Data generation using declarative constraints. In: SIGMOD Conference, pp. 685---696 (2011)
[3]
Binnig, C., Kossmann, D., Lo, E.: Reverse query processing. In: ICDE, pp. 506---515 (2007)
[4]
Binnig, C., Kossmann, D., Lo, E., Özsu, M.T.: QAGen: Generating query-aware test databases. In: SIGMOD Conference, pp. 341---352 (2007)
[5]
Blakeley, J.A., Larson, P.A., Tompa, F.W.: Efficiently updating materialized views. SIGMOD Rec. 15(2), 61---71 (1986)
[6]
Brown, P.G., Haas, P.J.: BHUNT: Automatic discovery of fuzzy algebraic constraints in relational data. In: VLDB (2003)
[7]
Bruno, N., Chaudhuri, S., Thomas, D.: Generating queries with cardinality constraints for dbms testing. IEEE TKDE 18(12), 1721---1725 (2006)
[8]
Cormen, T.H., Stein, C., Rivest, R.L., Leiserson, C.E.: Introduction to Algorithms. McGraw-Hill Science, Boston (2001)
[9]
Gaasterland, T., Godfrey, P., Minker, J.: An overview of cooperative answering. J. Intell. Inf. Syst. 1(2), 123---157 (1992)
[10]
Getoor, L., Taskar, B., Koller, D.: Selectivity estimation using probabilistic models. In: SIGMOD Conference, pp. 461---472 (2001)
[11]
Godfrey, P., Gryz, J., Zuzarte, C.: Exploiting constraint-like data characterizations in query optimization. In: SIGMOD Conference, pp. 582---592 (2001)
[12]
Johnson, T., Marathe, A., Dasu, T.: Database exploration and bellman. IEEE Data Eng. Bull. 26(3), 34---39 (2003)
[13]
Lenzerini, M.: Data integration: A theoretical perspective. In: PODS, pp. 233---246 (2002)
[14]
Lo, E., Cheng, N., Hon, W.K.: Generating databases for query workloads. Proc. VLDB Endow. 3(1), 848---859 (2010)
[15]
Malpani, A., Bernstein, P., Melnik, S., Terwilliger, J.: Reverse engineering models from databases to bootstrap application development. In: ICDE (2010)
[16]
Mehta, M., Agrawal, R., Rissanen, J.: SLIQ: A fast scalable classifier for data mining. In: EDBT, pp. 18---32 (1996)
[17]
Mishra, C., Koudas, N., Zuzarte, C.: Generating targeted queries for database testing. In: SIGMOD Conference, pp. 499---510 (2008)
[18]
Motro, A.: Intensional answers to database queries. IEEE TKDE 6(3), 444---454 (1994)
[19]
Mumick, I.S., Quass, D., Mumick, B.S.: Maintenance of data cubes and summary tables in a warehouse. SIGMOD Rec. 26(2), 100---111 (1997)
[20]
Petit, J.M., Toumani, F., Boulicaut, J.F., Kouloumdjian, J.: Towards the reverse engineering of denormalized relational databases. In: ICDE (1996)
[21]
Tan, P.N., Kumar, M.V.: Introduction to Data Mining. Addison-Wesley, Reading, MA (2006)
[22]
Ramakrishnan, N., Kumar, D., Mishra, B., Potts, M., Helm, R.F.: Turning cartwheels: An alternating algorithm for mining redescriptions. In: SIGKDD Conference, pp. 266---275 (2004)
[23]
van Rijsbergen, C.J.: Information Retrieval. Butterworth, London (1979)
[24]
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465---471 (1978)
[25]
Sarma, A.D., Parameswaran, A., Garcia-Molina, H., Widom, J.: Synthesizing view definitions from data. In: ICDT, pp. 89---103 (2010)
[26]
Sismanis, Y., Brown, P., Haas, P.J., Reinwald, B.: GORDIAN: Efficient and scalable discovery of composite keys. In: VLDB (2006)
[27]
Stonebraker, M.: The design of the postgres storage system. In: VLDB, pp. 289---300 (1987)
[28]
Tran, Q.T., Chan, C.Y.: How to ConQueR why-not questions. In: SIGMOD Conference, pp. 15---26 (2010)
[29]
Tran, Q.T., Chan, C.Y., Parthasarathy, S.: Query by output. In: SIGMOD Conference, pp. 535---548 (2009)
[30]
Valduriez, P.: Join indices. ACM Trans. Database Syst. 12(2), 218---246 (1987)
[31]
Wu, W., Reinwald, B., Sismanis, Y., Manjrekar, R.: Discovering topical structures of databases. In: SIGMOD (2008)
[32]
Xiao, X., Tao, Y.: Output perturbation with query relaxation. Proc. VLDB Endow. 1(1), 857---869 (2008)
[33]
Yin, X., Han, J., Yang, J., Yu, P.S.: Efficient classification across multiple database relations: A crossmine approach. TKDE 18, 770---783 (2006)

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases  Volume 23, Issue 5
October 2014
162 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 October 2014

Author Tags

  1. At-least-one semantics
  2. Instance-equivalent queries
  3. Query reverse engineering

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)9
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Fitting Algorithms for Conjunctive QueriesACM SIGMOD Record10.1145/3641832.364183452:4(6-18)Online publication date: 19-Jan-2024
  • (2024)Training neural networks end-to-end for hyperbox-based classificationNeurocomputing10.1016/j.neucom.2024.127961599:COnline publication date: 28-Sep-2024
  • (2024)A logic-based framework for characterizing nexus of similarity within knowledge basesInformation Sciences: an International Journal10.1016/j.ins.2024.120331664:COnline publication date: 1-Apr-2024
  • (2024)Towards Reliable SQL Synthesis: Fuzzing-Based Evaluation and DisambiguationFundamental Approaches to Software Engineering10.1007/978-3-031-57259-3_11(232-254)Online publication date: 6-Apr-2024
  • (2023)Explaining Dataset Changes for Semantic Data Versioning with Explain-Da-VProceedings of the VLDB Endowment10.14778/3583140.358316916:6(1587-1600)Online publication date: 20-Apr-2023
  • (2023)A SQL Synthesis System with Operator HandlerProceedings of the 2023 7th International Conference on Computer Science and Artificial Intelligence10.1145/3638584.3638654(132-136)Online publication date: 8-Dec-2023
  • (2023)Extremal Fitting Problems for Conjunctive QueriesProceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3584372.3588655(89-98)Online publication date: 18-Jun-2023
  • (2022)SQL query extensions for imprecise questionsData & Knowledge Engineering10.1016/j.datak.2021.101944137:COnline publication date: 1-Jan-2022
  • (2022)Interactively discovering and ranking desired tuples by data explorationThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-021-00714-031:4(753-777)Online publication date: 18-Jan-2022
  • (2021)On discovering semantics of user-defined functions in data processing workflowsProceedings of the International Workshop on Big Data in Emergent Distributed Environments10.1145/3460866.3461771(1-6)Online publication date: 20-Jun-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media