Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/375663.375749acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Exploiting constraint-like data characterizations in query optimization

Published: 01 May 2001 Publication History

Abstract

Query optimizers nowadays draw upon many sources of information about the database to optimize queries. They employ runtime statistics in cost-based estimation of query plans. They employ integrity constraints in the query rewrite process. Primary and foreign key constraints have long played a role in the optimizer, both for rewrite opportunities and for providing more accurate cost predictions. More recently, other types of integrity constraints are being exploited by optimizers in commercial systems, for which certain semantic query optimization techniques have now been implemented.
These new optimization strategies that exploit constraints hold the promise for good improvement. Their weakness, however, is that often the “constraints” that would be useful for optimization for a given database and workload are not explicitly available for the optimizer. Data mining tools can find such “constraints” that are true of the database, but then there is the question of how this information can be kept by the database system, and how to make this information available to, and effectively usable by, the optimizer.
We present our work on soft constraints in DB2. A soft constraint is a syntactic statement equivalent to an integrity constraint declaration. A soft constraint is not really a constraint, per se, since future updates may undermine it. While a soft constraint is valid, however, it can be used by the optimizer in the same way integrity constraints are. We present two forms of soft constraint: absolute and statistical. An absolute soft constraint is consistent with respect to the current state of the database, just in the same way an integrity constraint must be. They can be used in rewrite, as well as in cost estimation. A statistical soft constraint differs in that it may have some degree of violation with respect to the state of the database. Thus, statistical soft constraints cannot be used in rewrite, but they can still be used in cost estimation.
We are working long-term on absolute soft constraints. We discuss the issues involved in implementing a facility for absolute soft constraints in a database system (and in DB2), and the strategies that we are researching. The current DB2 optimizer is more amenable to adding facilities for statistical soft constraints. In the short-term, we have been implementing pathways in the optimizer for statistical soft constraints. We discuss this implementation.

References

[1]
D. Bitton, J. Millman, and S. Torgersen. A feasibility and performance study of dependency inference. In Proceedings of the Fifth ICDE, pages 635-641, Los Angeles, California, USA, 1989. IEEE Computer Society.]]
[2]
S. Ceri, P. Fraternali, S. Paraboschi, and L. Tanca. Automatic generation of production rules for integrity maintenance. TODS, 19(3):367-422, 1994.]]
[3]
S. Ceri and J. Widom. Deriving production rules for constraint maintanance. In Proceedings of the 16th VLDB, pages 577-589, Brisbane, Australia, 1990.]]
[4]
U. Chakravarthy, J. Grant, and J. Minker. Logic-based approach to semantic query optimization. ACM TODS, 15(2):162-207, June 1990.]]
[5]
I.-M. A. Chen and R. C. Lee. An approach to deriving object hierarchies from database schema and contents. In Proceedings of the 6th ISMIS, pages 112-121, Charlotte, NC, 1991.]]
[6]
Q. Cheng J. Gryz, F. Koo, C. Lenng, L. Liu, X. Qian, and B. Schiefer. Implementation of two semantic query optimization techniques in DB2 UDB. In Proc. of the 25th VLDB, pages 687-698, Edinburgh, Scotland, 1999.]]
[7]
W. Chu, R. C. Lee, and Q. Chen. Using type inference and induced rules to provide intensional answers. In Proceedings of the 7th ICDE, pages 396-403, Kobe, Japan, 1991.]]
[8]
J. Edmonds, J. Gryz, D. Liang, and R. J. Miller. Mining for empty rectangles in large data sets. In Proceedings of the 8th ICDT, pages 174-188, London, UK, 2001.]]
[9]
P. Godffey, J. Grant, J. Gryz, and J. Minker. Integrity constraints: Semantics and applications. In G. Saake and J. Chomicki, editors, Logics for Databases and Information Systems, pages 265-307. Kluwer Academic, 1998.]]
[10]
J. Gryz, B. Schiefer, J. Zheng, and C. Zuzarte. Discovery and application of check constraints in DB2. In Proceedings of ICDE, Heidelberg, Germany, 2001.]]
[11]
P. Haas, J. Nanghton, S. Seshadri, and L. Stokes. Sampling-based estimation of the number of distinct values of an attribute. In Proceedings of VLDB, pages 311-322, Zurich, Switzerland, 1995.]]
[12]
J. Han, Y. Cai, and N. Cercone. Knowledge discovery in databases: An attribute-oriented approach. In Proceedings of the 18th VLDB, pages 547-559, Vancouver, Canada, 1992.]]
[13]
C. N. Hsu and C. A. Knoblock. Using inductive learning to generate rules for semantic query optimization. In Advances in Knowledge Discovery and Data Mining, pages 425-445. AAAI/MIT Press, 1996.]]
[14]
Y. Huhtala, J. Karkkainen, P. Porkka, and H. Toivonen. Efficient discovery of functional and approximate dependencies using partitions. In Proceedings of 14th ICDE, pages 392-401, Orlando, FL, Feb. 1998.]]
[15]
Y. E. Ioannidis. Universality of serial histograms. In Proceedings of 19th VLDB, pages 256-267, Dublin, Ireland, 1993.]]
[16]
Y. E. Ioannidis and S. Christodoulakis. Optimal histograms for limiting worst-case error propagation in the size of the join results. TODS, 18(4):709-748, 1993.]]
[17]
J. King. Quist: A system for semantic query optimization in relational databases. In Proc. 7th VLDB, pages 510-517, Cannes, France, Sept. 1981.]]
[18]
R. Lipton, J. Nanghton, and D. Schneider. Practical selectivity estimation through adaptive sampling. In Proc. of Sigmod, pages 40-46, 1990.]]
[19]
S. Lopes, J. Petit, and L. Lakhal. Efficient discovery of functional dependencies and armstrong relations. In Proceedings of the 6th International Conference on Extending Database Technology, pages 350-364, Konstanz, Germany, 2000. Springer.]]
[20]
H. Mannila and K.-J. Raiha. Algorithms for inferring functional dependencies from relations. Data and Knowledge Engineering, 12(1):83-89, 1994.]]
[21]
M. Mannino, P. Chu, and T. Sager. Statistical profile estimation in database. A CM Computing Surveys, 20(3):191-221, 1988.]]
[22]
N. Novelli and R. Cicchetti. Fun: An efficient algorithm for mining functional and embedded dependencies. In Proceedings of the 8th ICDT, pages 189-203, London, UK, 2001.]]
[23]
SQL Reference Manual, Oracle 8i, Realease 8.1.5. 500 Oracle Parkway, Redwood City, CA 94065, 1999.]]
[24]
G. N. Panlley and P.-,. Larson. Exploiting uniqueness in query optimization. In Proceedings of the 1Oth ICDE, pages 68-79, 1994.]]
[25]
V. Poosala, Y. Ioannidis, P. Haas, and E. Shekita. Improved histograms for selectivity estimation of range predicates. In Proceedings of SIGMOD, pages 294-305, Montreal, Canada, 1996.]]
[26]
I. Savnik and P. Flach. Bottom-up induction of functional dependencies from relations. In G. Piatetsky-Shapiro, editor, Knowledge Discovery in Databases, pages 284-290. Morgan Kanfman Pub, 1993.]]
[27]
S. Shekar, B. Hamidzadeh, A. Kohli, and M. Coyle. Learning transformation rules for semantic query optimization. TKDE, 5(6):950-964, Dec. 1993.]]
[28]
M. Siegel. Automatic rule derivation for semantic query optimization. In Proceedings of the 2nd International Conference on Expert Database Systems, pages 371-386, Vienna, Virginia, 1988.]]
[29]
D. Simmen, E. Shekita, and T. Malkems. Fundamental techniques for order optimization. In Proceedings of SIGMOD, pages 57-67, 1996.]]
[30]
C. T. Yu and W. Sun. Automatic knowledge acquisition and maintenance for semantic query optimization. IEEE Transactions on Knowledge and Data Engineering, 1(3):362-375, Sept. 1989]]

Cited By

View all
  • (2019)Designing Succinct Secondary Indexing Mechanism by Exploiting Column CorrelationsProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3319861(1223-1240)Online publication date: 25-Jun-2019
  • (2016)Skycube Materialization Using the Topmost Skyline or Functional DependenciesACM Transactions on Database Systems10.1145/295509241:4(1-40)Online publication date: 2-Nov-2016
  • (2013)An Architecture for Query Optimization Using Association Rule MiningIntelligence Methods and Systems Advancements for Knowledge-Based Business10.4018/978-1-4666-1873-2.ch016(281-304)Online publication date: 2013
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
May 2001
630 pages
ISBN:1581133324
DOI:10.1145/375663
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS01
Sponsor:

Acceptance Rates

SIGMOD '01 Paper Acceptance Rate 44 of 293 submissions, 15%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)Designing Succinct Secondary Indexing Mechanism by Exploiting Column CorrelationsProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3319861(1223-1240)Online publication date: 25-Jun-2019
  • (2016)Skycube Materialization Using the Topmost Skyline or Functional DependenciesACM Transactions on Database Systems10.1145/295509241:4(1-40)Online publication date: 2-Nov-2016
  • (2013)An Architecture for Query Optimization Using Association Rule MiningIntelligence Methods and Systems Advancements for Knowledge-Based Business10.4018/978-1-4666-1873-2.ch016(281-304)Online publication date: 2013
  • (2011)An Architecture for Query Optimization Using Association Rule MiningInternational Journal of Knowledge-Based Organizations10.4018/ijkbo.20111001031:4(32-55)Online publication date: 1-Oct-2011
  • (2010)Query optimization in large databases using association rule miningProceedings of the 48th annual ACM Southeast Conference10.1145/1900008.1900123(1-2)Online publication date: 15-Apr-2010
  • (2010)Toward a verified relational database management systemACM SIGPLAN Notices10.1145/1707801.170632945:1(237-248)Online publication date: 17-Jan-2010
  • (2010)Toward a verified relational database management systemProceedings of the 37th annual ACM SIGPLAN-SIGACT symposium on Principles of programming languages10.1145/1706299.1706329(237-248)Online publication date: 17-Jan-2010
  • (2009)Correlation mapsProceedings of the VLDB Endowment10.14778/1687627.16877652:1(1222-1233)Online publication date: 1-Aug-2009
  • (2009)Query by outputProceedings of the 2009 ACM SIGMOD International Conference on Management of data10.1145/1559845.1559902(535-548)Online publication date: 29-Jun-2009
  • (2006)On Simplification of Database Integrity ConstraintsFundamenta Informaticae10.5555/1227517.122751971:4(371-417)Online publication date: 1-Mar-2006
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media