Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/375663.375748acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
Article

Orthogonal optimization of subqueries and aggregation

Published: 01 May 2001 Publication History

Abstract

There is considerable overlap between strategies proposed for subquery evaluation, and those for grouping and aggregation. In this paper we show how a number of small, independent primitives generate a rich set of efficient execution strategies —covering standard proposals for subquery evaluation suggested in earlier literature. These small primitives fall into two main, orthogonal areas: Correlation removal, and efficient processing of outerjoins and GroupBy. An optimization approach based on these pieces provides syntax-independence of query processing with respect to subqueries, i. e. equivalent queries written with or without subquery produce the same efficient plan.
We describe techniques implemented in Microsoft SQL Server (releases 7.0 and 8.0) for queries containing sub-queries and/or aggregations, based on a number of orthogonal optimizations. We concentrate separately on removing correlated subqueries, also called “query flattening,” and on efficient execution of queries with aggregations. The end result is a modular, flexible implementation, which produces very efficient execution plans. To demonstrate the validity of our approach, we present results for some queries from the TPC-H benchmark. From all published TPC-H results in the 300GB scale, at the time of writing (November 2000), SQL Server has the fastest results on those queries, even on a fraction of the processors used by other systems.

References

[1]
P. Celis and H. Zeller. Subquery elimination: A complete unnesting algorithm for an extended relational algebra. In Proceedings of the Thirteenth International Conference on Data Engineering, April 7-11, 1997 Birmingham U.K, page 321, 1997.
[2]
D. Chatziantoniou and K. A. Ross. Groupwise processing of relational queries. In Proceedings of the 23rd International Conference on Very Large Databases, Athens, pages 476-485, 1997.
[3]
S. Chaudhuri and K. Shim. Including Group-By in query optimization. In Proceedings of the Twentieth International Conference on Very Large Databases, Santiago, pages 354-366, 1994.
[4]
S. Cluet and C. Delobel. A general framework for the optimization of object-oriented queries. In Proceedings of ACM SIGMOD 1992, pages 383-392, 1992.
[5]
U. Dayal. Of nests and trees: A unified approach to processing queries that contain nested subqueries, aggregates, and quantifiers. In Proceedings of the Thirteenth International Conference on Very Large Databases, Brighton, pages 197-208, 1987.
[6]
C. A. Galindo-Legaria. Parameterized queries and nesting equivalences. Technical report, Microsoft, 2001. MSR-TR-2000-31.
[7]
C. A. Galindo-Legaria and A. Rosenthal. Outerjoin simplification and reordering for query optimization. ACM Transactions on Database Systems, 22(1):43-73, Mar. 1997.
[8]
G. Graefe. The Cascades framework for query optimization. Data Engineering Bulletin, 18(3):19-29, 1995.
[9]
G. Graefe and W. J. McKenna. The volcano optimizer generator: Extensibility and efficient. In Proceedings of the Ninth International Conference on Data Engineering, Viena, Austria, pages 209-218, 1993.
[10]
M. M. Joshi and C. A. Galindo-Legaria. Properties of the GroupBy/Aggregate relational operator. Technical report, Microsoft, 2001. MSR-TR-2001-13.
[11]
W. Kim. On optimizing an SQL-like nested query. ACM Transactions on Database Systems, 7(3):443-469, Sept. 1982.
[12]
T. Leung, G. Mitchell, B. Subramanian, B. Vance, S. L. Vandenberg, and S. B. Zdonik. The AQUA data model and algebra. In DBPL, pages 157-175, 1993.
[13]
J. Melton and A. R. Simon. Understanding the new SQL: A complete guide. Morgan Kaufmann, San Francisco, 1993.
[14]
H. Pirahesh, J. M. Hellerstein, and W. Hasan. Extensible/rule based query rewrite optimization in starburst. In Proceedings of ACM SIGMOD 1992, pages 39-48, 1992.
[15]
P. Seshadri, H. Pirahesh, and T. Y. C. Leung. Complex query decorrelation. In Proceedings of the Twelfth International Conference on Data Engineering, New Orleans, Luisiana, pages 450-458, 1996.
[16]
G. Shaw and S. Zdonik. An object-oriented query algebra. In Proceedings of the Second International Workshop on Database Programming Languages, pages 249-225, 1989.
[17]
Q. Wang, D. Maier, and L. Shapiro. Algebraic unnesting of nested object queries. Technical report, Oregon Graduate Institute, 1999. CSE-99-013.
[18]
Y. P. Yan and P. A. Larson. Eager aggregation and lazy aggregation. In Proceedings of the 21st International Conference on Very Large Databases, Zurich, pages 345-357, 1995.

Cited By

View all
  • (2023)EVA: An End-to-End Exploratory Video Analytics SystemProceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning10.1145/3595360.3595858(1-5)Online publication date: 18-Jun-2023
  • (2022)Computation Reuse via Fusion in Amazon Athena2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00166(1610-1620)Online publication date: May-2022
  • (2021)One WITH RECURSIVE is Worth Many GOTOsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457272(723-735)Online publication date: 9-Jun-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '01: Proceedings of the 2001 ACM SIGMOD international conference on Management of data
May 2001
630 pages
ISBN:1581133324
DOI:10.1145/375663
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 May 2001

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SIGMOD/PODS01
Sponsor:

Acceptance Rates

SIGMOD '01 Paper Acceptance Rate 44 of 293 submissions, 15%;
Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)69
  • Downloads (Last 6 weeks)14
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)EVA: An End-to-End Exploratory Video Analytics SystemProceedings of the Seventh Workshop on Data Management for End-to-End Machine Learning10.1145/3595360.3595858(1-5)Online publication date: 18-Jun-2023
  • (2022)Computation Reuse via Fusion in Amazon Athena2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00166(1610-1620)Online publication date: May-2022
  • (2021)One WITH RECURSIVE is Worth Many GOTOsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457272(723-735)Online publication date: 9-Jun-2021
  • (2019)Incorporating super-operators in big-data query optimizersProceedings of the VLDB Endowment10.14778/3368289.336829913:3(348-361)Online publication date: 1-Nov-2019
  • (2018)Efficient generation of query plans containing group-by, join, and groupjoinThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-017-0476-327:5(617-641)Online publication date: 1-Oct-2018
  • (2017)FroidProceedings of the VLDB Endowment10.1145/3186728.316414011:4(432-444)Online publication date: 1-Dec-2017
  • (2017)Optimizing Big-Data Queries Using Program SynthesisProceedings of the 26th Symposium on Operating Systems Principles10.1145/3132747.3132773(631-646)Online publication date: 14-Oct-2017
  • (2017)Category- and selection-enabled nearest neighbor joinsInformation Systems10.1016/j.is.2017.01.00668(3-16)Online publication date: Aug-2017
  • (2016)Optimization of Nested Queries using the NF2 AlgebraProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2915241(1765-1780)Online publication date: 26-Jun-2016
  • (2016)Extracting Equivalent SQL from Imperative Code in Database ApplicationsProceedings of the 2016 International Conference on Management of Data10.1145/2882903.2882926(1781-1796)Online publication date: 26-Jun-2016
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media