Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3035918.3064039acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Solving the Join Ordering Problem via Mixed Integer Linear Programming

Published: 09 May 2017 Publication History

Abstract

We transform join ordering into a mixed integer linear program (MILP). This allows to address query optimization by mature MILP solver implementations that have evolved over decades and steadily improved their performance. They offer features such as anytime optimization and parallel search that are highly relevant for query optimization.
We present a MILP formulation for searching left-deep query plans. We use sets of binary variables to represent join operands and intermediate results, operator implementation choices or the presence of interesting orders. Linear constraints restrict value assignments to the ones representing valid query plans. We approximate the cost of scan and join operations via linear functions, allowing to increase approximation precision up to arbitrary degrees. We integrated a prototypical implementation of our approach into the Postgres optimizer and compare against the original optimizer and several variants. Our experimental results are encouraging: we are able to optimize queries joining 40 tables within less than one minute of optimization time. Such query sizes are far beyond the capabilities of traditional query optimization algorithms with worst case guarantees on plan quality. Furthermore, as we use an existing solver, our optimizer implementation is small and can be integrated with low overhead.

References

[1]
S. Agarwal, B. Mozafari, and A. Panda. BlinkDB: queries with bounded errors and bounded response times on very large data. In European Conf. on Computer Systems, pages 29--42, 2013.
[2]
P. Beame, P. Koutris, and D. Suciu. Skew in parallel query processing. In PODS, pages 212--223, 2014.
[3]
K. Bennett, M. Ferris, and Y. Ioannidis. A genetic algorithm for database query optimization. 1991.
[4]
J. Bisschop. Integer Linear Programming Tricks. In AIMMS: Optimization Modeling, page 75ff. 215.
[5]
R. E. Bixby. A Brief History of Linear and Mixed-Integer Programming Computation. Documenta Mathematica, pages 107--121, 2012.
[6]
N. Bruno. Polynomial heuristics for query optimization. In ICDE, pages 589--600, 2010.
[7]
S. Chatterji and S. Evani. On the complexity of approximate query optimization. In PODS, pages 282--292, 2002.
[8]
S. Chaudhuri. Query optimizers: time to rethink the contract? In SIGMOD, pages 961--968, 2009.
[9]
S. Chaudhuri and K. Shim. Optimization of queries with user-defined predicates. ACM Transactions on Database Systems, 24(2):177--228, 1999.
[10]
S. Cluet and G. Moerkotte. On the complexity of generating optimal left-deep processing trees with cross products. In ICDT, pages 54--67, 1995.
[11]
T. Dokeroglu, M. A. Bayir, and A. Cosar. Integer linear programming solution for the multiple query optimization problem. In Information Sciences and Systems, pages 51--60. 2014.
[12]
S. Ganguly. Design and analysis of parametric query optimization algorithms. In VLDB, pages 228--238, 1998.
[13]
W.-S. Han, W. Kwak, J. Lee, G. M. Lohman, and V. Markl. Parallelizing query optimization. In VLDB, pages 188--200, 2008.
[14]
W.-S. Han and J. Lee. Dependency-aware reordering for parallelizing query optimization in multi-core CPUs. In SIGMOD, pages 45--58, 2009.
[15]
J. M. Hellerstein and M. Stonebraker. Predicate migration: optimizing queries with expensive predicates. SIGMOD, 22(2):267--276, 1993.
[16]
A. Hulgeri and S. Sudarshan. Parametric query optimization for linear and piecewise linear cost functions. In VLDB, pages 167--178, 2002.
[17]
A. Hulgeri and S. Sudarshan. AniPQO: almost non-intrusive parametric query optimization for nonlinear cost functions. In VLDB, pages 766--777, 2003.
[18]
Y. E. Ioannidis and Y. Kang. Randomized algorithms for optimizing large join queries. SIGMOD Record, 19(2):312--321, 1990.
[19]
R. Kaushik, C. Ré, and D. Suciu. General database statistics using entropy maximization. In Database Programming Languages, pages 84--99. 2009.
[20]
A. Kemper, G. Moerkotte, K. Peithner, and M. Steinbrunn. Optimizing disjunctive queries with expensive predicates. SIGMOD Record, 23(2):336--347, 1994.
[21]
J. A. Lawrence and B. A. Pasternack. Applied Management Science. 1997.
[22]
G. Moerkotte and T. Neumann. Analysis of two existing and one new dynamic programming algorithm for the generation of optimal bushy join trees without cross products. In VLDB, pages 930--941, 2006.
[23]
G. Moerkotte and T. Neumann. Dynamic programming strikes back. In SIGMOD, pages 9--12, 2008.
[24]
M. Muralikrishna. Improved unnesting algorithms for join aggregate SQL queries. VLDB, pages 91--102, 1992.
[25]
K. Ono and G. Lohman. Measuring the complexity of join enumeration in query optimization. In VLDB, pages 314--325, 1990.
[26]
S. Papadomanolakis and A. Ailamaki. An integer linear programming approach to database design. In ICDEW, pages 442--449, 2007.
[27]
P. G. Selinger, M. M. Astrahan, D. D. Chamberlin, R. A. Lorie, and T. G. Price. Access path selection in a relational database management system. In SIGMOD, pages 23--34, 1979.
[28]
M. a. Soliman, M. Petropoulos, F. Waas, S. Narayanan, K. Krikellas, R. Baldwin, L. Antova, V. Raghavan, A. El-Helw, Z. Gu, E. Shen, G. C. Caragea, C. Garcia-Alvarado, and F. Rahman. Orca: A modular query optimizer architectur for big data. In SIGMOD, pages 337--348, 2014.
[29]
M. Steinbrunn, G. Moerkotte, and A. Kemper. Heuristic and randomized optimization for the join ordering problem. VLDBJ, 6(3):191--208, 1997.
[30]
A. Swami. Optimization of large join queries: combining heuristics and combinatorial techniques. SIGMOD, pages 367--376, 1989.
[31]
A. Swami and A. Gupta. Optimization of large join queries. In SIGMOD, pages 8--17, 1988.
[32]
I. Trummer and C. Koch. An incremental anytime algorithm for multi-objective query optimization. In SIGMOD, pages 1941--1953, 2015.
[33]
I. Trummer and C. Koch. Multi-objective parametric query optimization. VLDB, 8(3):221--232, 2015.
[34]
I. Trummer and C. Koch. Parallelizing query optimization on shared-nothing architectures. In VLDB, pages 660--671, 2016.
[35]
B. Vance and D. Maier. Rapid bushy join-order optimization with Cartesian products. SIGMOD, 25(2):35--46, 1996.
[36]
F. M. Waas and J. M. Hellerstein. Parallelizing extensible query optimizers. In SIGMOD, pages 871--882, 2009.
[37]
J. Yang, K. Karlapalem, and Q. Li. Algorithms for materialized view design in data warehousing environment. In VLDB, pages 136--145, 1997.

Cited By

View all
  • (2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
  • (2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
  • (2024)Quantum Data Management: From Theory to Opportunities2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00410(5376-5381)Online publication date: 13-May-2024
  • Show More Cited By

Index Terms

  1. Solving the Join Ordering Problem via Mixed Integer Linear Programming

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
    May 2017
    1810 pages
    ISBN:9781450341974
    DOI:10.1145/3035918
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 May 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. integer linear programming
    2. query optimization

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS'17
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)48
    • Downloads (Last 6 weeks)17
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
    • (2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
    • (2024)Quantum Data Management: From Theory to Opportunities2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00410(5376-5381)Online publication date: 13-May-2024
    • (2024)GLO: Towards Generalized Learned Query Optimization2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00368(4843-4855)Online publication date: 13-May-2024
    • (2023)Quantum-Inspired Digital Annealing for Join OrderingProceedings of the VLDB Endowment10.14778/3632093.363211217:3(511-524)Online publication date: 1-Nov-2023
    • (2023)Opportunities for Quantum Acceleration of Databases: Optimization of Queries and Transaction SchedulesProceedings of the VLDB Endowment10.14778/3598581.359860316:9(2344-2353)Online publication date: 10-Jul-2023
    • (2023)Ready to Leap (by Co-Design)? Join Order Optimisation on Quantum HardwareProceedings of the ACM on Management of Data10.1145/35889461:1(1-27)Online publication date: 30-May-2023
    • (2023)Fine-Tuning Data Structures for Query ProcessingProceedings of the 21st ACM/IEEE International Symposium on Code Generation and Optimization10.1145/3579990.3580016(149-161)Online publication date: 17-Feb-2023
    • (2023)Quantum Data Management and Quantum Machine Learning for Data Management: State-of-the-Art and Open ChallengesIntelligent Systems and Machine Learning10.1007/978-3-031-35081-8_20(252-261)Online publication date: 10-Jul-2023
    • (2022)Applicability of Quantum Computing on Database Query OptimizationProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3520257(2512-2514)Online publication date: 10-Jun-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media