research-article

LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans

Authors:

Yaofeng TuAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 16, Issue 7

Pages 1777 - 1789

https://doi.org/10.14778/3587136.3587150

Published: 01 March 2023 Publication History

Abstract

Query optimization based on deep reinforcement learning (DRL) has become a hot research topic recently. Despite the achieved promising progress, DRL optimizers still face great challenges of robustly producing efficient plans, due to the vast search space for both join order and operator selection and the highly varying execution latency taken as the feedback signal. In this paper, we propose LOGER, a learned optimizer towards generating efficient and robust plans, aiming at producing both efficient join orders and operators. LOGER first utilizes Graph Transformer to capture relationships between tables and predicates. Then, the search space is reorganized, in which LOGER learns to restrict specific operators instead of directly selecting one for each join, while utilizing DBMS built-in optimizer to select physical operators under the restrictions. Such a strategy exploits expert knowledge to improve the robustness of plan generation while offering sufficient plan search flexibility. Furthermore, LOGER introduces ε-beam search, which keeps multiple search paths that preserve promising plans while performing guided exploration. Finally, LOGER introduces a loss function with reward weighting to further enhance performance robustness by reducing the fluctuation caused by poor operators, and log transformation to compress the range of rewards. We conduct experiments on Join Order Benchmark (JOB), TPC-DS and Stack Overflow, and demonstrate that LOGER can achieve a performance better than existing learned query optimizers, with a 2.07x speedup on JOB compared with PostgreSQL.

References

[1]

Emmanuel Abbe. 2017. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research 18, 1 (2017), 6446--6531.

Digital Library

[2]

Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, Nov (2002), 397--422.

Digital Library

[3]

Riccardo Cappuzzo, Paolo Papotti, and Saravanan Thirumuruganathan. 2020. Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 1335--1349.

Digital Library

[4]

Nan Ding and Radu Soricut. 2017. Cold-start reinforcement learning with softmax policy gradient. Advances in Neural Information Processing Systems 30 (2017).

[5]

Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).

[6]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.

Digital Library

[7]

Toshihide Ibaraki and Tiko Kameda. 1984. On the Optimal Nesting Order for Computing N-Relational Joins. ACM Trans. Database Syst. 9, 3 (1984), 482--502.

Digital Library

[8]

John J Irwin, Teague Sterling, Michael M Mysinger, Erin S Bolstad, and Ryan G Coleman. 2012. ZINC: a free tool to discover chemistry for biology. Journal of chemical information and modeling 52, 7 (2012), 1757--1768.

[9]

Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996), 237--285.

Digital Library

[10]

Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR 2017.

[11]

Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD 2018. 489--504.

[12]

Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. 2018. Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196 (2018).

[13]

Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? Proceedings of the VLDB Endowment 9, 3 (2015), 204--215.

Digital Library

[14]

Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. Proc. VLDB Endow. 12, 12 (2019), 2118--2130.

Digital Library

[15]

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2022. Bao: Making learned query optimization practical. ACM SIGMOD Record 51, 1 (2022), 6--13.

Digital Library

[16]

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A learned query optimizer. arXiv preprint arXiv:1904.03711 (2019).

[17]

Ryan Marcus and Olga Papaemmanouil. 2018. Deep reinforcement learning for join order enumeration. In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. 1--4.

Digital Library

[18]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[19]

Meikel Poess, Bryan Smith, Lubor Kollar, and Paul Larson. 2002. Tpc-ds, taking decision support benchmarking to the next level. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data. 582--587.

Digital Library

[20]

Daniel J Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen, et al. 2018. A tutorial on thompson sampling. Foundations and Trends^® in Machine Learning 11, 1 (2018), 1--96.

Digital Library

[21]

Ibrahim Sabek, Tenzin Samten Ukyab, and Tim Kraska. 2022. LSched: A Workload-Aware Learned Query Scheduler for Analytical Database Systems. In SIGMOD 2022. ACM, 1228--1242.

Digital Library

[22]

P Griffiths Selinger, Morton M Astrahan, Donald D Chamberlin, Raymond A Lorie, and Thomas G Price. 1989. Access path selection in a relational database management system. In Readings in Artificial Intelligence and Databases. Elsevier, 511--522.

[23]

Ji Sun and Guoliang Li. 2019. An End-to-End Learning-based Cost Estimator. Proc. VLDB Endow. 13, 3 (2019), 307--319.

Digital Library

[24]

Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015).

[25]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS 2017. 5998--6008.

Digital Library

[26]

Christopher John Cornish Hellaby Watkins. 1989. Learning from delayed rewards. (1989).

[27]

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 32, 1 (2021), 4--24.

[28]

Zongheng Yang, Wei-Lin Chiang, Sifei Luan, Gautam Mittal, Michael Luo, and Ion Stoica. 2022. Balsa: Learning a Query Optimizer Without Expert Demonstrations. arXiv preprint arXiv:2201.01441 (2022).

[29]

Xiang Yu, Guoliang Li, Chengliang Chai, and Nan Tang. 2020. Reinforcement learning with tree-lstm for join order selection. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1297--1308.

Cited By

Lehmann CSulimov PStockinger K(2024)Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning PerspectiveProceedings of the VLDB Endowment10.14778/3654621.365462517:7(1565-1577)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654625
Saxena PSabek ISpedalieri F(2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3665225.3665447
Yi ZTian YIves ZMarcus R(2024)Low Rank Approximation for Learned Query OptimizationProceedings of the Seventh International Workshop on Exploiting Artificial Intelligence Techniques for Data Management10.1145/3663742.3663974(1-5)Online publication date: 14-Jun-2024
https://dl.acm.org/doi/10.1145/3663742.3663974
Show More Cited By

Recommendations

Generating query plans for distributed query processing using genetic algorithm
ICICA'11: Proceedings of the Second international conference on Information Computing and Applications

Query Processing is a key determinant in the overall performance of distributed databases. It requires processing of data at their respective sites and transmission of the same between them. These together constitute a distributed query processing ...
Turbo-charging SPJ query plans with learned physical join operator selections

The optimization of select-project-join (SPJ) queries entails two major challenges: (i) finding a good join order and (ii) selecting the best-fitting physical join operator for each single join within the chosen join order. Previous work mainly focuses ...
Efficient mid-query re-optimization of sub-optimal query execution plans

For a number of reasons, even the best query optimizers can very often produce sub-optimal query execution plans, leading to a significant degradation of performance. This is especially true in databases used for complex decision support queries and/or ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 16, Issue 7

March 2023

203 pages

ISSN:2150-8097

Editors:
Georgia Koutrika
Athena Research Center
,
Jun Yang
Duke University

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2023

Published in PVLDB Volume 16, Issue 7

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

11
Total Citations
View Citations
253
Total Downloads

Downloads (Last 12 months)180
Downloads (Last 6 weeks)18

Reflects downloads up to 13 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lehmann CSulimov PStockinger K(2024)Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning PerspectiveProceedings of the VLDB Endowment10.14778/3654621.365462517:7(1565-1577)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.14778/3654621.3654625
Saxena PSabek ISpedalieri F(2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3665225.3665447
Yi ZTian YIves ZMarcus R(2024)Low Rank Approximation for Learned Query OptimizationProceedings of the Seventh International Workshop on Exploiting Artificial Intelligence Techniques for Data Management10.1145/3663742.3663974(1-5)Online publication date: 14-Jun-2024
https://dl.acm.org/doi/10.1145/3663742.3663974
Zhang HLiu ZChen BZhao YZhao TYang TCui B(2024)CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation ModelsProceedings of the ACM on Management of Data10.1145/36393062:1(1-28)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639306
Chu YLiu YZhang YHou XYu LPeng ZSerra ESpezzano F(2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
https://dl.acm.org/doi/10.1145/3627673.3679742
Zhu RWeng LDing BZhou JBarcelo PSanchez-Pi NMeliou ASudarshan S(2024)Learned Query Optimizer: What is New and What is NextCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654692(561-569)Online publication date: 9-Jun-2024
https://dl.acm.org/doi/10.1145/3626246.3654692
Li SCai PShen YHu HZhang RZhou XQing XZhao R(2024)SPQO: Learning to Safely Reuse Cached Plans for Dynamic WorkloadsDatabase Systems for Advanced Applications10.1007/978-981-97-5552-3_21(315-330)Online publication date: 2-Jul-2024
https://dl.acm.org/doi/10.1007/978-981-97-5552-3_21
Chang BKamali AKantere V(2024)A Novel Technique for Query Plan Representation Based on Graph Neural NetsBig Data Analytics and Knowledge Discovery10.1007/978-3-031-68323-7_25(299-314)Online publication date: 26-Aug-2024
https://dl.acm.org/doi/10.1007/978-3-031-68323-7_25
Zhang HZhao PMiao XShao YLiu ZYang TCui B(2023)Experimental Analysis of Large-Scale Learnable Vector Storage CompressionProceedings of the VLDB Endowment10.14778/3636218.363623417:4(808-822)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.14778/3636218.3636234
Reiner SGrossniklaus M(2023)Sample-Efficient Cardinality Estimation Using Geometric Deep LearningProceedings of the VLDB Endowment10.14778/3636218.363622917:4(740-752)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.14778/3636218.3636229
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents