Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

LOGER: A Learned Optimizer Towards Generating Efficient and Robust Query Execution Plans

Published: 01 March 2023 Publication History

Abstract

Query optimization based on deep reinforcement learning (DRL) has become a hot research topic recently. Despite the achieved promising progress, DRL optimizers still face great challenges of robustly producing efficient plans, due to the vast search space for both join order and operator selection and the highly varying execution latency taken as the feedback signal. In this paper, we propose LOGER, a learned optimizer towards generating efficient and robust plans, aiming at producing both efficient join orders and operators. LOGER first utilizes Graph Transformer to capture relationships between tables and predicates. Then, the search space is reorganized, in which LOGER learns to restrict specific operators instead of directly selecting one for each join, while utilizing DBMS built-in optimizer to select physical operators under the restrictions. Such a strategy exploits expert knowledge to improve the robustness of plan generation while offering sufficient plan search flexibility. Furthermore, LOGER introduces ε-beam search, which keeps multiple search paths that preserve promising plans while performing guided exploration. Finally, LOGER introduces a loss function with reward weighting to further enhance performance robustness by reducing the fluctuation caused by poor operators, and log transformation to compress the range of rewards. We conduct experiments on Join Order Benchmark (JOB), TPC-DS and Stack Overflow, and demonstrate that LOGER can achieve a performance better than existing learned query optimizers, with a 2.07x speedup on JOB compared with PostgreSQL.

References

[1]
Emmanuel Abbe. 2017. Community detection and stochastic block models: recent developments. The Journal of Machine Learning Research 18, 1 (2017), 6446--6531.
[2]
Peter Auer. 2002. Using confidence bounds for exploitation-exploration trade-offs. Journal of Machine Learning Research 3, Nov (2002), 397--422.
[3]
Riccardo Cappuzzo, Paolo Papotti, and Saravanan Thirumuruganathan. 2020. Creating Embeddings of Heterogeneous Relational Datasets for Data Integration Tasks. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 1335--1349.
[4]
Nan Ding and Radu Soricut. 2017. Cold-start reinforcement learning with softmax policy gradient. Advances in Neural Information Processing Systems 30 (2017).
[5]
Vijay Prakash Dwivedi and Xavier Bresson. 2020. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699 (2020).
[6]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.
[7]
Toshihide Ibaraki and Tiko Kameda. 1984. On the Optimal Nesting Order for Computing N-Relational Joins. ACM Trans. Database Syst. 9, 3 (1984), 482--502.
[8]
John J Irwin, Teague Sterling, Michael M Mysinger, Erin S Bolstad, and Ryan G Coleman. 2012. ZINC: a free tool to discover chemistry for biology. Journal of chemical information and modeling 52, 7 (2012), 1757--1768.
[9]
Leslie Pack Kaelbling, Michael L Littman, and Andrew W Moore. 1996. Reinforcement learning: A survey. Journal of artificial intelligence research 4 (1996), 237--285.
[10]
Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR 2017.
[11]
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD 2018. 489--504.
[12]
Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph Hellerstein, and Ion Stoica. 2018. Learning to optimize join queries with deep reinforcement learning. arXiv preprint arXiv:1808.03196 (2018).
[13]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter Boncz, Alfons Kemper, and Thomas Neumann. 2015. How good are query optimizers, really? Proceedings of the VLDB Endowment 9, 3 (2015), 204--215.
[14]
Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. Proc. VLDB Endow. 12, 12 (2019), 2118--2130.
[15]
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2022. Bao: Making learned query optimization practical. ACM SIGMOD Record 51, 1 (2022), 6--13.
[16]
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A learned query optimizer. arXiv preprint arXiv:1904.03711 (2019).
[17]
Ryan Marcus and Olga Papaemmanouil. 2018. Deep reinforcement learning for join order enumeration. In Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management. 1--4.
[18]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[19]
Meikel Poess, Bryan Smith, Lubor Kollar, and Paul Larson. 2002. Tpc-ds, taking decision support benchmarking to the next level. In Proceedings of the 2002 ACM SIGMOD international conference on Management of data. 582--587.
[20]
Daniel J Russo, Benjamin Van Roy, Abbas Kazerouni, Ian Osband, Zheng Wen, et al. 2018. A tutorial on thompson sampling. Foundations and Trends® in Machine Learning 11, 1 (2018), 1--96.
[21]
Ibrahim Sabek, Tenzin Samten Ukyab, and Tim Kraska. 2022. LSched: A Workload-Aware Learned Query Scheduler for Analytical Database Systems. In SIGMOD 2022. ACM, 1228--1242.
[22]
P Griffiths Selinger, Morton M Astrahan, Donald D Chamberlin, Raymond A Lorie, and Thomas G Price. 1989. Access path selection in a relational database management system. In Readings in Artificial Intelligence and Databases. Elsevier, 511--522.
[23]
Ji Sun and Guoliang Li. 2019. An End-to-End Learning-based Cost Estimator. Proc. VLDB Endow. 13, 3 (2019), 307--319.
[24]
Kai Sheng Tai, Richard Socher, and Christopher D Manning. 2015. Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015).
[25]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In NIPS 2017. 5998--6008.
[26]
Christopher John Cornish Hellaby Watkins. 1989. Learning from delayed rewards. (1989).
[27]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Trans. Neural Networks Learn. Syst. 32, 1 (2021), 4--24.
[28]
Zongheng Yang, Wei-Lin Chiang, Sifei Luan, Gautam Mittal, Michael Luo, and Ion Stoica. 2022. Balsa: Learning a Query Optimizer Without Expert Demonstrations. arXiv preprint arXiv:2201.01441 (2022).
[29]
Xiang Yu, Guoliang Li, Chengliang Chai, and Nan Tang. 2020. Reinforcement learning with tree-lstm for join order selection. In 2020 IEEE 36th International Conference on Data Engineering (ICDE). IEEE, 1297--1308.

Cited By

View all
  • (2024)Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning PerspectiveProceedings of the VLDB Endowment10.14778/3654621.365462517:7(1565-1577)Online publication date: 1-Mar-2024
  • (2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
  • (2024)Low Rank Approximation for Learned Query OptimizationProceedings of the Seventh International Workshop on Exploiting Artificial Intelligence Techniques for Data Management10.1145/3663742.3663974(1-5)Online publication date: 14-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 16, Issue 7
March 2023
203 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2023
Published in PVLDB Volume 16, Issue 7

Check for updates

Badges

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)180
  • Downloads (Last 6 weeks)18
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Is Your Learned Query Optimizer Behaving As You Expect? A Machine Learning PerspectiveProceedings of the VLDB Endowment10.14778/3654621.365462517:7(1565-1577)Online publication date: 1-Mar-2024
  • (2024)Constrained Quadratic Model for Optimizing Join OrdersProceedings of the 1st Workshop on Quantum Computing and Quantum-Inspired Technology for Data-Intensive Systems and Applications10.1145/3665225.3665447(38-44)Online publication date: 9-Jun-2024
  • (2024)Low Rank Approximation for Learned Query OptimizationProceedings of the Seventh International Workshop on Exploiting Artificial Intelligence Techniques for Data Management10.1145/3663742.3663974(1-5)Online publication date: 14-Jun-2024
  • (2024)CAFE: Towards Compact, Adaptive, and Fast Embedding for Large-scale Recommendation ModelsProceedings of the ACM on Management of Data10.1145/36393062:1(1-28)Online publication date: 26-Mar-2024
  • (2024)TESSM: Tree-based Selective State Space Models for Efficient Join Order Selection LearningProceedings of the 33rd ACM International Conference on Information and Knowledge Management10.1145/3627673.3679742(374-383)Online publication date: 21-Oct-2024
  • (2024)Learned Query Optimizer: What is New and What is NextCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654692(561-569)Online publication date: 9-Jun-2024
  • (2024)SPQO: Learning to Safely Reuse Cached Plans for Dynamic WorkloadsDatabase Systems for Advanced Applications10.1007/978-981-97-5552-3_21(315-330)Online publication date: 2-Jul-2024
  • (2024)A Novel Technique for Query Plan Representation Based on Graph Neural NetsBig Data Analytics and Knowledge Discovery10.1007/978-3-031-68323-7_25(299-314)Online publication date: 26-Aug-2024
  • (2023)Experimental Analysis of Large-Scale Learnable Vector Storage CompressionProceedings of the VLDB Endowment10.14778/3636218.363623417:4(808-822)Online publication date: 1-Dec-2023
  • (2023)Sample-Efficient Cardinality Estimation Using Geometric Deep LearningProceedings of the VLDB Endowment10.14778/3636218.363622917:4(740-752)Online publication date: 1-Dec-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media