Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

An inquiry into machine learning-based automatic configuration tuning services on real-world database management systems

Published: 01 March 2021 Publication History

Abstract

Modern database management systems (DBMS) expose dozens of configurable knobs that control their runtime behavior. Setting these knobs correctly for an application's workload can improve the performance and efficiency of the DBMS. But because of their complexity, tuning a DBMS often requires considerable effort from experienced database administrators (DBAs). Recent work on automated tuning methods using machine learning (ML) have shown to achieve better performance compared with expert DBAs. These ML-based methods, however, were evaluated on synthetic workloads with limited tuning opportunities, and thus it is unknown whether they provide the same benefit in a production environment.
To better understand ML-based tuning, we conducted a thorough evaluation of ML-based DBMS knob tuning methods on an enterprise database application. We use the OtterTune tuning service to compare three state-of-the-art ML algorithms on an Oracle installation with a real workload trace. Our results with OtterTune show that these algorithms generate knob configurations that improve performance by 45% over enterprise-grade configurations. We also identify deployment and measurement issues that were overlooked by previous research in automated DBMS tuning services.

References

[1]
2011. MySQL Tuning Primer Script. https://launchpad.net/mysql-tuning-primer.
[2]
2021. FIO: Flexible I/O Tester. https://fio.readthedocs.io/en/latest/fio_doc.html.
[3]
2021. OtterTune. https://ottertune.cs.cmu.edu.
[4]
2021. OtterTune - Automated Database Tuning Service. https://ottertune.com.
[5]
2021. PostgreSQL Configuration Wizard. https://pgtune.leopard.in.ua.
[6]
2021. Société Générale. https://www.societegenerale.com.
[7]
Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2000. Automated Selection of Materialized Views and Indexes in SQL Databases. In Proceedings of the 26th International Conference on Very Large Data Bases. 496--505.
[8]
Sanjay Agrawal, Vivek Narasayya, and Beverly Yang. 2004. Integrating Vertical and Horizontal Partitioning into Automated Physical Database Design. In Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data. 359--370.
[9]
James Bergstra and Yoshua Bengio. 2012. Random Search for Hyper-Parameter Optimization. Journal of Machine Learning Research 13, 10 (2012), 281--305.
[10]
Leo Breiman, Jerome Friedman, Charles J. Stone, and Richard A. Olshen. 1984. Classification and Regression Trees. CRC press.
[11]
Surajit Chaudhuri and Vivek Narasayya. 1998. AutoAdmin "What-If" Index Analysis Utility. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. 367--378.
[12]
Surajit Chaudhuri and Vivek Narasayya. 2007. Self-Tuning Database Systems: A Decade of Progress. In Proceedings of the 33rd International Conference on Very Large Data Bases. 3--14.
[13]
Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: A Workload-Driven Approach to Database Replication and Partitioning. Proceedings of the VLDB Endowment 3, 1-2 (2010), 48--57.
[14]
Benoît Dageville and Mohamed Zait. 2002. SQL Memory Management in Oracle9I. In Proceedings of the 28th International Conference on Very Large Data Bases. 962--973.
[15]
Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR. 84--94.
[16]
Djellel Eddine Difallah, Andrew Pavlo, Carlo Curino, and Philippe Cudré-Mauroux. 2013. OLTP-Bench: An Extensible Testbed for Benchmarking Relational Databases. Proceedings of the VLDB Endowment 7, 4 (2013), 277--288.
[17]
Songyun Duan,Vamsidhar Thummala,and Shivnath Babu. 2009. Tuning Database Configuration Parameters with iTuned. Proceedings of the VLDB Endowment 2, 1 (2009), 1246--1257.
[18]
Kurt Engeleiter, John Beresniewicz, and Cecilia Gervasio. 2010. Maximizing Database Performance: Performance Tuning with DB Time. Retrieved December 29, 2020 from https://www.oracle.com/technetwork/oem/db-mgmt/s317294-db-perf-tuning-with-db-time-181631.pdf
[19]
Leonidas Galanis, Supiti Buranawatanachoke, Romain Colle, Benoît Dageville, Karl Dias, Jonathan Klein, Stratos Papadomanolakis, Leng Leng Tan, Venkateshwaran Venkataramani, Yujun Wang, and Graham Wood. 2008. Oracle Database Replay. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data. 1159--1170.
[20]
Michael Hammer and Arvola Chan. 1976. Index Selection in a Self-Adaptive Data Base Management System. In Proceedings of the 1976 ACM SIGMOD International Conference on Management of Data. 1--8.
[21]
Michael Hammer and Bahram Niamir. 1979. A Heuristic Approach to Attribute Partitioning. In Proceedings of the 1979 ACM SIGMOD International Conference on Management of Data. 93--101.
[22]
Charles R. Hicks and Kenneth V. Turner. 1997. Fundamental Concepts in the Design of Experiments (5 ed.). Oxford University Press.
[23]
Windsor W. Hsu, Alan Jay Smith, and Honesty C. Young. 2001. Characteristics of Production Database Workloads and the TPC Benchmarks. IBM Systems Journal 40, 3 (2001), 781--802.
[24]
Konstantinos Kanellis, Ramnatthan Alagappan, and Shivaram Venkataraman. 2020. Too Many Knobs to Tune? Towards Faster Database Tuning by Pre-selecting Important Knobs. In 12th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 20).
[25]
Jens Krueger, Changkyu Kim, Martin Grund, Nadathur Satish, David Schwalb, Jatin Chhugani, Hasso Plattner, Pradeep Dubey, and Alexander Zeier. 2011. Fast Updates on Read-Optimized Databases Using Multi-Core CPUs. Proceedings of the VLDB Endowment 5, 1 (2011), 61--72.
[26]
Mayuresh Kunjir and Shivnath Babu. 2020. Black or White? How to Develop an AutoTuner for Memory-based Analytics. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1667--1683.
[27]
Eva Kwan, Sam Lightstone, Adam Storm, and Leanne Wu. 2002. Automatic Configuration for IBM DB2 Universal Database. Technical Report. IBM.
[28]
Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. Proceedings of the VLDB Endowment 12, 12 (2019), 2118--2130.
[29]
Andrew Pavlo, Carlo Curino, and Stanley Zdonik. 2012. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 61--72.
[30]
Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, Pieter Abbeel, and Marcin Andrychowicz. 2018. Parameter Space Noise for Exploration. In 6th International Conference on Learning Representations (ICLR 2018).
[31]
Carl Edward Rasmussen. 2003. Gaussian Processes in Machine Learning. In Summer School on Machine Learning. Springer, 63--71.
[32]
Jörg Schad, Jens Dittrich, and Jorge-Arnulfo Quiané-Ruiz. 2010. Runtime Measurements in the Cloud: Observing, Analyzing, and Reducing Variance. Proceedings of the VLDB Endowment 3, 1--2 (2010), 460--471.
[33]
Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P. Adams, and Nando De Freitas. 2016. Taking the Human Out of the Loop: A Review of Bayesian Optimization. Proc. IEEE 104, 1 (2016), 148--175.
[34]
Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Advances in Neural Information Processing Systems. 2951--2959.
[35]
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research 15, 56 (2014), 1929--1958.
[36]
Jian Tan, Tieying Zhang, Feifei Li, Jie Chen, Qixing Zheng, Ping Zhang, Honglin Qiao, Yue Shi, Wei Cao, and Rui Zhang. 2019. iBTune: Individualized Buffer Tuning for Large-Scale Cloud Databases. Proceedings ofthe VLDB Endowment 12, 10 (2019), 1221--1234.
[37]
Robert Tibshirani. 1996. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society: Series B (Methodological) 58, 1 (1996), 267--288.
[38]
Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-scale Machine Learning. In Proceedings of the 2017 ACM International Conference on Management of Data. 1009--1024.
[39]
Bohan Zhang. 2021. https://github.com/bohanjason/ottertune.
[40]
Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li. 2019. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning. In Proceedings of the 2019 International Conference on Management of Data. 415--432.
[41]
Xuanhe Zhou, Chengliang Chai, Guoliang Li, and Ji Sun. 2020. Database Meets Artificial Intelligence: A Survey. IEEE Transactions on Knowledge and Data Engineering (2020).
[42]
Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang. 2017. BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning. In Proceedings of the 2017 Symposium on Cloud Computing. 338--350.

Cited By

View all
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian OptimizationProceedings of the VLDB Endowment10.14778/3659437.365944917:8(1939-1952)Online publication date: 1-Apr-2024
  • (2024)CAMAL: Optimizing LSM-trees via Active LearningProceedings of the ACM on Management of Data10.1145/36771382:4(1-26)Online publication date: 30-Sep-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 14, Issue 7
March 2021
130 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 March 2021
Published in PVLDB Volume 14, Issue 7

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)163
  • Downloads (Last 6 weeks)26
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)GPTuner: A Manual-Reading Database Tuning System via GPT-Guided Bayesian OptimizationProceedings of the VLDB Endowment10.14778/3659437.365944917:8(1939-1952)Online publication date: 1-Apr-2024
  • (2024)CAMAL: Optimizing LSM-trees via Active LearningProceedings of the ACM on Management of Data10.1145/36771382:4(1-26)Online publication date: 30-Sep-2024
  • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
  • (2024)Tuning Database Parameters Using Query Perception and Evolutionary Reinforcement LearningProceedings of the 2024 8th International Conference on Control Engineering and Artificial Intelligence10.1145/3640824.3640869(52-59)Online publication date: 26-Jan-2024
  • (2024)Mozi: Discovering DBMS Bugs via Configuration-Based Equivalent TransformationProceedings of the IEEE/ACM 46th International Conference on Software Engineering10.1145/3597503.3639112(1-12)Online publication date: 20-May-2024
  • (2024)LATuner: An LLM-Enhanced Database Tuning System Based on Adaptive Surrogate ModelMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70362-1_22(372-388)Online publication date: 8-Sep-2024
  • (2023)A Sample-Aware Database Tuning System With Deep Reinforcement LearningJournal of Database Management10.4018/JDM.33351935:1(1-25)Online publication date: 9-Nov-2023
  • (2023)An Efficient Transfer Learning Based Configuration Adviser for Database TuningProceedings of the VLDB Endowment10.14778/3632093.363211417:3(539-552)Online publication date: 1-Nov-2023
  • (2023)Demonstrating Waffle: A Self-Driving Grid IndexProceedings of the VLDB Endowment10.14778/3611540.361159416:12(3954-3957)Online publication date: 1-Aug-2023
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media