Abstract
Configuring databases for efficient querying is a complex task, often carried out by a database administrator. Solving the problem of building indexes that truly optimize database access requires a substantial amount of database and domain knowledge, the lack of which often results in wasted space and memory for irrelevant indexes, possibly jeopardizing database performance for querying and certainly degrading performance for updating. In this paper, we develop the SmartIX architecture to solve the problem of automatically indexing a database by using reinforcement learning to optimize queries by indexing data throughout the lifetime of a database. We train and evaluate SmartIX performance using TPC-H, a standard, and scalable database benchmark. Our empirical evaluation shows that SmartIX converges to indexing configurations with superior performance compared to standard baselines we define and other reinforcement learning methods used in related work.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
TPC: http://www.tpc.org/
References
Amado L, Meneguzzi F (2017) Reinforcement learning applied to RTS games. In: Proceedings of the 2017 workshop on adaptive learning agents (ALA), pp 1–8. ALA 2017, São Paulo. Brazil
Basu D, Lin Q, Chen W, Vo H T, Yuan Z, Senellart P, Bressan S (2016) Regularized cost-model oblivious database tuning with reinforcement learning. In: Transactions on large-scale data-and knowledge-centered systems XXVIII. Springer, Berlin, pp 96–132
Bellman R (1978) An introduction to artificial intelligence: Can computers think? Boyd & Fraser Pub. Co The University of Michigan
DB Group at UCSC: DBTune (2019). https://github.com/dbgroup-at-ucsc/dbtune
Duan S, Thummala V, Babu S (2009) Tuning database configuration parameters with ituned. Proceedings of the VLDB Endowment 2(1):1246–1257
Elfayoumy S, Patel J (2012) Database performance monitoring and tuning using intelligent agent assistants. In: Proceedings of the International Conference on Information and Knowledge Engineering (IKE), pp 1. The steering committee of the world congress in computer science, Computer …
EnterpriseDB: Enterprise Database (2019). https://www.enterprisedb.com
Giardino C, Paternoster N, Unterkalmsteiner M, Gorschek T, Abrahamsson P (2016) Software development in startup companies: The greenfield startup model. IEEE Trans Softw Eng 42(6):585–604
Kraska T, Beutel A, Chi E H, Dean J, Polyzotis N (2018) The case for learned index structures. In: Proceedings of the 2018 international conference on management of data, pp 489–504. ACM
Licks GP, Couto J, Meneguzzi F, Ruiz D, de Fátima Miehe P, de Paris R (2019) mir-pucrs/smartix-rl v1.0. https://doi.org/10.5281/zenodo.3254967
Marcus R, Papaemmanouil O (2018) Deep reinforcement learning for join order enumeration. arXiv:1803.00055, 1–7
Mnih V, Kavukcuoglu K, Silver D, Graves A, Antonoglou I, Wierstra D, Riedmiller MA (2013) Playing atari with deep reinforcement learning. arXiv:1312.5602, 9
Neuhaus P, Couto J, Wehrmann J, Ruiz D, Meneguzzi F (2019) GADIS: A genetic algorithm for database index selection. In: The 31st international conference on software engineering and knowledge engineering (SEKE), pp 39–42. https://doi.org/10.18293/SEKE2019-135
Olofson C W (2018) Ensuring a fast, reliable, and secure database through automation: Oracle autonomous database. White paper, IDC Corporate USA, Sponsored by: Oracle Corp
Ortiz J, Balazinska M, Gehrke J, Keerthi SS (2018) Learning state representations for query optimization with deep reinforcement learning. arXiv:1803.08604, 1–5
Pavlo A, Angulo G, Arulraj J, Lin H, Lin J, Ma L, Menon P, Mowry T C, Perron M, Quah I et al (2017) Self-driving database management systems. In: Conference on innovative data systems research (CIDR). CIDR, Chaminade, California, pp 1–6
Pedrozo W G, Nievola J C, Ribeiro D C (2018) An adaptive approach for index tuning with learning classifier systems on hybrid storage environments. In: International conference on hybrid artificial intelligence systems, pp 716–729. Springer
Petraki E, Idreos S, Manegold S (2015) Holistic indexing in main-memory column-stores. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, SIGMOD ’15. https://doi.org/10.1145/2723372.2723719. ACM, New York, pp 1153–1166
Popovic J (2017) Automatic tuning - SQL Server. https://docs.microsoft.com/en-us/sql/relational-databases/automatic-tuning/automatic-tuning. Accessed: 2019-06-17
POWA (2019) PostreSQL workload analyzer. http://powa.readthedocs.io
Ramakrishnan R, Gehrke J (2003) Database management systems, 3rd edn. McGraw-Hill, Inc., New York
Sharma A, Schuhknecht FM, Dittrich J (2018) The case for automatic database administration using deep reinforcement learning. arXiv:1801.05643, 1–9
Sutton R S, Barto A G (2018) Reinforcement learning: An introduction. MIT Press, Cambridge
Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Introduction. MIT Press, Cambridge, pp 1–13
Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Applications and case studies. MIT Press, Cambridge, pp 421–453
Sutton R S, Barto A G (2018) Reinforcement learning: An introduction, chap. Temporal-difference learning. MIT Press, Cambridge, pp 119–138
Thanopoulou A, Carreira P, Galhardas H (2012) Benchmarking with TPC-H on off-the-shelf hardware. In: 14th international conference on enterprise information systems. Springer, Wroclaw, pp 205–208
TPC: Transaction performance council website (TPC) (1998) http://www.tpc.org/
Tsitsiklis J N, Roy B V (1996) An analysis of temporal-difference learning with function approximation. Tech. rep., Report LIDS-P-2322). Laboratory for information and decision systems, Massachusetts Institute of Technology
Van Aken D, Pavlo A, Gordon G J, Zhang B (2017) Automatic database management system tuning through large-scale machine learning. In: Proceedings of the 2017 ACM international conference on management of data, pp 1009–1024. ACM
Wang J, Liu W, Kumar S, Chang S F (2016) Learning to hash for indexing big data—a survey. Proc IEEE 104(1):34–57
Acknowledgements
This work was supported by SAP SE. We thank our colleagues from SAP Labs Latin America who provided insights and expertise that greatly assisted the research.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 A Index configurations
The corresponding indexed columns in each of the configurations we use in our experiments are shown in Table 3.
Rights and permissions
About this article
Cite this article
Paludo Licks, G., Colleoni Couto, J., de Fátima Miehe, P. et al. SmartIX: A database indexing agent based on reinforcement learning. Appl Intell 50, 2575–2588 (2020). https://doi.org/10.1007/s10489-020-01674-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-020-01674-8