Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

CloudJump: optimizing cloud databases for cloud storages

Published: 01 August 2022 Publication History

Abstract

There has been an increasing interest in building cloud-native databases that decouple computation and storage for elasticity. A cloud-native database often adopts a cloud storage underneath its storage engine, leveraging another layer of virtualization and providing a high-performance and elastic storage service without exposing complex storage details. It helps reduce the maintenance cost and expedite development cycles for the database kernels. We have observed that there are significant differences between the local and the cloud storage that invalid many designs inside existing databases when they are ported to the cloud storage. In this paper, we analyze the challenges and opportunities of both B-tree and LSM-tree-based storage engines when they are deployed on a cloud storage. We propose an optimization framework that guides database developers to transform on-premise databases into their cloud-native counterparts. We use a B+-tree-based InnoDB as a demonstration vehicle where we have implemented a suite of optimizations using the proposed framework and extend such efforts to the LSM-tree-based RocksDB. On both engines, our evaluations show significant performance improvements on the cloud storage.

References

[1]
Abutalib Aghayev, Sage Weil, Michael Kuchnik, Mark Nelson, Gregory R Ganger, and George Amvrosiadis. 2019. File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution. In Proceedings of the 27th ACM Symposium on Operating Systems Principles. 353--369.
[2]
Alibaba. 2020. Alibaba Cloud Enhanced SSDs. https://www.alibabacloud.com/help/doc-detail/122389.html.
[3]
Amazon. 2020. Amazon Elastic Block Store. https://aws.amazon.com/ebs/features/.
[4]
Amazon. 2020. MySQL on Amazon RDS. https://docs.aws.amazon.com/AmazonRDS/latest/User-Guide/CHAP_MySQL.html.
[5]
Panagiotis Antonopoulos, Alex Budovski, Cristian Diaconu, Alejandro Hernandez Saenz, Jack Hu, Hanuma Kodavalla, Donald Kossmann, Sandeep Lingam, Umar Farooq Minhas, Naveen Prakash, et al. 2019. Socrates: the new SQL server in the cloud. In Proceedings of the 2019 ACM SIGMOD International Conference on Management of Data. 1743--1756.
[6]
Joy Arulraj, Matthew Perron, and Andrew Pavlo. 2016. Write-behind logging. Proceedings of the VLDB Endowment 10, 4 (2016), 337--348.
[7]
Hillel Avni, Alisher Aliev, Oren Amor, Aharon Avitzur, Ilan Bronshtein, Eli Ginot, Shay Goikhman, Eliezer Levy, Idan Levy, Fuyang Lu, et al. 2020. Industrial-strength OLTP using main memory and many cores. Proceedings of the VLDB Endowment 13, 12 (2020), 3099--3111.
[8]
Wei Cao, Zhenjun Liu, Peng Wang, Sen Chen, Caifeng Zhu, Song Zheng, Yuhui Wang, and Guoqing Ma. 2018. PolarFS: an ultra-low latency and failure resilient distributed file system for shared storage cloud database. Proceedings of the VLDB Endowment 11, 12 (2018), 1849--1862.
[9]
ClickHouse. 2021. ClickHouse. https://clickhouse.com/.
[10]
Joel Coburn, Trevor Bunker, Meir Schwarz, Rajesh Gupta, and Steven Swanson. 2013. From ARIES to MARS: Transaction support for next-generation, solid-state drives. In Proceedings of the twenty-fourth ACM symposium on operating systems principles. 197--212.
[11]
Cohortfs. 2021. Ceph over Accelio. https://www.cohortfs.com/ceph-over-accelio.
[12]
Alex Depoutovitch, Chong Chen, Jin Chen, Paul Larson, Shu Lin, Jack Ng, Wenlin Cui, Qiang Liu, Wei Huang, Yong Xiao, et al. 2020. Taurus Database: How to be Fast, Available, and Frugal in the Cloud. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 1463--1478.
[13]
Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In CIDR, Vol. 3. 3.
[14]
Siying Dong, Andrew Kryczka, Yanqin Jin, and Michael Stumm. 2021. Evolution of Development Priorities in Key-value Stores Serving Large-scale Applications: The RocksDB Experience. In 19th USENIX Conference on File and Storage Technologies (FAST 21). USENIX Association, 33--49.
[15]
Facebook. 2021. RocksDB. https://github.com/facebook/rocksdb.
[16]
Michael Haubenschild, Caetano Sauer, Thomas Neumann, and Viktor Leis. 2020. Rethinking Logging, Checkpoints, and Recovery for High-Performance Storage Engines. In Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. 877--892.
[17]
Nusrat S Islam, Mohammad Wahidur Rahman, Jithin Jose, Raghunath Rajachandrasekar, Hao Wang, Hari Subramoni, Chet Murthy, and Dhabaleswar K Panda. 2012. High performance RDMA-based design of HDFS over InfiniBand. In SC'12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. IEEE, 1--12.
[18]
Varun Jain, James Lennon, and Harshita Gupta. 2019. Lsm-trees and b-trees: The best of both worlds. In Proceedings of the 2019 International Conference on Management of Data. 1829--1831.
[19]
Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. 2010. Aether: a scalable approach to logging. Proceedings of the VLDB Endowment 3, 1--2 (2010), 681--692.
[20]
Hyungsoo Jung, Hyuck Han, and Sooyong Kang. 2017. Scalable database logging for multicores. Proceedings of the VLDB Endowment 11, 2 (2017), 135--148.
[21]
Aarati Kakaraparthy, Jignesh M Patel, Kwanghyun Park, and Brian P Kroth. 2019. Optimizing databases by learning hidden parameters of solid state drives. Proceedings of the VLDB Endowment 13, 4 (2019), 519--532.
[22]
Alexey Kopytov. 2021. Sysbench. https://github.com/akopytov/sysbench.
[23]
Philip L Lehman and S Bing Yao. 1981. Efficient locking for concurrent operations on B-trees. ACM Transactions on Database Systems (TODS) 6, 4 (1981), 650--670.
[24]
Hyeontaek Lim, Michael Kaminsky, and David G Andersen. 2017. Cicada: Dependably fast multi-core in-memory transactions. In Proceedings of the 2017 ACM International Conference on Management of Data. 21--35.
[25]
Youyou Lu, Jiwu Shu, Youmin Chen, and Tao Li. 2017. Octopus: an RDMA-enabled Distributed Persistent Memory File System. In 2017 USENIX Annual Technical Conference, USENIX ATC 2017, Santa Clara, CA, USA, July 12--14, 2017, Dilma Da Silva and Bryan Ford (Eds.). USENIX Association, 773--785. https://www.usenix.org/conference/atc17/technical-sessions/presentation/lu
[26]
Radhika Mittal, Alexander Shpiner, Aurojit Panda, Eitan Zahavi, Arvind Krishnamurthy, Sylvia Ratnasamy, and Scott Shenker. 2018. Revisiting Network Support for RDMA. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication (Budapest, Hungary) (SIGCOMM '18). Association for Computing Machinery, New York, NY, USA, 313--326.
[27]
Chandrasekaran Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems (TODS) 17, 1 (1992), 94--162.
[28]
C Mohan and Frank Levine. 1992. ARIES/IM: an efficient and high concurrency index management method using write-ahead logging. ACM Sigmod Record 21, 2 (1992), 371--380.
[29]
Oracle. 2021. MySQL 8.0 Reference Manual. https://dev.mysql.com/doc/refman/8.0/en/.
[30]
Kun Ren, Thaddeus Diamond, Daniel J Abadi, and Alexander Thomson. 2016. Low-overhead asynchronous checkpointing in main-memory database systems. In Proceedings of the 2016 International Conference on Management of Data. 1539--1551.
[31]
Radu Stoica and Anastasia Ailamaki. 2013. Enabling efficient OS paging for main-memory OLTP databases. In Proceedings of the Ninth International Workshop on Data Management on New Hardware. 1--7.
[32]
Shin-Yeh Tsai and Yiying Zhang. 2017. LITE Kernel RDMA Support for Data-center Applications. In Proceedings of the 26th Symposium on Operating Systems Principles (Shanghai, China) (SOSP '17). Association for Computing Machinery, New York, NY, USA, 306--324.
[33]
Alexander van Renen, Viktor Leis, Alfons Kemper, Thomas Neumann, Takushi Hashida, Kazuichi Oe, Yoshiyasu Doi, Lilian Harada, and Mitsuru Sato. 2018. Managing Non-Volatile Memory in Database Systems. In Proceedings of the 2018 International Conference on Management of Data (Houston, TX, USA) (SIGMOD '18). Association for Computing Machinery, New York, NY, USA, 1541--1555.
[34]
Alexandre Verbitski, Anurag Gupta, Debanjan Saha, Murali Brahmadesam, Kamal Gupta, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvili, and Xiaofeng Bao. 2017. Amazon Aurora: Design considerations for high throughput cloud-native relational databases. In Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data. 1041--1052.
[35]
Alexandre Verbitski, Anurag Gupta, Debanjan Saha, James Corey, Kamal Gupta, Murali Brahmadesam, Raman Mittal, Sailesh Krishnamurthy, Sandor Maurice, Tengiz Kharatishvilli, et al. 2018. Amazon aurora: On avoiding distributed consensus for i/os, commits, and membership changes. In Proceedings of the 2018 ACM SIGMOD International Conference on Management of Data. 789--796.
[36]
Tianzheng Wang and Ryan Johnson. 2014. Scalable logging through emerging non-volatile memory. Proceedings of the VLDB Endowment 7, 10 (2014), 865--876.
[37]
Yingjun Wu, Wentian Guo, Chee-Yong Chan, and Kian-Lee Tan. 2017. Fast failure recovery for main-memory dbmss on multicores. In Proceedings of the 2017 ACM SIGMOD International Conference on Management of Data. 267--281.
[38]
Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2019. Orion: A Distributed File System for Non-Volatile Main Memory and RDMA-Capable Networks. In 17th USENIX Conference on File and Storage Technologies (FAST 19). USENIX Association, Boston, MA, 221--234. https://www.usenix.org/conference/fast19/presentation/yang
[39]
Jian Yang, Joseph Izraelevitz, and Steven Swanson. 2020. FileMR: Rethinking RDMA Networking for Scalable Persistent Memory. In 17th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2020, Santa Clara, CA, USA, February 25--27, 2020, Ranjita Bhagwan and George Porter (Eds.). USENIX Association, 111--125.
[40]
Lei Yang, Hong Wu, Tieying Zhang, Xuntao Cheng, Feifei Li, Lei Zou, Yujie Wang, Rongyao Chen, Jianying Wang, and Gui Huang. 2020. Leaper: a learned prefetcher for cache invalidation in LSM-tree based storage engines. Proceedings of the VLDB Endowment 13, 12 (2020), 1976--1989.

Cited By

View all
  • (2024)SplitFT: Fault Tolerance for Disaggregated Datacenters via Remote Memory LoggingProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629561(590-607)Online publication date: 22-Apr-2024
  • (2023)Breathing New Life into an Old Tree: Resolving Logging Dilemma of B+-tree on Modern Computational Storage DrivesProceedings of the VLDB Endowment10.14778/3626292.362629717:2(134-147)Online publication date: 1-Oct-2023
  • (2023)PolarDB-IMCI: A Cloud-Native HTAP Database System at AlibabaProceedings of the ACM on Management of Data10.1145/35897851:2(1-25)Online publication date: 20-Jun-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 12
August 2022
551 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2022
Published in PVLDB Volume 15, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)136
  • Downloads (Last 6 weeks)14
Reflects downloads up to 30 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)SplitFT: Fault Tolerance for Disaggregated Datacenters via Remote Memory LoggingProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629561(590-607)Online publication date: 22-Apr-2024
  • (2023)Breathing New Life into an Old Tree: Resolving Logging Dilemma of B+-tree on Modern Computational Storage DrivesProceedings of the VLDB Endowment10.14778/3626292.362629717:2(134-147)Online publication date: 1-Oct-2023
  • (2023)PolarDB-IMCI: A Cloud-Native HTAP Database System at AlibabaProceedings of the ACM on Management of Data10.1145/35897851:2(1-25)Online publication date: 20-Jun-2023

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media