Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3465998.3466004acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article
Open access

KallaxDB: A Table-less Hash-based Key-Value Store on Storage Hardware with Built-in Transparent Compression

Published: 20 June 2021 Publication History

Abstract

This paper studies the design of a key-value (KV) store that can take full advantage of modern storage hardware with built-in transparent compression capability. Many modern storage appliances/drives implement hardware-based data compression, transparent to OS and applications. Moreover, the growing deployment of hardware-based compression in Cloud infrastructure leads to the imminent arrival of Cloud-based storage hardware with built-in transparent compression. By decoupling the logical storage space utilization efficiency from the true physical storage usage, transparent compression allows data management software to purposely waste logical storage space in return for simpler data structures and algorithms, leading to lower implementation complexity and higher performance. This work proposes a table-less hash-based KV store, where the basic idea is to hash the key space directly onto the logical storage space without using a hash table at all. With a substantially simplified data structure, this approach is subject to significant logical storage space under-utilization, which can be seamlessly mitigated by storage hardware with transparent compression. This paper presents the basic KV store architecture, and develops mathematical formulations to assist its configuration and analysis. We implemented such a KV store KallaxDB and carried out experiments on a commercial SSD with built-in transparent compression. The results show that, while consuming very little memory resource, it compares favorably with the other modern KV stores in terms of throughput, latency, and CPU usage.

References

[1]
Apache Cassandra. [n.d.]. http://cassandra.apache.org/.
[2]
AWS Graviton Processor. [n.d.]. https://aws.amazon.com/ec2/graviton/.
[3]
Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In Proceedings of USENIX Annual Technical Conference (ATC). 363--375.
[4]
Jeff Bonwick, Matt Ahrens, Val Henson, Mark Maybee, and Mark Shellenbaum. 2003. The zettabyte file system. In Proceedings of the Usenix Conference on File and Storage Technologies (FAST), Vol. 215.
[5]
Zhichao Cao, Siying Dong, Sagar Vemuri, and David Du. 2020. Characterizing, modeling, and benchmarking RocksDB key-value workloads at Facebook. In USENIX Conference on File and Storage Technologies (FAST). 209--223.
[6]
Helen HW Chan, Chieh-Jan Mike Liang, Yongkun Li, Wenjia He, Patrick PC Lee, Lianjie Zhu, Yaozu Dong, Yinlong Xu, Yu Xu, Jin Jiang, et al. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In Proceedings of USENIX Annual Technical Conference (ATC). 1007--1019.
[7]
Derek Chiou, Eric Chung, and Susan Carrie. 2019. (Cloud) Acceleration at Microsoft. Tutorial at Hot Chips (2019).
[8]
Brian Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the ACM Symposium on Cloud Computing (SoCC). ACM, 143--154.
[9]
Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 505--520.
[10]
Biplob Debnath, Sudipta Sengupta, and Jin Li. 2010. FlashStore: high throughput persistent key-value store. Proceedings of the VLDB Endowment 3, 1-2 (2010), 1414--1425.
[11]
Biplob Debnath, Sudipta Sengupta, and Jin Li. 2011. SkimpyStash: RAM space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data. 25--36.
[12]
Dell EMC PowerMax. [n.d.]. https://delltechnologies.com/.
[13]
E. F. Haratsch. 2019. SSD with Compression: Implementation, Interface and Use Case. In Flash Memory Summit.
[14]
HPE Nimble Storage. [n.d.]. https://www.hpe.com/.
[15]
Gui Huang, Xuntao Cheng, Jianying Wang, Yujie Wang, Dengcheng He, Tieying Zhang, Feifei Li, Sheng Wang, Wei Cao, and Qiang Li. 2019. X-Engine: An optimized storage engine for large-scale E-commerce transaction processing. In Proceedings of the ACM SIGMOD International Conference on Management of Data. ACM, 651--665.
[16]
Kornilios Kourtis, Nikolas Ioannou, and Ioannis Koltsidas. 2019. Reaping the performance of fast NVM storage with uDepot. In USENIX Conference on File and Storage Technologies (FAST). 1--15.
[17]
Harold J Larson. 1995. Introduction to Probability. Addison-Wesley, Reading, MA.
[18]
Baptiste Lepers, Oana Balmau, Karan Gupta, and Willy Zwaenepoel. 2019. KVell: the design and implementation of a fast persistent key-value store. In Proceedings of the ACM Symposium on Operating Systems Principles (SOSP). 447--461.
[19]
Hyeontaek Lim, Bin Fan, David G Andersen, and Michael Kaminsky. 2011. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. 1--13.
[20]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. WiscKey: Separating keys from values in SSD-conscious storage. ACM Transactions on Storage (TOS) 13, 1 (2017), 5.
[21]
C. Luo and M.J. Carey. 2020. LSM-based storage techniques: a survey. The VLDB Journal 29 (2020), 393--418.
[22]
Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. 2015. NVMKV: A Scalable, Lightweight, FTL-aware Key-Value Store. In USENIX Annual Technical Conference (ATC). 207--219.
[23]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.
[24]
Anastasios Papagiannis, Giorgos Saloustros, Pilar González-Férez, and Angelos Bilas. 2016. Tucana: Design and implementation of a fast and efficient scale-up key-value store. In Proceedings of USENIX Annual Technical Conference (ATC). 537--550.
[25]
Pure Storage FlashBlade. [n.d.]. https://purestorage.com/.
[26]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. 2017. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the Symposium on Operating Systems Principles (SOSP). 497--514.
[27]
Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. 2017. SlimDB: A space-efficient key-value storage engine for semi-sorted data. Proceedings of the VLDB Endowment 10, 13 (2017), 2037--2048.
[28]
RocksDB. [n.d.]. https://github.com/facebook/rocksdb.
[29]
Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Transactions on Storage (TOS) 9, 3 (2013), 1--32.
[30]
ScaleFlux Computational Storage. [n.d.]. http://scaleflux.com.
[31]
Pradeep J Shetty, Richard P Spillane, Ravikant R Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. 2013. Building workload-independent storage with VT-trees. In Proceedings of USENIX Conference on File and Storage Technologies (FAST). 17--30.
[32]
WiredTiger. [n.d.]. https://github.com/wiredtiger/.
[33]
Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. 2015. LSM-trie: An LSM-tree-based ultra-large key-value store for small data items. In Proceedings of USENIX Annual Technical Conference (ATC). 71--82.
[34]
Yinliang Yue, Bingsheng He, Yuzhe Li, and Weiping Wang. 2016. Building an efficient put-intensive key-value store with skip-tree. IEEE Transactions on Parallel and Distributed Systems 28, 4 (2016), 961--973.

Cited By

View all
  • (2024)Experimental Analysis of Large-Scale Learnable Vector Storage CompressionProceedings of the VLDB Endowment10.14778/3636218.363623417:4(808-822)Online publication date: 5-Mar-2024
  • (2024)A DNN partitioning framework with controlled lossy mechanisms for edge-cloud collaborative intelligenceFuture Generation Computer Systems10.1016/j.future.2024.01.006154:C(426-439)Online publication date: 25-Jun-2024
  • (2023)Breathing New Life into an Old Tree: Resolving Logging Dilemma of B+-tree on Modern Computational Storage DrivesProceedings of the VLDB Endowment10.14778/3626292.362629717:2(134-147)Online publication date: 1-Oct-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
DAMON '21: Proceedings of the 17th International Workshop on Data Management on New Hardware
June 2021
104 pages
ISBN:9781450385565
DOI:10.1145/3465998
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 June 2021

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • NSF

Conference

SIGMOD/PODS '21
Sponsor:

Acceptance Rates

DAMON '21 Paper Acceptance Rate 15 of 17 submissions, 88%;
Overall Acceptance Rate 94 of 127 submissions, 74%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)221
  • Downloads (Last 6 weeks)25
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Experimental Analysis of Large-Scale Learnable Vector Storage CompressionProceedings of the VLDB Endowment10.14778/3636218.363623417:4(808-822)Online publication date: 5-Mar-2024
  • (2024)A DNN partitioning framework with controlled lossy mechanisms for edge-cloud collaborative intelligenceFuture Generation Computer Systems10.1016/j.future.2024.01.006154:C(426-439)Online publication date: 25-Jun-2024
  • (2023)Breathing New Life into an Old Tree: Resolving Logging Dilemma of B+-tree on Modern Computational Storage DrivesProceedings of the VLDB Endowment10.14778/3626292.362629717:2(134-147)Online publication date: 1-Oct-2023
  • (2023)Elastic RAID: Implementing RAID over SSDs with Built-in Transparent CompressionProceedings of the 16th ACM International Conference on Systems and Storage10.1145/3579370.3594773(83-93)Online publication date: 5-Jun-2023
  • (2023)Relational Fabric: Transparent Data Transformation2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00297(3688-3698)Online publication date: Apr-2023

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media