Abstract
Traditional LSM-tree-based key-value storage systems face significant read amplification issues due to the multi-level structure of LSM-tree, the unordered SSTable files in Level 0, and the lack of an in-memory index structure for key-value pairs. We observed that, under the influence of workloads with locality features, key-value pairs exhibit a range-specific access intensity. Addressing the three reasons for LSM-tree read amplification, we have utilized the range-specific access intensity of workload to propose a multi-tree structure consisting of a B+ tree, a single-level hot tree, and an LSM-tree with partition-based Level 0. This aims to enhance the read performance of LSM-tree-based key-value storage systems. We designed the prototype, MTDB, based on LevelDB. The experimental results show that MTDB’s read performance is 1.62× to 2.02× that of LevelDB, and it approaches or exceeds the read performance of KVell and Bourbon while reducing memory overhead by 58.85%–86%.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study
References
O’Neil Patrick, Cheng Edward, Gawlick Dieter, O’Neil Elizabeth (1996) The log-structured merge-tree (lsm-tree). Acta Inf 33:351–385
Chang Fay, Dean Jeffrey, Ghemawat Sanjay, Hsieh Wilson C, Wallach Deborah A, Burrows Mike, Chandra Tushar, Fikes Andrew, Gruber Robert E (2008) Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2):1–26
LevelDB (2021) https://github.com/google/leveldb
RocksDB (2022) http://rocksdb.org/
Cassandra (2020) http://cassandra.apache.org/
HBase (2020) http://hbase.apache.org/
Lu Lanyue, Pillai Thanumalayan Sankaranarayana, Gopalakrishnan Hariharan, Arpaci-Dusseau Andrea C, Arpaci-Dusseau Remzi H (2017) Wisckey: separating keys from values in ssd-conscious storage. ACM Trans Storage 13(1):1–28
Agrawal Nitin, Prabhakaran Vijayan, Wobber Ted, Davis John D, Manasse Mark, Panigrahy Rina (2008) Design tradeoffs for ssd performance. In: 2008 USENIX Annual Technical Conference
Kai Lu, Zhao Nannan, Wan Jiguang, Fei Changhong, Zhao Wei, Deng Tongliang (2021) Tridentkv: a read-optimized lsm-tree based kv store via adaptive indexing and space-efficient partitioning. IEEE Trans Parallel Distrib Syst 33(8):1953–1966
Kaiyrakhmet Olzhas, Lee Songyi, Nam Beomseok, Noh Sam H, Choi Young-ri (2019) Slm-db:single-levelkey-value store with persistent memory. In: 17th USENIX Conference on File and Storage Technologies, pp 191–205
Lepers Baptiste, Balmau Oana, Gupta Karan, Zwaenepoel Willy (2019) Kvell: the design and implementation of a fast persistent key-value store. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles, pp 447–461
Ma Chenlin, Yang Hao, Shangyu Wu, Wang Yi, Mao Rui (2022) Tidal-tree-mem: Toward read-intensive key-value stores with tidal structure based on lsm-tree. IEEE Trans Comput Aided Des Integr Circuits Syst 42(2):423–436
Li Yongkun, Tian Chengjin, Guo Fan, Li Cheng, Xu Yinlong (2019) Elasticbf: Elastic bloom filter with hotness awareness for boosting read performance in large key-value stores. In: 2019 USENIX Annual Technical Conference, pp 739–752
Atikoglu Berk, Xu Yuehai, Frachtenberg Eitan, Jiang Song, Paleczny Mike (2012) Workload analysis of a large-scale key-value store. In: Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, pp 53–64
Cao Zhichao, Dong Siying, Vemuri Sagar, Du David HC (2020) Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook. In: 18th USENIX Conference on File and Storage Technologies, pp 209–223
Yang Juncheng, Yue Yao, Rashmi KV (2021) A large-scale analysis of hundreds of in-memory key-value cache clusters at twitter. ACM Trans Storage (TOS) 17(3):1–35
Cooper Brian F, Silberstein Adam, Tam Erwin, Ramakrishnan Raghu, Sears Russell (2010) Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM symposium on Cloud computing, pp 143–154
Wikipedia. Bloom filter. https://en.wikipedia.org/wiki/Bloom_filter
Chen Jiqiang, Chen Liang, Wang Sheng, Zhu Guoyun, Sun Yuanyuan, Liu Huan, Li Feifei (2020) Hotring: A hotspot-aware in-memory key-value store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), pp 239–252
Appsflyer (2021) https://appsflyer.com
Flurry analytics (2021) https://flurry.com
Google firebase (2020) https://firebase.google.com
Pugh William (1990) Skip lists: a probabilistic alternative to balanced trees. Commun ACM 33(6):668–676
Pugh William (1998) A skip list cookbook. Technical report
Wei Qian, Chen Zehao, Chen Xiaowei, Zhang Yuhao, Cai Xiaojun, Jia Zhiping, Shen Zhaoyan, Wang Y, Shao Zili, Li Bingzhe (2023) A semantic-integrated lsm-tree based key-value storage engine for blockchain systems. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
WiredTiger storage engine (2023) https://docs.mongodb.com/manual/core/wiredtiger/
Dai Yifan, Xu Yien, Ganesan Aishwarya, Alagappan Ramnatthan, Kroth Brian, Arpaci-Dusseau Andrea, Arpaci-Dusseau Remzi (2020) From wisckey to bourbon: A learned index for log-structured merge trees. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pp 155–171
Li Fei, Lu Youyou, Yang Zhe, Shu Jiwu (2020) Sinekv: Decoupled secondary indexing for lsm-based key-value stores. In: 2020 IEEE 40th International Conference on Distributed Computing Systems, pp 1112–1122
Luo Yuhan, Lin Mingwei, Pan Yubiao, Zeshui Xu (2022) Dual locality-based flash translation layer for nand flash-based consumer electronics. IEEE Trans Consum Electron 68(3):281–290
Zhang Jianpeng, Lin Mingwei, Pan Yubiao, Zeshui Xu (2023) Crftl: Cache reallocation-based page-level flash translation layer for smartphones. IEEE Trans Consum Electron 69(3):671–679
Yu JiaPing, Chen HuaHui, Qian JiangBo, Dong YiHong (2020) Ltg-lsm: The optimal structure in lsm-tree combined with reading hotness. In: 2020 IEEE 26th International Conference on Parallel and Distributed Systems, pp 1–8
Zhong Wenshao, Chen Chen, Wu Xingbo, Jiang Song (2021) Remix: Efficient range query for lsm-trees. In: 19th USENIX Conference on File and Storage Technologies, pp 51–64
Zhang Qiang, Li Yongkun, Lee Patrick PC, Xu Yinlong, Cui Qiu, Tang Liu (2020) Unikv: Toward high-performance and scalable kv storage in mixed workloads via unified indexing. In: 2020 IEEE 36th International Conference on Data Engineering, pp 313–324
Wu Xingbo, Xu Yuehai, Shao Zili, Jiang Song (2015) Lsm-trie: An lsm-tree-basedultra-largekey-value store for small data items. In: 2015 USENIX Annual Technical Conference, pp 71–82
Ahn Jung-Sang, Seo Chiyoung, Mayuram Ravi, Yaseen Rahim, Kim Jin-Soo, Maeng Seungryoul (2015) Forestdb: A fast key-value storage system for variable-length string keys. IEEE Trans Comput 65(3):902–915
Balmau Oana, Didona Diego, Guerraoui Rachid, Zwaenepoel Willy, Yuan Huapeng, Arora Aashray, Gupta Karan, Konka Pavan (2017) Triad: Creating synergies between memory, disk and log in log structured key-value stores. In: 2017 USENIX Annual Technical Conference, pp 363–375
Levandoski Justin J, Larson Per-Åke, Stoica Radu (2013) Identifying hot and cold data in main-memory databases. In: 2013 IEEE 29th International Conference on Data Engineering, pp 26–37
Yao Ting, Wan Jiguang, Huang Ping, He Xubin, Fei Wu, Xie Changsheng (2017) Building efficient key-value stores via a lightweight compaction tree. ACM Trans Storage 13(4):1–28
Raju Pandian, Kadekodi Rohan, Chidambaram Vijay, Abraham Ittai (2017) Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp 497–514
Wu Fenggang, Yang Ming-Hong, Zhang Baoquan, Du David HC (2020) Ac-key: Adaptive caching for lsm-based key-value stores. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp 603–615
Teng Dejun, Guo Lei, Lee Rubao, Chen Feng, Ma Siyuan, Zhang Yanfeng, Zhang Xiaodong (2017)Lsbm-tree: Re-enabling buffer caching in data management for mixed reads and writes. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp 68–79. IEEE
Yang Lei, Hong Wu, Zhang Tieying, Cheng Xuntao, Li Feifei, Zou Lei, Wang Yujie, Chen Rongyao, Wang Jianying, Huang Gui (2020) Leaper: a learned prefetcher for cache invalidation in lsm-tree based storage engines. Proc VLDB Endowment 13(12):1976–1989
Dayan Niv, Athanassoulis Manos, Idreos Stratos (2017) Monkey: Optimal navigable key-value store. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp 79–94
Funding
This work was supported by Nature Science Foundation of Fujian Province under Grant No. 2021J01319, Fundamental Research Founds for the Central Universities of Huaqiao University under Grant No. ZQN-910, National Natural Science Foundation of China under Grant No. 61872086, Natural Science Foundation of Fujian Province of China under Grant No. 2022J06020, and the Young Top Talent of Young Eagle Program of Fujian Province, China under Grant No. F21E0011202B01.
Author information
Authors and Affiliations
Contributions
XL, YP, WF, HZ contributed to conceptualization, methodology, and investigation. XL done formal analysis, software, validation, and writing—original draft. YP and ML done supervision and resources. YP, WF, HZ, ML helped in writing—review and editing.
Corresponding authors
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, X., Pan, Y., Feng, W. et al. MTDB: an LSM-tree-based key-value store using a multi-tree structure to improve read performance. J Supercomput 80, 23995–24025 (2024). https://doi.org/10.1007/s11227-024-06382-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-024-06382-5