Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

MTDB: an LSM-tree-based key-value store using a multi-tree structure to improve read performance

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Traditional LSM-tree-based key-value storage systems face significant read amplification issues due to the multi-level structure of LSM-tree, the unordered SSTable files in Level 0, and the lack of an in-memory index structure for key-value pairs. We observed that, under the influence of workloads with locality features, key-value pairs exhibit a range-specific access intensity. Addressing the three reasons for LSM-tree read amplification, we have utilized the range-specific access intensity of workload to propose a multi-tree structure consisting of a B+ tree, a single-level hot tree, and an LSM-tree with partition-based Level 0. This aims to enhance the read performance of LSM-tree-based key-value storage systems. We designed the prototype, MTDB, based on LevelDB. The experimental results show that MTDB’s read performance is 1.62× to 2.02× that of LevelDB, and it approaches or exceeds the read performance of KVell and Bourbon while reducing memory overhead by 58.85%–86%.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study

References

  1. O’Neil Patrick, Cheng Edward, Gawlick Dieter, O’Neil Elizabeth (1996) The log-structured merge-tree (lsm-tree). Acta Inf 33:351–385

    Article  Google Scholar 

  2. Chang Fay, Dean Jeffrey, Ghemawat Sanjay, Hsieh Wilson C, Wallach Deborah A, Burrows Mike, Chandra Tushar, Fikes Andrew, Gruber Robert E (2008) Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. (TOCS) 26(2):1–26

    Article  Google Scholar 

  3. LevelDB (2021) https://github.com/google/leveldb

  4. RocksDB (2022) http://rocksdb.org/

  5. Cassandra (2020) http://cassandra.apache.org/

  6. HBase (2020) http://hbase.apache.org/

  7. Lu Lanyue, Pillai Thanumalayan Sankaranarayana, Gopalakrishnan Hariharan, Arpaci-Dusseau Andrea C, Arpaci-Dusseau Remzi H (2017) Wisckey: separating keys from values in ssd-conscious storage. ACM Trans Storage 13(1):1–28

    Article  Google Scholar 

  8. Agrawal Nitin, Prabhakaran Vijayan, Wobber Ted, Davis John D, Manasse Mark, Panigrahy Rina (2008) Design tradeoffs for ssd performance. In: 2008 USENIX Annual Technical Conference

  9. Kai Lu, Zhao Nannan, Wan Jiguang, Fei Changhong, Zhao Wei, Deng Tongliang (2021) Tridentkv: a read-optimized lsm-tree based kv store via adaptive indexing and space-efficient partitioning. IEEE Trans Parallel Distrib Syst 33(8):1953–1966

    Google Scholar 

  10. Kaiyrakhmet Olzhas, Lee Songyi, Nam Beomseok, Noh Sam H, Choi Young-ri (2019) Slm-db:single-levelkey-value store with persistent memory. In: 17th USENIX Conference on File and Storage Technologies, pp 191–205

  11. Lepers Baptiste, Balmau Oana, Gupta Karan, Zwaenepoel Willy (2019) Kvell: the design and implementation of a fast persistent key-value store. In: Proceedings of the 27th ACM Symposium on Operating Systems Principles, pp 447–461

  12. Ma Chenlin, Yang Hao, Shangyu Wu, Wang Yi, Mao Rui (2022) Tidal-tree-mem: Toward read-intensive key-value stores with tidal structure based on lsm-tree. IEEE Trans Comput Aided Des Integr Circuits Syst 42(2):423–436

    Article  Google Scholar 

  13. Li Yongkun, Tian Chengjin, Guo Fan, Li Cheng, Xu Yinlong (2019) Elasticbf: Elastic bloom filter with hotness awareness for boosting read performance in large key-value stores. In: 2019 USENIX Annual Technical Conference, pp 739–752

  14. Atikoglu Berk, Xu Yuehai, Frachtenberg Eitan, Jiang Song, Paleczny Mike (2012) Workload analysis of a large-scale key-value store. In: Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE Joint International Conference on Measurement and Modeling of Computer Systems, pp 53–64

  15. Cao Zhichao, Dong Siying, Vemuri Sagar, Du David HC (2020) Characterizing, modeling, and benchmarking rocksdb key-value workloads at facebook. In: 18th USENIX Conference on File and Storage Technologies, pp 209–223

  16. Yang Juncheng, Yue Yao, Rashmi KV (2021) A large-scale analysis of hundreds of in-memory key-value cache clusters at twitter. ACM Trans Storage (TOS) 17(3):1–35

    Article  Google Scholar 

  17. Cooper Brian F, Silberstein Adam, Tam Erwin, Ramakrishnan Raghu, Sears Russell (2010) Benchmarking cloud serving systems with ycsb. In: Proceedings of the 1st ACM symposium on Cloud computing, pp 143–154

  18. Wikipedia. Bloom filter. https://en.wikipedia.org/wiki/Bloom_filter

  19. Chen Jiqiang, Chen Liang, Wang Sheng, Zhu Guoyun, Sun Yuanyuan, Liu Huan, Li Feifei (2020) Hotring: A hotspot-aware in-memory key-value store. In: 18th USENIX Conference on File and Storage Technologies (FAST 20), pp 239–252

  20. Appsflyer (2021) https://appsflyer.com

  21. Flurry analytics (2021) https://flurry.com

  22. Google firebase (2020) https://firebase.google.com

  23. Pugh William (1990) Skip lists: a probabilistic alternative to balanced trees. Commun ACM 33(6):668–676

    Article  Google Scholar 

  24. Pugh William (1998) A skip list cookbook. Technical report

  25. Wei Qian, Chen Zehao, Chen Xiaowei, Zhang Yuhao, Cai Xiaojun, Jia Zhiping, Shen Zhaoyan, Wang Y, Shao Zili, Li Bingzhe (2023) A semantic-integrated lsm-tree based key-value storage engine for blockchain systems. In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

  26. WiredTiger storage engine (2023) https://docs.mongodb.com/manual/core/wiredtiger/

  27. Dai Yifan, Xu Yien, Ganesan Aishwarya, Alagappan Ramnatthan, Kroth Brian, Arpaci-Dusseau Andrea, Arpaci-Dusseau Remzi (2020) From wisckey to bourbon: A learned index for log-structured merge trees. In: 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), pp 155–171

  28. Li Fei, Lu Youyou, Yang Zhe, Shu Jiwu (2020) Sinekv: Decoupled secondary indexing for lsm-based key-value stores. In: 2020 IEEE 40th International Conference on Distributed Computing Systems, pp 1112–1122

  29. Luo Yuhan, Lin Mingwei, Pan Yubiao, Zeshui Xu (2022) Dual locality-based flash translation layer for nand flash-based consumer electronics. IEEE Trans Consum Electron 68(3):281–290

    Article  Google Scholar 

  30. Zhang Jianpeng, Lin Mingwei, Pan Yubiao, Zeshui Xu (2023) Crftl: Cache reallocation-based page-level flash translation layer for smartphones. IEEE Trans Consum Electron 69(3):671–679

    Article  Google Scholar 

  31. Yu JiaPing, Chen HuaHui, Qian JiangBo, Dong YiHong (2020) Ltg-lsm: The optimal structure in lsm-tree combined with reading hotness. In: 2020 IEEE 26th International Conference on Parallel and Distributed Systems, pp 1–8

  32. Zhong Wenshao, Chen Chen, Wu Xingbo, Jiang Song (2021) Remix: Efficient range query for lsm-trees. In: 19th USENIX Conference on File and Storage Technologies, pp 51–64

  33. Zhang Qiang, Li Yongkun, Lee Patrick PC, Xu Yinlong, Cui Qiu, Tang Liu (2020) Unikv: Toward high-performance and scalable kv storage in mixed workloads via unified indexing. In: 2020 IEEE 36th International Conference on Data Engineering, pp 313–324

  34. Wu Xingbo, Xu Yuehai, Shao Zili, Jiang Song (2015) Lsm-trie: An lsm-tree-basedultra-largekey-value store for small data items. In: 2015 USENIX Annual Technical Conference, pp 71–82

  35. Ahn Jung-Sang, Seo Chiyoung, Mayuram Ravi, Yaseen Rahim, Kim Jin-Soo, Maeng Seungryoul (2015) Forestdb: A fast key-value storage system for variable-length string keys. IEEE Trans Comput 65(3):902–915

    Article  MathSciNet  Google Scholar 

  36. Balmau Oana, Didona Diego, Guerraoui Rachid, Zwaenepoel Willy, Yuan Huapeng, Arora Aashray, Gupta Karan, Konka Pavan (2017) Triad: Creating synergies between memory, disk and log in log structured key-value stores. In: 2017 USENIX Annual Technical Conference, pp 363–375

  37. Levandoski Justin J, Larson Per-Åke, Stoica Radu (2013) Identifying hot and cold data in main-memory databases. In: 2013 IEEE 29th International Conference on Data Engineering, pp 26–37

  38. Yao Ting, Wan Jiguang, Huang Ping, He Xubin, Fei Wu, Xie Changsheng (2017) Building efficient key-value stores via a lightweight compaction tree. ACM Trans Storage 13(4):1–28

    Article  Google Scholar 

  39. Raju Pandian, Kadekodi Rohan, Chidambaram Vijay, Abraham Ittai (2017) Pebblesdb: Building key-value stores using fragmented log-structured merge trees. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp 497–514

  40. Wu Fenggang, Yang Ming-Hong, Zhang Baoquan, Du David HC (2020) Ac-key: Adaptive caching for lsm-based key-value stores. In: 2020 USENIX Annual Technical Conference (USENIX ATC 20), pp 603–615

  41. Teng Dejun, Guo Lei, Lee Rubao, Chen Feng, Ma Siyuan, Zhang Yanfeng, Zhang Xiaodong (2017)Lsbm-tree: Re-enabling buffer caching in data management for mixed reads and writes. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp 68–79. IEEE

  42. Yang Lei, Hong Wu, Zhang Tieying, Cheng Xuntao, Li Feifei, Zou Lei, Wang Yujie, Chen Rongyao, Wang Jianying, Huang Gui (2020) Leaper: a learned prefetcher for cache invalidation in lsm-tree based storage engines. Proc VLDB Endowment 13(12):1976–1989

    Article  Google Scholar 

  43. Dayan Niv, Athanassoulis Manos, Idreos Stratos (2017) Monkey: Optimal navigable key-value store. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp 79–94

Download references

Funding

This work was supported by Nature Science Foundation of Fujian Province under Grant No. 2021J01319, Fundamental Research Founds for the Central Universities of Huaqiao University under Grant No. ZQN-910, National Natural Science Foundation of China under Grant No. 61872086, Natural Science Foundation of Fujian Province of China under Grant No. 2022J06020, and the Young Top Talent of Young Eagle Program of Fujian Province, China under Grant No. F21E0011202B01.

Author information

Authors and Affiliations

Authors

Contributions

XL, YP, WF, HZ contributed to conceptualization, methodology, and investigation. XL done formal analysis, software, validation, and writing—original draft. YP and ML done supervision and resources. YP, WF, HZ, ML helped in writing—review and editing.

Corresponding authors

Correspondence to Yubiao Pan or Mingwei Lin.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lin, X., Pan, Y., Feng, W. et al. MTDB: an LSM-tree-based key-value store using a multi-tree structure to improve read performance. J Supercomput 80, 23995–24025 (2024). https://doi.org/10.1007/s11227-024-06382-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06382-5

Keywords

Navigation