Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

LSM-based storage techniques: a survey

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

Recently, the log-structured merge-tree (LSM-tree) has been widely adopted for use in the storage layer of modern NoSQL systems. Because of this, there have been a large number of research efforts, from both the database community and the operating systems community, that try to improve various aspects of LSM-trees. In this paper, we provide a survey of recent research efforts on LSM-trees so that readers can learn the state of the art in LSM-based storage techniques. We provide a general taxonomy to classify the literature of LSM-trees, survey the efforts in detail, and discuss their strengths and trade-offs. We further survey several representative LSM-based open-source NoSQL systems and discuss some potential future research directions resulting from the survey.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. Also referred to as runs in the literature.

  2. Also referred to as compaction in the literature.

  3. Even for an append-mostly workload, the total number of levels will grow extremely slowly since the maximum number of entries that an LSM-tree can store increases exponentially with a factor of T as the number of levels increases.

  4. RocksDB supports a limited form of a partitioned tiering merge policy to bound the maximum size of each SSTable due to its internal restrictions. However, the disk space may still be doubled temporarily during large merges.

  5. The original analysis presented in [20, 21] defines the space amplification to be the overall number of obsolete entries divided by the number of unique entries. We slightly modified the definition to ensure that the space amplification is no less than 1.

  6. RocksDB [57] supports a limited form of fractional cascading by maintaining the set of overlapping SSTables at the adjacent next level for each SSTable. These pointers are used to narrow down the search range when locating specific SSTables during point lookups.

  7. It is called the compaction filter in RocksDB since RocksDB prefers the term compaction to merge. We use the term merge here to minimize the potential for terminology confusion.

References

  1. Absalyamov, I., et al.: Lightweight cardinality estimation in LSM-based systems. In: ACM SIGMOD, pp. 841–855 (2018)

  2. Ahmad, M.Y., Kemme, B.: Compaction management in distributed key-value datastores. PVLDB 8(8), 850–861 (2015)

    Google Scholar 

  3. Alsubaiee, S., et al.: AsterixDB: a scalable, open source BDMS. PVLDB 7(14), 1905–1916 (2014)

    Google Scholar 

  4. Alsubaiee, S., et al.: Storage management in AsterixDB. PVLDB 7(10), 841–852 (2014)

    Google Scholar 

  5. Alsubaiee, S., et al.: LSM-based storage and indexing: an old idea with timely benefits. In: International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-spatial Data (GeoRich), pp 1–6 (2015)

  6. Amur, H., et al.: Design of a write-optimized data store. Tech. rep, Georgia Institute of Technology (2013)

  7. Athanassoulis, M., et al.: MaSM: efficient online updates in data warehouses. In: ACM SIGMOD, pp. 865–876. ACM (2011)

  8. Athanassoulis, M., et al.: Designing access methods: the RUM conjecture. In: EDBT, vol. 2016, pp. 461–466 (2016)

  9. Balmau, O., et al.: FloDB: unlocking memory in persistent key-value stores. In: European Conference on Computer Systems (EuroSys), pp. 80–94 (2017)

  10. Balmau, O., et al.: TRIAD: creating synergies between memory, disk and log in log structured key-value stores. In: USENIX Annual Technical Conference (ATC), pp. 363–375 (2017)

  11. Bender, M.A., et al.: Don’t thrash: how to cache your hash on flash. PVLDB 5(11), 1627–1637 (2012)

    Google Scholar 

  12. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. CACM 13(7), 422–426 (1970)

    Article  Google Scholar 

  13. Bortnikov, E., et al.: Accordion: better memory organization for LSM key-value stores. PVLDB 11(12), 1863–1875 (2018)

    Google Scholar 

  14. Cassandra. http://cassandra.apache.org/

  15. Chan, H.H.W., et al.: HashKV: enabling efficient updates in KV storage via hashing. In: USENIX Annual Technical Conference (ATC), pp. 1007–1019 (2018)

  16. Chang, F., et al.: Bigtable: a distributed storage system for structured data. ACM TOCS 26(2), 4:1–4:26 (2008)

    Article  Google Scholar 

  17. Chazelle, B., Guibas, L.J.: Fractional cascading: I. A data structuring technique. Algorithmica 1(1), 133–162 (1986)

    Article  MathSciNet  Google Scholar 

  18. Chen, G.J., et al.: Realtime data processing at Facebook. In: ACM SIGMOD, pp. 1087–1098 (2016)

  19. Dragon: A distributed graph query engine. https://code.fb.com/data-infrastructure/dragon-a-distributed-graph-query-engine/

  20. Dayan, N., Idreos, S.: Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In: ACM SIGMOD, pp. 505–520 (2018)

  21. Dayan, N., et al.: Monkey: optimal navigable key-value store. In: ACM SIGMOD, pp. 79–94 (2017)

  22. Dayan, N., et al.: Optimal Bloom filters and adaptive merging for LSM-trees. ACM TODS 43(4), 16:1–16:48 (2018)

    Article  MathSciNet  Google Scholar 

  23. DeCandia, G., et al.: Dynamo: Amazon’y highly available key-value store. In: ACM SOSP, pp. 205–220 (2007)

  24. Dong, S., et al.: Optimizing space amplification in RocksDB. In: CIDR, vol. 3, p. 3 (2017)

  25. D’silva, J.V., et al.: Secondary indexing techniques for key-value stores: two rings to rule them all. In: International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) (2017)

  26. Duan, H., et al.: Incremental materialized view maintenance on distributed log-structured merge-tree. In: DASFAA, pp. 682–700 (2018)

  27. Fagin, R., et al.: Optimal aggregation algorithms for middleware. In: ACM PODS, pp. 102–113 (2001)

  28. Fan, B., et al.: Cuckoo filter: practically better than bloom. In: International Conference on emerging Networking EXperiments and Technologies (CoNEXT), pp. 75–88 (2014)

  29. Fang, Y., et al.: Spatial indexing in Microsoft SQL Server 2008. In: ACM SIGMOD, pp. 1207–1216 (2008)

  30. Golan-Gueta, G., et al.: Scaling concurrent log-structured data stores. In: European Conference on Computer Systems (EuroSys), pp. 32:1–32:14 (2015)

  31. Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: ACM SIGMOD, pp. 47–57 (1984)

    Article  Google Scholar 

  32. HBase. https://hbase.apache.org/

  33. Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM CSUR 15(4), 287–317 (1983)

    Article  MathSciNet  Google Scholar 

  34. Jagadish, H.V., et al.: Incremental organization for data recording and warehousing. In: VLDB, pp. 16–25 (1997)

  35. Jermaine, C., et al.: The partitioned exponential file for database storage management. VLDBJ 16(4), 417–437 (2007)

    Article  Google Scholar 

  36. Kannan, S., et al.: Redesigning LSMs for nonvolatile memory with NoveLSM. In: USENIX Annual Technical Conference (ATC), pp. 993–1005 (2018)

  37. Khodaei, A., et al.: Hybrid indexing and seamless ranking of spatial and textual features of web documents. In: Database and Expert Systems Applications (DEXA), pp. 450–466 (2010)

    Google Scholar 

  38. Kim, T., et al.: Supporting similarity queries in Apache AsterixDB. In: EDBT, pp. 528–539 (2018)

  39. Kim, Y., et al.: A comparative study of log-structured merge-tree-based spatial indexes for big data. In: ICDE, pp. 147–150 (2017)

  40. LevelDB. http://leveldb.org/

  41. Lawder, J.: The application of space-filling curves to the storage and retrieval of multi-dimensional data. Ph.D. thesis, PhD Thesis, University of London, UK (2000)

  42. Li, Y., et al.: Tree indexing on solid state drives. PVLDB 3(1–2), 1195–1206 (2010)

    Google Scholar 

  43. Lim, H., et al.: Towards accurate and fast evaluation of multi-stage log-structured designs. In: USENIX Conference on File and Storage Technologies (FAST), pp. 149–166 (2016)

  44. Lu, L., et al.: WiscKey: separating keys from values in SSD-conscious storage. In: USENIX Conference on File and Storage Technologies (FAST), pp. 133–148 (2016)

  45. Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. PVLDB 12(5), 531–543 (2019)

    Google Scholar 

  46. MyRocks. http://myrocks.io/

  47. Mathieu, C., et al.: Bigtable merge compaction. CoRR arXiv:1407.3008 (2014)

  48. Mei, F., et al.: LSM-tree managed storage for large-scale key-value store. In: ACM SoCC, pp. 142–156 (2017)

  49. Mei, F., et al.: SifrDB: a unified solution for write-optimized key-value stores in large datacenter. In: ACM SoCC, pp. 477–489 (2018)

  50. Muth, P., et al.: The LHAM log-structured history data access method. VLDBJ 8(3), 199–221 (2000)

    Article  Google Scholar 

  51. O’Neil, P., et al.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996)

    Article  Google Scholar 

  52. Pan, F.F., et al.: dCompaction: speeding up compaction of the LSM-tree via delayed compaction. J. Comput. Sci. Technol. 32(1), 41–54 (2017)

    Article  Google Scholar 

  53. Papagiannis, A., et al.: An efficient memory-mapped key-value store for flash storage. In: ACM SoCC, pp. 490–502 (2018)

  54. Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. CACM 33(6), 668–676 (1990)

    Article  Google Scholar 

  55. Putze, F., et al.: Cache-, hash-, and space-efficient bloom filters. J. Exp. Algorithmics 14, 4:4.4–4:4.18 (2010)

    MathSciNet  MATH  Google Scholar 

  56. Qader, M.A., et al.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: ACM SIGMOD, pp. 551–566 (2018)

  57. RocksDB. http://rocksdb.org/

  58. Raju, P., et al.: PebblesDB: building key-value stores using fragmented log-structured merge trees. In: ACM SOSP, pp. 497–514 (2017)

  59. Ren, K., et al.: SlimDB: a space-efficient key-value storage engine for semi-sorted data. PVLDB 10(13), 2037–2048 (2017)

    Google Scholar 

  60. Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM TOCS 10(1), 26–52 (1992)

    Article  Google Scholar 

  61. Sears, R., Ramakrishnan, R.: bLSM: a general purpose log structured merge tree. In: ACM SIGMOD, pp. 217–228 (2012)

  62. Seltzer, M.I.: File system performance and transaction support. Tech. rep., PhD Thesis, Department of Electrical Engineering and Computer Sciences, University of California Berkeley (1992)

  63. Severance, D.G., Lohman, G.M.: Differential files: their application to the maintenance of large databases. ACM TODS 1(3), 256–267 (1976)

    Article  Google Scholar 

  64. Shetty, P.J., et al.: Building workload-independent storage with VT-trees. In: USENIX Conference on File and Storage Technologies (FAST), pp. 17–30 (2013)

  65. Stonebraker, M.: The design of the Postgres storage system. In: VLDB, pp. 289–300 (1987)

  66. Tan, W., et al.: Diff-index: differentiated index in distributed log-structured data stores. In: EDBT, pp. 700–711 (2014)

  67. Tang, Y., et al.: Deferred lightweight indexing for log-structured key-value stores. In: International Symposium in Cluster, Cloud, and Grid Computing (CCGrid), pp. 11–20 (2015)

  68. Teng, D., et al.: LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In: IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 68–79 (2017)

  69. Teng, D., et al.: A low-cost disk solution enabling LSM-tree to achieve high performance for mixed read/write workloads. ACM TOS 14(2), 15:1–15:26 (2018)

    Google Scholar 

  70. Thonangi, R., Yang, J.: On log-structured merge for solid-state drives. In: ICDE, pp. 683–694 (2017)

  71. Thonangi, R., et al.: A practical concurrent index for solid-state drives. In: ACM CIKM, pp. 1332–1341 (2012)

  72. Turner, J.: New directions in communications (or which way to the information age?). IEEE Commun. Mag. 24(10), 8–15 (1986)

    Article  Google Scholar 

  73. Vinçon, T., et al.: NoFTL-KV: Tackling write-amplification on KV-stores with native storage management. In: EDBT, pp. 457–460 (2018)

  74. Wang, P., et al.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: European Conference on Computer Systems (EuroSys), pp. 16:1–16:14 (2014)

  75. Wu, L., et al.: LSII: An indexing structure for exact real-time search on microblogs. In: ICDE, pp. 482–493 (2013)

  76. Wu, X., et al.: LSM-trie: an LSM-tree-based ultra-large key-value store for small data. In: USENIX Annual Technical Conference (ATC), pp. 71–82 (2015)

  77. Yao, A.C.C.: On random 2–3 trees. Acta Inf. 9(2), 159–170 (1978)

    Article  Google Scholar 

  78. Yao, T., et al.: Building efficient key-value stores via a lightweight compaction tree. ACM TOS 13(4), 29:1–29:28 (2017)

    Google Scholar 

  79. Yao, T., et al.: A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In: International Conference on Massive Storage Systems and Technology (MSST) (2017)

  80. Yoon, H., et al.: Mutant: Balancing storage cost and latency in LSM-tree data stores. In: ACM SoCC, pp. 162–173 (2018)

  81. Yue, Y., et al.: Building an efficient put-intensive key-value store with skip-tree. IEEE Trans. Parallel Distrib. Syst. 28(4), 961–973 (2017)

    Article  MathSciNet  Google Scholar 

  82. Zhang, W., et al.: Improving write performance of LSMT-based key-value store. In: International Conference on Parallel and Distributed Systems (ICPADS), pp. 553–560 (2016)

  83. Zhang, Y., et al.: ElasticBF: Fine-grained and elastic bloom filter towards efficient read for LSM-tree-based KV stores. In: USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage) (2018)

  84. Zhang, Z., et al.: Pipelined compaction for the LSM-tree. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 777–786 (2014)

  85. Zhu, Y., et al.: An efficient bulk loading approach of secondary index in distributed log-structured data stores. In: DASFAA, pp. 87–102 (2017)

    Chapter  Google Scholar 

Download references

Acknowledgements

We would like to thank Mark Callaghan, Manos Athanassoulis, and the anonymous reviewers for their valuable comments and feedback. This work was supported by NSF awards CNS-1305430, IIS-1447720, and IIS-1838248 along with industrial support from Amazon, Google, and Microsoft and support from The Donald Bren Foundation (via a Bren Chair) of UC Irvine.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chen Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Luo, C., Carey, M.J. LSM-based storage techniques: a survey. The VLDB Journal 29, 393–418 (2020). https://doi.org/10.1007/s00778-019-00555-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-019-00555-y

Keywords

Navigation