LSM-based storage techniques: a survey

7192 Accesses
122 Citations
60 Altmetric
Explore all metrics

Abstract

Recently, the log-structured merge-tree (LSM-tree) has been widely adopted for use in the storage layer of modern NoSQL systems. Because of this, there have been a large number of research efforts, from both the database community and the operating systems community, that try to improve various aspects of LSM-trees. In this paper, we provide a survey of recent research efforts on LSM-trees so that readers can learn the state of the art in LSM-based storage techniques. We provide a general taxonomy to classify the literature of LSM-trees, survey the efforts in detail, and discuss their strengths and trade-offs. We further survey several representative LSM-based open-source NoSQL systems and discuss some potential future research directions resulting from the survey.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 7

Fig. 16

Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores

Fault-tolerant precise data access on distributed log-structured merge-tree

Article 18 June 2019

Comparison and evaluation of state-of-the-art LSM merge policies

Article Open access 13 February 2021

Notes

Also referred to as runs in the literature.
Also referred to as compaction in the literature.
Even for an append-mostly workload, the total number of levels will grow extremely slowly since the maximum number of entries that an LSM-tree can store increases exponentially with a factor of T as the number of levels increases.
RocksDB supports a limited form of a partitioned tiering merge policy to bound the maximum size of each SSTable due to its internal restrictions. However, the disk space may still be doubled temporarily during large merges.
The original analysis presented in [20, 21] defines the space amplification to be the overall number of obsolete entries divided by the number of unique entries. We slightly modified the definition to ensure that the space amplification is no less than 1.
RocksDB [57] supports a limited form of fractional cascading by maintaining the set of overlapping SSTables at the adjacent next level for each SSTable. These pointers are used to narrow down the search range when locating specific SSTables during point lookups.
It is called the compaction filter in RocksDB since RocksDB prefers the term compaction to merge. We use the term merge here to minimize the potential for terminology confusion.

References

Absalyamov, I., et al.: Lightweight cardinality estimation in LSM-based systems. In: ACM SIGMOD, pp. 841–855 (2018)
Ahmad, M.Y., Kemme, B.: Compaction management in distributed key-value datastores. PVLDB 8(8), 850–861 (2015)
Google Scholar
Alsubaiee, S., et al.: AsterixDB: a scalable, open source BDMS. PVLDB 7(14), 1905–1916 (2014)
Google Scholar
Alsubaiee, S., et al.: Storage management in AsterixDB. PVLDB 7(10), 841–852 (2014)
Google Scholar
Alsubaiee, S., et al.: LSM-based storage and indexing: an old idea with timely benefits. In: International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-spatial Data (GeoRich), pp 1–6 (2015)
Amur, H., et al.: Design of a write-optimized data store. Tech. rep, Georgia Institute of Technology (2013)
Athanassoulis, M., et al.: MaSM: efficient online updates in data warehouses. In: ACM SIGMOD, pp. 865–876. ACM (2011)
Athanassoulis, M., et al.: Designing access methods: the RUM conjecture. In: EDBT, vol. 2016, pp. 461–466 (2016)
Balmau, O., et al.: FloDB: unlocking memory in persistent key-value stores. In: European Conference on Computer Systems (EuroSys), pp. 80–94 (2017)
Balmau, O., et al.: TRIAD: creating synergies between memory, disk and log in log structured key-value stores. In: USENIX Annual Technical Conference (ATC), pp. 363–375 (2017)
Bender, M.A., et al.: Don’t thrash: how to cache your hash on flash. PVLDB 5(11), 1627–1637 (2012)
Google Scholar
Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. CACM 13(7), 422–426 (1970)
Article Google Scholar
Bortnikov, E., et al.: Accordion: better memory organization for LSM key-value stores. PVLDB 11(12), 1863–1875 (2018)
Google Scholar
Cassandra. http://cassandra.apache.org/
Chan, H.H.W., et al.: HashKV: enabling efficient updates in KV storage via hashing. In: USENIX Annual Technical Conference (ATC), pp. 1007–1019 (2018)
Chang, F., et al.: Bigtable: a distributed storage system for structured data. ACM TOCS 26(2), 4:1–4:26 (2008)
Article Google Scholar
Chazelle, B., Guibas, L.J.: Fractional cascading: I. A data structuring technique. Algorithmica 1(1), 133–162 (1986)
Article MathSciNet Google Scholar
Chen, G.J., et al.: Realtime data processing at Facebook. In: ACM SIGMOD, pp. 1087–1098 (2016)
Dragon: A distributed graph query engine. https://code.fb.com/data-infrastructure/dragon-a-distributed-graph-query-engine/
Dayan, N., Idreos, S.: Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In: ACM SIGMOD, pp. 505–520 (2018)
Dayan, N., et al.: Monkey: optimal navigable key-value store. In: ACM SIGMOD, pp. 79–94 (2017)
Dayan, N., et al.: Optimal Bloom filters and adaptive merging for LSM-trees. ACM TODS 43(4), 16:1–16:48 (2018)
Article MathSciNet Google Scholar
DeCandia, G., et al.: Dynamo: Amazon’y highly available key-value store. In: ACM SOSP, pp. 205–220 (2007)
Dong, S., et al.: Optimizing space amplification in RocksDB. In: CIDR, vol. 3, p. 3 (2017)
D’silva, J.V., et al.: Secondary indexing techniques for key-value stores: two rings to rule them all. In: International Workshop On Design, Optimization, Languages and Analytical Processing of Big Data (DOLAP) (2017)
Duan, H., et al.: Incremental materialized view maintenance on distributed log-structured merge-tree. In: DASFAA, pp. 682–700 (2018)
Fagin, R., et al.: Optimal aggregation algorithms for middleware. In: ACM PODS, pp. 102–113 (2001)
Fan, B., et al.: Cuckoo filter: practically better than bloom. In: International Conference on emerging Networking EXperiments and Technologies (CoNEXT), pp. 75–88 (2014)
Fang, Y., et al.: Spatial indexing in Microsoft SQL Server 2008. In: ACM SIGMOD, pp. 1207–1216 (2008)
Golan-Gueta, G., et al.: Scaling concurrent log-structured data stores. In: European Conference on Computer Systems (EuroSys), pp. 32:1–32:14 (2015)
Guttman, A.: R-trees: a dynamic index structure for spatial searching. In: ACM SIGMOD, pp. 47–57 (1984)
Article Google Scholar
HBase. https://hbase.apache.org/
Haerder, T., Reuter, A.: Principles of transaction-oriented database recovery. ACM CSUR 15(4), 287–317 (1983)
Article MathSciNet Google Scholar
Jagadish, H.V., et al.: Incremental organization for data recording and warehousing. In: VLDB, pp. 16–25 (1997)
Jermaine, C., et al.: The partitioned exponential file for database storage management. VLDBJ 16(4), 417–437 (2007)
Article Google Scholar
Kannan, S., et al.: Redesigning LSMs for nonvolatile memory with NoveLSM. In: USENIX Annual Technical Conference (ATC), pp. 993–1005 (2018)
Khodaei, A., et al.: Hybrid indexing and seamless ranking of spatial and textual features of web documents. In: Database and Expert Systems Applications (DEXA), pp. 450–466 (2010)
Google Scholar
Kim, T., et al.: Supporting similarity queries in Apache AsterixDB. In: EDBT, pp. 528–539 (2018)
Kim, Y., et al.: A comparative study of log-structured merge-tree-based spatial indexes for big data. In: ICDE, pp. 147–150 (2017)
LevelDB. http://leveldb.org/
Lawder, J.: The application of space-filling curves to the storage and retrieval of multi-dimensional data. Ph.D. thesis, PhD Thesis, University of London, UK (2000)
Li, Y., et al.: Tree indexing on solid state drives. PVLDB 3(1–2), 1195–1206 (2010)
Google Scholar
Lim, H., et al.: Towards accurate and fast evaluation of multi-stage log-structured designs. In: USENIX Conference on File and Storage Technologies (FAST), pp. 149–166 (2016)
Lu, L., et al.: WiscKey: separating keys from values in SSD-conscious storage. In: USENIX Conference on File and Storage Technologies (FAST), pp. 133–148 (2016)
Luo, C., Carey, M.J.: Efficient data ingestion and query processing for LSM-based storage systems. PVLDB 12(5), 531–543 (2019)
Google Scholar
MyRocks. http://myrocks.io/
Mathieu, C., et al.: Bigtable merge compaction. CoRR arXiv:1407.3008 (2014)
Mei, F., et al.: LSM-tree managed storage for large-scale key-value store. In: ACM SoCC, pp. 142–156 (2017)
Mei, F., et al.: SifrDB: a unified solution for write-optimized key-value stores in large datacenter. In: ACM SoCC, pp. 477–489 (2018)
Muth, P., et al.: The LHAM log-structured history data access method. VLDBJ 8(3), 199–221 (2000)
Article Google Scholar
O’Neil, P., et al.: The log-structured merge-tree (LSM-tree). Acta Inf. 33(4), 351–385 (1996)
Article Google Scholar
Pan, F.F., et al.: dCompaction: speeding up compaction of the LSM-tree via delayed compaction. J. Comput. Sci. Technol. 32(1), 41–54 (2017)
Article Google Scholar
Papagiannis, A., et al.: An efficient memory-mapped key-value store for flash storage. In: ACM SoCC, pp. 490–502 (2018)
Pugh, W.: Skip lists: a probabilistic alternative to balanced trees. CACM 33(6), 668–676 (1990)
Article Google Scholar
Putze, F., et al.: Cache-, hash-, and space-efficient bloom filters. J. Exp. Algorithmics 14, 4:4.4–4:4.18 (2010)
MathSciNet MATH Google Scholar
Qader, M.A., et al.: A comparative study of secondary indexing techniques in LSM-based NoSQL databases. In: ACM SIGMOD, pp. 551–566 (2018)
RocksDB. http://rocksdb.org/
Raju, P., et al.: PebblesDB: building key-value stores using fragmented log-structured merge trees. In: ACM SOSP, pp. 497–514 (2017)
Ren, K., et al.: SlimDB: a space-efficient key-value storage engine for semi-sorted data. PVLDB 10(13), 2037–2048 (2017)
Google Scholar
Rosenblum, M., Ousterhout, J.K.: The design and implementation of a log-structured file system. ACM TOCS 10(1), 26–52 (1992)
Article Google Scholar
Sears, R., Ramakrishnan, R.: bLSM: a general purpose log structured merge tree. In: ACM SIGMOD, pp. 217–228 (2012)
Seltzer, M.I.: File system performance and transaction support. Tech. rep., PhD Thesis, Department of Electrical Engineering and Computer Sciences, University of California Berkeley (1992)
Severance, D.G., Lohman, G.M.: Differential files: their application to the maintenance of large databases. ACM TODS 1(3), 256–267 (1976)
Article Google Scholar
Shetty, P.J., et al.: Building workload-independent storage with VT-trees. In: USENIX Conference on File and Storage Technologies (FAST), pp. 17–30 (2013)
Stonebraker, M.: The design of the Postgres storage system. In: VLDB, pp. 289–300 (1987)
Tan, W., et al.: Diff-index: differentiated index in distributed log-structured data stores. In: EDBT, pp. 700–711 (2014)
Tang, Y., et al.: Deferred lightweight indexing for log-structured key-value stores. In: International Symposium in Cluster, Cloud, and Grid Computing (CCGrid), pp. 11–20 (2015)
Teng, D., et al.: LSbM-tree: Re-enabling buffer caching in data management for mixed reads and writes. In: IEEE International Conference on Distributed Computing Systems (ICDCS), pp. 68–79 (2017)
Teng, D., et al.: A low-cost disk solution enabling LSM-tree to achieve high performance for mixed read/write workloads. ACM TOS 14(2), 15:1–15:26 (2018)
Google Scholar
Thonangi, R., Yang, J.: On log-structured merge for solid-state drives. In: ICDE, pp. 683–694 (2017)
Thonangi, R., et al.: A practical concurrent index for solid-state drives. In: ACM CIKM, pp. 1332–1341 (2012)
Turner, J.: New directions in communications (or which way to the information age?). IEEE Commun. Mag. 24(10), 8–15 (1986)
Article Google Scholar
Vinçon, T., et al.: NoFTL-KV: Tackling write-amplification on KV-stores with native storage management. In: EDBT, pp. 457–460 (2018)
Wang, P., et al.: An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: European Conference on Computer Systems (EuroSys), pp. 16:1–16:14 (2014)
Wu, L., et al.: LSII: An indexing structure for exact real-time search on microblogs. In: ICDE, pp. 482–493 (2013)
Wu, X., et al.: LSM-trie: an LSM-tree-based ultra-large key-value store for small data. In: USENIX Annual Technical Conference (ATC), pp. 71–82 (2015)
Yao, A.C.C.: On random 2–3 trees. Acta Inf. 9(2), 159–170 (1978)
Article Google Scholar
Yao, T., et al.: Building efficient key-value stores via a lightweight compaction tree. ACM TOS 13(4), 29:1–29:28 (2017)
Google Scholar
Yao, T., et al.: A light-weight compaction tree to reduce I/O amplification toward efficient key-value stores. In: International Conference on Massive Storage Systems and Technology (MSST) (2017)
Yoon, H., et al.: Mutant: Balancing storage cost and latency in LSM-tree data stores. In: ACM SoCC, pp. 162–173 (2018)
Yue, Y., et al.: Building an efficient put-intensive key-value store with skip-tree. IEEE Trans. Parallel Distrib. Syst. 28(4), 961–973 (2017)
Article MathSciNet Google Scholar
Zhang, W., et al.: Improving write performance of LSMT-based key-value store. In: International Conference on Parallel and Distributed Systems (ICPADS), pp. 553–560 (2016)
Zhang, Y., et al.: ElasticBF: Fine-grained and elastic bloom filter towards efficient read for LSM-tree-based KV stores. In: USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage) (2018)
Zhang, Z., et al.: Pipelined compaction for the LSM-tree. In: IEEE International Parallel & Distributed Processing Symposium (IPDPS), pp. 777–786 (2014)
Zhu, Y., et al.: An efficient bulk loading approach of secondary index in distributed log-structured data stores. In: DASFAA, pp. 87–102 (2017)
Chapter Google Scholar

Download references

Acknowledgements

We would like to thank Mark Callaghan, Manos Athanassoulis, and the anonymous reviewers for their valuable comments and feedback. This work was supported by NSF awards CNS-1305430, IIS-1447720, and IIS-1838248 along with industrial support from Amazon, Google, and Microsoft and support from The Donald Bren Foundation (via a Bren Chair) of UC Irvine.

Author information

Authors and Affiliations

University of California, Irvine, Irvine, USA
Chen Luo & Michael J. Carey

Authors

Chen Luo
View author publications
You can also search for this author in PubMed Google Scholar
Michael J. Carey
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chen Luo.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Luo, C., Carey, M.J. LSM-based storage techniques: a survey. The VLDB Journal 29, 393–418 (2020). https://doi.org/10.1007/s00778-019-00555-y

Download citation

Received: 20 December 2018
Revised: 17 April 2019
Accepted: 05 July 2019
Published: 19 July 2019
Issue Date: January 2020
DOI: https://doi.org/10.1007/s00778-019-00555-y

LSM-based storage techniques: a survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores

Fault-tolerant precise data access on distributed log-structured merge-tree

Comparison and evaluation of state-of-the-art LSM merge policies

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

LSM-based storage techniques: a survey

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi-core Adaptive Merging of the Secondary Index for LSM-Based Stores

Fault-tolerant precise data access on distributed log-structured merge-tree

Comparison and evaluation of state-of-the-art LSM merge policies

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation