Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3423211.3425676acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

BoLT: Barrier-optimized LSM-Tree

Published: 11 December 2020 Publication History

Abstract

Key-value stores such as LevelDB and RocksDB are widely used in various systems due to their high write performance. However, the background compaction operations inherent to the key-value stores are often to blame for write amplification and write stall. In particular, the SSTable size in the existing key-value stores introduces, upon compactions, a tradeoff between the fsync() call frequency and the amount of amplified writes. Small SSTables require a larger number of fsync()/fdatasync() than large SSTables to maintain file consistency. On the contrary, large SSTables result in large overlaps and frequent rewrites of SSTables. In this paper, to reduce file consistency overhead without increasing key ranges of SSTables, we present a variant of LSM-tree, namely, BoLT (Barrier-optimized LSM-Tree), that minimizes the number of calls to fsync()/fdatasync() barriers while taking advantage of fine-grained SSTables. BoLT consists of four key elements: (i) compaction file, (ii) logical SSTables, (iii) group compaction, and (iv) settled compaction. We implement BoLT in LevelDB and HyperLevelDB and compare the performances against LevelDB, HyperLevelDB, RocksDB, and the state-of-the-art PebblesDB. Our experimental study shows that BoLT achieves significantly higher write throughputs than LevelDB and HyperLevelDB.

References

[1]
HBase. https://hbase.apache.org/.
[2]
HyperLevelDB Performance Benchmarks. http://hyperdex.org/performance/leveldb/.
[3]
Kyoto Cabinet: a straightforward implementation of DBM. http://fallabs.com/kyotocabinet/.
[4]
Leveldb. https://github.com/google/leveldb.
[5]
Oracle Berkeley DB. https://www.oracle.com/technetwork/database/database-technologies/berkeleydb/overview/index.html.
[6]
RocksDB. https://rocksdb.org/.
[7]
Muhammad Yousuf Ahmad and Bettina Kemme. Compaction management in distributed key-value datastores. Proc. VLDB Endow., 8(8):850--861, April 2015.
[8]
Anirudh Badam, KyoungSoo Park, Vivek S. Pai, and Larry L. Peterson. HashCache: Cache storage for the next billion. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation, NSDI'09, pages 123--136, 2009.
[9]
Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC), 2017.
[10]
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chandhiramoorthi, and Diego Didona. SILK: Preventing latency spikes in log-structured merge key-value stores. In 2019 USENIX Annual Technical Conference (USENIX ATC), pages 753--766, July 2019.
[11]
Oana Balmau, Rachid Guerraoui, Vasileios Trigonakis, and Igor Zablotchi. FloDB: Unlocking Memory in Persistent Key-Value Stores. In Proceedings of the 12th European Conference on Computer Systems (EuroSys), 2017.
[12]
Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC), 2018.
[13]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. Bigtable: A Distributed Storage System for Structured Data. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI), 2006.
[14]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC), 2010.
[15]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. Optimal bloom filters and adaptive merging for lsm-trees. ACM Trans. Database Syst., 43(4), December 2018.
[16]
Biplob Debnath, Sudipta Sengupta, and Jin Li. Skimpystash: Ram space skimpy key-value store on flash-based storage. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data, SIGMOD '11, pages 25--36. ACM, 2011.
[17]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. Dynamo: Amazon's Highly Available Key-value Store. In Proceedings of the 21th ACM SIGOPS Symposium on Operating Systems Principles (SOSP).
[18]
Paul Dix. Benchmarking LevelDB vs. RocksDB vs. HyperLevelDB vs. LMDB Performance for InfluxDB. https://www.influxdata.com/blog/benchmarking-leveldb-vs-rocksdb-vs-hyperleveldb-vs-lmdb-performance-for-influxdb/.
[19]
Guy Golan-Gueta, Edward Bortnikov, Eshcar Hillel, and Idit Keidar. Scaling concurrent log-structured data stores. In Proceedings of the Tenth European Conference on Computer Systems (EuroSys), page 32. ACM, 2015.
[20]
Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. The unwritten contract of solid state drives. In Proceedings of the Twelfth European Conference on Computer Systems (EuroSys), pages 127--144. ACM, 2017.
[21]
Olzhas Kaiyrakhmet, Songyi Lee, Beomseok Nam, Sam H. Noh, and Young ri Choi. SLM-DB: Single-level key-value store with persistent memory. In Proceedings of the 17th Usenix Conference on File and Storage Technologies (FAST). USENIX Association, 2019.
[22]
Sudarsun Kannan, Nitish Bhat, Ada Gavrilovska, Andrea Arpaci-Dusseau, and Remzi Arpaci-Dusseau. Redesigning LSMs for Nonvolatile Memory with NoveLSM. In Proceedings of the 2018 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC), 2018.
[23]
Avinash Lakshman and Prashant Malik. Cassandra: A Decentralized Structured Storage System. ACM SIGOPS Operating Systems Review, 44(2):35--40, April 2010.
[24]
Hyeontaek Lim, Bin Fan, David G. Andersen, and Michael Kaminsky. SILT: A memory-efficient, high-performance key-value store. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP), pages 1--13. ACM, 2011.
[25]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. WiscKey: Separating Keys from Values in SSD-conscious Storage. In Proceedings of the 14th Usenix Conference on File and Storage Technologies (FAST), 2016.
[26]
Leonardo Marmol, Swaminathan Sundararaman, Nisha Talagala, and Raju Rangaswami. NVMKV: A Scalable, Lightweight, FTL-aware Key-Value Store. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC), 2015.
[27]
Fei Mei, Qiang Cao, Hong Jiang, and Lei Tian Tintri. LSM-tree Managed Storage for Large-scale Key-value Store. In Proceedings of the 2017 Symposium on Cloud Computing (SoCC), 2017.
[28]
Suman Nath and Aman Kansal. FlashDB: Dynamic self-tuning database for nand flash. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks, IPSN '07, pages 410--419. ACM, 2007.
[29]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. The Log-structured Merge-tree (LSM-tree). Acta Informatica, 33(4):351--385, June 1996.
[30]
Pandian Raju, Rohan Kadekodi, Vijay Chidambaram, and Ittai Abraham. PebblesDB: Building Key-Value Stores Using Fragmented Log-Structured Merge Trees. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP), 2017.
[31]
Kai Ren, Qing Zheng, Joy Arulraj, and Garth Gibson. Slimdb: A space-efficient key-value storage engine for semi-sorted data. Proc. VLDB Endow., 10(13):2037--2048, September 2017.
[32]
Russell Sears and Raghu Ramakrishnan. bLSM: A General Purpose Log Structured Merge Tree. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data (SIGMOD), 2012.
[33]
Pradeep J. Shetty, Richard P. Spillane, Ravikant R. Malpani, Binesh Andrews, Justin Seyster, and Erez Zadok. Building Workload-Independent Storage with VT-Trees. In Proceedings of the 11th USENIX Conference on File and Storage Technologies (FAST), 2013.
[34]
X. Sun, J. Yu, Z. Zhou, and C. J. Xue. Fpga-based compaction engine for accelerating lsm-tree key-value stores. In 2020 IEEE 36th International Conference on Data Engineering (ICDE), pages 1261--1272, 2020.
[35]
Youjip Won, Jaemin Jung, Gyeongyeol Choi, Joontaek Oh, Seongbae Son, Jooyoung Hwang, and Sangyeun Cho. Barrier-enabled IO stack for flash storage. In 16th USENIX Conference on File and Storage Technologies (FAST 18), pages 211--226, Oakland, CA, February 2018. USENIX Association.
[36]
Xingbo Wu, Yuehai Xu, Zili Shao, and Song Jiang. LSM-trie: An LSM-tree-based Ultra-large Key-value Store for Small Data. In Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC), 2015.
[37]
Demetrios Zeinalipour-Yazti, Song Lin, Vana Kalogeraki, Dimitrios Gunopulos, and Walid A. Najjar. Microhash: An efficient index structure for flash-based sensor devices. In Proceedings of the 4th Conference on USENIX Conference on File and Storage Technologies (FAST), 2005.

Cited By

View all
  • (2024)Why Files If You Have a DBMS?2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00297(3878-3892)Online publication date: 13-May-2024
  • (2023)FLIXR: Embedding Index Into Flash Translation Layer in SSDsIEEE Transactions on Computers10.1109/TC.2022.315460272:1(250-263)Online publication date: 1-Jan-2023

Index Terms

  1. BoLT: Barrier-optimized LSM-Tree

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    Middleware '20: Proceedings of the 21st International Middleware Conference
    December 2020
    455 pages
    ISBN:9781450381536
    DOI:10.1145/3423211
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 December 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Key-Value Stores
    2. Log-Structured Merge Tree

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    Middleware '20
    Sponsor:
    Middleware '20: 21st International Middleware Conference
    December 7 - 11, 2020
    Delft, Netherlands

    Acceptance Rates

    Overall Acceptance Rate 203 of 948 submissions, 21%

    Upcoming Conference

    MIDDLEWARE '24
    25th International Middleware Conference
    December 2 - 6, 2024
    Hong Kong , Hong Kong

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)62
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 18 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Why Files If You Have a DBMS?2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00297(3878-3892)Online publication date: 13-May-2024
    • (2023)FLIXR: Embedding Index Into Flash Translation Layer in SSDsIEEE Transactions on Computers10.1109/TC.2022.315460272:1(250-263)Online publication date: 1-Jan-2023

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media