Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3459637.3481913acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Elastic and Stable Compaction for LSM-tree: A FaaS-Based Approach on TerarkDB

Published: 30 October 2021 Publication History

Abstract

LSM-tree is widely used as a write-optimized storage engine in many NoSQL systems. However, the periodical compaction operations in LSM-tree cost many I/O bandwidths and CPU resources of the local server, resulting in throughput drops of the system. To address this issue, this paper proposes a new compaction scheme based on the FaaS (Functions as a Service) architecture, which is called FaaS Compaction. It utilizes the elastic computing capability of FaaS and always pushes compactions to a FaaS cluster. The FaaS cluster will perform actual compaction operations, which will not affect the processing of the local server. Therefore, we can maintain stable performance even when periodical compactions are triggered. We also present a Parallel Slight Compaction method to solve the timeout problem caused by heavy compactions. We implement the FaaS Compaction based on TerarkDB and a real FaaS cluster and experimentally compare the FaaS Compaction with the RocksDB's local compaction scheme and the state-of-the-art offloading compaction policy. The results suggest the efficiency, stability, and elasticity of our proposal.

References

[1]
Muhammad Yousuf Ahmad and Bettina Kemme. 2015. Compaction management in distributed key-value datastores. Proceedings of the VLDB Endowment (PVLDB), Vol. 8, 8 (2015), 850--861.
[2]
Alibaba. 2021. Oceanbase. https://oceanbase.alipay.com Retrieved January 21, 2021 from
[3]
Amazon. 2021 a. AWS Lambda. https://aws.amazon.com Retrieved January 21, 2021 from
[4]
Amazon. 2021 b. What Is a Cold Start? https://mikhail.io/serverless/coldstarts/define Retrieved January 21, 2021 from
[5]
Apache. 2021 a. Cassandra. https://cassandra.apache.org/ Retrieved January 21, 2021 from
[6]
Apache. 2021 b. HBase. https://hbase.apache.org/ Retrieved January 21, 2021 from https://hbase.apache.org/
[7]
Oana Balmau, Diego Didona, Rachid Guerraoui, Willy Zwaenepoel, Huapeng Yuan, Aashray Arora, Karan Gupta, and Pavan Konka. 2017. TRIAD: Creating Synergies Between Memory, Disk and Log in Log Structured Key-Value Stores. In Proceedings of the 2017 USENIX Annual Technical Conference (ATC). 363--375.
[8]
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, Ravishankar Chandhiramoorthi, and Diego Didona. 2019. SILK: Preventing Latency Spikes in Log-Structured Merge Key-Value Stores. In Proceedings of the 2019 USENIX Annual Technical Conference (ATC). 753--766.
[9]
Oana Balmau, Florin Dinu, Willy Zwaenepoel, Karan Gupta, and Diego Didona. 2020. SILK Preventing Latency Spikes in Log-Structured Merge Key-Value Stores Running Heterogeneous Workloads. ACM Transactions on Computer Systems (ToCS), Vol. 36, 4 (2020), 1--27.
[10]
Laurent Bindschaedler, Ashvin Goel, and Willy Zwaenepoel. 2020. Hailstorm: Disaggregated Compute and Storage for Distributed LSM-based Databases. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). 301--316.
[11]
ByteDance. 2021. TerarkDB. https://github.com/bytedance/terarkdb Retrieved January 21, 2021 from
[12]
Yunpeng Chai, Yanfeng Chai, Xin Wang, Haocheng Wei, Ning Bao, and Yushi Liang. 2019. LDC: a lower-level driven compaction method to optimize SSD-oriented key-value stores. In Proceedings of the 2019 IEEE 35th International Conference on Data Engineering (ICDE). IEEE, 722--733.
[13]
Helen H. W. Chan, Yongkun Li, Patrick P. C. Lee, and Yinlong Xu. 2018. HashKV: Enabling Efficient Updates in KV Storage via Hashing. In Proceedings of the 2018 USENIX Annual Technical Conference (ATC). 1007--1019.
[14]
Lidong Chen, Yinliang Yue, Haobo Wang, and Jianhua Wu. 2018. A Priority and Fairness Mixed Compaction Scheduling Mechanism for LSM-tree Based KV-Stores. In Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (ICA3PP). Springer, 89--105.
[15]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC). ACM, 143--54.
[16]
Niv Dayan, Manos Athanassoulis, and Stratos Idreos. 2017. Monkey: Optimal navigable key-value store. In Proceedings of the 2017 ACM International Conference on Management of Data (SIGMOD). ACM, 79--94.
[17]
Niv Dayan and Stratos Idreos. 2018. Dostoevsky: Better space-time trade-offs for LSM-tree based key-value stores via adaptive removal of superfluous merging. In Proceedings of the 2018 International Conference on Management of Data (SIGMOD). ACM, 505--520.
[18]
Niv Dayan and Stratos Idreos. 2019. The log-structured merge-bush & the wacky continuum. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD). 449--466.
[19]
FACEBOOK. 2021 a. RocksuppercaseDB: A Persistent Key-value Store for Fast Storage Environments. https://rocksdb.org/ Retrieved January 21, 2021 from
[20]
FACEBOOK. 2021 b. RocksuppercaseDB: Leveled Compaction. https://github.com/facebook/rocksdb/wiki/Leveled-Compaction Retrieved January 21, 2021 from
[21]
FACEBOOK. 2021 c. RocksuppercaseDB: Rate Limiter. https://github.com/facebook/rocksdb/wiki/Rate-Limiter Retrieved January 21, 2021 from
[22]
FACEBOOK. 2021 d. Write Stalls. https://github.com/facebook/rocksdb/wiki/Write-Stalls Retrieved January 21, 2021 from
[23]
Google. 2021 a. Google Cloud. https://cloud.google.com Retrieved January 21, 2021 from
[24]
Google. 2021 b. LevelDB. https://github.com/google/leveldb Retrieved January 21, 2021 from
[25]
IBM. 2021. IBM Cloud. https://cloud.ibm.com/ Retrieved January 21, 2021 from
[26]
Peiquan Jin, Jianchuan Li, and Hai Long. 2021. DLC: A New Compaction Scheme for LSM-tree with High Stability and Low Latency. In Proceedings of the 24th International Conference on Extending Database Technology, EDBT. 547--557.
[27]
Jianchuang Li, Peiquan Jin, and Shouhong Wan. 2020. Adaptive Lazy Compaction with High Stability and Low Latency for Data-Intensive Systems. In Proceedings of the IEEE International Conference on Big Data (Big Data). 5753--5755.
[28]
Lanyue Lu, Thanumalayan Sankaranarayana Pillai, Hariharan Gopalakrishnan, Andrea C Arpaci-Dusseau, and Remzi H Arpaci-Dusseau. 2017. Wisckey: Separating keys from values in ssd-conscious storage. ACM Transactions on Storage (TOS), Vol. 13, 1 (2017), 1--28.
[29]
Chen Luo and Michael J. Carey. 2019. On Performance Stability in LSM-based Storage Systems. Proceedings of the VLDB Endowment (PVLDB), Vol. 13, 4 (2019), 449--462.
[30]
Microsoft. 2021. Microsoft Azure. https://azure.microsoft.com Retrieved January 21, 2021 from
[31]
Patrick O'Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O'Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica, Vol. 33, 4 (1996), 351--385.
[32]
Fengfeng Pan, Yinliang Yue, and Jin Xiong. 2017. dCompaction: Delayed compaction for the LSM-tree. International Journal of Parallel Programming (IJPP), Vol. 45, 6 (2017), 1310--1325.
[33]
Hieu Pham. 2021. Remote Compactions in RocksDB-Cloud Share. https://rockset.com/blog/remote-compactions-in-rocksdb-cloud/ Retrieved January 21, 2021 from
[34]
Mike Roberts. 2021. Serverless Architectures. https://martinfowler.com/articles/serverless.html Retrieved January 21, 2021 from
[35]
Dejun Teng, Lei Guo, Rubao Lee, Feng Chen, Siyuan Ma, Yanfeng Zhang, and Xiaodong Zhang. 2017. M-tree: Re-enabling buffer caching in data management for mixed reads and writes. In Proceedings of the IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, 68--79.
[36]
Terark. 2021 a. Dynamic Patricia Trie. https://github.com/Terark/terarkdb/wiki/Dynamic-Patricia-Trie Retrieved January 21, 2021 from
[37]
Terark. 2021 b. TerarkDB Documentation. https://github.com/Terark/terarkdb/wiki Retrieved January 21, 2021 from
[38]
Terark. 2021 c. TerarkZipTable. https://github.com/Terark/terark-zip-rocksdb Retrieved January 21, 2021 from
[39]
Yi Wang, Peiquan Jin, and Shouhong Wan. 2020. HotKey-LSM: A Hotness-Aware LSM-Tree for Big Data Storage. In Proceedings of the IEEE International Conference on Big Data (Big Data). 5849--5851.
[40]
Wikipedia. 2021. Function_as_a_service. https://en.wikipedia.org/wiki/Function_as_a_service Retrieved January 21, 2021 from
[41]
Teng Zhang, Jianying Wang, Xuntao Cheng, Hao Xu, Nanlong Yu, Gui Huang, Tieying Zhang, Dengcheng He, Feifei Li, Wei Cao, et al. 2020. FPGA-Accelerated Compactions for LSM-based Key-Value Store. In Proceedings of the 18th USENIX Conference on File and Storage Technologies (FAST). 225--237.

Cited By

View all
  • (2024)SuccinctKV: a CPU-efficient LSM-tree Based KV Store with Scan-based CompactionACM Transactions on Architecture and Code Optimization10.1145/369587321:4(1-26)Online publication date: 20-Nov-2024
  • (2024)Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665954(116-123)Online publication date: 8-Jul-2024
  • (2024)CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated InfrastructureProceedings of the ACM on Management of Data10.1145/36549272:3(1-28)Online publication date: 30-May-2024
  • Show More Cited By

Index Terms

  1. Elastic and Stable Compaction for LSM-tree: A FaaS-Based Approach on TerarkDB

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
    October 2021
    4966 pages
    ISBN:9781450384469
    DOI:10.1145/3459637
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. compaction
    2. elastic computing
    3. faas
    4. key-value store
    5. lsm-tree

    Qualifiers

    • Research-article

    Funding Sources

    • National Science Foundation of China

    Conference

    CIKM '21
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)104
    • Downloads (Last 6 weeks)13
    Reflects downloads up to 23 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SuccinctKV: a CPU-efficient LSM-tree Based KV Store with Scan-based CompactionACM Transactions on Architecture and Code Optimization10.1145/369587321:4(1-26)Online publication date: 20-Nov-2024
    • (2024)Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665954(116-123)Online publication date: 8-Jul-2024
    • (2024)CaaS-LSM: Compaction-as-a-Service for LSM-based Key-Value Stores in Storage Disaggregated InfrastructureProceedings of the ACM on Management of Data10.1145/36549272:3(1-28)Online publication date: 30-May-2024
    • (2024)Scavenger: Better Space-Time Trade-Offs for Key-Value Separated LSM-trees2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00312(4072-4085)Online publication date: 13-May-2024
    • (2023)WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDsACM Transactions on Architecture and Code Optimization10.1145/363748821:1(1-23)Online publication date: 13-Dec-2023
    • (2023)Increase Merge Efficiency in LSM Trees Through Coordinated Partitioning of Sorted Runs2023 IEEE International Conference on Big Data (BigData)10.1109/BigData59044.2023.10386222(530-535)Online publication date: 15-Dec-2023
    • (2022)ByteGraphProceedings of the VLDB Endowment10.14778/3554821.355482415:12(3306-3318)Online publication date: 1-Aug-2022
    • (2022)ZonedStore: A Concurrent ZNS-Aware Cache System for Cloud Data Storage2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS54860.2022.00148(1322-1325)Online publication date: Jul-2022
    • (2022)Revisiting LSM-Tree-Based Key-Value Stores for ZNS SSDs2022 IEEE International Conference on Big Data (Big Data)10.1109/BigData55660.2022.10020961(6772-6774)Online publication date: 17-Dec-2022
    • (2021)Exploring Index Structures for Zoned Namespaces SSDs2021 IEEE International Conference on Big Data (Big Data)10.1109/BigData52589.2021.9671606(5919-5922)Online publication date: 15-Dec-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media