Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3627703.3629566acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

CSAL: the Next-Gen Local Disks for the Cloud

Published: 22 April 2024 Publication History

Abstract

Cloud local disks are attractive for their affordable price and high performance. The recent advancement in CPUs motivates cloud vendors to further multiplex the computing resources to serve more users. Unfortunately, such proposals are constrained by the limited offerings of cloud local disks per server as the underlying storage devices are either large but slow (e.g., HDDs) or fast yet small (e.g., NVMe SSDs).
In this paper, we explore the possibility of leveraging high-capacity QLC-based SSDs for cloud local disks. However, the three preliminary unsuccessful attempts indicate that QLC SSDs cannot simply work as drop-in replacement. The root cause is the two levels of write amplification caused by device-level address mapping with Indirection Unit and NAND-level garbage collection.
With these lessons learned, we propose CSAL, the next-gen local disks in Alibaba Cloud. CSAL includes a high-performance SSD as write buffers and a large-capacity QLC SSD for persistence. With a two-level Logical to Physical (L2P) address mapping table, CSAL achieves fine-grained (4KB) data accessing and significantly alleviates the two levels of write amplification. Results show that CSAL always prevails with superior performance and can achieve up to 2.22×, 1.82×, and 2.03× speedups against the second-best peers in micro, application, and deployment benchmarking, respectively. As of now, we have deployed CSAL on thousands of servers and made CSAL open-source to the public.

References

[1]
[n.d.]. Aerospike Certification Tool (ACT). https://github.com/aerospike/act
[2]
[n.d.]. Benchmarking tools. https://github.com/facebook/rocksdb/wiki/Benchmarking-tools
[3]
[n. d.]. Build Ultra High-Performance Storage Applications with the Storage Performance Development Kit. https://spdk.io/
[4]
[n.d.]. dm-zoned. https://www.kernel.org/doc/html/latest/admin-guide/device-mapper/dm-zoned.html
[5]
[n.d.]. FIO. https://fio.readthedocs.io/en/latest/
[6]
[n.d.]. Intel Optane™ SSD P5800X Series. https://www.intel.com/content/www/us/en/products/docs/memory-storage/solid-state-drives/data-center-ssds/optane-ssd-p5800x-p5801x-brief.html
[7]
[n.d.]. Internal Drives Ultrastar DC ZN540 from Western Digital. https://www.westerndigital.com/en-il/products/internal-drives/ultrastar-dc-zn540-nvme-ssd#0TS2094
[8]
[n.d.]. Open Cache Acceleration Software. https://open-cas.github.io/index.html
[9]
[n.d.]. A persistent key-value store for fast storage environments. https://rocksdb.org
[10]
[n. d.]. The scalable, multi-model real-time database. https://aerospike.com/
[11]
2021. SPDK v21.04: ZNS NVMe bdev, PMR, ADQ initiator, RPM. https://github.com/spdk/spdk/releases/tag/v21.04
[12]
AWS. 2022. Amazon EC2 D3 & D3en Instances - Run dense storage workloads with the highest capacity local storage in the cloud. https://aws.amazon.com/ec2/instance-types/d3/
[13]
AWS. 2022. Amazon EC2 I3 Instances - Storage optimized for high transaction workloads. https://aws.amazon.com/ec2/instance-types/i3
[14]
AWS. 2022. Amazon EC2 I3ne Instances - Dense SSD storage instances for data-intensive workloads. https://aws.amazon.com/ec2/instance-types/i3en
[15]
Microsoft Azure. 2022. Lasv3-series - Azure Virtual Machines. https://leam.microsoft.com/en-us/azure/virtual-machines/lasv3-series
[16]
Microsoft Azure. 2022. Lsv3-series - Azure Virtual Machines. https://leam.microsoft.com/en-us/azure/virtual-machines/lsv3-series
[17]
Matias Bjørling, Abutalib Aghayev, Hans Holmberg, Aravind Ramesh, Damien Le Moal, Gregory R Ganger, and George Amvrosiadis. 2021. ZNS: Avoiding the Block Interface Tax for Flash-based SSDs. In Proceedings of the 2021 USENIX Annual Technical Conference (ATC).
[18]
Alibaba Cloud. 2022. D2c, compute-intensive big data instance family. https://www.alibabacloud.com/help/en/elastic-compute-service/latest/instance-families-with-local-ssds#section-ogb-jlc-y4v
[19]
Alibaba Cloud. 2022. D2s, storage-intensive big data instance family. https://www.alibabacloud.com/help/en/elastic-compute-service/latest/big-data- instance-families#section-eum-nil-2ui
[20]
Alibaba Cloud. 2022. I3, instance family with local SSDs. https://www.alibabacloud.com/help/en/elastic-compute-service/latest/instance-families- with-local-ssds#section-ogb-jlc-y4v
[21]
Intel Corporation. 2021. Achieving Optimal Performance & Endurance on Coarse Indirection Unit SSDs. https://www.colfax-intl.com/downloads/intel-achieving-optimal-perf-iu-ssds.pdf
[22]
Tom Coughlin. 2021. Seagate: High Capacity HDDs Have Better TCO Than SSDs. https://www.forbes.com/sites/tomcoughlin/2021/02/25/seagate-high-capacity-hdds-have- better-tco-than-ssds/
[23]
Ian Cutress. 2020. Intel Launches Cooper Lake: 3rd Generation Xeon Scalable for 4P/8P Servers. https://www.anandtech.com/show/15862/intel-launches-cooperTake-3rd-generation-xeon-scalable-for-4p8p-servers
[24]
Western Digital. 2022. Western Digital Extends HDD Technology and Areal Density Leadership Across Smart Video, Network Attached Storage (NAS) and IT/Data Center Channel Segments. https://www.westerndigital.com/company/newsroom/press-releases/2022/2022-07-19-western-digital-extends- hdd-technology-and-areal-density-leadership
[25]
NVM Express. 2023. Features for Error Reporting, SMART, Log Pages, Failures and management capabilities in NVMe Architectures. https://nvmexpress.org/resource/features-for-error-reporting-smart-log-pages-failures-and-management-capabilities-in-nvme-architectures/
[26]
Akira Goda. 2020. 3-D NAND technology achievements and future scaling perspectives. IEEE Transactions on Electron Devices (TED) 67, 4 (2020), 1373--1381.
[27]
Aayush Gupta, Youngjae Kim, and Bhuvan Urgaonkar. 2009. DFTL: a flash translation layer employing demand-based selective caching of page-level address mappings. Acm Sigplan Notices 44, 3 (2009), 229--240.
[28]
Jonmichael Hands. 2022. SupermeRAID and Solidigm D5-P5316 QLC NVMe: Case Study. https://www.graidtech.com/case-study-supremeraid-solidigm-d5-p5316-qlc-nvme/
[29]
Soojun Im and Dongkun Shin. 2010. ComboFTL: Improving performance and lifespan of MLC flash memory using SLC flash buffer. Journal of Systems Architecture 56, 12 (2010), 641--653.
[30]
Shehbaz Jaffer, Kaveh Mahdaviani, and Bianca Schroeder. 2022. Improving the Reliability of Next Generation SSDs using WOM-v Codes. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST).
[31]
Charles P. Jefferies. 2022. Seagate Exos X20 20TB Enterprise HDD Review. https://www.storagereview.com/review/seagate-exos-x20-20tb-enterprise-hdd-review
[32]
Pranav Kalavade. 2020. 4 bits/cell 96 Layer Floating Gate 3D NAND with CMOS under Array Technology and SSDs. In Proceedings of the 2020 IEEE International Memory Workshop (IMW).
[33]
Juwon Kim, Minsu Kim, Muhammad Danish Tehseen, Joontaek Oh, and YouJip Won. 2022. IPLFS: Log-Structured File System without Garbage Collection. In Proceedings of the 2022 USENIX Annual Technical Conference (ATC).
[34]
KIOXIA. 2022. XL-FLASH, Storage Class Memory (SCM). https://www.kioxia.com/en-jp/business/memory/xlflash.html
[35]
Changman Lee, Dongho Sim, Jooyoung Hwang, and Sangyeun Cho. 2015. F2FS: A New File System for Flash Storage. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST).
[36]
Hee-Rock Lee, Chang-Gyu Lee, Seungjin Lee, and Youngjae Kim. 2022. Compaction-Aware Zone Allocation for LSM Based Key-Value Store on ZNS SSDs. In Proceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage).
[37]
Cheng Li, Hao Chen, Chaoyi Ruan, Xiaosong Ma, and Yinlong Xu. 2021. Leveraging NVMe SSDs for Building a Fast, Cost-effective, LSM-tree-based KV Store. ACM Transactions on Storage (TOS) 17, 4 (2021), 1--29.
[38]
Jinhong Li, Qiuping Wang, and Patrick PC Lee. 2022. Efficient LSM-Tree Key-Value Data Management on Hybrid SSD/HDD Zoned Storage. arXiv preprint arXiv:2205.11753 (2022).
[39]
Yan Li. 2020. 3D NAND Memory and Its Application in Solid-State Drives: Architecture, Reliability, Flash Management Techniques, and Current Trends. IEEE Solid-State Circuits Magazine 12, 4 (2020), 56--65.
[40]
Shuwen Liang, Zhi Qiao, Sihai Tang, Jacob Hochstetler, Song Fu, Weisong Shi, and Hsing-Bung Chen. 2019. An Empirical Study of Quad-Level Cell (QLC) Nand Flash SSDs for Big Data Applications. In Proceedings of the 2019 IEEE International Conference on Big Data (BigData).
[41]
Linux. 2022. AIO - POSIX asynchronous I/O overview. https://linux.die.net/man/7/aio
[42]
Zhiye Liu. 2022. Intel Sapphire Rapids Workstation Specs Leaked: Up To 56 Cores, 350W TDP. https://www.tomshardware.com/news/intel-xeon-sapphire-rapids-ws-specs-leaked-up-to-56-cores-350w-tdp
[43]
Zhiye Liu. 2022. Intel's Xeon Emerald Rapids CPUs Could Wield Up To 64 Cores. https://www.tomsliardware.com/news/intels-xeon-emerald-rapids-cpus-could-wield-up-to-64-cores
[44]
Chris Mellor. 2019. WD and Tosh talk up penta-level cell flash. https://blocksandfiles.com/2019/08/07/penta-level-cell-flash/
[45]
Gijun Oh, Junseok Yang, and Sungyong Ahn. 2021. Efficient Key-Value Data Placement for ZNS SSD. Applied Sciences 11, 24 (2021), 11842.
[46]
Yubiao Pan, Yongkun Li, Huizhen Zhang, Hao Chen, and Mingwei Lin. 2020. GFTL: Group-level mapping in flash translation layer to provide efficient address translation for NAND flash-based SSDs. IEEE Transactions on Consumer Electronics 66, 3 (2020), 242--250.
[47]
Madhurima Ray, Krishna Kant, Peng Li, and Sanjeev Trika. 2020. FlashKey: A High-Performance Flash Friendly Key-value Store. In Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS).
[48]
Samsung. 2022. Samsung Z-SSD SZ985. https://semiconductor.samsung.com/newsroom/tech-blog/samsung-z-ssd-sz985/
[49]
Margo Seltzer, Keith Bostic, Marshall McKusick, and Carl Staelin. 1993. An Implementation of a Log-Structured File System for UNIX. In USENIX Winter.
[50]
Kent Smith. 2019. Using QLC SSDs to Improve Cost/Performance Tradeoffs for Warm Data. In Proceedings of the 2019 Flash Memory Summit.
[51]
Lyle Smith. 2021. Intel P5316 SSD Review (30.72TB). https://www.storagereview.com/review/intel-p5316-ssd-review-30-72tb
[52]
Solidigm. 2022. Solidigm Demonstrates World's First Penta-Level Cell SSD at Flash Memory Summit. http://news.solidigm.com/en-WW/217006-solidigm-demonstrates-world-s-first-penta-level-cell-ssd-at-flash-memory-summit
[53]
Solidigm. 2023. Introducing the Solidigm D7-P5810 - an ultra-fast SLC SSD for write-intensive workloads. https://news.solidigm.com/en-WW/230095-introducing-the-solidigm-d7-p5810-an-ultra-fast-slc-ssd-for-write-intensive-workloads
[54]
Yuzhe Tang, Arun Iyengar, Wei Tan, Liana Fong, Ling Liu, and Balaji Palanisamy. 2015. Deferred Lightweight Indexing for Log-structured Key-value Stores. In Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid).
[55]
Tiffany Trader. 2021. Intel Launches 10nm 'Ice Lake' Datacenter CPU with Up to 40 Cores. https://www.hpcwire.com/2021/04/06/intel-launches-10nm-ice-lake-datacenter-cpu-with-up-to-40-cores/
[56]
Qiuping Wang, Jinhong Li, Patrick PC Lee, Tao Ouyang, Chao Shi, and Lilong Huang. 2022. Separating Data via Block Invalidation Time Inference for Write Amplification Reduction in {Log-Structured} Storage. In Proceedings of the 20th USENIX Conference on File and Storage Technologies (FAST).
[57]
Jian Xu and Steven Swanson. 2016. NOVA: A Log-structured File System for Hybrid Volatile/Non-volatile Main Memories. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST).
[58]
Shuai Xue, Shang Zhao, Quan Chen, Gang Deng, Zheng Liu, Jie Zhang, Zhuo Song, Tao Ma, Yong Yang, Yanbo Zhou, Keqiang Niu, Sijie Sun, and Minyi Guo. 2020. Spool: Reliable Virtualized NVMe Storage Pool in Public Cloud Infrastructure. In Proceedings of the 2020 USENIX Annual Technical Conference (ATC).
[59]
Gala Yadgar, Moshe Gabel, Shehbaz Jaffer, and Bianca Schroeder. 2021. SSD-based workload characteristics and their performance implications. ACM Transactions on Storage (TOS) 17, 1 (2021), 1--26.
[60]
Ziye Yang, James R Harris, Benjamin Walker, Daniel Verkamp, Changpeng Liu, Cunyin Chang, Gang Cao, Jonathan Stern, Vishal Verma, and Luse E Paul. 2017. Spdk: A Development Kit to Build High Performance Storage Applications. In Proceedings of the 2017 IEEE International Conference on Cloud Computing Technology and Science (CloudCom).
[61]
Ziye Yang, Changpeng Liu, Yanbo Zhou, Xiaodong Liu, and Gang Cao. 2018. SPDK Vhost-NVMe: Accelerating I/Os in Virtual Machines on NVMe SSDs via User Space Vhost Target. In 2018 IEEE 8th International Symposium on Cloud and Service Computing (SC2). IEEE, 67--76.
[62]
Ting Yao, Jiguang Wan, Ping Huang, Yiwen Zhang, Zhiwen Liu, Changsheng Xie, and Xubin He. 2019. GearDB: A GC-free Key-value Store on HM-SMR Drives with Gear Compaction. In Proceedings of the 17th USENIX Conference on File and Storage Technologies (FAST).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '24: Proceedings of the Nineteenth European Conference on Computer Systems
April 2024
1245 pages
ISBN:9798400704376
DOI:10.1145/3627703
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2024

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. Caching
  2. Cloud Storage
  3. NAND Flash

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

EuroSys '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 929
    Total Downloads
  • Downloads (Last 12 months)929
  • Downloads (Last 6 weeks)97
Reflects downloads up to 17 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media