Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3337821.3337879acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

CostPI: Cost-Effective Performance Isolation for Shared NVMe SSDs

Published: 05 August 2019 Publication History

Abstract

NVMe SSDs have been wildly adopted to provide storage services in cloud platforms where diverse workloads (including latency-sensitive, throughput-oriented and capacity-oriented workloads) are colocated. To achieve performance isolation, existing solutions partition the shared SSD into multiple isolated regions and assign each workload a separate region. However, these isolation solutions could result in inefficient resource utilization and imbalanced wear. More importantly, they cannot reduce the interference caused by embedded cache contention. In this paper, we present CostPI to improve isolation and resource utilization by providing latency-sensitive workloads with dedicated resources (including data cache, mapping table cache and NAND flash), and providing throughput-oriented and capacity-oriented workloads with shared resources. Specifically, at the NVMe queue level, we present an SLO-aware arbitration mechanism which fetches requests from NVMe queues at different granularities according to workload SLOs. At the embedded cache level, we use an asymmetric allocation scheme to partition the cache (including data cache and mapping table cache). For different data cache partitions, we adopt different cache polices to meet diverse workload requirements while reducing the imbalanced wear. At the NAND flash level, we partition the hardware resources at the channel granularity to enable the strongest isolation. Our experiments show that CostPI can reduce the average response time by up to 44.2%, the 99% response time by up to 89.5%, and the 99.9% by up to 88.5% for latency-sensitive workloads. Meanwhile, CostPI can increase resource utilization and reduce wear-imbalance for the shared NVMe SSD.

References

[1]
Microsoft Enterprise Traces. http://iotta.snia.org/traces/130.
[2]
Microsoft Production Server Traces. http://iotta.snia.org/traces/158.
[3]
2019. NVM Express 1.3 specification. https://nvmexpress.org/.
[4]
UMass Trace Repository. http://traces.cs.umass.edu/index.php/Storage/Storage.
[5]
Lakshmi N. Bairavasundaram, Gokul Soundararajan, Vipul Mathur, Kaladhar Voruganti, and Steven Kleiman. 2011. Italian for Beginners: The Next Steps for SLO-Based Management. In 3rd USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2011, Portland, OR, USA, June 14, 2011.
[6]
Da-Wei Chang, Hsin-Hung Chen, and Wei-Jian Su. 2015. VSSD: Performance Isolation in a Solid-State Drive. ACM Trans. Design Autom. Electr. Syst. 20, 4 (2015), 51:1--51:33.
[7]
Jinhua Cui, Weiguo Wu, Yinfeng Wang, and Zhangfeng Duan. 2014. PT-LRU: a probabilistic page replacement algorithm for NAND flash-based consumer electronics. IEEE Trans. Consumer Electronics 60, 4 (2014), 614--622.
[8]
Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74--80.
[9]
Jian Huang, Anirudh Badam, Laura Caulfield, Suman Nath, Sudipta Sengupta, Bikash Sharma, and Moinuddin K. Qureshi. 2017. FlashBlox: Achieving Both Performance Isolation and Uniform Lifetime for Virtualized SSDs. In 15th USENIX Conference on File and Storage Technologies, FAST 2017, Santa Clara, CA, USA, February 27 - March 2, 2017. 375--390.
[10]
Theodore Johnson and Dennis E. Shasha. 1994. 2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm. In VLDB'94, Proceedings of 20th International Conference on Very Large Data Bases, September 12-15, 1994, Santiago de Chile, Chile. 439--450.
[11]
Hoyoung Jung, Hyoki Shim, Sungmin Park, Sooyong Kang, and Jaehyuk Cha. 2008. LRU-WSR: integration of LRU and writes sequence reordering for flash memory. IEEE Trans. Consumer Electronics 54, 3 (2008), 1215--1223.
[12]
Jeong-Uk Kang, Jeeseok Hyun, Hyunjoo Maeng, and Sangyeun Cho. 2014. The Multi-streamed Solid-State Drive. In 6th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage '14, Philadelphia, PA, USA, June 17-18, 2014.
[13]
Won-Kyung Kang, Dongkun Shin, and Sungjoo Yoo. 2017. Reinforcement Learning-Assisted Garbage Collection to Mitigate Long-Tail Latency in SSD. ACM Trans. Embedded Comput. Syst. 16, 5 (2017), 134:1--134:20.
[14]
Swaroop Kavalanekar, Bruce L. Worthington, Qi Zhang, and Vishal Sharda. 2008. Characterization of storage workload traces from production Windows Servers. In 4th International Symposium on Workload Characterization (IISWC 2008), Seattle, Washington, USA, September 14-16, 2008. 119--128.
[15]
Bryan Suk Kim. 2018. Utilitarian Performance Isolation in Shared SSDs. In 10th USENIX Workshop on Hot Topics in Storage and File Systems, HotStorage 2018, Boston, MA, USA, July 9-10, 2018.
[16]
Jaeho Kim, Donghee Lee, and Sam H. Noh. 2015. Towards SLO Complying SSDs Through OPS Isolation. In Proceedings of the 13th USENIX Conference on File and Storage Technologies, FAST 2015, Santa Clara, CA, USA, February 16-19, 2015. 183--189.
[17]
Donghee Lee, Jongmoo Choi, Jong-Hun Kim, Sam H. Noh, Sang Lyul Min, Yookun Cho, and Chong-Sang Kim. 2001. LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies. IEEE Trans. Computers 50, 12 (2001), 1352--1361.
[18]
Zhi Li, Peiquan Jin, Xuan Su, Kai Cui, and Lihua Yue. 2009. CCF-LRU: a new buffer replacement algorithm for flash memory. IEEE Trans. Consumer Electronics 55, 3 (2009), 1351--1359.
[19]
Nimrod Megiddo and Dharmendra S. Modha. 2003. ARC: A Self-Tuning, Low Overhead Replacement Cache. In Proceedings of the FAST '03 Conference on File and Storage Technologies, March 31 - April 2, 2003, Cathedral Hill Hotel, San Francisco, California, USA.
[20]
Elizabeth J. O'Neil, Patrick E. O'Neil, and Gerhard Weikum. 1993. The LRU-K Page Replacement Algorithm For Database Disk Buffering. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, DC, USA, May 26-28, 1993. 297--306.
[21]
Seon-Yeong Park, Dawoon Jung, Jeong-Uk Kang, Jinsoo Kim, and Joonwon Lee. 2006. CFLRU: a replacement algorithm for flash memory. In Proceedings of the 2006 International Conference on Compilers, Architecture, and Synthesis for Embedded Systems, CASES 2006, Seoul, Korea, October 22-25, 2006. 234--241.
[22]
R. Hugo Patterson, Garth A. Gibson, Eka Ginting, Daniel Stodolsky, and Jim Zelenka. 1995. Informed Prefetching and Caching. In Proceedings of the Fifteenth ACM Symposium on Operating System Principles, SOSP 1995, Copper Mountain Resort, Colorado, USA, December 3-6, 1995. 79--95.
[23]
Arash Tavakkol, Juan Gómez-Luna, Mohammad Sadrosadati, Saugata Ghose, and Onur Mutlu. 2018. MQSim: A Framework for Enabling Realistic Studies of Modern Multi-Queue SSD Devices. In 16th USENIX Conference on File and Storage Technologies, FAST 2018, Oakland, CA, USA, February 12-15, 2018. 49--66.
[24]
Arash Tavakkol, Mohammad Sadrosadati, Saugata Ghose, Jeremie Kim, Yixin Luo, Yaohua Wang, Nika Mansouri-Ghiasi, Lois Orosa, Juan Gómez-Luna, and Onur Mutlu. 2018. FLIN: Enabling Fairness and Enhancing Performance in Modern NVMe Solid State Drives. In 45th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2018, Los Angeles, CA, USA, June 1-6, 2018. 397--410.
[25]
Suzhen Wu, Yanping Lin, Bo Mao, and Hong Jiang. 2016. GCaR: Garbage Collection aware Cache Management with Improved Performance for Flash-based SSDs. In Proceedings of the 2016 International Conference on Supercomputing, ICS 2016, Istanbul, Turkey, June 1-3, 2016. 28:1--28:12.
[26]
Shiqin Yan, Huaicheng Li, Mingzhe Hao, Michael Hao Tong, Swaminathan Sundararaman, Andrew A. Chien, and Haryadi S. Gunawi. 2017. Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs. In 15th USENIX Conference on File and Storage Technologies, FAST 2017, Santa Clara, CA, USA, February 27 - March 2, 2017. 15--28.
[27]
Jie Zhang, Miryeong Kwon, Donghyun Gouk, Sungjoon Koh, Changlim Lee, Mohammad Alian, Myoungjun Chun, Mahmut Taylan Kandemir, Nam Sung Kim, Jihong Kim, and Myoungsoo Jung. 2018. FlashShare: Punching Through Server Storage Stack from Kernel to Firmware for Ultra-Low Latency SSDs. In 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018, Carlsbad, CA, USA, October 8-10, 2018. 477--492.

Cited By

View all
  • (2024)CoFS: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage SystemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.341297043:12(4490-4504)Online publication date: Dec-2024
  • (2024)Minato: A Read-Disturb-Aware Dynamic Buffer Management Scheme for NAND Flash MemoryIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336410943:7(1930-1943)Online publication date: Jul-2024
  • (2024)Highly VM-Scalable SSD in Cloud Storage SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.330557343:1(113-126)Online publication date: Jan-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
August 2019
1107 pages
ISBN:9781450362955
DOI:10.1145/3337821
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • University of Tsukuba: University of Tsukuba

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 August 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. NVMe SSDs
  2. performance isolation
  3. resource utilization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • National Defense Preliminary Research Project
  • Hubei Province Technical Innovation Special Project
  • Wuhan Application Basic Research Project
  • National Natural Science Foundation of China
  • Fundamental Research Funds for the Central Universities

Conference

ICPP 2019

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)8
Reflects downloads up to 25 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)CoFS: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage SystemIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.341297043:12(4490-4504)Online publication date: Dec-2024
  • (2024)Minato: A Read-Disturb-Aware Dynamic Buffer Management Scheme for NAND Flash MemoryIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2024.336410943:7(1930-1943)Online publication date: Jul-2024
  • (2024)Highly VM-Scalable SSD in Cloud Storage SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.330557343:1(113-126)Online publication date: Jan-2024
  • (2024)Fair-ZNS: Enhancing Fairness in ZNS SSDs Through Self-Balancing I/O SchedulingIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.323299743:7(2012-2022)Online publication date: Jul-2024
  • (2023)QoS-pro: A QoS-enhanced Transaction Processing Framework for Shared SSDsACM Transactions on Architecture and Code Optimization10.1145/3632955Online publication date: 14-Nov-2023
  • (2023)A Space-Efficient Fair Cache Scheme Based on Machine Learning for NVMe SSDsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.322141034:1(383-399)Online publication date: 1-Jan-2023
  • (2023)A State-Aware Method for Flows With Fairness on NVMe SSDs With Load BalanceIEEE Transactions on Cloud Computing10.1109/TCC.2023.3253864(1-16)Online publication date: 2023
  • (2023)EBIO: An Efficient Block I/O Stack for NVMe SSDs With Mixed WorkloadsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2023.329636942:12(5048-5060)Online publication date: Dec-2023
  • (2022)Improving Fairness for SSD Devices through DRAM Over-Provisioning Cache ManagementIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.314329533:10(2444-2454)Online publication date: 1-Oct-2022
  • (2022)WA-OPShare: Workload-Adaptive Over-Provisioning Space Allocation for Multi-Tenant SSDsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.319996641:11(4527-4538)Online publication date: Nov-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media