Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3337821.3337884acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

A Read-leveling Data Distribution Scheme for Promoting Read Performance in SSDs with Deduplication

Published: 05 August 2019 Publication History

Abstract

Deduplication, as a space-saving technology, is widely deployed in the flash-based storage systems to address the capacity and endurance limitations of flash devices. In this paper, we find that deduplication changes the physical data layout, which raises the chances of the uneven read distribution. This uneven read distribution not only increases the access contention but also deteriorates the read parallelism, thus leading to the read performance degradation. To solve this issue, we propose an efficient read-leveling data distribution scheme (RLDDS), which scatters the highly-duplicated data into different parallel units, to improve the read performance for SSDs with deduplication for access-intensive workloads. RLDDS writes data into a parallel unit with lower potential read-hotness to balance the read distribution among all the parallel units. Extensive experimental results show that RLDDS effectively improves the read performance by up to 21.61% compared to deduplication with the conventional dynamic data allocation scheme. Additional benefits of RLDDS include the promoted write performance (up to 23.69%) in access-intensive workloads and the overall system performance improvement (up to 18.22%) with the same write traffic reduction.

References

[1]
Li-Pin Chang and Tei-Wei Kuo. 2002. An adaptive striping architecture for flash memory storage systems of embedded systems. In Proc. 8th IEEE Real-Time and Embedded Technology and Applications Symposium. IEEE, 187--196.
[2]
Feng Chen, Tian Luo, and Xiaodong Zhang. 2011. CAFTL: A Content-Aware Flash Translation Layer Enhancing the Lifespan of Flash Memory based Solid State Drives. In Proc. USENIX FAST, 2011. 77--90.
[3]
Edward Grady Coffman and Peter J Denning. 1973. Operating systems theory. Vol. 973. prentice-Hall Englewood Cliffs, NJ.
[4]
Ahmed El-Shimi, Ran Kalach, Ankit Kumar, Adi Ottean, Jin Li, and Sudipta Sengupta. 2012. Primary data deduplicationąłlarge scale study and system design. In Proc. USENIX FAST, 2012.
[5]
Robert Gallager. 1962. Low-density parity-check codes. IRE Transactions on information theory 8, 1 (1962), 21--28.
[6]
Aayush Gupta, Raghav Pisolkar, Bhuvan Urgaonkar, and Anand Sivasubramaniam. 2011. Leveraging Value Locality in Optimizing NAND Flash-based SSDs. In Proc. USENIX FAST, 2011.
[7]
Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Chao Ren. 2013. Exploring and exploiting the multilevel parallelism inside SSDs for improved performance and endurance. IEEE Trans. Comput. 62, 6 (2013), 1141--1155.
[8]
Yang Hu, Hong Jiang, Dan Feng, Lei Tian, Hao Luo, and Shuping Zhang. 2011. Performance impact and interplay of SSD parallelism through advanced commands, allocation strategy and data granularity. In Proceedings of the international conference on Supercomputing, 2011. ACM, 96--107.
[9]
Jeong-Uk Kang, Jin-Soo Kim, Chanik Park, Hyoungjun Park, and Joonwon Lee. 2007. A multi-channel architecture for highperformance NAND flash-based storage system. Journal of systems Architecture 53, 9 (2007), 644--658.
[10]
Jonghwa Kim, Choonghyun Lee, Sangyup Lee, Ikjoon Son, Jongmoo Choi, Sungroh Yoon, Hu-ung Lee, Sooyong Kang, Youjip Won, and Jaehyuk Cha. 2012. Deduplication in SSDs: Model and quantitative analysis. In Proc. MSST, 2012. IEEE, 1--12.
[11]
Ricardo Koller and Raju Rangaswami. 2010. I/O deduplication: Utilizing content similarity to improve I/O performance. ACM Transactions on Storage 6, 3 (2010), 13.
[12]
Cheng Li, Philip Shilane, Fred Douglis, Hyong Shim, Stephen Smaldone, and Grant Wallace. 2014. Nitro: A Capacity-Optimized SSD Cache for Primary Storage. In Proc. USENIX ATC, 2014. 501--512.
[13]
Wenji Li, Gregory Jean-Baptise, Juan Riveros, Giri Narasimhan, Tony Zhang, and Ming Zhao. 2016. CacheDedup: In-line Deduplication for Flash Caching. In Proc. USENIX FAST, 2016.
[14]
Jian Liu, Yunpeng Chai, Xiao Qin, and Yuan Xiao. 2014. PLC-cache: Endurable SSD cache for deduplication-based primary storage. In Proc. MSST, 2014. IEEE, 1--12.
[15]
Bo Mao, Hong Jiang, Suzhen Wu, and Lei Tian. 2014. POD: Performance oriented I/O deduplication for primary storage systems in the cloud. In Proc. IPDPS, 2014. IEEE, 767--776.
[16]
Microsoft. 2017. MSR cambridge traces repository. http://iotta.snia.org/traces/388.
[17]
Dushyanth Narayanan, Eno Thereska, Austin Donnelly, Sameh Elnikety, and Antony Rowstron. 2009. Migrating server storage to SSDs: analysis of tradeoffs. In Proceedings of the 4th ACM European conference on Computer systems, 2009.
[18]
FIPS PUB. 1995. Secure hash standard. Public Law (1995), 235.
[19]
Sean Quinlan and Sean Dorward. 2002. Venti: A New Approach to Archival Storage. In Proc. USENIX FAST, 2002. 89--101.
[20]
Ronald Rivest. 1992. The MD5 message-digest algorithm. Technical Report.
[21]
Ji-Yong Shin, Zeng-Lin Xia, Ning-Yi Xu, Rui Gao, Xiong-Fei Cai, Seungryoul Maeng, and Feng-Hsiung Hsu. 2009. FTL design exploration in reconfigurable high-performance SSD for server applications. In Proc. 23rd international conference on Supercomputing. ACM, 338--349.
[22]
Hiroshi Uchigaito, Seiji Miura, and Takumi Nito. 2018. Efficient data-allocation scheme for eliminating garbage collection during analysis of big graphs stored in nand flash memory. IEEE Trans. Comput. 67, 5 (2018), 646--657.
[23]
Fei Wu, Zuo Lu, You Zhou, Xubin He, Zhihu Tan, and Chang-sheng Xie. 2018. OSPADA: One-Shot Programming Aware Data Allocation Policy to Improve 3D NAND Flash Read Performance. In Proc. ICCD, 2018. IEEE, 51--58.
[24]
Guanying Wu and Xubin He. 2012. Reducing SSD read latency via NAND flash program and erase suspension. In Proc. USENIX FAST, 2012, Vol. 12. 10--10.
[25]
Guanying Wu, Xubin He, Ningde Xie, and Tong Zhang. 2010. DiffECC: Improving SSD read performance using differentiated error correction coding schemes. In 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems. IEEE, 57--66.
[26]
Meng Zhang, Fei Wu, Xubin He, Ping Huang, Shunzhuo Wang, and Changsheng Xie. 2016. REAL: A retention error aware LDPC decoding scheme to improve NAND flash read performance. In Proc. MSST, 2016. IEEE, 1--13.
[27]
Benjamin Zhu, Kai Li, and R Hugo Patterson. 2008. Avoiding the Disk Bottleneck in the Data Domain Deduplication File System. In Proc. USENIX FAST, 2008. 1--14.

Cited By

View all
  • (2024)H2C-Dedup: Reducing I/O and GC Amplification for QLC SSDs from the Deduplication Metadata PerspectiveProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698507(704-719)Online publication date: 20-Nov-2024
  • (2023)RadarSSD: A Computational Storage for Radar Signal ProcessingProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605628(244-253)Online publication date: 7-Aug-2023
  • (2023)ERP: An Efficient Rewrite Scheme to Improve the Inline Deduplication Restore Performance in Backup Systems2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00055(371-378)Online publication date: Jan-2023
  • Show More Cited By

Index Terms

  1. A Read-leveling Data Distribution Scheme for Promoting Read Performance in SSDs with Deduplication

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICPP '19: Proceedings of the 48th International Conference on Parallel Processing
    August 2019
    1107 pages
    ISBN:9781450362955
    DOI:10.1145/3337821
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • University of Tsukuba: University of Tsukuba

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 August 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. SSD
    2. access contention
    3. data allocation
    4. deduplication
    5. read parallelism

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICPP 2019

    Acceptance Rates

    Overall Acceptance Rate 91 of 313 submissions, 29%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)34
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 16 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)H2C-Dedup: Reducing I/O and GC Amplification for QLC SSDs from the Deduplication Metadata PerspectiveProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698507(704-719)Online publication date: 20-Nov-2024
    • (2023)RadarSSD: A Computational Storage for Radar Signal ProcessingProceedings of the 52nd International Conference on Parallel Processing10.1145/3605573.3605628(244-253)Online publication date: 7-Aug-2023
    • (2023)ERP: An Efficient Rewrite Scheme to Improve the Inline Deduplication Restore Performance in Backup Systems2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00055(371-378)Online publication date: Jan-2023
    • (2022)Dedup-for-speedProceedings of the 15th ACM International Conference on Systems and Storage10.1145/3534056.3534937(128-139)Online publication date: 6-Jun-2022
    • (2022)EDC: An Elastic Data Cache to Optimizing the I/O Performance in Deduplicated SSDsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2021.310140441:7(2250-2262)Online publication date: Jul-2022
    • (2022)Research on Data Routing Strategy of Deduplication in Cloud EnvironmentIEEE Access10.1109/ACCESS.2021.313975710(9529-9542)Online publication date: 2022
    • (2021)Coupling Right-Provisioned Cold Storage Data Centers with DeduplicationProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472485(1-11)Online publication date: 9-Aug-2021
    • (2021)Smart-DNN: Efficiently Reducing the Memory Requirements of Running Deep Neural Networks on Resource-constrained Platforms2021 IEEE 39th International Conference on Computer Design (ICCD)10.1109/ICCD53106.2021.00087(533-541)Online publication date: Oct-2021
    • (2021)A Cost-Efficient Metadata Scheme for High-Performance Deduplication Systems2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00034(49-56)Online publication date: Dec-2021
    • (2020)Delta-DNN: Efficiently Compressing Deep Neural Networks via Exploiting Floats SimilarityProceedings of the 49th International Conference on Parallel Processing10.1145/3404397.3404408(1-12)Online publication date: 17-Aug-2020

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media