Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3655038.3665951acmconferencesArticle/Chapter ViewAbstractPublication PageshotstorageConference Proceedingsconference-collections
research-article
Open access

Revisiting Erasure Codes: A Configuration Perspective

Published: 08 July 2024 Publication History

Abstract

Erasure coding (EC) plays a crucial role in the fault tolerance of modern distributed storage systems (DSS). Inspired by recent research on storage configuration, we study the configuration sensitivity of EC in real DSS in this paper. We systematically inject faults to trigger EC recovery under various configurations, and measure the impact on recovery time and storage overhead quantitatively. Our results show that configurations may affect the EC recovery time significantly (e.g., up to 426%). More interestingly, theoretically superior codes may perform worse in DSS under certain configurations. Also, there is a system checking period before EC recovery that accounts for 41% to 58% of the overall system recovery time, which has been largely ignored in previous studies. Finally, in terms of storage overhead, EC may introduce 32.3% to 72.0% more write amplification (WA) than the theoretical expectation, and we derive a formula to help estimate WA more precisely. Our work suggests the importance of considering the context of real DSS for EC research, and we hope the methodology and findings can contribute to a firmer footing for EC optimization in practice.

References

[1]
2012. NCCloud: Applying Network Coding for the Storage Repair in a Cloud-of-Clouds. In 10th USENIX Conference on File and Storage Technologies (FAST).
[2]
Ramnatthan Alagappan, Aishwarya Ganesan, Eric Lee, Aws Albarghouthi, Vijay Chidambaram, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2018. Protocol-Aware Recovery for Consensus-Based Storage. In 16th USENIX Conference on File and Storage Technologies (FAST).
[3]
AWS-EC2. https://aws.amazon.com/ec2/?nc2=h_ql_prod_fs_ec2.
[4]
Jinrui Cao, Om Rameshwar Gatla, Mai Zheng, Dong Dai, Vidya Eswarappa, Yan Mu, and Yong Chen. 2018. PFault: A general framework for analyzing the reliability of high-performance parallel file systems. In Proceedings of the 2018 International Conference on Super-computing (ICS).
[5]
Zhen Cao, Geoff Kuenning, and Erez Zadok. 2020. Carver: Finding Important Parameters for Storage System Tuning. In 18th USENIX Conference on File and Storage Technologies (FAST 20).
[6]
Benjamin Carver, Runzhou Han, Jingyuan Zhang, Mai Zheng, and Yue Cheng. 2024. λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions. In Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4.
[7]
Ceph. https://ceph.com/en/. (accessed April 3, 2024).
[8]
Qingrong Chen, Teng Wang, Owolabi Legunsen, Shanshan Li, and Tianyin Xu. 2020. Understanding and discovering software configuration dependencies in cloud and datacenter systems. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering.
[9]
Colossus. https://cloud.google.com/blog/products/storage-data-transfer/a-peek-behind-colossus-googles-file-system.
[10]
DAOS. https://ethereum.org/en/dao/.
[11]
Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, and Kannan Ramchandran. 2010. Network Coding for Distributed Storage Systems. IEEE Transactions on Information Theory 56, 9 (2010), 4539--4551. https://doi.org/10.1109/TIT.2010.2054295
[12]
AWS FIS. https://aws.amazon.com/fis/. (accessed April, 2024).
[13]
Windows 10 2004/20H2: Microsoft fixes chkdsk issue in update KB4592438. https://borncity.com/win/2020/12/21/windows-10-2004-20h2-microsoft-fixes-chkdsk-issue-in-update-kb4592438/.
[14]
Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. Redundancy Does Not Imply Fault Tolerance: Analysis of Distributed Storage Reactions to Single Errors and Corruptions. In 15th USENIX Conference on File and Storage Technologies (FAST 17). 149--166. https://www.usenix.org/conference/fast17/technical-sessions/presentation/ganesan
[15]
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google File System. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP).
[16]
GlusterFS. https://www.gluster.org.
[17]
Parikshit Gopalan, Cheng Huang, Huseyin Simitci, and Sergey Yekhanin. 2012. On the Locality of Codeword Symbols. IEEE Transactions on Information Theory 58, 11 (2012), 6925--6934. https://doi.org/10.1109/TIT.2012.2208937
[18]
Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen, and Dhruba Borthakur. 2011. FATE and DESTINI: A Framework for Cloud Recovery Testing. In 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI 11). https://www.usenix.org/conference/nsdi11/fate-and-destini-framework-cloud-recovery-testing
[19]
Runzhou Han, Suren Byna, Houjun Tang, Bin Dong, and Mai Zheng. 2022. PRO V-IO: An I/O-Centric Provenance Framework for Scientific Data on HPC Systems. In Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing.
[20]
Runzhou Han, Om Rameshwar Gatla, Mai Zheng, Jinrui Cao, Di Zhang, Dong Dai, Yong Chen, and Jonathan Cook. 2022. A Study of Failure Recovery and Logging of High-Performance Parallel File Systems. ACM Transactions on Storage (TOS) (2022).
[21]
Runzhou Han, Mai Zheng, Suren Byna, Houjun Tang, Bin Dong, Dong Dai, Yong Chen, Dongkyun Kim, Joseph Hassoun, and David Thorsley. 2024. PROV-IO++: A Cross-Platform Provenance Framework for Scientific Data on HPC Systems. IEEE Transactions on Parallel and Distributed Systems (2024).
[22]
Cheng Huang, Minghua Chen, and Jin Li. 2007. Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems. In Sixth IEEE International Symposium on Network Computing and Applications (NCA 2007). 79--86. https://doi.org/10.1109/NCA.2007.37
[23]
Cheng Huang, Huseyin Simitci, Yikang Xu, Aaron Ogus, Brad Calder, Parikshit Gopalan, Jin Li, and Sergey Yekhanin. 2012. Erasure Coding in Windows Azure Storage. In Proceedings of the 2012 USENIX Conference on Annual Technical Conference. 2.
[24]
iostat. https://linux.die.net/man/1/iostat.
[25]
Shehbaz Jaffer, Stathis Maneas, Andy Hwang, and Bianca Schroeder. 2019. Evaluating File System Reliability on Solid State Drives. In 2019 USENIX Annual Technical Conference (USENIX ATC'19).
[26]
Jepsen. https://jepsen.io.
[27]
Saurabh Kadekodi, Shashwat Silas, David Clausen, and Arif Merchant. 2023. Practical Design Considerations for Wide Locally Recoverable Codes (LRCs). In 21st USENIX Conference on File and Storage Technologies (FAST 23). 1--16. https://www.usenix.org/conference/fast23/presentation/kadekodi
[28]
Apache Kafka. https://kafka.apache.org.
[29]
Sungjoon Koh, Jie Zhang, Miryeong Kwon, Jungyeon Yoon, David Donofrio, Nam Sung Kim, and Myoungsoo Jung. 2019. Exploring Fault-Tolerant Erasure Codes for Scalable All-Flash Array Clusters. IEEE Transactions on Parallel and Distributed Systems 30, 6 (2019), 1312--1330. https://doi.org/10.1109/TPDS.2018.2884722
[30]
Oleg Kolosov, Gala Yadgar, Matan Liram, Itzhak Tamo, and Alexander Barg. 2018. On Fault Tolerance, Locality, and Optimality in Locally Repairable Codes. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). 865--877. https://www.usenix.org/conference/atc18/presentation/kolosov
[31]
Xiaolu Li, Keyun Cheng, Kaicheng Tang, Patrick P. C. Lee, Yuchong Hu, Dan Feng, Jie Li, and Ting-Yi Wu. 2023. ParaRC: Embracing Sub-Packetization for Repair Parallelization in MSR-Coded Storage. In 21st USENIX Conference on File and Storage Technologies (FAST 23). 17--32. https://www.usenix.org/conference/fast23/presentation/li-xiaolu
[32]
Xiaolu Li, Runhui Li, Patrick P. C.Lee, and Yuchong Hu. 2019. OpenEC: Toward Unified and Configurable Erasure Coding Management in Distributed Storage Systems. In 17th USENIX Conference on File and Storage Technologies (FAST 19). 331--344. https://www.usenix.org/conference/fast19/presentation/li
[33]
Intel Intelligent Storage Acceleration Library. https://www.intel.com/content/www/us/en/developer/tools/isal/overview.html. (accessed April, 2024).
[34]
Jerasure: Erasure Coding Library. https://jerasure.org. (accessed April, 2024).
[35]
Yifei Liu, Manish Adkar, Gerard Holzmann, Geoff Kuenning, Pei Liu, Scott A. Smolka, Wei Su, and Erez Zadok. 2024. Metis: File System Model Checking via Versatile Input and State Exploration. In 22nd USENIX Conference on File and Storage Technologies (FAST 24).
[36]
Yifei Liu, Gautam Ahuja, Geoff Kuenning, Scott Smolka, and Erez Zadok. 2023. Input and Output Coverage Needed in File System Testing. In Proceedings of the 15th ACM Workshop on Hot Topics in Storage and File Systems.
[37]
Tabassum Mahmud, Om Rameshwar Gatla, Duo Zhang, Carson Love, Ryan Bumann, and Mai Zheng. 2023. ConfD: Analyzing Configuration Dependencies of File Systems for Fun and Profit. In 21st USENIX Conference on File and Storage Technologies (FAST 23).
[38]
Stathis Maneas, Kaveh Mahdaviani, Tim Emami, and Bianca Schroeder. 2020. A Study of SSD Reliability in Large Scale Enterprise Storage Deployments. In 18th USENIX Conference on File and Storage Technologies (FAST 20).
[39]
Jayashree Mohan, Rohan Kadekodi, and Vijay Chidambaram. 2017. Analyzing IO Amplification in Linux File Systems. ArXiv abs/1707.08514 (2017). https://api.semanticscholar.org/CorpusID:10285032
[40]
NVMe-oF. https://nvmexpress.org/developers/nvme-of-specification/.
[41]
Lluis Pamies-Juarez, Filip Blagojević, Robert Mateescu, Cyril Gyuot, Eyal En Gad, and Zvonimir Bandić. 2016. Opening the Chrysalis: On the Real Repair Performance of MSR Codes. In 14th USENIX Conference on File and Storage Technologies (FAST 16). 81--94. https://www.usenix.org/conference/fast16/technical-sessions/presentation/pamies-juarez
[42]
James S. Plank, Jianqiang Luo, Catherine D. Schuman, Lihao Xu, and Zooko Wilcox-O'Hearn. 2009. A Performance Evaluation and Examination of Open-Source Erasure Coding Libraries for Storage. In 7th USENIX Conference on File and Storage Technologies (FAST 09). https://www.usenix.org/conference/fast-09/performance-evaluation-and-examination-open-source-erasure-coding-libraries
[43]
Clay Code Plugin. https://docs.ceph.com/en/quincy/rados/operations/erasure-code-clay/.
[44]
K. V. Rashmi, Preetum Nakkiran, Jingyan Wang, Nihar B. Shah, and Kannan Ramchandran. 2015. Having Your Cake and Eating It Too: Jointly Optimal Erasure Codes for I/O, Storage and Network-Bandwidth. In Proceedings of the 13th USENIX Conference on File and Storage Technologies (FAST'15). 81--94.
[45]
Irving S. Reed and Gustave Solomon. 1960. Polynomial Codes Over Certain Finite Fields. Journal of The Society for Industrial and Applied Mathematics 8 (1960), 300--304.
[46]
Nihar B. Shah, K. V. Rashmi, P. Vijay Kumar, and Kannan Ramchandran. 2012. Interference Alignment in Regenerating Codes for Distributed Storage: Necessity and Code Constructions. IEEE Transactions on Information Theory 58, 4 (2012), 2134--2158. https://doi.org/10.1109/TIT.2011.2178588
[47]
Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler, et al. 2010. The Hadoop Distributed File System. In Proceedings of 26th IEEE Symposium on Massive Storage Systems and Technologies (MSST).
[48]
Shane Snyder, Philip Carns, Kevin Harms, Robert Ross, Glenn K. Lockwood, and Nicholas J. Wright. 2016. Modular HPC I/O Characterization with Darshan. In 2016 5th Workshop on Extreme-Scale Programming Tools.
[49]
stress-ng. https://wiki.ubuntu.com/Kernel/Reference/stress-ng.
[50]
Wei Su, Yifei Liu, Gomathi Ganesan, Gerard Holzmann, Scott Smolka, Erez Zadok, and Geoff Kuenning. 2021. Model-Checking Support for File System Development. In Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems.
[51]
OpenStack Swift. https://wiki.openstack.org/wiki/Swift.
[52]
Itzhak Tamo and Alexander Barg. 2014. A Family of Optimal Locally Recoverable Codes. IEEE Transactions on Information Theory 60, 8 (2014), 4661--4676. https://doi.org/10.1109/TIT.2014.2321280
[53]
A. Uselton, M. Howison, N. J. Wright, D. Skinner, N. Keen, J. Shalf, K. L. Karavanic, and L. Oliker. 2010. Parallel I/O Performance: From Events to Ensembles. In Parallel & Distributed Processing (IPDPS), 2010 IEEE International Symposium on. IEEE, 1--11.
[54]
Myna Vajha, Vinayak Ramkumar, Bhagyashree Puranik, Ganesh Kini, Elita Lobo, Birenjith Sasidharan, P. Vijay Kumar, Alexandar Barg, Min Ye, Srinivasan Narayanamurthy, Syed Hussain, and Siddhartha Nandi. 2018. Clay Codes: Moulding MDS Codes to Yield an MSR Code. In 16th USENIX Conference on File and Storage Technologies (FAST 18). 139--154. https://www.usenix.org/conference/fast18/presentation/vajha
[55]
Jeffrey Vetter and Carsten Chambreau. 2004. mpiP: Lightweight, Scalable MPI Profiling. http://mpip.sourceforge.net.
[56]
K. Vijayakumar, F. Mueller, X. Ma, and P. C. Roth. 2009. Scalable I/O Tracing and Analysis. In Proceedings of the 4th Annual Workshop on Petascale Data Storage. ACM, 26--31.
[57]
WekaIO. https://www.weka.io.
[58]
Erci Xu, Mai Zheng, Feng Qin, Jiesheng Wu, and Yikang Xu. 2018. Understanding SSD Reliability in Large-Scale Cloud Systems. In Proceedings of the 3rd ACM/IEEE Joint International Workshop on Parallel Data Storage and Data Intensive Scalable Computing Systems (PDSW-DISCS) at ACM/IEEE Supercomputing (SC).
[59]
Erci Xu, Mai Zheng, Feng Qin, Yikang Xu, and Jiesheng Wu. 2019. Lessons and Actions: What We Learned from 10K SSD-Related Storage System Failures. In Procedings of USENIX Annual Technical Conference (ATC).
[60]
Ding Yuan, Yu Luo, Xin Zhuang, Guilherme Renna Rodrigues, Xu Zhao, Yongle Zhang, Pranay U. Jain, and Michael Stumm. 2014. Simple Testing Can Prevent Most Critical Failures: An Analysis of Production Failures in Distributed Data-Intensive Systems. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14). 249--265.
[61]
Di Zhang, Dong Dai, Runzhou Han, and Mai Zheng. 2021. SentiLog: Anomaly Detecting on Parallel File Systems via Log-based Sentiment Analysis. In Proceedings of the 13th ACM Workshop on Hot Topics in Storage and File Systems (HotStorage).

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HotStorage '24: Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems
July 2024
141 pages
ISBN:9798400706301
DOI:10.1145/3655038
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2024

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

HOTSTORAGE '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 34 of 87 submissions, 39%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 376
    Total Downloads
  • Downloads (Last 12 months)376
  • Downloads (Last 6 weeks)142
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media