Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Countering Fragmentation in an Enterprise Storage System

Published: 16 January 2020 Publication History

Abstract

As a file system ages, it can experience multiple forms of fragmentation. Fragmentation of the free space in the file system can lower write performance and subsequent read performance. Client operations as well as internal operations, such as deduplication, can fragment the layout of an individual file, which also impacts file read performance. File systems that allow sub-block granular addressing can gather intra-block fragmentation, which leads to wasted free space. Similarly, wasted space can also occur when a file system writes a collection of blocks out to object storage as a single large object, because the constituent blocks can become free at different times. The impact of fragmentation also depends on the underlying storage media. This article studies each form of fragmentation in the NetApp® WAFL®file system, and explains how the file system leverages a storage virtualization layer for defragmentation techniques that physically relocate blocks efficiently, including those in read-only snapshots. The article analyzes the effectiveness of these techniques at reducing fragmentation and improving overall performance across various storage media.

References

[1]
Woo Hyun Ahn, Kyungbaek Kim, Yongjin Choi, and Daeyeon Park. 2002. DFS: A de-fragmented file system. In Proceedings of the 10th IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunications Systems (MASCOTS’02). 71--80.
[2]
Wendy Bartlett and Lisa Spainhower. 2004. Commercial fault tolerance: A tale of two systems. IEEE Trans. Depend. Sec. Comput. 1, 1 (2004).
[3]
Matias Bjørling, Javier González, and Philippe Bonnet. 2017. LightNVM: The Linux open-channel SSD subsystem. In Proceedings of the Conference on File and Storage Technologies (FAST’17).
[4]
Alexander Conway, Ainesh Bakshi, Yizheng Jiao, William Jannen, Yang Zhan, Jun Yuan, Michael A. Bender, Rob Johnson, Bradley C. Kuszmaul, Donald E. Porter et al. 2017. File systems fated for senescence? Nonsense, says science! In Proceedings of the Conference on File and Storage Technologies (FAST’17).
[5]
Fernando J. Corbato. 1968. A Paging Experiment with the Multics System. Technical Report. Massachusetts Institute of Technology, Cambridge, MA, Project MAC.
[6]
Peter Corbett, Bob English, Atul Goel, Tomislav Grcanac, Steven Kleiman, James Leong, and Sunitha Sankar. 2004. Row-diagonal parity for double disk failure correction. In Proceedings of the Conference on File and Storage Technologies (FAST’04).
[7]
Matthew Curtis-Maury, Vinay Devadas, Vania Fang, and Aditya Kulkarni. 2016. To waffinity and beyond: A scalable architecture for incremental parallelization of file system code. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI’16).
[8]
Matthew Curtis-Maury, Ram Kesavan, and Mrinal Bhattacharjee. 2017. Scalable write allocation in the WAFL file system. In Proceedings of the Internal Conference on Parallel Processing (ICPP’17).
[9]
John K. Edwards, Daniel Ellard, Craig Everhart, Robert Fair, Eric Hamilton, Andy Kahn, Arkady Kanevsky, James Lentini, Ashish Prakash, Keith A. Smith et al. 2008. FlexVol: Flexible, efficient file volume virtualization in WAFL. In Proceedings of the USENIX Annual Technical Conference (ATC’08).
[10]
Atul Goel and Peter Corbett. 2012. RAID triple parity. In ACM SIGOPS Operat. Syst. Rev., Vol. 46. 41--49.
[11]
Sangwook Shane Hahn, Sungjin Lee, Cheng Ji, Li-Pin Chang, Inhyuk Yee, Liang Shi, Chun Jason Xue, and Jihong Kim. 2017. Improving file system performance of mobile storage systems using a decoupled defragmenter. In Proceedings of the USENIX Annual Technical Conference (ATC’17).
[12]
Jun He, Sudarsun Kannan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2017. The unwritten contract of solid state drives. In Proceedings of the European Conference on Computer Systems (EuroSys’17).
[13]
Weiping He and David H. C. Du. 2017. SMaRT: An approach to shingled magnetic recording translation. In Proceedings of the Conference on File and Storage Technologies (FAST’17).
[14]
Dave Hitz, James Lau, and Michael Malcolm. 1994. File system design for an NFS file server appliance. In Proceedings of the USENIX Winter Technical Conference.
[15]
Cheng Ji, Li-Pin Chang, Liang Shi, Chao Wu, Qiao Li, and Chun Jason Xue. 2016. An empirical study of file-system fragmentation in mobile storage systems. In Proceedings of the USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage’16).
[16]
Saurabh Kadekodi, Vaishnavh Nagarajan, Gregory R. Ganger, and Garth A. Gibson. 2018. Geriatrix: Aging what you see and what you don’t see. A file system aging approach for modern storage systems. In Proceedings of the USENIX Annual Technical Conference (ATC’18).
[17]
Ram Kesavan, Matthew Curtis-Maury, and Mrinal Bhattacharjee. 2018. Efficient search for free blocks in the WAFL file system. In Proceedings of the Internal Conference on Parallel Processing (ICPP’18).
[18]
Ram Kesavan, Matthew Curtis-Maury, Vinay Devadas, and Kesari Mishra. 2019. Storage gardening: Using a virtualization layer for efficient defragmentation in the WAFL file system. In Proceedings of the Conference on File and Storage Technologies (FAST’19).
[19]
Ram Kesavan, Rohit Singh, Travis Grusecki, and Yuvraj Patel. 2017. Algorithms and data structures for efficient free space reclamation in WAFL. In Proceedings of the Conference on File and Storage Technologies (FAST’17).
[20]
Ram Kesavan, Rohit Singh, Travis Grusecki, and Yuvraj Patel. 2017. Efficient free space reclamation in WAFL. ACM Trans. Stor. 13, 3 (Oct. 2017).
[21]
Hyukjoong Kim, Dongkun Shin, Yunho Jeong, and Kyung Ho Kim. 2017. SHRD: Improving spatial locality in flash storage accesses by sequentializing in host and randomizing in device. In Proceedings of the Conference on File and Storage Technologies (FAST’17).
[22]
John Lantz. 2018. Enable Cloud-Connected All-Flash Arrays with ONTAP 9.4 and FabricPool. Retrieved from https://blog.netapp.com/cloud-connected-flash-new-fabricpool-capabilities-ontap-9-4/.
[23]
Changman Lee, Dongho Sim, Joo Young Hwang, and Sangyeun Cho. 2015. F2FS: A new file system for flash storage. In Proceedings of the Conference on File and Storage Technologies (FAST’15).
[24]
Youngjae Lee, Jin-Soo Kim, and Seungryoul Maeng. 2010. ReSSD: A software layer for resuscitating SSDs from poor small random write performance. In Proceedings of the ACM Symposium on Applied Computing.
[25]
Youyou Lu, Jiwu Shu, Wei Wang et al. 2014. ReconFS: A reconstructable file system on flash storage. In Proceedings of the Conference on File and Storage Technologies (FAST’14).
[26]
Avantika Mathur, Mingming Cao, Suparna Bhattacharya, Andreas Dilger, Alex Tomas, and Laurent Vivier. 2007. The new ext4 filesystem: Current status and future plans. In Proceedings of the Linux Symposium.
[27]
Marshall K. McKusick, William N. Joy, Samuel J. Leffler, and Robert S. Fabry. 1984. A fast file system for UNIX. Trans. Comput. Syst. 2, 3 (1984), 181--197.
[28]
Dutch T. Meyer and William J. Bolosky. 2011. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage.
[29]
SUN Microsystems. 2008. ZFS at OpenSolaris Community. Retrieved from http://opensolaris.org/os/community/zfs/.
[30]
Sparsh Mittal and Jeffrey S. Vetter. 2015. A survey of software techniques for using non-volatile memories for storage and main memory systems. IEEE Trans. Parallel Distrib. Syst. 30, 5 (2015).
[31]
NetApp, Inc.2019. Volume Move Express Guide. Retrieved from https://library.netapp.com/ecm/ecm_download_file/ECMLP2496251.
[32]
David A. Patterson, Garth Gibson, and Randy H. Katz. 1988. A case for redundant arrays of inexpensive disks (RAID). In Proceedings of the International Conference on Management of Data (SIGMOD’88).
[33]
Hugo Patterson, Stephen Manley, Mike Federwisch, Dave Hitz, Steve Kleiman, and Shane Owara. 2002. SnapMirror: File system based asynchronous mirroring for disaster recovery. In Proceedings of the USENIX Conference on File and Storage Technologies (FAST’02).
[34]
Ohad Rodeh, Josef Bacik, and Chris Mason. 2013. BTRFS: The Linux B-tree filesystem. ACM Trans. Stor. 9, 3 (2013), 9.
[35]
Mendel Rosenblum and John K. Ousterhout. 1992. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst. 10 (1992), 1--15.
[36]
Takashi Sato. 2007. ext4 online defragmentation. In Proceedings of the Linux Symposium, Vol. 2. 179--186.
[37]
Margo Seltzer, Keith A. Smith, Hari Balakrishnan. Jacqueline Chang, Sara McMains, and Venkata Padmanabhan. 1995. File system logging versus clustering: A performance comparison. In Proceedings of the USENIX Annual Technical Conference (ATC’95).
[38]
Keith A. Smith and Margo I. Seltzer. 1997. File system aging-Increasing the relevance of file system benchmarks. In ACM SIGMETRICS Performance Evaluation Review, Vol. 25. ACM, 203--213.
[39]
Storage Performance Council. 2018. Storage Performance Council-1 Benchmark. Retrieved from www.storageperformance.org.
[40]
Rajesh Sundaram. 2006. The Private Lives of Disk Drives. Retrieved from: https://atg.netapp.com/?p=13640.
[41]
Technical Committee T10. 2016. SCSI Storage Interfaces. Retrieved from: www.t10.org/lists/2op.htm.
[42]
Technical Committee T13. 2007. TRIM Specification. ATA/ATAPI Command Set-2 (ACS2). www.t13.org.
[43]
Sage A. Weil, Scott A. Brandt, Ethan L. Miller, Darrell D. E. Long, and Carlos Maltzhan. 2006. Ceph: A scalable, high-performance distributed file system. In Proceedings of the Symposium on Operating Systems Design and Implementation (OSDI’06).
[44]
Wikipedia. 2017. ReiserFS. Wikipedia, The Free Encyclopedia. Retrieved from: https://en.wikipedia.org/wiki/ReiserFS.
[45]
Jian Xu and Steven Swanson. 2016. NOVA: A log-structured file system for hybrid volatile/non-volatile main memories. In Proceedings of the Conference on File and Storage Technologies (FAST’16).
[46]
Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Manu Awasthi, Tameesh Suri, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance characterization of hyperscale applications on NVMe SSDs. In ACM SIGMETRICS Performance Evaluation Review, Vol. 43. ACM.
[47]
Qiumin Xu, Huzefa Siyamwala, Mrinmoy Ghosh, Tameesh Suri, Manu Awasthi, Zvika Guz, Anahita Shayesteh, and Vijay Balakrishnan. 2015. Performance analysis of NVMe SSDs and their implication on real world databases. In Proceedings of the 8th ACM International Systems and Storage Conference (SYSTOR’15). ACM.
[48]
Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, and Swaminathan Sundararaman. 2014. Don’t stack your log on my log. In Proceedings of the Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (INFLOW’14).
[49]
Zhihui Zhang and Kanad Ghose. 2003. yFS: A journaling file system design for handling large data sets with reduced seeking. In Proceedings of the Conference on File and Storage Technologies (FAST’03).
[50]
Aviad Zuck, Oren Kishon, and Sivan Toledo. 2014. LSDM: Improving the performance of mobile storage with a log-structured address remapping device driver. In Proceedings of the 8th International Conference on Next Generation Mobile Apps, Services and Technologies (NGMAST’14).

Cited By

View all
  • (2022)[Retracted] Digital Transformation and Firm Performance in the Context of Sustainability: Mediating Effects Based on Behavioral IntegrationJournal of Environmental and Public Health10.1155/2022/82209402022:1Online publication date: 30-Sep-2022
  • (2022)File fragmentation from the perspective of I/O controlProceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3538643.3539746(126-132)Online publication date: 27-Jun-2022
  • (2021)FragPickerProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483593(280-294)Online publication date: 26-Oct-2021

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage
ACM Transactions on Storage  Volume 15, Issue 4
Usenix Fast 2019 Special Section and Regular Papers
November 2019
228 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/3373756
  • Editor:
  • Sam H. Noh
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 January 2020
Accepted: 01 October 2019
Received: 01 June 2019
Published in TOS Volume 15, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Storage system
  2. deduplication
  3. file system
  4. file system performance
  5. fragmentation
  6. snapshot

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)54
  • Downloads (Last 6 weeks)5
Reflects downloads up to 22 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)[Retracted] Digital Transformation and Firm Performance in the Context of Sustainability: Mediating Effects Based on Behavioral IntegrationJournal of Environmental and Public Health10.1155/2022/82209402022:1Online publication date: 30-Sep-2022
  • (2022)File fragmentation from the perspective of I/O controlProceedings of the 14th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3538643.3539746(126-132)Online publication date: 27-Jun-2022
  • (2021)FragPickerProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483593(280-294)Online publication date: 26-Oct-2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media