Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/2813767.2813786guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

SpanFS: a scalable file system on fast storage devices

Published: 08 July 2015 Publication History

Abstract

Most recent storage devices, such as NAND flash-based solid state drives (SSDs), provide low access latency and high degree of parallelism. However, conventional file systems, which are designed for slow hard disk drives, often encounter severe scalability bottlenecks in exploiting the advances of these fast storage devices on many-core architectures. To scale file systems to many cores, we propose SpanFS, a novel file system which consists of a collection of micro file system services called domains. SpanFS distributes files and directories among the domains, provides a global file system view on top of the domains and maintains consistency in case of system crashes.
SpanFS is implemented based on the Ext4 file system. Experimental results evaluating SpanFS against Ext4 on a modern PCI-E SSD show that SpanFS scales much better than Ext4 on a 32-core machine. In micro-benchmarks SpanFS outperforms Ext4 by up to 1226%. In application-level benchmarks SpanFS improves the performance by up to 73% relative to Ext4.

References

[1]
SysBench, https://github.com/akopytov/sysbench.
[2]
OpenZFS, http://open-zfs.org/wiki/Main_Page.
[3]
IOzone Benchmark, http://www.iozone.org/.
[4]
Dbench, https://dbench.samba.org/.
[5]
Ext4 Disk Layout. https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout. Accessed May 2015.
[6]
Filebench. http://filebench.sourceforge.net/wiki/index.php/Main_Page.
[7]
lockstat. https://www.kernel.org/doc/Documentation/locking/lockstat.txt.
[8]
BAUMANN, A., BARHAM, P., DAGAND, P., HARRIS, T. L., ISAACS, R., PETER, S., ROSCOE, T., SCHÜPBACH, A., AND SINGHANIA, A. The multikernel: a new OS architecture for scalable multicore systems. In SOSP (2009).
[9]
BJØRLING, M., AXBOE, J., NELLANS, D. W., AND BONNET, P. Linux block IO: introducing multi-queue SSD access on multicore systems. In SYSTOR (2013).
[10]
BOYD-WICKIZER, S., CHEN, H., CHEN, R., MAO, Y., KAASHOEK, M. F., MORRIS, R., PESTEREV, A., STEIN, L., WU, M., DAI, Y., ZHANG, Y., AND ZHANG, Z. Corey: An operating system for many cores. In OSDI (2008).
[11]
BOYD-WICKIZER, S., CLEMENTS, A. T., MAO, Y., PESTEREV, A., KAASHOEK, M. F., MORRIS, R., AND ZELDOVICH, N. An analysis of Linux scalability to many cores. In OSDI (2010).
[12]
BUGNION, E., DEVINE, S., AND ROSENBLUM, M. DISCO: running commodity operating systems on scalable multiprocessors. In SOSP (1997).
[13]
CAO, M., BHATTACHARYA, S., AND TS'O, T. Ext4: The next generation of ext2/3 filesystem. In 2007 Linux Storage & Filesystem Workshop, LSF 2007 (2007).
[14]
CHAPIN, J., ROSENBLUM, M., DEVINE, S., LAHIRI, T., TEODOSIU, D., AND GUPTA, A. Hive: Fault containment for sharedmemory multiprocessors. In SOSP (1995).
[15]
CHEN, F., LEE, R., AND ZHANG, X. Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing. In HPCA (2011).
[16]
CHIDAMBARAM, V., PILLAI, T. S., ARPACI-DUSSEAU, A. C., AND ARPACI-DUSSEAU, R. H. Optimistic crash consistency. In SOSP (2013).
[17]
CHIDAMBARAM, V., SHARMA, T., ARPACI-DUSSEAU, A. C., AND ARPACI-DUSSEAU, R. H. Consistency without ordering. In FAST (2012).
[18]
CLEMENTS, A. T., KAASHOEK, M. F., AND ZELDOVICH, N. RadixVM: Scalable address spaces for multithreaded applications. In EuroSys (2013).
[19]
CLEMENTS, A. T., KAASHOEK, M. F., ZELDOVICH, N., MORRIS, R. T., AND KOHLER, E. The scalable commutativity rule: designing scalable software for multicore processors. In SOSP (2013).
[20]
DILGER, A. E. Online ext2 and ext3 filesystem resizing. In Ottawa Linux Symposium 2002 (2002).
[21]
EQBAL, R. ScaleFS: A multicore-scalable file system. Master's thesis, Massachusetts Institute of Technology, Aug. 2014.
[22]
GAMSA, B., KRIEGER, O., APPAVOO, J., AND STUMM, M. Tornado: Maximizing locality and concurrency in a shared memory multiprocessor operating system. In OSDI (1999).
[23]
HAGMANN, R. B. Reimplementing the cedar file system using logging and group commit. In SOSP (1987).
[24]
III, C. G. Providing a Shared File System in the Hare POSIX Multikernel. PhD thesis, Massachusetts Institute of Technology, June 2014.
[25]
III, C. G., SIRONI, F., KAASHOEK, M. F., AND ZELDOVICH, N. Hare: a file system for non-cache-coherent multicores. In EuroSys (2015).
[26]
KANG, J., HU, C., WO, T., ZHAI, Y., ZHANG, B., AND HUAI, J. MultiLanes: Providing virtualized storage for OS-level virtualization on many cores. An extended verison of [27] submitted to a journal.
[27]
KANG, J., ZHANG, B., WO, T., HU, C., AND HUAI, J. MultiLanes: Providing virtualized storage for OS-level virtualization on many cores. In FAST (2014).
[28]
KRIEGER, O., AUSLANDER, M. A., ROSENBURG, B. S., WISNIEWSKI, R. W., XENIDIS, J., SILVA, D. D., OSTROWSKI, M., APPAVOO, J., BUTRICO, M. A., MERGEN, M. F., WATERLAND, A., AND UHLIG, V. K42: building a complete operating system. In EuroSys (2006).
[29]
LEE, S., MOON, B., AND PARK, C. Advances in flash memory SSD technology for enterprise database applications. In SIGMOD (2009).
[30]
LIU, R., ZHANG, H., AND CHEN, H. Scalable read-mostly synchronization using passive reader-writer locks. In USENIX ATC (2014).
[31]
LU, L., ZHANG, Y., DO, T., AL-KISWANY, S., ARPACIDUSSEAU, A. C., AND ARPACI-DUSSEAU, R. H. Physical disentanglement in a container-based file system. In OSDI (2014).
[32]
MAVRIDIS, S., SFAKIANAKIS, Y., PAPAGIANNIS, A., MARAZAKIS, M., AND BILAS, A. Jericho: Achieving scalability through optimal data placement on multicore systems. In IEEE MSST (2014).
[33]
MCKENNEY, P. E., APPAVOO, J., KLEEN, A., KRIEGER, O., RUSSELL, R., SARMA, D., AND SONI, M. Read-copy update. In Ottawa Linux Symposium (2001).
[34]
REN, K., ZHENG, Q., PATIL, S., AND GIBSON, G. A. IndexFS: Scaling file system metadata performance with stateless caching and bulk insertion. In SC (2014).
[35]
RODEH, O., BACIK, J., AND MASON, C. BTRFS: the Linux B-Tree filesystem. ACM Transactions on Storage (TOS) 9, 3 (2013), 9:1-9:32.
[36]
SFAKIANAKIS, Y., MAVRIDIS, S., PAPAGIANNIS, A., PAPAGEORGIOU, S., FOUNTOULAKIS, M., MARAZAKIS, M., AND BILAS, A. Vanguard: Increasing server efficiency via workload isolation in the storage I/O path. In Proceedings of the ACM Symposium on Cloud Computing (2014).
[37]
SONG, X., CHEN, H., CHEN, R., WANG, Y., AND ZANG, B. A case for scaling applications to many-core with OS clustering. In EuroSys (2011).
[38]
SWEENEY, A., DOUCETTE, D., HU, W., ANDERSON, C., NISHIMOTO, M., AND PECK, G. Scalability in the XFS file system. In USENIX ATC (1996).
[39]
TWEEDIE, S. C. Journaling the Linux ext2fs filesystem. In The Fourth Annual Linux Expo (1998).
[40]
WANG, T., AND JOHNSON, R. Scalable logging through emerging non-volatile memory. PVLDB 7, 10 (2014), 865-876.
[41]
WEIL, S. A., BRANDT, S. A., MILLER, E. L., LONG, D. D. E., AND MALTZAHN, C. Ceph: A scalable, high-performance distributed file system. In OSDI (2006).
[42]
WRIGHT, C. P., DAVE, J., GUPTA, P., KRISHNAN, H., QUIGLEY, D. P., ZADOK, E., AND ZUBAIR, M. N. Versatility and Unix semantics in namespace unification. ACM Transactions on Storage (TOS) 2, 1 (2006), 74-105.
[43]
ZHENG, D., BURNS, R. C., AND SZALAY, A. S. Toward millions of file system IOPS on low-cost, commodity hardware. In SC (2013).

Cited By

View all
  • (2024)I/O in a flashProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650708(177-192)Online publication date: 27-Feb-2024
  • (2024)ScaleCache: A Scalable Page Cache for Multiple Solid-State DrivesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629588(641-656)Online publication date: 22-Apr-2024
  • (2022)Container-aware I/O stack: bridging the gap between container storage drivers and solid state devicesProceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3516807.3516818(18-30)Online publication date: 25-Feb-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
USENIX ATC '15: Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference
July 2015
625 pages
ISBN:9781931971225

Sponsors

  • VMware
  • NetApp
  • Google Inc.
  • Facebook: Facebook
  • HP: HP

Publisher

USENIX Association

United States

Publication History

Published: 08 July 2015

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)I/O in a flashProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650708(177-192)Online publication date: 27-Feb-2024
  • (2024)ScaleCache: A Scalable Page Cache for Multiple Solid-State DrivesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3629588(641-656)Online publication date: 22-Apr-2024
  • (2022)Container-aware I/O stack: bridging the gap between container storage drivers and solid state devicesProceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3516807.3516818(18-30)Online publication date: 25-Feb-2022
  • (2021)SPMFS: A Scalable Persistent Memory File System on Optane Persistent MemoryProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472503(1-10)Online publication date: 9-Aug-2021
  • (2019)Flexgroup volumesProceedings of the 2019 USENIX Conference on Usenix Annual Technical Conference10.5555/3358807.3358820(135-148)Online publication date: 10-Jul-2019
  • (2019)Finding and Fixing Performance Pathologies in Persistent Memory Software StacksProceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3297858.3304077(427-439)Online publication date: 4-Apr-2019
  • (2018)High-performance transaction processing in journaling file systemsProceedings of the 16th USENIX Conference on File and Storage Technologies10.5555/3189759.3189781(227-240)Online publication date: 12-Feb-2018
  • (2018)Barrier-enabled IO stack for flash storageProceedings of the 16th USENIX Conference on File and Storage Technologies10.5555/3189759.3189779(211-226)Online publication date: 12-Feb-2018
  • (2018)Bringing Order to ChaosACM Transactions on Storage10.1145/324209114:3(1-29)Online publication date: 3-Oct-2018
  • (2017)Scaling a file system to many cores using an operation logProceedings of the 26th Symposium on Operating Systems Principles10.1145/3132747.3132779(69-86)Online publication date: 14-Oct-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media