Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/1267308.1267318guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

EXPLODE: a lightweight, general system for finding serious storage system errors

Published: 06 November 2006 Publication History

Abstract

Storage systems such as file systems, databases, and RAID systems have a simple, basic contract: you give them data, they do not lose or corrupt it. Often they store the only copy, making its irrevocable loss almost arbitrarily bad. Unfortunately, their code is exceptionally hard to get right, since it must correctly recover from any crash at any program point, no matter how their state was smeared across volatile and persistent memory.
This paper describes EXPLODE, a system that makes it easy to systematically check real storage systems for errors. It takes user-written, potentially system-specific checkers and uses them to drive a storage system into tricky corner cases, including crash recovery errors. EXPLODE uses a novel adaptation of ideas from model checking, a comprehensive, heavy-weight formal verification technique, that makes its checking more systematic (and hopefully more effective) than a pure testing approach while being just as lightweight.
EXPLODE is effective. It found serious bugs in a broad range of real storage systems (without requiring source code): three version control systems, Berkeley DB, an NFS implementation, ten file systems, a RAID system, and the popular VMware GSX virtual machine. We found bugs in every system we checked, 36 bugs in total, typically with little effort.

Cited By

View all
  • (2024)Shadow Filesystems: Recovering from Filesystem Runtime Errors via Robust Alternative ExecutionProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665942(15-22)Online publication date: 8-Jul-2024
  • (2023)The Security War in File Systems: An Empirical Study from A Vulnerability-centric PerspectiveACM Transactions on Storage10.1145/360602019:4(1-26)Online publication date: 3-Oct-2023
  • (2021)Reasoning about modern datacenter infrastructures using partial historiesProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465276(213-220)Online publication date: 1-Jun-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
OSDI '06: Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
November 2006
53 pages

Sponsors

  • USENIX Assoc: USENIX Assoc

Publisher

USENIX Association

United States

Publication History

Published: 06 November 2006

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Shadow Filesystems: Recovering from Filesystem Runtime Errors via Robust Alternative ExecutionProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665942(15-22)Online publication date: 8-Jul-2024
  • (2023)The Security War in File Systems: An Empirical Study from A Vulnerability-centric PerspectiveACM Transactions on Storage10.1145/360602019:4(1-26)Online publication date: 3-Oct-2023
  • (2021)Reasoning about modern datacenter infrastructures using partial historiesProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465276(213-220)Online publication date: 1-Jun-2021
  • (2019)Automatically detecting missing cleanup for ungraceful exitsProceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3338906.3338938(751-762)Online publication date: 12-Aug-2019
  • (2019)CrashMonkey and ACEACM Transactions on Storage10.1145/332027515:2(1-34)Online publication date: 20-Apr-2019
  • (2018)Finding crash-consistency bugs with bounded black-box crash testingProceedings of the 13th USENIX conference on Operating Systems Design and Implementation10.5555/3291168.3291172(33-50)Online publication date: 8-Oct-2018
  • (2018)FCatchACM SIGPLAN Notices10.1145/3296957.317716153:2(419-431)Online publication date: 19-Mar-2018
  • (2018)Towards Robust File System CheckersACM Transactions on Storage10.1145/328103114:4(1-25)Online publication date: 4-Dec-2018
  • (2018)FCatchProceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems10.1145/3173162.3177161(419-431)Online publication date: 19-Mar-2018
  • (2017)Application crash consistency and performance with CCFSProceedings of the 15th Usenix Conference on File and Storage Technologies10.5555/3129633.3129650(181-196)Online publication date: 27-Feb-2017
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media