Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Free access

IronFleet: proving safety and liveness of practical distributed systems

Published: 26 June 2017 Publication History

Abstract

Distributed systems are notorious for harboring subtle bugs. Verification can, in principle, eliminate these bugs, but it has historically been difficult to apply at full-program scale, much less distributed system scale. We describe a methodology for building practical and provably correct distributed systems based on a unique blend of temporal logic of actions-style state-machine refinement and Hoare-logic verification. We demonstrate the methodology on a complex implementation of a Paxos-based replicated state machine library and a lease-based sharded key-value store. We prove that each obeys a concise safety specification as well as desirable liveness requirements. Each implementation achieves performance competitive with a reference system. With our methodology and lessons learned, we aim to raise the standard for distributed systems from "tested" to "correct."

References

[1]
Bolosky, W.J., Douceur, J.R., Howell, J. The Farsite project: a retrospective. ACM SIGOPS Oper. Syst. Rev. 41, 2 (Apr. 2007), 17--26.
[2]
de Moura, L.M., Bjørner, N. Z3: An efficient SMT solver. In Proceedings of the Conference on Tools and Algorithms for the Construction and Analysis of Systems (2008).
[3]
Detlefs, D., Nelson, G., Saxe, J.B. Simplify: A theorem prover for program checking. J. ACM 52 (2003), 365--473.
[4]
Fischer, M.J., Lynch, N.A., Paterson, M.S. Impossibility of distributed consensus with one faulty process. J.ACM 32, 2 (Apr. 1985), 374--382.
[5]
Floyd, R. Assigning meanings to programs. In Proceedings of Symposia in Applied Mathematics (1967). American Mathematical Society, 19--32.
[6]
Hawblitzel, C., Howell, J., Lorch, J.R., Narayan, A., Parno, B., Zhang, D., Zill, B. Ironclad apps: End-to-end security via automated full-system verification. In Proceedings of USENIX OSDI (Oct. 2014).
[7]
Hoare, T. An axiomatic basis for computer programming. Commun. ACM 12 (1969), 576--580.
[8]
Joshi, R., Lamport, L., Matthews, J., Tasiran, S., Tuttle, M., Yu, Y. Checking cache coherence protocols with TLA+. J. Formal Methods Syst. Des. 22, 2 (2003), 125--131.
[9]
Killian, C.E., Anderson, J.W., Braud, R., Jhala, R., Vahdat, A.M. Mace: Language support for building distributed systems. In Proceedings of ACM PLDI (2007).
[10]
Klein, G., Andronick, J., Elphinstone, K., Murray, T., Sewell, T., Kolanski, R., Heiser, G. Comprehensive formal verification of an OS microkernel. ACM Trans. Comput. Syst. 32, 1 (2014), 1--70.
[11]
Lamport, L. The temporal logic of actions. ACM Trans. Program. Lang. Syst. 16, 3 (May 1994), 872--923.
[12]
Lamport, L. The part-time parliament. ACM Trans. Comput. Syst. 16, 2 (May 1998), 133--169.
[13]
Lamport, L. Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, Boston, MA, 2002.
[14]
Leino, K.R.M. Dafny: An automatic program verifier for functional correctness. In Proceedings of the LPAR Conference (2010).
[15]
Lu, T., Merz, S., Weidenbach, C., Bendisposto, J., Leuschel, M., Roggenbach, M., Margaria, T., Padberg, J., Taentzer, G., Lu, T., Merz, S., Weidenbach, C. Model checking the Pastry routing protocol. In 10th International Workshop Automated Verification of Critical Systems (Düsseldorf, Germany, Sep. 2010).
[16]
Moraru, I., Andersen, D.G., Kaminsky, M. There is more consensus in egalitarian parliaments. In Proceedings of the ACM SOSP (2013).
[17]
Musuvathi, M., Park, D., Chou, A., Engler, D., Dill, D.L. CMC: A pragmatic approach to model checking real code. In Proceedings of the Fifth Symposium on Operating Systems Design and Implementation (2002).
[18]
Newcombe, C., Rath, T., Zhang, F., Munteanu, B., Brooker, M., Deardeuff, M. How Amazon Web Services uses formal methods. Commun. ACM 58, 4 (Apr. 2015), 66--73.
[19]
Parkinson, M. The next 700 separation logics. In Proceedings of IFIP VSTTE (Aug. 2010).
[20]
Pek, E., Bogunovic, N. Formal verification of communication protocols in distributed systems. In Proceedings of the Joint Conferences on Computers in Technical Systems and Intelligent Systems, MIPRO (2003).
[21]
Ridge, T. Verifying distributed systems: The operational approach. In Proceedings of the ACM POPL (Jan. 2009).
[22]
Schiper, N., Rahli, V., van Renesse, R., Bickford, M., Constable, R. Developing correctly replicated databases using formal tools. In Proceedings of IEEE/IFIP DSN (June 2014).
[23]
Wilcox, J.R., Woos, D., Panchekha, P., Tatlock, Z., Wang, X., Ernst, M.D., Anderson, T. Verdi: A framework for implementing and formally verifying distributed systems. In Proceedings of ACM PLDI (June 2015).
[24]
Woos, D., Wilcox, J.R., Anton, S., Tatlock, Z., Ernst, M.D., Anderson, T. Planning for change in a formal verification of the Raft consensus protocol. In ACM Conference on Certified Programs and Proofs (CPP) (Jan. 2016).
[25]
Zave, P. Using lightweight modeling to understand Chord. ACM SIGCOMM Comput. Comm. Rev. 42, 2 (Apr. 2012), 49--57.

Cited By

View all
  • (2025)Formal verification of timely knowledge propagation in airborne networksScience of Computer Programming10.1016/j.scico.2024.103184239:COnline publication date: 1-Jan-2025
  • (2024)Foundational Integration Verification of a Cryptographic ServerProceedings of the ACM on Programming Languages10.1145/36564468:PLDI(1704-1729)Online publication date: 20-Jun-2024
  • (2024)Models for Storage in Database BackendsProceedings of the 11th Workshop on Principles and Practice of Consistency for Distributed Data10.1145/3642976.3653036(58-66)Online publication date: 22-Apr-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 60, Issue 7
July 2017
89 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/3116227
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2017
Published in CACM Volume 60, Issue 7

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)438
  • Downloads (Last 6 weeks)57
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Formal verification of timely knowledge propagation in airborne networksScience of Computer Programming10.1016/j.scico.2024.103184239:COnline publication date: 1-Jan-2025
  • (2024)Foundational Integration Verification of a Cryptographic ServerProceedings of the ACM on Programming Languages10.1145/36564468:PLDI(1704-1729)Online publication date: 20-Jun-2024
  • (2024)Models for Storage in Database BackendsProceedings of the 11th Workshop on Principles and Practice of Consistency for Distributed Data10.1145/3642976.3653036(58-66)Online publication date: 22-Apr-2024
  • (2024)Trillium: Higher-Order Concurrent and Distributed Separation Logic for Intensional RefinementProceedings of the ACM on Programming Languages10.1145/36328518:POPL(241-272)Online publication date: 5-Jan-2024
  • (2024)Randomised Testing of the Compiler for a Verification-Aware Programming Language2024 IEEE Conference on Software Testing, Verification and Validation (ICST)10.1109/ICST60714.2024.00044(407-418)Online publication date: 27-May-2024
  • (2024)Parameterized Verification of Round-Based Distributed Algorithms via Extended Threshold AutomataFormal Methods10.1007/978-3-031-71162-6_33(638-657)Online publication date: 9-Sep-2024
  • (2024)Toward Liveness Proofs at ScaleComputer Aided Verification10.1007/978-3-031-65627-9_13(255-276)Online publication date: 26-Jul-2024
  • (2024)Verifying a C Implementation of Derecho’s Coordination Mechanism Using VST and CoqNASA Formal Methods10.1007/978-3-031-60698-4_6(99-117)Online publication date: 4-Jun-2024
  • (2023)Counterexample Driven Quantifier Instantiations with Applications to Distributed ProtocolsProceedings of the ACM on Programming Languages10.1145/36228647:OOPSLA2(1878-1904)Online publication date: 16-Oct-2023
  • (2023)Verifying Reliable Network Components in a Distributed Separation Logic with Dependent Separation ProtocolsProceedings of the ACM on Programming Languages10.1145/36078597:ICFP(847-877)Online publication date: 31-Aug-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media