Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article
Open access

Verifying strong eventual consistency in distributed systems

Published: 12 October 2017 Publication History

Abstract

Data replication is used in distributed systems to maintain up-to-date copies of shared data across multiple computers in a network. However, despite decades of research, algorithms for achieving consistency in replicated systems are still poorly understood. Indeed, many published algorithms have later been shown to be incorrect, even some that were accompanied by supposed mechanised proofs of correctness. In this work, we focus on the correctness of Conflict-free Replicated Data Types (CRDTs), a class of algorithm that provides strong eventual consistency guarantees for replicated data. We develop a modular and reusable framework in the Isabelle/HOL interactive proof assistant for verifying the correctness of CRDT algorithms. We avoid correctness issues that have dogged previous mechanised proofs in this area by including a network model in our formalisation, and proving that our theorems hold in all possible network behaviours. Our axiomatic network model is a standard abstraction that accurately reflects the behaviour of real-world computer networks. Moreover, we identify an abstract convergence theorem, a property of order relations, which provides a formal definition of strong eventual consistency. We then obtain the first machine-checked correctness theorems for three concrete CRDTs: the Replicated Growable Array, the Observed-Remove Set, and an Increment-Decrement Counter. We find that our framework is highly reusable, developing proofs of correctness for the latter two CRDTs in a few hours and with relatively little CRDT-specific code.

Supplementary Material

Auxiliary Archive (oopsla17-oopsla94-aux.zip)

References

[1]
Akka 2017. The Akka actor framework for the Java Virtual Machine. (2017). http://www.akka.io/ Accessed April 2017.
[2]
Paulo Sérgio Almeida, Ali Shoker, and Carlos Baquero. 2015. Efficient State-Based CRDTs by Delta-Mutation. In International Conference on Networked Systems (NETYS) .
[3]
Manamiary Bruno Andriamiarina, Dominique Méry, and Neeraj Kumar Singh. 2014. Analysis of Self-⋆ and P2P Systems Using Refinement. In Abstract State Machines, Alloy, B, TLA, VDM, and Z - 4th International Conference, ABZ 2014, Toulouse, France, June 2-6, 2014. Proceedings . 117–123.
[4]
AppJet, Inc. 2011. Etherpad and EasySync Technical Manual. (March 2011). https://github.com/ether/etherpad-lite/blob/ e2ce9dc/doc/easysync/easysync-full-description.pdf
[5]
Hagit Attiya, Sebastian Burckhardt, Alexey Gotsman, Adam Morrison, Hongseok Yang, and Marek Zawirski. 2016. Specification and Complexity of Collaborative Text Editing. In ACM Symposium on Principles of Distributed Computing (PODC). 259–268.
[6]
Hagit Attiya, Faith Ellen, and Adam Morrison. 2015. Limitations of Highly-Available Eventually-Consistent Data Stores. In ACM Symposium on Principles of Distributed Computing (PODC) .
[7]
Noran Azmy, Stephan Merz, and Christoph Weidenbach. 2016. A Rigorous Correctness Proof for Pastry. In Abstract State Machines, Alloy, B, TLA, VDM, and Z - 5th International Conference, ABZ 2016, Linz, Austria, May 23-27, 2016, Proceedings . 86–101.
[8]
Peter Bailis and Ali Ghodsi. 2013. Eventual Consistency Today: Limitations, Extensions, and Beyond. ACM Queue 11, 3 (March 2013).
[9]
Peter Bailis and Kyle Kingsbury. 2014. The Network is Reliable. ACM Queue 12, 7 (July 2014).
[10]
Carlos Baquero, Paulo Sérgio Almeida, and Carl Lerche. 2016. The problem with embedded CRDT counters and a solution. In 2nd Workshop on the Principles and Practice of Consistency for Distributed Data (PaPoC).
[11]
Carlos Baquero, Paulo Sérgio Almeida, and Ali Shoker. 2014. Making Operation-based CRDTs Operation-based. In 14th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS) . 126–140.
[12]
Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte. 2012a. Brief Announcement: Semantics of Eventually Consistent Replicated Sets. In 26th International Symposium on Distributed Computing (DISC) .
[13]
Annette Bieniusa, Marek Zawirski, Nuno Preguiça, Marc Shapiro, Carlos Baquero, Valter Balegas, and Sérgio Duarte. 2012b. An Optimized Conflict-free Replicated Set . Technical Report RR-8083. http://arxiv.org/abs/1210.3368
[14]
Russell Brown, Sean Cribbs, Christopher Meiklejohn, and Sam Elliott. 2014. Riak DT map: a composable, convergent replicated dictionary. In 1st Workshop on Principles and Practice of Eventual Consistency (PaPEC).
[15]
Sebastian Burckhardt. 2014. Principles of Eventual Consistency. Foundations and Trends in Programming Languages 1, 1-2 (Oct. 2014), 1–150.
[16]
Sebastian Burckhardt, Manuel Fähndrich, Daan Leijen, and Benjamin P Wood. 2012. Cloud Types for Eventual Consistency. In 26th European Conference on Object-Oriented Programming (ECOOP).
[17]
Sebastian Burckhardt, Alexey Gotsman, Hongseok Yang, and Marek Zawirski. 2014. Replicated Data Types: Specification, Verification, Optimality. In 41st ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). 271–284.
[18]
Christian Cachin, Rachid Guerraoui, and Luís Rodrigues. 2011. Introduction to Reliable and Secure Distributed Programming (second ed.). Springer.
[19]
Bernadette Charron-Bost, Henri Debrat, and Stephan Merz. 2011. Formal Verification of Consensus Algorithms Tolerating Malicious Faults. In Stabilization, Safety, and Security of Distributed Systems - 13th International Symposium, SSS 2011, Grenoble, France, October 10-12, 2011. Proceedings . 120–134.
[20]
Susan B Davidson, Hector Garcia-Molina, and Dale Skeen. 1985. Consistency in Partitioned Networks. Comput. Surveys 17, 3 (Sept. 1985), 341–370.
[21]
Aguido Horatio Davis, Chengzheng Sun, and Junwei Lu. 2002. Generalizing Operational Transformation to the Standard General Markup Language. In ACM Conference on Computer Supported Cooperative Work (CSCW). 58–67.
[22]
John Day-Richter. 2010. What’s different about the new Google Docs: Making collaboration fast. (Sept. 2010). https: //drive.googleblog.com/2010/09/whats-different-about-new-google-docs.html
[23]
Henri Debrat and Stephan Merz. 2012. Verifying Fault-Tolerant Distributed Algorithms in the Heard-Of Model. Archive of Formal Proofs 2012 (2012). https://www.isa-afp.org/entries/Heard_Of.shtml
[24]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall, and Werner Vogels. 2007. Dynamo: Amazon’s Highly Available Key-Value Store. In 21st ACM Symposium on Operating Systems Principles (SOSP). 205–220.
[25]
Clarence Ellis and S J Gibbs. 1989. Concurrency Control in Groupware Systems. In ACM International Conference on Management of Data (SIGMOD) . 399–407.
[26]
Colin J Fidge. 1988. Timestamps in message-passing systems that preserve the partial ordering. In 11th Australian Computer Science Conference . 56–66.
[27]
Pedro Fonseca, Kaiyuan Zhang, Xi Wang, and Arvind Krishnamurthy. 2017. An Empirical Study on the Correctness of Formally Verified Distributed Systems. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys 2017, Belgrade, Serbia, April 23-26, 2017 . 328–343.
[28]
Victor B. F. Gomes, Martin Kleppmann, Dominic P. Mulligan, and Alastair R. Beresford. 2017. A framework for establishing Strong Eventual Consistency for Conflict-free Replicated Datatypes. Archive of Formal Proofs (July 2017). http: //isa-afp.org/entries/CRDT.shtml, Formal proof development.
[29]
Florian Haftmann and Tobias Nipkow. 2010. Code Generation via Higher-Order Rewrite Systems. In Functional and Logic Programming, 10th International Symposium, FLOPS 2010, Sendai, Japan, April 19-21, 2010. Proceedings . 103–117.
[30]
Florian Haftmann and Makarius Wenzel. 2008. Local Theory Specifications in Isabelle/Isar. In Types for Proofs and Programs, International Conference, TYPES 2008, Torino, Italy, March 26-29, 2008, Revised Selected Papers . 153–168.
[31]
Claudia-Lavinia Ignat and Moira C Norrie. 2003. Customizable Collaborative Editor Relying on treeOPT Algorithm. In 8th European Conference on Computer-Supported Cooperative Work (ECSCW). 315–334.
[32]
Abdessamad Imine, Pascal Molli, Gérald Oster, and Michaël Rusinowitch. 2003. Proving Correctness of Transformation Functions in Real-Time Groupware. In 8th European Conference on Computer-Supported Cooperative Work (ECSCW). 277–293.
[33]
Abdessamad Imine, Michaël Rusinowitch, Gérald Oster, and Pascal Molli. 2006. Formal design and verification of operational transformation algorithms for copies convergence. Theoretical Computer Science 351, 2 (Feb. 2006), 167–183.
[34]
James E. Johnson, David E. Langworthy, Leslie Lamport, and Friedrich H. Vogt. 2004. Formal Specification of a Web Services Protocol. Electr. Notes Theor. Comput. Sci. 105 (2004), 147–158.
[35]
Tim Jungnickel and Tobias Herb. 2015. TP1-valid Transformation Functions for Operations on ordered n-ary Trees. arXiv:1512.05949. (Dec. 2015). https://arxiv.org/abs/1512.05949
[36]
Florian Kammüller, Markus Wenzel, and Lawrence C. Paulson. 1999. Locales - A Sectioning Concept for Isabelle. In Theorem Proving in Higher Order Logics, 12th International Conference, TPHOLs’99, Nice, France, September, 1999, Proceedings . 149–166.
[37]
Kyle Kingsbury. 2013. Jepsen: Cassandra. (Sept. 2013). https://aphyr.com/posts/294-jepsen-cassandra Accessed April 2017.
[38]
Martin Kleppmann and Alastair R Beresford. 2017. A Conflict-Free Replicated JSON Datatype. IEEE Transactions on Parallel and Distributed Systems (April 2017).
[39]
Leslie Lamport. 1978. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM 21, 7 (July 1978), 558–565.
[40]
João Leitão, José Pereira, and Luís Rodrigues. 2007. HyParView: A Membership Protocol for Reliable Gossip-Based Broadcast. In 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). 419–429.
[41]
Raph Levien. 2016. Towards a unified theory of Operational Transformation and CRDT. (July 2016). https://medium.com/ @raphlinus/towards-a-unified-theory-of-operational-transformation-and-crdt-70485876f72f
[42]
Du Li and Rui Li. 2004. Preserving operation effects relation in group editors. In ACM Conference on Computer Supported Cooperative Work (CSCW) . 457–466.
[43]
Du Li and Rui Li. 2008. An Approach to Ensuring Consistency in Peer-to-Peer Real-Time Group Editors. Computer Supported Cooperative Work (CSCW) 17, 5 (Dec. 2008), 553–611.
[44]
Rui Li and Du Li. 2005. A landmark-based transformation approach to concurrency control in group editors. In International ACM SIGGROUP Conference on Supporting Group Work . 284–293.
[45]
Stéphane Martin, Pascal Urso, and Stéphane Weiss. 2010. Scalable XML Collaborative Editing with Undo. In On the Move to Meaningful Internet Systems (OTM) . 507–514.
[46]
Christopher Meiklejohn and Peter Van Roy. 2015. Lasp: a language for distributed, coordination-free programming. In Proceedings of the 17th International Symposium on Principles and Practice of Declarative Programming, Siena, Italy, July 14-16, 2015 . 184–195.
[47]
Brice Nédelec, Pascal Molli, and Achour Mostefaoui. 2016. CRATE: Writing Stories Together with our Browsers. In 25th International World Wide Web Conference (WWW) . 231–234.
[48]
Brice Nédelec, Pascal Molli, Achour Mostefaoui, and Emmanuel Desmontils. 2013. LSEQ: an Adaptive Structure for Sequences in Distributed Collaborative Editing. In 13th ACM Symposium on Document Engineering (DocEng). 37–46.
[49]
David A Nichols, Pavel Curtis, Michael Dixon, and John Lamping. 1995. High-Latency, Low-Bandwidth Windowing in the Jupiter Collaboration System. In 8th Annual ACM Symposium on User Interface Software and Technology (UIST). 111–120.
[50]
Tobias Nipkow and Gerwin Klein. 2014. Concrete Semantics - With Isabelle/HOL. Springer.
[51]
Gérald Oster, Pascal Molli, Pascal Urso, and Abdessamad Imine. 2006a. Tombstone Transformation Functions for Ensuring Consistency in Collaborative Editing Systems. In 2nd International Conference on Collaborative Computing (CollaborateCom) .
[52]
Gérald Oster, Pascal Urso, Pascal Molli, and Abdessamad Imine. 2005. Proving correctness of transformation functions in collaborative editing systems . Technical Report RR-5795. https://hal.inria.fr/inria-00071213/
[53]
Gérald Oster, Pascal Urso, Pascal Molli, and Abdessamad Imine. 2006b. Data Consistency for P2P Collaborative Editing. In ACM Conference on Computer Supported Cooperative Work (CSCW) .
[54]
Nuno Preguiça, Joan Manuel Marquès, Marc Shapiro, and Mihai Letia. 2009. A commutative replicated data type for cooperative editing. In 29th IEEE International Conference on Distributed Computing Systems (ICDCS).
[55]
Aurel Randolph, Hanifa Boucheneb, Abdessamad Imine, and Alejandro Quintero. 2015. On Synthesizing a Consistent Operational Transformation Approach. IEEE Trans. Comput. 64, 4 (April 2015), 1074–1089.
[56]
Michel Raynal and Mukesh Singhal. 1996. Logical time: capturing causality in distributed systems. IEEE Computer 29, 2 (Feb. 1996), 49–56.
[57]
Matthias Ressel, Doris Nitsche-Ruhland, and Rul Gunzenhäuer. 1996. An Integrating, Transformation-Oriented Approach to Concurrency Control and Undo in Group Editors. In ACM Conference on Computer Supported Cooperative Work (CSCW). 288–297.
[58]
Hyun-Gul Roh, Myeongjae Jeon, Jin-Soo Kim, and Joonwon Lee. 2011. Replicated abstract data types: Building blocks for collaborative applications. J. Parallel and Distrib. Comput. 71, 3 (2011), 354–368.
[59]
Hyun-Gul Roh, Jin-Soo Kim, Joonwon Lee, and Seungryoul Maeng. 2009. Optimistic Operations for Replicated Abstract Data Types . Technical Report CS/TR-2009-318. KAIST.
[60]
Reinhard Schwarz and Friedemann Mattern. 1994. Detecting Causal Relationships in Distributed Computations: In Search of the Holy Grail. Distributed Computing 7, 3 (March 1994), 149–174.
[61]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011a. A comprehensive study of Convergent and Commutative Replicated Data Types . Technical Report 7506. INRIA.
[62]
Marc Shapiro, Nuno Preguiça, Carlos Baquero, and Marek Zawirski. 2011b. Conflict-free Replicated Data Types. In 13th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS) . 386–400.
[63]
Justin Sheehy. 2015. There is No Now: Problems with simultaneity in distributed systems. ACM Queue 13, 3 (March 2015).
[64]
Sergey Sinchuk, Pavel Chuprikov, and Konstantin Solomatov. 2016. Verified Operational Transformation for Trees. In 7th International Conference on Interactive Theorem Proving (ITP) .
[65]
Daniel Spiewak. 2010. Understanding and Applying Operational Transformation. (May 2010). http://www.codecommit. com/blog/java/understanding-and-applying-operational-transformation
[66]
Maher Suleiman, Michèle Cart, and Jean Ferrié. 1997. Serialization of concurrent operations in a distributed collaborative environment. In International Conference on Supporting Group Work (GROUP). 435–445.
[67]
Maher Suleiman, Michèle Cart, and Jean Ferrié. 1998. Concurrent operations in a distributed and mobile collaborative environment. In 14th International Conference on Data Engineering (ICDE). 36–45.
[68]
Chengzheng Sun and David Chen. 2002. Consistency Maintenance in Real-Time Collaborative Graphics Editing Systems. ACM Transactions on Computer-Human Interaction (TOCHI) 9, 1 (March 2002), 1–41.
[69]
Chengzheng Sun and Clarence Ellis. 1998. Operational Transformation in Real-Time Group Editors: Issues, Algorithms, and Achievements. In ACM Conference on Computer Supported Cooperative Work (CSCW). 59–68.
[70]
Chengzheng Sun, Xiaohua Jia, Yanchun Zhang, Yun Yang, and David Chen. 1998. Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems. ACM Transactions on ComputerHuman Interaction (TOCHI) 5, 1 (1998), 63–108.
[71]
Douglas B Terry, Alan J Demers, Karin Petersen, Mike J Spreitzer, Marvin M Theimer, and Brent B Welch. 1994. Session Guarantees for Weakly Consistent Replicated Data. In 3rd International Conference on Parallel and Distributed Information Systems (PDIS) . 140–149.
[72]
Mohamed Tounsi, Mohamed Mosbah, and Dominique Méry. 2013. From Event-B Specifications to Programs for Distributed Algorithms. In 2013 Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises, Hammamet, Tunisia, June 17-20, 2013 . 104–109.
[73]
Mohamed Tounsi, Mohamed Mosbah, and Dominique Méry. 2016. From Event-B specifications to programs for distributed algorithms. IJAACS 9, 3/4 (2016), 223–242.
[74]
Nicolas Vidot, Michelle Cart, Jean Ferrié, and Maher Suleiman. 2000. Copies convergence in a distributed real-time collaborative environment. In ACM Conference on Computer Supported Cooperative Work (CSCW). 171–180.
[75]
Werner Vogels. 2009. Eventually consistent. Commun. ACM 52, 1 (Jan. 2009), 40–44.
[76]
David Wang, Alex Mah, Soren Lassen, and Sam Thorogood. 2015. Apache Wave (incubating) Protocol Documentation, Release 0.4. Apache Software Foundation. (Aug. 2015). https://people.apache.org/~al/wave_docs/ApacheWaveProtocol-0.4.pdf
[77]
Stéphane Weiss, Pascal Urso, and Pascal Molli. 2010. Logoot-Undo: Distributed Collaborative Editing System on P2P networks. IEEE Transactions on Parallel and Distributed Systems 21, 8 (Jan. 2010), 1162–1174.
[78]
Makarius Wenzel, Lawrence C. Paulson, and Tobias Nipkow. 2008. The Isabelle Framework. In Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 18-21, 2008. Proceedings . 33–38.
[79]
James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas E. Anderson. 2015. Verdi: a framework for implementing and formally verifying distributed systems. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015 . 357–368.
[80]
Peter Zeller, Annette Bieniusa, and Arnd Poetzsch-Heffter. 2014. Formal Specification and Verification of CRDTs. In 34th IFIP International Conference on Formal Techniques for Distributed Objects, Components and Systems (FORTE) .

Cited By

View all
  • (2024)Decentralized Near-Synchronous Local-First Programming CollaborationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3685555(1909-1911)Online publication date: 11-Sep-2024
  • (2024)LoRe: A Programming Model for Verifiably Safe Local-first SoftwareACM Transactions on Programming Languages and Systems10.1145/363376946:1(1-26)Online publication date: 15-Jan-2024
  • (2024)Performability evaluation of NoSQL-based storage systemsJournal of Systems and Software10.1016/j.jss.2023.111885208(111885)Online publication date: Feb-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Programming Languages
Proceedings of the ACM on Programming Languages  Volume 1, Issue OOPSLA
October 2017
1786 pages
EISSN:2475-1421
DOI:10.1145/3152284
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2017
Published in PACMPL Volume 1, Issue OOPSLA

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. CRDTs
  2. automated theorem proving
  3. convergence
  4. distributed systems
  5. replication
  6. strong eventual consistency
  7. verification

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)649
  • Downloads (Last 6 weeks)120
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Decentralized Near-Synchronous Local-First Programming CollaborationProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3685555(1909-1911)Online publication date: 11-Sep-2024
  • (2024)LoRe: A Programming Model for Verifiably Safe Local-first SoftwareACM Transactions on Programming Languages and Systems10.1145/363376946:1(1-26)Online publication date: 15-Jan-2024
  • (2024)Performability evaluation of NoSQL-based storage systemsJournal of Systems and Software10.1016/j.jss.2023.111885208(111885)Online publication date: Feb-2024
  • (2023)General-Purpose Secure Conflict-free Replicated Data Types2023 IEEE 36th Computer Security Foundations Symposium (CSF)10.1109/CSF57540.2023.00030(521-536)Online publication date: Jul-2023
  • (2023)On the correctness of highly available systems in the presence of failuresJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.04.008180(104707)Online publication date: Oct-2023
  • (2023)NoSQL-based storage systems: influence of consistency on performance, availability and energy consumptionThe Journal of Supercomputing10.1007/s11227-023-05488-679:18(21424-21448)Online publication date: 20-Jun-2023
  • (2022)Verified Causal Broadcast with Liquid HaskellProceedings of the 34th Symposium on Implementation and Application of Functional Languages10.1145/3587216.3587222(1-13)Online publication date: 31-Aug-2022
  • (2022)Modular verification of op-based CRDTs in separation logicProceedings of the ACM on Programming Languages10.1145/35633516:OOPSLA2(1788-1816)Online publication date: 31-Oct-2022
  • (2022)Katara: synthesizing CRDTs with verified liftingProceedings of the ACM on Programming Languages10.1145/35633366:OOPSLA2(1349-1377)Online publication date: 31-Oct-2022
  • (2022)Peritext: A CRDT for Collaborative Rich Text EditingProceedings of the ACM on Human-Computer Interaction10.1145/35556446:CSCW2(1-36)Online publication date: 11-Nov-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media