When Consensus Meets Self-stabilization

Shlomi Dolev¹⁷,
Ronen I. Kat¹⁷ &
Elad M. Schiller¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4305))

Included in the following conference series:

International Conference On Principles Of Distributed Systems

630 Accesses
5 Citations

Abstract

This paper presents a self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine.

Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. New requirements for consensus that fit the on-going nature of self-stabilizing algorithms are presented. The wait-free consensus (and the replicated state-machine) algorithm presented is a classic combination of a failure detector and a (memory bounded) rotating coordinator consensus that satisfy both eventual safety and eventual liveness.

Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance-unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 2⁶⁴ sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used in other scopes.

Partially supported by the Israeli Ministry of Science, the Lynn and William Frankel Center for Computer Sciences, and the Rita Altura Trust Chair in Computer Sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Closer Look at Fault Tolerance

Article 15 May 2017

The weakest failure detector for eventual consistency

Article 05 January 2017

Brief Announcement: Byzantine-Tolerant Detection of Causality in Synchronous Systems

References

Abraham, U.: Self-stabilizing timestamps. Theoretical Computer Science 308(1-3), 449–515 (2003)
Article MATH MathSciNet Google Scholar
Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing omega with weak reliability and synchrony assumptions. In: Proc. of the 22nd ACM symposium on Principles of distributed computing, pp. 306–314 (July 2003)
Google Scholar
Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proc. of the 23rd ACM symposium on Principles of distributed computing, pp. 328–337 (July 2004)
Google Scholar
Arora, A., Kulkarni, S.S., Demirbas, M.: Resettable vector clocks. In: Proc. of the 19th ACM Symposium on Principles of Distributed Computing, pp. 269–278 (August 2000)
Google Scholar
Attiya, H., Welch, J.L.: Distributed Computing: Fundamentals, Simulations, and Advanced Topics, 2nd edn. John Wiley and Sons, Inc., Chichester (2004)
Google Scholar
Beauquier, J., Kekkonen-Moneta, S.: Fault-tolerance and self-stabilizing: Impossibility results and solutions using self-stabilizing failure detectors. International Journal of Systems Science 28(11), 1177–1187 (1997)
Article MATH Google Scholar
Birman, K.P.: Replication and Fault-Tolerance in the ISIS System. In: ACM Symposium on Operating Systems Principles (SOSP), pp. 79–86 (1985)
Google Scholar
Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)
Article MATH MathSciNet Google Scholar
Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. In: Proc. of the 11th ACM Symposium on Principles of Distributed Computing, pp. 147–158 (August 1992)
Google Scholar
Chor, B., Israeli, A., Li, M.: On processor coordination using asynchronous hardware. In: Proc. of the 6th ACM Symposium on Principles of Distributed Computing, pp. 86–97 (August 1987)
Google Scholar
Davidovitch, L., Dolev, S., Rajsbaum, S.: Consensus continue? Stability of multi-valued continuous consensus! In: Proc. of the Workshop on Geometry and Topology in Concurrency and Distributed Computing, pp. 21–24 (October 2004)
Google Scholar
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R.: Failure Detection Lower Bounds on Registers and Consensus. In: Proc. of the 16th International Conference on Distributed Computing, pp. 237–251 (October 2002)
Google Scholar
Dijkstra, E.W.: Self-Stabilizing Systems in spite of Distributed Control. Communications of the ACM 1(11), 643–644 (1974)
Article Google Scholar
Dolev, S.: Self-stabilization. MIT Press, Cambridge (2000)
MATH Google Scholar
Dolev, S., Israeli, A., Moran, S.: Self-Stabilization of Dynamic Systems Assuming only Read/Write Atomicity. In: Proc. of the 9th ACM Symposium on Principles of Distributed Computing, pp. 103–117 (August 1990)
Google Scholar
Dolev, S., Gilbert, S., Lahiani, L., Lynch, N., Nolte, T.: Virtual Stationary Automata for Mobile Networks. In: Proc. of the 2005 International Conference On Principles Of Distributed Systems (OPODIS) (2005)
Google Scholar
Dolev, S., Gilbert, S., Lynch, N.A., Schiller, E.M., Shvartsman, A., Welch, J.L.: Virtual Mobile Nodes for Mobile Ad Hoc Networks. In: Guerraoui, R. (ed.) DISC 2004. LNCS, vol. 3274, pp. 230–244. Springer, Heidelberg (2004)
Chapter Google Scholar
Dolev, S., Kat, R.I., Schiller, E.M.: When Consensus Meets Self-Stabilization: Self-Stabilizing Failure-Detector, Consensus and Replicated State-Machine. Computer Science, Ben-Gurion University of the Negev, TR #06-05 (May 2006)
Google Scholar
Dolev, S., Rajsbaum, S.: Stability of long-lived consensus. Journal of Computer and System Sciences, 26–45 (August 2003)
Google Scholar
Dolev, S., Welch, J.L.: Wait-free clock synchronization. Algorithmica 18, 486–511 (1997)
Article MATH MathSciNet Google Scholar
Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)
Article MATH MathSciNet Google Scholar
Freiling, F.C., Guerraoui, R., Kouznetsov, P.: The Failure Detector Abstraction. Department for Mathematics and Computer Science, University of Mannheim, TR-2006-003 (2006)
Google Scholar
Freiling, F.C., Völzer, H.: Illustrating the impossibility of crash-tolerant consensus in asynchronous systems. ACM SIGOPS Operating Systems Review 40(2), 105–109 (2006)
Article Google Scholar
Fox, A., Patterson, D.: Self-Repairing Computers. Scientific American (June 2003)
Google Scholar
Gafni, E., Lamport, L.: Disk Paxos. Distributed Computing 16(1), 1–20 (2003)
Article Google Scholar
Herlihy, M.: Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems 13(1), 124–149 (1991)
Article Google Scholar
Hutle, M., Widder, J.: Self-Stabilizing Failure Detector Algorithms. In: Parallel and Distributed Computing and Networks Conference, pp. 485–490 (February 2005)
Google Scholar
Kulkarni, S.S., Arora, A.: Multitolerance in Distributed Reset. Chicago Journal of Theoretical Computer, CJTCS-1998-4 (1998)
Google Scholar
Lamport, L.: The part-time parliament. ACM Transactions on Computer Systems 16(2), 133–169 (1998)
Article Google Scholar
Lamport, L.: Time, Clocks, and Ordering of Events in a Distributed System. Communication of the ACM 21(7), 558–565 (1978)
Article MATH Google Scholar
Lo, W., Hadzilacos, V.: Using Failure Detectors to Solve Consensus in Asynchronous Shared-Memory Systems. In: Tel, G., Vitányi, P.M.B. (eds.) WDAG 1994. LNCS, vol. 857, pp. 280–295. Springer, Heidelberg (1994)
Chapter Google Scholar
Loui, M.C., Abu-Amara, H.H.: Memory requirements for agreement among unreliable asynchronous processes. Advances in Computing Research 4, 163–183 (1987)
MathSciNet Google Scholar
Lynch, N.: Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco (1996)
MATH Google Scholar
Malkhi, D., Oprea, F., Zhou, L.: Omega Meets Paxos: Leader Election and Stability Without Eventual Timely Links. In: Fraigniaud, P. (ed.) DISC 2005. LNCS, vol. 3724, pp. 199–213. Springer, Heidelberg (2005)
Chapter Google Scholar
Raynal, M.: A Short Introduction to Failure Detectors for Asynchronous Distributed Systems. SIGACT News column 36(1) (March 2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Ben-Gurion University of the Negev, Beer-Sheva, 84105, Israel
Shlomi Dolev & Ronen I. Kat
Division of Computer Science and Engineering, Chalmers University of Technology, Göteborg, Sweden
Elad M. Schiller

Authors

Shlomi Dolev
View author publications
You can also search for this author in PubMed Google Scholar
Ronen I. Kat
View author publications
You can also search for this author in PubMed Google Scholar
Elad M. Schiller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Computer Science and Artificial Intelligence Laboratory, MIT, USA
Mariam Momenzadeh Alexander A. Shvartsman

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dolev, S., Kat, R.I., Schiller, E.M. (2006). When Consensus Meets Self-stabilization. In: Shvartsman, M.M.A.A. (eds) Principles of Distributed Systems. OPODIS 2006. Lecture Notes in Computer Science, vol 4305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945529_5

Download citation

DOI: https://doi.org/10.1007/11945529_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49990-9
Online ISBN: 978-3-540-49991-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

When Consensus Meets Self-stabilization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Closer Look at Fault Tolerance

The weakest failure detector for eventual consistency

Brief Announcement: Byzantine-Tolerant Detection of Causality in Synchronous Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

When Consensus Meets Self-stabilization

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

A Closer Look at Fault Tolerance

The weakest failure detector for eventual consistency

Brief Announcement: Byzantine-Tolerant Detection of Causality in Synchronous Systems

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation