Nothing Special   »   [go: up one dir, main page]

Skip to main content

When Consensus Meets Self-stabilization

Self-stabilizing Failure-Detector, Consensus and Replicated State-Machine (Extended Abstract)

  • Conference paper
Principles of Distributed Systems (OPODIS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4305))

Included in the following conference series:

Abstract

This paper presents a self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine.

Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. New requirements for consensus that fit the on-going nature of self-stabilizing algorithms are presented. The wait-free consensus (and the replicated state-machine) algorithm presented is a classic combination of a failure detector and a (memory bounded) rotating coordinator consensus that satisfy both eventual safety and eventual liveness.

Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance-unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 264 sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used in other scopes.

Partially supported by the Israeli Ministry of Science, the Lynn and William Frankel Center for Computer Sciences, and the Rita Altura Trust Chair in Computer Sciences.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Abraham, U.: Self-stabilizing timestamps. Theoretical Computer Science 308(1-3), 449–515 (2003)

    Article  MATH  MathSciNet  Google Scholar 

  2. Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: On implementing omega with weak reliability and synchrony assumptions. In: Proc. of the 22nd ACM symposium on Principles of distributed computing, pp. 306–314 (July 2003)

    Google Scholar 

  3. Aguilera, M.K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proc. of the 23rd ACM symposium on Principles of distributed computing, pp. 328–337 (July 2004)

    Google Scholar 

  4. Arora, A., Kulkarni, S.S., Demirbas, M.: Resettable vector clocks. In: Proc. of the 19th ACM Symposium on Principles of Distributed Computing, pp. 269–278 (August 2000)

    Google Scholar 

  5. Attiya, H., Welch, J.L.: Distributed Computing: Fundamentals, Simulations, and Advanced Topics, 2nd edn. John Wiley and Sons, Inc., Chichester (2004)

    Google Scholar 

  6. Beauquier, J., Kekkonen-Moneta, S.: Fault-tolerance and self-stabilizing: Impossibility results and solutions using self-stabilizing failure detectors. International Journal of Systems Science 28(11), 1177–1187 (1997)

    Article  MATH  Google Scholar 

  7. Birman, K.P.: Replication and Fault-Tolerance in the ISIS System. In: ACM Symposium on Operating Systems Principles (SOSP), pp. 79–86 (1985)

    Google Scholar 

  8. Chandra, T.D., Toueg, S.: Unreliable failure detectors for reliable distributed systems. Journal of the ACM 43(2), 225–267 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  9. Chandra, T.D., Hadzilacos, V., Toueg, S.: The weakest failure detector for solving consensus. In: Proc. of the 11th ACM Symposium on Principles of Distributed Computing, pp. 147–158 (August 1992)

    Google Scholar 

  10. Chor, B., Israeli, A., Li, M.: On processor coordination using asynchronous hardware. In: Proc. of the 6th ACM Symposium on Principles of Distributed Computing, pp. 86–97 (August 1987)

    Google Scholar 

  11. Davidovitch, L., Dolev, S., Rajsbaum, S.: Consensus continue? Stability of multi-valued continuous consensus! In: Proc. of the Workshop on Geometry and Topology in Concurrency and Distributed Computing, pp. 21–24 (October 2004)

    Google Scholar 

  12. Delporte-Gallet, C., Fauconnier, H., Guerraoui, R.: Failure Detection Lower Bounds on Registers and Consensus. In: Proc. of the 16th International Conference on Distributed Computing, pp. 237–251 (October 2002)

    Google Scholar 

  13. Dijkstra, E.W.: Self-Stabilizing Systems in spite of Distributed Control. Communications of the ACM 1(11), 643–644 (1974)

    Article  Google Scholar 

  14. Dolev, S.: Self-stabilization. MIT Press, Cambridge (2000)

    MATH  Google Scholar 

  15. Dolev, S., Israeli, A., Moran, S.: Self-Stabilization of Dynamic Systems Assuming only Read/Write Atomicity. In: Proc. of the 9th ACM Symposium on Principles of Distributed Computing, pp. 103–117 (August 1990)

    Google Scholar 

  16. Dolev, S., Gilbert, S., Lahiani, L., Lynch, N., Nolte, T.: Virtual Stationary Automata for Mobile Networks. In: Proc. of the 2005 International Conference On Principles Of Distributed Systems (OPODIS) (2005)

    Google Scholar 

  17. Dolev, S., Gilbert, S., Lynch, N.A., Schiller, E.M., Shvartsman, A., Welch, J.L.: Virtual Mobile Nodes for Mobile Ad Hoc Networks. In: Guerraoui, R. (ed.) DISC 2004. LNCS, vol. 3274, pp. 230–244. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  18. Dolev, S., Kat, R.I., Schiller, E.M.: When Consensus Meets Self-Stabilization: Self-Stabilizing Failure-Detector, Consensus and Replicated State-Machine. Computer Science, Ben-Gurion University of the Negev, TR #06-05 (May 2006)

    Google Scholar 

  19. Dolev, S., Rajsbaum, S.: Stability of long-lived consensus. Journal of Computer and System Sciences, 26–45 (August 2003)

    Google Scholar 

  20. Dolev, S., Welch, J.L.: Wait-free clock synchronization. Algorithmica 18, 486–511 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  21. Fischer, M.J., Lynch, N.A., Paterson, M.S.: Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2), 374–382 (1985)

    Article  MATH  MathSciNet  Google Scholar 

  22. Freiling, F.C., Guerraoui, R., Kouznetsov, P.: The Failure Detector Abstraction. Department for Mathematics and Computer Science, University of Mannheim, TR-2006-003 (2006)

    Google Scholar 

  23. Freiling, F.C., Völzer, H.: Illustrating the impossibility of crash-tolerant consensus in asynchronous systems. ACM SIGOPS Operating Systems Review 40(2), 105–109 (2006)

    Article  Google Scholar 

  24. Fox, A., Patterson, D.: Self-Repairing Computers. Scientific American (June 2003)

    Google Scholar 

  25. Gafni, E., Lamport, L.: Disk Paxos. Distributed Computing 16(1), 1–20 (2003)

    Article  Google Scholar 

  26. Herlihy, M.: Wait-Free Synchronization. ACM Transactions on Programming Languages and Systems 13(1), 124–149 (1991)

    Article  Google Scholar 

  27. Hutle, M., Widder, J.: Self-Stabilizing Failure Detector Algorithms. In: Parallel and Distributed Computing and Networks Conference, pp. 485–490 (February 2005)

    Google Scholar 

  28. Kulkarni, S.S., Arora, A.: Multitolerance in Distributed Reset. Chicago Journal of Theoretical Computer, CJTCS-1998-4 (1998)

    Google Scholar 

  29. Lamport, L.: The part-time parliament. ACM Transactions on Computer Systems 16(2), 133–169 (1998)

    Article  Google Scholar 

  30. Lamport, L.: Time, Clocks, and Ordering of Events in a Distributed System. Communication of the ACM 21(7), 558–565 (1978)

    Article  MATH  Google Scholar 

  31. Lo, W., Hadzilacos, V.: Using Failure Detectors to Solve Consensus in Asynchronous Shared-Memory Systems. In: Tel, G., Vitányi, P.M.B. (eds.) WDAG 1994. LNCS, vol. 857, pp. 280–295. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  32. Loui, M.C., Abu-Amara, H.H.: Memory requirements for agreement among unreliable asynchronous processes. Advances in Computing Research 4, 163–183 (1987)

    MathSciNet  Google Scholar 

  33. Lynch, N.: Distributed Algorithms. Morgan Kaufmann Publishers, San Francisco (1996)

    MATH  Google Scholar 

  34. Malkhi, D., Oprea, F., Zhou, L.: Omega Meets Paxos: Leader Election and Stability Without Eventual Timely Links. In: Fraigniaud, P. (ed.) DISC 2005. LNCS, vol. 3724, pp. 199–213. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  35. Raynal, M.: A Short Introduction to Failure Detectors for Asynchronous Distributed Systems. SIGACT News column 36(1) (March 2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dolev, S., Kat, R.I., Schiller, E.M. (2006). When Consensus Meets Self-stabilization. In: Shvartsman, M.M.A.A. (eds) Principles of Distributed Systems. OPODIS 2006. Lecture Notes in Computer Science, vol 4305. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11945529_5

Download citation

  • DOI: https://doi.org/10.1007/11945529_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49990-9

  • Online ISBN: 978-3-540-49991-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics