Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Failure detectors encapsulate fairness

  • Published:
Distributed Computing Aims and scope Submit manuscript

Abstract

Failure detectors have long been viewed as abstractions for the synchronism present in distributed system models. However, investigations into the exact amount of synchronism encapsulated by a given failure detector have met with limited success. The reason for this is that traditionally, models of partial synchrony are specified with respect to real time, but failure detectors do not encapsulate real time. Instead, we argue that failure detectors encapsulate the fairness in computation and communication. Fairness is a measure of the number of steps executed by one process relative either to the number of steps taken by another process or relative to the duration for which a message is in transit. We argue that failure detectors are substitutable for the fairness properties (rather than real-time properties) of partially synchronous systems. We propose four fairness-based models of partial synchrony and demonstrate that they are, in fact, the ‘weakest system models’ to implement the canonical failure detectors from the Chandra-Toueg hierarchy. We also propose a set of fairness-based models which encapsulate the \({\mathcal{G}_c}\) parametric failure detectors which eventually and permanently suspect crashed processes, and eventually and permanently trust some fixed set of c correct processes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Aguilera M.K., Chen W., Toueg S.: On quiescent reliable communication. SIAM J. Comput. 29(6), 2040–2073 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  2. Aguilera, M. K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Stable leader election. In: Proceedings of the 15th International Conference on Distributed Computing, pp. 108–122. Springer, London, UK (2001)

  3. Aguilera, M. K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing, pp. 328–337 (2004)

  4. Aguilera M.K., Delporte-Gallet C., Fauconnier H., Toueg S.: On implementing omega in systems with weak reliability and synchrony assumptions. Distrib. Comput. 21(4), 285–314 (2008)

    Article  Google Scholar 

  5. Anta A.F., Raynal M.: From an asynchronous intermittent rotating star to an eventual leader. IEEE Trans. Parallel Distrib. Syst. 21(9), 1290–1303 (2010)

    Article  Google Scholar 

  6. Biely, M., Hutle, M., Penso, L. D., Widder, J.: Relating stabilizing timing assumptions to stabilizing failure detectors regarding solvability and efficiency. In: Proceedings of the 9th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 4–20 (2007)

  7. Biely, M., Hutle, M., Penso, L.D., Widder, J.: Relating stabilizing timing assumptions to stabilizing failure detectors regarding solvability and efficiency. Technical Report 54/2007, Technische Universität Wien, Institut für Technische Informatik (2007)

  8. Biely, M., Robinson, P., Schmid, U.: Weak synchrony models and failure detectors for message passing k-set agreement. In: Proceedings of the 13th International Conference on Principles of Distributed Systems, pp. 285–299 (2009)

  9. Bonnet, F., Raynal, M.: Looking for the weakest failure detector for k-set agreement in message-passing systems: is π k the end of the road? In: Procceding of 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 149–164 (2009)

  10. Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  11. Chandra T.D., Hadzilacos V., Toueg S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)

    Article  MathSciNet  MATH  Google Scholar 

  12. Chandy K.M., Misra J.: The drinking philosophers problem. ACM Trans. Program. Lang. Syst. 6(4), 632–646 (1984)

    Article  Google Scholar 

  13. Charron-Bost, B., Guerraoui, R., Schiper, A.: Synchronous system and perfect failure detector: solvability and efficiency issues. In: International Conference on Dependable Systems and Networks, pp. 523–532 (2000)

  14. Chen W., Toueg S., Aguilera M.K.: On the quality of service of failure detectors. IEEE Trans. Comput. 51(5), 561–580 (2002)

    Article  MathSciNet  Google Scholar 

  15. Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Hadzilacos, V., Kouznetsov, P., Toueg, S.: The weakest failure detectors to solve certain fundamental problems in distributed computing. In: Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing, pp. 338–346 (2004)

  16. Delporte-Gallet C., Fauconnier H., Guerraoui R., Kouznetsov P.: Mutual exclusion in asynchronous systems with failure detectors. J. Parallel Distrib. Comput. 65(4), 492–505 (2005)

    Article  MATH  Google Scholar 

  17. Dolev D., Dwork C., Stockmeyer L.: On the minimal synchronism needed for distributed consensus. J. ACM 34(1), 77–97 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  18. Dwork C., Lynch N.A., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)

    Article  MathSciNet  Google Scholar 

  19. Fetzer, C.: The message classification model. In: Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pp. 153–162 (1998)

  20. Fetzer, C., Schmid, U., Susskraut, M.: On the possibility of consensus in asynchronous systems with finite average response times. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems, pp. 271–280 (2005)

  21. Fich F., Ruppert E.: Hundreds of impossibility results for distributed computing. Distrib. Comput. 16(2–3), 121–163 (2003)

    Article  Google Scholar 

  22. Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)

    Article  MathSciNet  MATH  Google Scholar 

  23. Gafni, E., Kouznetsov, P.: The weakest failure detector for solving k-set agreement. In: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, pp. 83–91 (2009)

  24. Guerraoui, R., Oliveira, R., Schiper, A.: Stubborn communication channels. Technical Report LSR-REPORT-1998-009, Ecole Polytechnique Federale de Lausanne (1998)

  25. Guerraoui R., Kapalka M., Kouznetsov P.: The weakest failure detectors to boost obstruction-freedom. Distrib. Comput. 20(6), 415–433 (2008)

    Article  Google Scholar 

  26. Hermant, J. F., Widder, J.: Implementing reliable distributed real-time systems with the Θ-model. In: Proceedings of the 9th International Conference on the Principles of Distributed Systems, pp. 334–350 (2005)

  27. Hutle M., Malkhi D., Schmid U., Zhou L.: Chasing the weakest system model for implementing Ω and consensus. IEEE Trans. Dependable Secur. Comput. 6(4), 269–281 (2009)

    Article  Google Scholar 

  28. Jayanti, P., Toueg, S.: Every problem has a weakest failure detector. In: Proceedings of the 27th ACM Symposium on Principles of Distributed Computing, pp. 75–84 (2008)

  29. Lo, W. K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared-memory systems (extended abstract). In: Proceedings of the 8th International Workshop on Distributed Algorithms, pp. 280–295 (1994)

  30. Malkhi, D., Oprea, F., Zhou, L.:Ω meets Paxos: Leader election and stability without eventual timely links. In: Proceedings of the 19th International Symposium on Distributed Computing, pp. 199–213 (2005)

  31. Mostefaoui A., Mourgaya E., Raynal M.: An introduction to oracles for asynchronous distributed systems. Futur. Gener. Comput. Syst. 18(6), 757–767 (2002)

    Article  MATH  Google Scholar 

  32. Mostéfaoui, A., Mourgaya, E., Raynal, M.: Asynchronous implementation of failure detectors. In: Proceedings of the 33rd International Conference on Dependable Systems and Networks, pp. 351–360 (2003)

  33. Mostéfaoui A., Mourgaya E., Raynal M., Travers C.: A time-free assmption to implement eventual leadership. Parallel Process. Lett. 16(2), 189–207 (2006)

    Article  MathSciNet  Google Scholar 

  34. Mostefaoui A., Rajsbaum S., Raynal M., Travers C.: On the computability power and the robustness of set agreement-oriented failure detector classes. Distrib. Comput. 21(3), 201–222 (2008)

    Article  Google Scholar 

  35. Pike, S. M., Sivilotti, P. A.: Dining philosophers with crash locality 1. In: Proceedings of the 24th IEEE International Conference on Distributed Computing Systems, pp. 22–29 (2004)

  36. Pike, S. M., Song, Y., Sastry, S.: Wait-free dining under eventual weak exclusion. In: Proceedings of the 9th International Conference on Distributed Computing and Networking, pp. 135–146 (2008)

  37. Rajsbaum, S., Raynal, M., Travers, C.: Failure detectors as schedulers (an algorithmically-reasoned characterization). Technical Report 1838, IRISA, Université de Rennes, France (2007)

  38. Rajsbaum, S., Raynal, M., Travers, C.:The iterated restricted immediate snapshot model. In: Proceedings of 14th International Conference on Computing and Combinatorics, pp. 487–497 (2008)

  39. Robinson, P., Schmid, U.: The Asynchronous Bounded-Cycle Model. In: Proceedings of the 10th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 246–262 (2008)

  40. Sastry, S., Pike, S. M.: Eventually perfect failure detection using ADD channels. In: Proceedings of the 5th international Symposium on Parallel and Distributed Processing and Applications, pp. 483–496 (2007)

  41. Sastry, S., Pike, S. M., Welch, J. L.: Crash fault detection in celerating environments. In: Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium, pp. 1–12 (2009)

  42. Yang, J., Neiger, G., Gafni, E.:Structured derivations of consensus algorithms for failure detectors. In: Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pp. 297–306 (1998)

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Srikanth Sastry.

Additional information

A preliminary version of this paper was presented at the 14th International Conference On Principles Of Distributed Systems (OPODIS) in 2010. This work is supported in part by NSF grants CCF-0964696 and CCF-0937274, and Texas Higher Education Coordinating Board grant NHARP 000512-0130-2007. This works is also partially supported by Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF-0939370.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pike, S.M., Sastry, S. & Welch, J.L. Failure detectors encapsulate fairness. Distrib. Comput. 25, 313–333 (2012). https://doi.org/10.1007/s00446-012-0164-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00446-012-0164-x

Keywords

Navigation