Abstract
Failure detectors have long been viewed as abstractions for the synchronism present in distributed system models. However, investigations into the exact amount of synchronism encapsulated by a given failure detector have met with limited success. The reason for this is that traditionally, models of partial synchrony are specified with respect to real time, but failure detectors do not encapsulate real time. Instead, we argue that failure detectors encapsulate the fairness in computation and communication. Fairness is a measure of the number of steps executed by one process relative either to the number of steps taken by another process or relative to the duration for which a message is in transit. We argue that failure detectors are substitutable for the fairness properties (rather than real-time properties) of partially synchronous systems. We propose four fairness-based models of partial synchrony and demonstrate that they are, in fact, the ‘weakest system models’ to implement the canonical failure detectors from the Chandra-Toueg hierarchy. We also propose a set of fairness-based models which encapsulate the \({\mathcal{G}_c}\) parametric failure detectors which eventually and permanently suspect crashed processes, and eventually and permanently trust some fixed set of c correct processes.
Similar content being viewed by others
References
Aguilera M.K., Chen W., Toueg S.: On quiescent reliable communication. SIAM J. Comput. 29(6), 2040–2073 (2000)
Aguilera, M. K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Stable leader election. In: Proceedings of the 15th International Conference on Distributed Computing, pp. 108–122. Springer, London, UK (2001)
Aguilera, M. K., Delporte-Gallet, C., Fauconnier, H., Toueg, S.: Communication-efficient leader election and consensus with limited link synchrony. In: Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing, pp. 328–337 (2004)
Aguilera M.K., Delporte-Gallet C., Fauconnier H., Toueg S.: On implementing omega in systems with weak reliability and synchrony assumptions. Distrib. Comput. 21(4), 285–314 (2008)
Anta A.F., Raynal M.: From an asynchronous intermittent rotating star to an eventual leader. IEEE Trans. Parallel Distrib. Syst. 21(9), 1290–1303 (2010)
Biely, M., Hutle, M., Penso, L. D., Widder, J.: Relating stabilizing timing assumptions to stabilizing failure detectors regarding solvability and efficiency. In: Proceedings of the 9th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 4–20 (2007)
Biely, M., Hutle, M., Penso, L.D., Widder, J.: Relating stabilizing timing assumptions to stabilizing failure detectors regarding solvability and efficiency. Technical Report 54/2007, Technische Universität Wien, Institut für Technische Informatik (2007)
Biely, M., Robinson, P., Schmid, U.: Weak synchrony models and failure detectors for message passing k-set agreement. In: Proceedings of the 13th International Conference on Principles of Distributed Systems, pp. 285–299 (2009)
Bonnet, F., Raynal, M.: Looking for the weakest failure detector for k-set agreement in message-passing systems: is π k the end of the road? In: Procceding of 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 149–164 (2009)
Chandra T.D., Toueg S.: Unreliable failure detectors for reliable distributed systems. J. ACM 43(2), 225–267 (1996)
Chandra T.D., Hadzilacos V., Toueg S.: The weakest failure detector for solving consensus. J. ACM 43(4), 685–722 (1996)
Chandy K.M., Misra J.: The drinking philosophers problem. ACM Trans. Program. Lang. Syst. 6(4), 632–646 (1984)
Charron-Bost, B., Guerraoui, R., Schiper, A.: Synchronous system and perfect failure detector: solvability and efficiency issues. In: International Conference on Dependable Systems and Networks, pp. 523–532 (2000)
Chen W., Toueg S., Aguilera M.K.: On the quality of service of failure detectors. IEEE Trans. Comput. 51(5), 561–580 (2002)
Delporte-Gallet, C., Fauconnier, H., Guerraoui, R., Hadzilacos, V., Kouznetsov, P., Toueg, S.: The weakest failure detectors to solve certain fundamental problems in distributed computing. In: Proceedings of the 23rd ACM Symposium on Principles of Distributed Computing, pp. 338–346 (2004)
Delporte-Gallet C., Fauconnier H., Guerraoui R., Kouznetsov P.: Mutual exclusion in asynchronous systems with failure detectors. J. Parallel Distrib. Comput. 65(4), 492–505 (2005)
Dolev D., Dwork C., Stockmeyer L.: On the minimal synchronism needed for distributed consensus. J. ACM 34(1), 77–97 (1987)
Dwork C., Lynch N.A., Stockmeyer L.: Consensus in the presence of partial synchrony. J. ACM 35(2), 288–323 (1988)
Fetzer, C.: The message classification model. In: Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pp. 153–162 (1998)
Fetzer, C., Schmid, U., Susskraut, M.: On the possibility of consensus in asynchronous systems with finite average response times. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems, pp. 271–280 (2005)
Fich F., Ruppert E.: Hundreds of impossibility results for distributed computing. Distrib. Comput. 16(2–3), 121–163 (2003)
Fischer M.J., Lynch N.A., Paterson M.S.: Impossibility of distributed consensus with one faulty process. J. ACM 32(2), 374–382 (1985)
Gafni, E., Kouznetsov, P.: The weakest failure detector for solving k-set agreement. In: Proceedings of the 28th ACM Symposium on Principles of Distributed Computing, pp. 83–91 (2009)
Guerraoui, R., Oliveira, R., Schiper, A.: Stubborn communication channels. Technical Report LSR-REPORT-1998-009, Ecole Polytechnique Federale de Lausanne (1998)
Guerraoui R., Kapalka M., Kouznetsov P.: The weakest failure detectors to boost obstruction-freedom. Distrib. Comput. 20(6), 415–433 (2008)
Hermant, J. F., Widder, J.: Implementing reliable distributed real-time systems with the Θ-model. In: Proceedings of the 9th International Conference on the Principles of Distributed Systems, pp. 334–350 (2005)
Hutle M., Malkhi D., Schmid U., Zhou L.: Chasing the weakest system model for implementing Ω and consensus. IEEE Trans. Dependable Secur. Comput. 6(4), 269–281 (2009)
Jayanti, P., Toueg, S.: Every problem has a weakest failure detector. In: Proceedings of the 27th ACM Symposium on Principles of Distributed Computing, pp. 75–84 (2008)
Lo, W. K., Hadzilacos, V.: Using failure detectors to solve consensus in asynchronous shared-memory systems (extended abstract). In: Proceedings of the 8th International Workshop on Distributed Algorithms, pp. 280–295 (1994)
Malkhi, D., Oprea, F., Zhou, L.:Ω meets Paxos: Leader election and stability without eventual timely links. In: Proceedings of the 19th International Symposium on Distributed Computing, pp. 199–213 (2005)
Mostefaoui A., Mourgaya E., Raynal M.: An introduction to oracles for asynchronous distributed systems. Futur. Gener. Comput. Syst. 18(6), 757–767 (2002)
Mostéfaoui, A., Mourgaya, E., Raynal, M.: Asynchronous implementation of failure detectors. In: Proceedings of the 33rd International Conference on Dependable Systems and Networks, pp. 351–360 (2003)
Mostéfaoui A., Mourgaya E., Raynal M., Travers C.: A time-free assmption to implement eventual leadership. Parallel Process. Lett. 16(2), 189–207 (2006)
Mostefaoui A., Rajsbaum S., Raynal M., Travers C.: On the computability power and the robustness of set agreement-oriented failure detector classes. Distrib. Comput. 21(3), 201–222 (2008)
Pike, S. M., Sivilotti, P. A.: Dining philosophers with crash locality 1. In: Proceedings of the 24th IEEE International Conference on Distributed Computing Systems, pp. 22–29 (2004)
Pike, S. M., Song, Y., Sastry, S.: Wait-free dining under eventual weak exclusion. In: Proceedings of the 9th International Conference on Distributed Computing and Networking, pp. 135–146 (2008)
Rajsbaum, S., Raynal, M., Travers, C.: Failure detectors as schedulers (an algorithmically-reasoned characterization). Technical Report 1838, IRISA, Université de Rennes, France (2007)
Rajsbaum, S., Raynal, M., Travers, C.:The iterated restricted immediate snapshot model. In: Proceedings of 14th International Conference on Computing and Combinatorics, pp. 487–497 (2008)
Robinson, P., Schmid, U.: The Asynchronous Bounded-Cycle Model. In: Proceedings of the 10th International Symposium on Stabilization, Safety, and Security of Distributed Systems, pp. 246–262 (2008)
Sastry, S., Pike, S. M.: Eventually perfect failure detection using ADD channels. In: Proceedings of the 5th international Symposium on Parallel and Distributed Processing and Applications, pp. 483–496 (2007)
Sastry, S., Pike, S. M., Welch, J. L.: Crash fault detection in celerating environments. In: Proceedings of the 23rd IEEE International Parallel and Distributed Processing Symposium, pp. 1–12 (2009)
Yang, J., Neiger, G., Gafni, E.:Structured derivations of consensus algorithms for failure detectors. In: Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pp. 297–306 (1998)
Author information
Authors and Affiliations
Corresponding author
Additional information
A preliminary version of this paper was presented at the 14th International Conference On Principles Of Distributed Systems (OPODIS) in 2010. This work is supported in part by NSF grants CCF-0964696 and CCF-0937274, and Texas Higher Education Coordinating Board grant NHARP 000512-0130-2007. This works is also partially supported by Center for Science of Information (CSoI), an NSF Science and Technology Center, under grant agreement CCF-0939370.
Rights and permissions
About this article
Cite this article
Pike, S.M., Sastry, S. & Welch, J.L. Failure detectors encapsulate fairness. Distrib. Comput. 25, 313–333 (2012). https://doi.org/10.1007/s00446-012-0164-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00446-012-0164-x