Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

The consensus problem in fault-tolerant computing

Published: 01 June 1993 Publication History
First page of PDF

References

[1]
ADAMS, J., AND RAMARAO, K. 1989 Distributed dagnosm of Byzantine processors and links. In the 9th Internatmnal IEEE Conference on Dzstr~buted Corr~putzng Systems. IEEE, New York, 562 569.
[2]
AGRAWAI, P 1985. RAFT: A recursive algorithm for fault tolerance. In the International Con/erence on Parallel Programming, 814-821.
[3]
AMMANN, E., AND DAL CIN, M. 1981. Efficient algorithms for comparison-based self-diagnos~s In Self-Dzagnoszs and Fault-Tolerance. Werkhefte der Universitat Tfibingen, 4 Attempto-Verlag, Tubingen, 1-18.
[4]
ATTIYA, C, DOLEV, D., AND GIL, J. 1984. Asynchronous Byzantme consensus. In the 3rd Annual ACM Symposzum on Prmctples of Distrzbuted Computing. ACM, New York, 119 133
[5]
BABAO~LU, (~. 1987a. Stopping times of distributed consensus protocols: A probabilistic approach. Inf. Process. Lett. 25, 3 (May), 163-169.
[6]
BARAOOLU, 0. 1987b. On the reliability of consensus-based fault-tolerant distributed computing systems. ACM Trans. Comput. Syst. 5, 3 (Nov.), 394-416.
[7]
BABAO~LU, 0., AND DRUMMOND, R. 1985. Streets of Byzantium: Network architectures for fast reliable broadcasts. IEEE Trans. Softw. Eng. SE-11 (June), 546 554.
[8]
BAGCHI, A. 1992. A distributed algorithm for system-level diagnosis in hypercubes. In the IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems (July). IEEE, New York, 106-113.
[9]
BAGCHI, A., AND HAKIMI, S. 1991. An optimal algorithm for distributed system level diagnosis. In the 21st International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 214-221.
[10]
BARBORAK, M., AND MALEK, M. 1993. Partitioning for efficient consensus. In Proceedings of the 26th Hawaii International Conference on System Sctences (Maul, Jan. 5-8), 438-446.
[11]
BAR-NOY, A., AND DOLEV, 1991. Consensus algorithms with one-bit messages. Dtstrib. Comput. 4, 105-110.
[12]
BAR-NOY, A., DOLEV, D., DWORK, C., AND STRONG, R. 1992. Shifting gears: Changing algorithms on the fly to expedite Byzantine agreement. Inf. Comput. 97, 205-233.
[13]
BARRETT, P., HILBORNE, A., BOND, P., SEATON, D., VERISSIMO, P., RODRIGUES, L., AND SPEIRS, N. 1990. The Delta-4 Extra Performance Architecture (XPA). In the 20th Internatwnal IEEE Symposium on Fault-Tolerant Computing'. IEEE, New York, 481-488.
[14]
BARSI, F., GRANDONI, F., AND MAESTRINI, P. 1976. A theory of diagnosability of digital systems. IEEE Trans. Comput. C-25, 6 (June), 585 593.
[15]
BARTLETT, J. 1978. A 'non-stop' operating system. In Proceedings of the Hawaii International Conference on System Sciences, 103 119,
[16]
BEN-O~, M. 1983. Another advantage of free choice: Completely asynchronous agreement protocols. In the 2nd Annual ACM Symposium on Principles of Dzstributed Computing. ACM, New York, 27 30.
[17]
BERMAN, P., AND PELC, A. 1990. Distributed probabilistic fault diagnosis for multiprocessor systems In the 20th Internattonal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York. 340-346.
[18]
BERMAN, P., GARY, J, AND PERRY, K. 1989. Towards optimal distributed consensus. In the 30th IEEE Symposium on Foundattons of Computer Science. IEEE, New York, 410 415.
[19]
BERTSEKAS, D., AND GALLAGER, R. 1987. Data Networks. Prentice-Hall, Englewood Cliffs, N.J.
[20]
B~CH~NI, R., AND BUS~NS, R. 1991. An adaptive distributed system-level diagnosis algorithm and its implementation. In the 21st International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 222-229.
[21]
BIANCHINI, R., GOODWIN, K., AND NYDICK, D. 1990. Practical application and implementation of distributed system-level diagnosis theory. In the 20th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 332-339.
[22]
BIRMAN, K. 1985. Replication and fault-tolerance in the ISIS system. In the lOth ACM SIGOPS Symposium on Operating Systems Principles. (Dec.). ACM, New York, 79-86.
[23]
BIRMAN K., AND JOSEPH, T. 1987. Reliable communication in the presence of faults. ACM Trans. Comput. Syst. 5, i (Feb.), 47-76.
[24]
BLECHER, P. 1983. On a logical problem. Discr. Math. 43, 107-110.
[25]
BLOUGH, D., AND PELC, A. 1992. Complexity of fault diagnosis in comparison models. IEEE Trans. Comput. 41, 3 (Mar.), 318-323.
[26]
BLOUGH, D., AND PELC, A. 1990. Reliable diagnosis and repair in constant-degree multiprocessor systems. In the 20th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 316-323.
[27]
BLOUGH, D., SULLIVAN, G., AND MASSON, G. 1992a. Intermittent fault diagnosis in multiprocessor systems. IEEE Trans. Comput. To be published.
[28]
BLOUGH, D., SULLIVAN, G., AND MASSON, G. 1992b. Efficient diagnosis of multiprocessor systems under probabilistic models. IEEE Trans. Cornput. To be published.
[29]
BLOUGH, D., SULLIVAN, G., AND MASSON, G. 1989 Fault diagnosis for sparsely interconnected multiprocessor systems. In the 19th International IEEE Symposium of Fault-Tolerant Computing. IEEE, New York, 62-69.
[30]
BLOUGH, D., SULLIVAN, G., AND MASSON, G. 1988. Almost certain diagnosis for intermittently faulty systems. In the 18th International IEEE Symposium of Fault-Tolerant Computing. IEEE, New York, 260 265.
[31]
BLOUNT, M. 1978. Modeling of diagnosis in failsoftly computer systems. In the 8th International IEEE symposium on Fault-Tolerant Computing. IEEE, New York, 53 58.
[32]
BLOUNT, M. 1977. Probabilistic treatment of diagnosis in digital systems. In the 7th Internatzonal IEEE Symposium of Fault-Tolerant Computing. IEEE, New York, 72-77.
[33]
BRACHA, G. 1987a. Asynchronous Byzantine agreenxenJe larotocola. Inf. COmFit. 7~ (Nov.), 130-143.
[34]
BP~CHA, G. 1987b. An O(log n) expected rounds randomized Byzantine generals protocol. J. ACM 34, 4 (Oct.), 910-920.
[35]
BRACHA, G., AND TOUEG, S. 1985. Asynchronous consensus and broadcast protocols. J. ACM 32, 4 (Oct.), 824 840.
[36]
BR^C~A, G., ANO TOUEG, S. 1983. Resilient consensus protocols. In the 2nd ACM Symposium on Princtples of Distributed Computing. ACM, New York, 12-26.
[37]
BURNS, J., AND LYNCH, N. 1987. The Byzantine firing squad problem. Adv. Comput. Res.: Para,Tl. Dtstrib. Comput. 4, 147 161.
[38]
CHAR. B., AND CoAN, B. 1985. A simple and efficmnt randomized Byzantine agreement algorithm. IEEE Trans. Softw. Eng. SE-11, 6 (June), 531-539.
[39]
CHAR, B., MERRITT, M., AND SHMOYS, D. 1989. Simple constant-time consensus protocols in realistic failure models. J. ACM 36, 3 (July), 591 614.
[40]
CHW^, K., AND HAKmI, S. 1981a. Schemes for fault-tolerant computing: A comparison of modularly redundant and t-diagnosable systems. Inf. Contr. 49, 212-238.
[41]
CHWA, K., AND HAt~IMI, S. 1981b. On fault identification in diagnosable systems. IEEE Trans. Comput. C~30, 6 (June), 414 422.
[42]
CIOMgI, P., GRANDONI, F., ANb SmONCINI, L. 1981. Distributed diagnosis in multiprocessor systems: The MuTeam approach. In the llth International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 25-29.
[43]
CoAN, B. 1988. Efficient agreement using fault diagnosis. In Proceedings of the 26th Allerton Conference on Communication, Control and Computing. Univ. of Illinois, Urbana, Ill., 663-672.
[44]
COAN, g., AND WELCH, J. 1992. Modular construction of a Byzantine agreement protocol with optimal message bit complexity. In/} Camput. 97, 61 85.
[45]
COAN, B., DOLEV, D., DWORK, C., AND STOCKMEYER, L. 1985. The distributed firing squad problem. In the 17th ACM Symposium on the TheoCY of Computing. ACM, New York, 335-345.
[46]
Cms'~IAN, F. 1991a. Understanding fault-tolerant distributed systems. Cornmun. ACM 34, 2 (Feb.).
[47]
CRISTIAN, F. 1991b. Reaching agreement on processor-group membership in synchronous distributed systems. Dzstrib. Comput. 4, 175 187.
[48]
CRISTIAN, F. 1990. Fault-tolerance in the advanced automation system. IBM Res. Rep, RJ '7424 (69595).
[49]
CRIS'r~AN, F. 1989. Synchronous atomic broadcast for redundant broadcast channels. IBM Res. Rep. RJ 7203.
[50]
CRISTIAN, F., AGHILI, H., STRONG, R., AND DOLEV, D. 1986. Atomic broadcast: From simple message diffusion to Byzantine agreement. IBM Tech. Rep. RJ 5244 (54244).
[51]
DAHBURA, A. 1988. System-level diagnosis: A perspective for the third decade. AT& T Bell Laboratories Report. Concurrent Computations: Algorithms, Architecture, and Technology. Plenum Press, New York.
[52]
DAHBURA, A. 1986. An efficient algorithm for identifying the most likely fault set in a probabilistically diagnosable system. IEEE Trans. Comput. C-35, 4 (Apr.), 354-356.
[53]
DAHBURA, A., AND MASSON, G. 1984a. An O(n25) fault identification algorithm for diagnosable systems. IEEE Trans. Comput. C-33, 6 (June), 486 492.
[54]
DAHBURA, A., AND MASSON, G. 1984b. A practical variation of the O(n2 s) fault diagnosis algorithm. In the 14th International IEEE Sympostum on Fault-Tolerant Computing. IEEE, New York, 428-433.
[55]
DAHBURA, A., ANn MASSON, G. 1983a. Greedy diagnosis of hybrid fault situations. IEEE Trans. Comput. C-32, 8 (Aug.), 777 782.
[56]
DAaBURA, A., AND MASSON, G. 1983b. Greedy diagnosis of an intermittent-fault/tranment-upset tolerant system design, IEEE Trans. Camput. C-32,10 (Oct.), 953-957.
[57]
DAHBURA, A., LAFERRERA, J., AND KING, L. 1985a. A performance study of system-level fault diagnosis algorithms (I). In the 4th Internattonal Conference on Computers and Communications, 469-473.
[58]
DAHBURA, A., MASSON, G., AND YAN6, C. 1985b. Self-implicating structures for diagnosable systems. IEEE Trans. Comput. C-34, 8 (Aug.), 718-723.
[59]
DAHUBURA, A., SABNANI, K., AND I-IERY, W. 1989. Spare capacity as a means of fault detection and diagnosis in multiproeessor systems. IEEE Trans. Comput. C-38, 6 (June), 881-891.
[60]
DAHBURA, A., SABNANI, K., AND KING, L. 1987. The comparison approach to multiprocessor fault diagnosis. IEEE Trans. Comput. C-36, 3 (Mar.), 373-378.
[61]
DAL CIN, M. 1984. Distributed diagnosis for computing networks. Microproeess, Microprogram. 14, 139 144.
[62]
DAL CIN, M. 1982. A diagnostic device for large multiprocessor systems. In the 12th Internatzonal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 357-360.
[63]
DAL C~N, M. 1980. Self-testlng and self-diagnosing multicomponent systems. In the lOth International IEEE Symposium on Fault-Tolerant Computmg. IEEE, New York.
[64]
DAL CIN, M. 1978. Performance evaluation of self-diagnosing multiprocessing systems, In the 8th International IEEE Symposium on Fault- Tolerant Computing. IEEE, New York, 59-64.
[65]
DAL C~N, M., AND DmGER, E. 1981. On the diagnosability of self-testing multi-microprocessor systems. Microprocess. Microprogram. 7, 177-184.
[66]
DAL CIN, M., AND FLORIAN, F. 1985. Analysis of a fault-tolerant distributed diagnosis algorithm. In the 15th Internatzonal IEEE Symposium on Fault-Tolerant Computing-. IEEE, New York, 159-164.
[67]
DEGONIA, P., WITT, R., LAMPE, D., AND COLE, E. 1978. Micronet--A self-healing network for signal processing. In the Government Microcircuit Application Conference. (Nov.), 370-375.
[68]
DEC), N. 1974. Graph Theory with Applications to Engineering and Computer Sctence. Prentice-Hall, Englewood Cliffs, N.J.
[69]
DOLEV, D. 1982. The Byzantine generals strike again. J. Alg. 3, 14-30.
[70]
DOLEV, D. 1981. Unanimity in an unknown and unreliable environment. In the 22nd Symposium on the Foundations of Computer Science, 159-168.
[71]
DOLEV, D., AND REISCHUK, R. 1985. Bounds on information exchange for Byzantine agreement. J. ACM 32, i (Jan.), 191-204.
[72]
DOLEV, D., ANO STRONG, H. 1982. Authenticated algorithms for Byzantine agreement. J. Alg. 3, 14-30.
[73]
DOLEV, D., DWORK, C., AND STOCKMEYER, L. 1987. On the minimal synchronism needed for distributed consensus. J. ACM 34, 1 (Jan.), 77-97.
[74]
DOLEV, D., LYNCH, N., PINTER, S., STARK, E., AND WEIHL, W. 1986. Reaching approximate agreement in the presence of faults. J. ACM 33, 3 (July), 499 516.
[75]
DWORK, C., LYNCH, N., AND STOCKMEYER, L. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (Apr.), 288-323.
[76]
FEKETE, A. 1991. Asymptotically optimal algorithms for approximate agreement. Dtstrtb. Comput. 4, 9-29.
[77]
FELDMAN, P., AND MICALI, S. 1988. Optimal algorithms for Byzantine agreement. In the 20th ACM Sympostum on Theory of Computing ACM, New York, 148 161.
[78]
FISCHER, M., AND LYNCH, N. 1982. A lower bound for the time to assure interactive consistency. Inf Process. Lett. 14, 4 (June), 183 186.
[79]
FXSCHER, M., LYNCH, N., AND MERR~TT, M. 1986. Easy impossibility proofs for distributed consensus problems. Dtstrib. Comput. 1, 1, 26 39.
[80]
FISCHER, M., LYNCH, N., AND PATERSON, M. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2 (Apr.), 374-382.
[81]
FRIEDMAN, A. 1975. A new measure of digital system diagnosis. In the 5th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 167 169.
[82]
FRmDMAN, A., AND SmONC~m, L. 1980. Systemlevel fault diagnosis. Computer (Mar.), 47-53.
[83]
FUJIWARA. H., AND KINOSHITA, K. 1978. On the computational complexity of system diagnosis. IEEE Trans. Comput. C-27, 10 (Oct.), 881-885.
[84]
FUJIWARA, $., AND OZAKI, $. 1979. On the diagnosability of systems with self testing units. In the 9th International IEEE Sympostum on Fault-Tolerant Computing'. IEEE, New York, 157-160
[85]
FUSSELL, E, AND RANGARAJAN, S. 1989. Probabilistic diagnosis of multiprocessor systems with arbitrary connectivity. In the 19th Internattonal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 560 565.
[86]
GUPTA, R., AND RAMAKRISHNAN, I. 1987. Systemlevel fault diagnosis in malicious environments. In the 17th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 184-189.
[87]
HAKIMI, S., AND AMIN, A. 1974. Characterization of connection assignment of diagnosabte systems. IEEE Trans. Comput. C-23, i (Jan.), 86-88.
[88]
HAKIMI, S., AND NAKAJIMA, K. 1984. On adaptive system diagnosis. IEEE Trans. Comput. C~33, 3 (Mar.), 234 240.
[89]
HARARY, F. 1972. Graph Theory. Addison-Wesley, Reading, Mass.
[90]
HARPER, R., LALA, J., AND DEYST, J. 1988. Fault tolerant parallel processor architecture overview. In the 18th Internattonal IEEE Symposium on Fault-Tolerant Computtng. IEEE, New York, 252-257.
[91]
HOLT, C., AND SMITH, J. 1981. Diagnosis of systems with asymmetric invalidation. IEEE Trans. Comput. C-30, 9 (Sept.), 679 690.
[92]
HOSSEINr, S., KUHL, J., AND REDDY, S. 1985. On self fault-diagnosis of the distributed systems. In the 15th Internattonal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 30-35.
[93]
HOSSEINI, S., KUHL, J., AND REDDY, S. 1984. A diagnosis algorithm for distributed computing systems with dynamic failure and repair. IEEE Trans. Comput. C-33, 3 (Mar.), 223-233.
[94]
HUANG, S., Xu, J., AND CHEN, T. 1989. Characterization and design of sequentially t-diagnosable systems. In the 19th Internattonal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 554-559.
[95]
KAMEDA, T., TO1DA, S., AND ALLAN, F. 1975. A diagnosing algorithm for networks. Inf. Contr. 29, 141-148.
[96]
KARUNANITHI, S., AND FRIEDMAN, A. 1977. System diagnosis with t/s diagnosability. In the 7th International IEEE Symposium on Fault- Tolerant Computing. IEEE, New York, 65-71.
[97]
KAWANPOUR, A., AND FRmDMAN, A. 1978. Efficient design of easily diagnosable systems. In Proceedings of the 3rd USA-Japan Computer Conference, 251-257.
[98]
t~IM, K., AND YANC, S 1986. Fault tolerance mechanisms in real-time distributed operating systems: An overview. In the Paciftc Computer Communications '85. Elsevier Science Publishers, 239-248.
[99]
I~ME, C. 1970. An analysis model for digital system diagnosis IEEE Trans. Comput. C-19, 11 (Nov.), 1063 1073.
[100]
KIME, C. 1986. System diagnosis. In Fault Tolerant Computzng: Theory and Techniques. Prentice-Hall, Englewood Cliffs, N.J.
[101]
KOHDA, T., AND ABIRU, K. 1988. A recursive proeedure for optimally designing a hybrid fault diagnosable system. In the 18th Internattonal IEEE Sympostum on Fault-Tolerant Computing IEEE, New York, 272 277.
[102]
KOZLOWSKL W., AND KRAWCZYK, $. 1991. A comparison-based approach to multicomputer system diagnosis in hybrid fault situations. IEEE Trans. Comput. 40, 11 (Nov.), 1283-1287
[103]
KREUTZER, S., AND HAKIMI, S. 1988. Distributed diagnosis and the system user. IEEE Trans. Comput. 37, I (Jan.), 71-78.
[104]
KREUTZER, S., AND HAKIM), S. 1987. System-level fault diagnosis: A survey. Mlcroprocess. Mtcroprogram. 20,323-330.
[105]
KREUTZ~'R, S., AND HAKIMI, S. 1983. Adaptive fault identification in two new diagnostic models. In Proceedings of the 21st Allerton Conference on Communwatlon, Control and Computing Umv. of Illinois, Urbana, Ill., 353 362.
[106]
Ki~OL, T. 1991 A generahzation of fault-tolerance based on masking. Ph.D. dissertation, Eindhoven Univ. of Technology, Eindhoven, The Netherlands.
[107]
KROL, T. 1982. The '(4,2)-Concept' fault tolerant computer In the 12th Internatzonal IEEE Symposium on Fault-Tolerant Computzng. IEEE, New York, 49-54.
[108]
KUHL, J., AND REDDY, S. 1981. Fault-dmgnosis in fully distributed systems. In the 11th Internatwnal IEEE Symposzum on Fault-Tolerant Computing. IEEE, New York, 100-105
[109]
KUHL, J., ANn REDDY, S. 1980a Distributed fault-tolerance for large multiprocessor systems. In Proceedings of the 7th Annual Symposium on Computer Archttecture, 23 30.
[110]
KUHL, J., AND REDDY, S. 1980b. Some extensions to the theory of system level fault dmgnosis. In the lOth Internattonal IEEE Symposzum on Fault Tolerant Computing. IEEE, New York, 291 296.
[111]
LALA, J. 1986 A Byzantine resilient fault tolerant computer fbr nuclear power plant applicatmns. In the lO'th Internattonat IEEE Sympostum on Faull-Tolerant Computing. IEEE, New York, 338-343.
[112]
LALA, J., ALGER, L., GAUTHIER, R., AND DZWONCZYK, M. 1986. A fault tolerant processor to meet rigorous fmlure reqmrements. In Proceedings of the 7th AIAA-IEEE Digital Avtonics Systems Conference. AIAA-IEEE, New York, 555 562
[113]
LAMPORT, L. 1989. The part-time parliament Digital Tech. Rep. 49 (Sept. 1).
[114]
LA~IPO~T, L. 1983. Weak Byzantine generals problem J. ACM 30, 3 (July), 668-676
[115]
LAMPORT, L., AND MELLIAR-SMITH, P. 1984. Byzantine clock synchromzation. In the 3rd ACM Sympostum on Princtples of Dtstrzbuted Computtng. ACM, New York, 68-74.
[116]
LAMPORT, L., SHOSTAK, R., AND PEASE, M. 1982. The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4, 3 (July), 382-401.
[117]
LARANJEIRA, L., MALEK, M., AND JENEVEIN, R. 1991. On tolerating faults in naturally redundant algorithms. In the lOth Symposzum on Reliable Distrtbuted Systems (Pisa, Italy, Sept.). IEEE Computer Society, Los Alamitos, Calif., 18-127.
[118]
LEE, S., AND SHIN, K 1990. Optimal multiple syndrome probabilistic diagnosis. In the 20th International IEEE Sympostum on Fault- Tolerant Computing. IEEE, New York, 324 331.
[119]
LIAW, C., Su, S, AND MALAIY^, Y. 1982. Self-diagnosis of non-homogeneous distributed systems. In the 12th Internatzonal IEEE Sympostum on Fault-Tolerant Computing. IEEE, New York, 349-352.
[120]
LOMBARDI, F 1985. Diagnosable systems for fault tolerant computing In the 15th Internatzonal IEEE Symposium on Fault-Tolerant Computmg. IEEE, New York, 42 47.
[121]
MAEHLE, E., MORITZEN, K. AND WIRL, K. 1986. A graph model for diagnoms and reconfiguration and ~ts application to a fault4olerant multiprocessor system. In the 16th International IEEE Symposium on Fault-Tolerant Computing. (Vienna, Austria). IEEE, New York, 292-297.
[122]
MAENO, J., AND MALF~K, M 1981. A comparison connection assignment for self-diagnoms of multiprocessor systems. In the 11th Internatwnal IEEE Sympostum on Fault-Tolerant Cornputzng. IEEE, New York, 173-175.
[123]
MAHANEY, S., AND SCHNEIDER, F 1985. Inexact agreement' Accuracy, precision and graceful degradation In the 4th ACM Symposium on the Prznciples of Dzstrtbuted Computing. ACM, New York, 237-249.
[124]
MAHESHWARI, S., AND HAKIML S 1976. On models for diagnosable systems and probabilistic fault diagnosis. IEEE Trans. Comput. C-25, 3 (Mar.), 228 236.
[125]
MALEK, M. 1991. Responsive systems: A marriage between real-time and fault tolerance. In tho 5th GI / NTG / GMR Conference o~ Fault- Tolerant Computtng Systems. Informatik- Fachberichte, vol. 283. Springer-Verlag, Berlin, 1-17.
[126]
MALEK, M. 1990. Responsive systems: A challenge for the nineties. In Proceedzngs of Euromzcro '90, 16th Symposzum on Mzcroprocesszng and Mtcroprogramming. Microprocessmg and MicroprogTamming 30, North-Holland, Amsterdam, 9 16.
[127]
MALEK, M. 1980. A comparison connectmn assignment for &agnosis of multlprocessor systems. In Proceedzngs of the 7th Annual Symposium on Computer Archttecture, 31-36.
[128]
MALEK, M., AND LID, K. 1980. Graph theory models in fault diagnosis and fault tolerance. In Design Automatto, and Fault-Tolerant Computing, vol. 3, 3, 4. Computer Science Press, 155-169.
[129]
MALLELA, S. 1980. On diagnosable systems with simple algorithms. In Proceedings of the 1980 Conference on Informatton Science and Systems. Princeton Univ., Princeton, N.J., 545 549.
[130]
MALLELA, S., AND MASSON, G. 1980. Diagnosis without repair for hybrid fault situations. IEEE Trans. Comput. C-29, 6 (June), 461-470.
[131]
MALLEL^, S., AND MASSON, G. 1978. Diagnosable systems for intermittent faults. IEEE Trans. Comput. C-27, 6 (June), 560 566.
[132]
MAXEMCHUK, N., AND DAHBURA, A. 1986. Optimal diagnosable system design using full-difference triangles. IEEE Trans. Comput. C-35, 9 (Sept), 837-839.
[133]
MENEZES, B., JOHNSON, A., MALEK, M., JENEVEIN, R., AND YAU, K. 1992. Fault impact and fault tolerance in multiprocessor interconnection networks. In Quality and Reliability Engineering International, vol. 8, 485 500.
[134]
MEYER, G. 1983. A diagnosis algorithm for the BGM system-level fault model. In Proceedings of the 21st Allerton Conference on Commumcatwn, Control and Computing. Univ. of Illinois, Urbana, Ill., 345-351.
[135]
MEYER, G., AND MASSON, G. 1978. An efficient fault diagnosis algorithm for symmetric multiple processor architectures. IEEE Trans. Comput. C-27, 11 (Nov.), 1059-1063.
[136]
MORn'ZEN, K. 1984. System level fault-diagnosis in distributed systems. In the 2nd GI / NTG / GMR Conference on Fault-Tolerant Computing Systems. Informatik-Fachberichte, vol. 84. Springer-Verlag, Berlin, 301-312.
[137]
MOSES, Y., AND WAZmTS, O. 1988. (t + D-round Byzantine agreement in polynomial time. In the 29th Symposium on Foundations of Computer Science, 246-255.
[138]
N^m, R. 1978. Diagnosis, self-diagnosis and roving diagnosis. Dept. of Computer Science Rep. R-823, University of Illinois, Urbana, Ill.
[139]
NAKAJIMA, K. 1981. A new approach to system diagnosis. In Proceedings of the 19th Allerton Conference on Commumcatwm Control and Computing. Univ. of Illinois, Urbana, Ill., 697-706.
[140]
PEASE, M., SHOSTAK, R., AND LAMPORT, L., 1980, Reaching agreement in the presence of faults. J. ACM, 27, 2 (Apr.), 228-234.
[141]
PELC, A. 1992. Optimal fault diagnosis in comparison models. IEEE Trans. Comput. 41, 6 (June), 779 786.
[142]
PELC, A. 1991. Undirected graph models for system-level fault diagnosis. IEEE Trans. Comput. 40 11 (Nov.), 1271-1276.
[143]
POWELL, D. 1992. Fault mode assumptions and assumption coverage. In the 22nd IEEE International Symposium on Fault-Tolerant Computing. IEEE, New York, 386-395.
[144]
PRADHAN, D., AND REDD~, S. 1982. A faulttolerant communication architecture for distributed systems. IEEE Trans. Comput. C-31, 9 (Sept.), 863 870.
[145]
PREPAr~TA, F., METZE, G., AN~ CmEN, R. 1967. On the connection assignment problem of diagnosable systems. IEEE Tra,s. Elect. Corrzput. EC-16, 6 (Dec.), 848 854.
[146]
RABIN, M. 1983. Randomized Byzantine generals. In Proceedings of the 24th Symposzum on Foundations of Computer Science, 403 409.
[147]
RAGHAVAN, V., AND TmPATHI, A. 1991a. Improved diagnosability algorithms. IEEE Trans Cornput. 49, 2 (Feb.), 143-153.
[148]
RAGHAVAN, V., AND TRIPATHI, A. 1991b. Sequential diagnosability is co-NP complete. IEEE Trans. Comput. 40, 5 (May), 584-595.
[149]
RANGARAJAN, S., AND FUSSELL, D 1991. Probabilistic diagnosis algorithms tailored to system topology. In the 21st International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 230-237.
[150]
RANGARAJAN, S. AND FUSSELL, D. 1988. A probabilistic method for fault diagnosm of nmltiprocessor systems. In the 18th International IEEE Symposium on Fault-Tolerant Computtng. IEEE, New York, 278-283.
[151]
RANaARAJAN, S., FUSSELL, D., AND MALEK, M. 1990. Bufit-in testing of integTated circuit wafers. IEEE Trans. Comput. 39, 2 (Feb.), 195 204.
[152]
RAYNAL, M. 1988. Distributed Algorithms and Protocols. John Wiley, New York, 137-163.
[153]
RUSSELL, J., AND KIME, C. 1975a. System fault diagnosis: Masking, exposure, and diagnosability without repair. IEEE Trans. Comput. C-24, 12 (Dec.), 1115-1161.
[154]
RUSSELL, J., AND KiME, C. 1975b. System fault diagnosis: Closure and diagnosability with repair. IEEE Trans. Comput. C-24, 11 (Nov.), 1078-1088.
[155]
SAHEBAN, F., SIMONCINI, L., AND FRIEDMAN, A. 1979. Concurrent computation and diagnosis in multiprocessor systems. In the 9th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 149 156.
[156]
SCHEINERMAN, E. 1987. Almost sure fault tolerance in random graphs. SIAM d. Comput. 16, 6 (Dec.), 1124-1134.
[157]
SCHLICHTING, R., AND SCHNEIDER, F. 1983. Failstop processors: An approach to designing fault-tolerant computing systems. ACM Trans. Comput. Sys. 1, 3 (Aug.), 222-238.
[158]
BCHMEICHEL, E., HAKIMI, S., OTSUKA, M., AND SULLI- VAN, G. 1988. On minimizing testing rounds for fault identification. In the 18th Internatwnal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 266 271.
[159]
SCHNEIDER, F. 1990. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Cornput. Surv. 22, 4 (Dec.), 299 319.
[160]
SCHNEIDER, F. 1984. Byzantine generals in action: Implementing fail-stop processors. ACM Trans. Comput. Syst. 2, 2 (May), 145 154.
[161]
SENGUPTA, A., AND DAHBURA, A. 1989 On self-dlagnosable mukiprocessor systems. Diagnosis by the comparison approach. In the 19th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 54-61.
[162]
SHAMm, A. 1979. How to share a secret. Commun. ACM 22, 11 (Nov.), 612-613.
[163]
SHIN, K., AND RAM~'4^THAN, P 1987. Diagnosis of processors with Byzantine faults. In the 17th International IEEE Syrnpostum on Fault- Tolerant Computing. IEEE, New York, 55 60
[164]
SIMONS, B., AND SPECTOR, h., EDS. 1990. Lecture Notes tn Computer Sczence: Fault-Tolerant Dzstrzbuted Computmg, vol. 448. Sprmger-Verlag, Berlin.
[165]
SMITH, J. 1979. Universal system diagnosis algorithms. IEEE Trans. Comput. C-28, 5 (May), 374-378.
[166]
SOMANI, A., AND AGARWAL, V 1992. Distributed diagnosis algorithms for regular interconnected structures. IEEE Trans. Comput. 41, 7 (July), 899-906.
[167]
SOMANI, A., AG^RWAL, V., AND Avrs, D. 1987. A generalized theory for system level diagnosis IEEE Trans. Cornput. C-36, 5 (May), 538-546.
[168]
STAHL, M., BUSKENS, R., AND BIANCHINI, R. 1992. On-line dmgnosis in general topology/networks. In the IEEE Workshop on Fault-Tolerant Parallel and Distributed Systems. IEEE, New York, 114-121.
[169]
SULHVAN, G. 1988. An O(t3 + }El) fault identification algorithm for diagnosable systems. IEEE Trans. Comput. 37, 4 (Apr), 388-397.
[170]
SULLIVAN, G 1987. System-level fault dlagnosability m probabfiistic and weighted models. In the 17th International IEEE Symposium of Fault-Tolerant Computing. IEEE, New York, 190-195.
[171]
SULLIV~, G. 1984. A polynomial time algorithm for fault diagnosability. In the 25th Symposium on the Foundatio~ts of Computer Science, 148-156.
[172]
TOUEG, 8., PERRY, K., AND SR{Ig2ANTH, T. 1987. Fast distributed agreement. SIAA4 J Cornput. 16, 445-458.
[173]
TUREK, J., AND SP~SHA, D. 1992 The many faces of consensus in distributed systems. Computer 18 6 (June), 8 17.
[174]
TURPIN, R., AND COAN, B. 1984. Extending binary Byzantine agreement to multivalued Byzantine agreement. Inf. Process. Lett. 18 (Feb.), 73-76.
[175]
VAIDYA, N., AND PRADHAN, D. 1991. System level diagnosis: Combining detection and location In the 21st International IEEE Symposzum on Fault-Tolerant Computing, IEEE, New York, 488-495.
[176]
WALTER, C. 1990. Identifying the cause of detected errors. In the 20th International IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 48-55.
[177]
WENSLEY, J, LAMPORT, L., GOLDBERG, J., GREEN, M, LEVITT, K., MELLIAR-SMITH, P., SHOSTAK, R, AND WE{NSTOCK, C. 1978. SIFT: Design and analysis of a faull-tolerant computer for mrcraft control. Proe IEEE, 66, 10 (Oct.), 1240-1255.
[178]
Xu, J. 1991. The t/(n - 1)-dmgnosablhty and its applications to fault tolerance. In the 21st In~ ternattonal IEEE Symposzum on Fault-Tolerant Computing. IEEE, New York, 496-503.
[179]
YANG, C., AND MASSON, G. 1988a. Hybrid fault diagnosability with unreliable commumcation hnks. {EEE Trans. Comput. 37, 2 (Feb), 175-181.
[180]
YANG, C., AND MASSON, G 1988b. A distributed algorithm for fault diagnosis in systems with soft fmlures IEEE Trans. Cornput. 37, 11 (Nov.), 1476 1480.
[181]
fANG, C., AND MASSON, G. 1987. A new measure for hybrid fault dlagnosabihty. IEEE Trans Comput. C-36, 3 (Mar.), 378 383.
[182]
YANG, C., AND MASSOX;. G 1986. An ef~ment algorithm for multiprocessor fault dmgnosls usmg the comparison approach. In the 16th International IEEE Symposzum on Fault-Tolerant Computing (Vmnna, Austria). IEEE, New York, 238-243.
[183]
YANG, C., AND MASSON, G. 1985a. A fault identification algorithm for t,-diagnosable systems. In the 15th Internatwnal IEEE Symposium on Fault-Tolerant Computing. IEEE, New York, 78 83.
[184]
YAN6, C., AND MASSON, G. 1985b. A generahzation of hybrid faulty dmgnosabfiy. In the 15th International IEEE Symposium on Fault Tolerant Computing. IEEE, New York, 36 41.
[185]
YANG, C., MASSON, G., AND LEONETTI, R. 1986. On fault isolation and identification in t~/t~-diagnosable systems. IEEE Trans. Cornput. C-35, 7 (July), 639 643.

Cited By

View all
  • (2024)A survey of fault tolerant consensus in wireless networksHigh-Confidence Computing10.1016/j.hcc.2024.1002024:2(100202)Online publication date: Jun-2024
  • (2024)An in-depth and insightful exploration of failure detection in distributed systemsComputer Networks10.1016/j.comnet.2024.110432247(110432)Online publication date: Jun-2024
  • (2024)Dynamic-EC: an efficient dynamic erasure coding method for permissioned blockchain systemsFrontiers of Computer Science10.1007/s11704-023-3209-319:1Online publication date: 13-Nov-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Computing Surveys
ACM Computing Surveys  Volume 25, Issue 2
June 1993
148 pages
ISSN:0360-0300
EISSN:1557-7341
DOI:10.1145/152610
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 June 1993
Published in CSUR Volume 25, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Byzantine agreement
  2. consensus problem
  3. decision theory
  4. processor membership
  5. system diagnosis

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)222
  • Downloads (Last 6 weeks)24
Reflects downloads up to 28 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A survey of fault tolerant consensus in wireless networksHigh-Confidence Computing10.1016/j.hcc.2024.1002024:2(100202)Online publication date: Jun-2024
  • (2024)An in-depth and insightful exploration of failure detection in distributed systemsComputer Networks10.1016/j.comnet.2024.110432247(110432)Online publication date: Jun-2024
  • (2024)Dynamic-EC: an efficient dynamic erasure coding method for permissioned blockchain systemsFrontiers of Computer Science10.1007/s11704-023-3209-319:1Online publication date: 13-Nov-2024
  • (2024)Fault-Tolerant Wireless ConsensusWireless Consensus10.1007/978-3-031-70859-6_2(53-94)Online publication date: 22-Oct-2024
  • (2023)FPGAs-based Edge Computing with Distributed Leader Election on Internet of Things2023 South American Conference On Visible Light Communications (SACVLC)10.1109/SACVLC59022.2023.10347586(118-123)Online publication date: 8-Nov-2023
  • (2023)EBFT: Efficient BFT Consensus through Shortcut Replies2023 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom59178.2023.00090(439-446)Online publication date: 21-Dec-2023
  • (2023)DW-LRC: A Dynamic Wide-stripe LRC Codes for Blockchain Data Under Malicious Node Scenarios2023 IEEE 29th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS60453.2023.00248(1793-1800)Online publication date: 17-Dec-2023
  • (2023)Verifiable Computing in Avionics for Assuring Computer-Integrity without Replication2023 IEEE/AIAA 42nd Digital Avionics Systems Conference (DASC)10.1109/DASC58513.2023.10311290(1-10)Online publication date: 1-Oct-2023
  • (2023)Resilient Vector Consensus Over Random Dynamic Networks Under Mobile Malicious AttacksThe Computer Journal10.1093/comjnl/bxad04367:3(1076-1086)Online publication date: 28-Apr-2023
  • (2023)Distributed computations in fully-defective networksDistributed Computing10.1007/s00446-023-00452-236:4(501-528)Online publication date: 19-Jun-2023
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media