Abstract
Rollback recovery schemes are used in fault-tolerant distributed systems to minimize the computation loss incurred in the presence of failures. One-level recovery schemes do not consider the different types of failures and their relative frequency of occurrence, thereby tolerating all failures with the same overhead. Two-level recovery schemes aim to provide low overhead protection against more probable failures, providing protection against other failures with possibly higher overhead. In this paper, we have analyzed a two-level recovery scheme due to Vaidya taking probability of task completion on a system with limited repairs as the performance metric.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
K.M. Chandy, J.C. Browne, C.W. Dissly, and W.R. Uhrig, Analytic Models for Rollback and Recovery Strategies in Data Base Systems, IEEE Trans. Software Eng, 1 (1975)100–110.
S. Garg and K.F. Wong, Analysis of an improved Distributed Checkpointing Algorithm, Technical Report WUCS-93-37, Dept. of Computer Science, Washington Univ., June 1993.
E. Gelenbe, A Model for Roll-Back Recovery with Multiple Checkpoints, Proc. Second Int’l Conf. Software Eng., (1976)251–255.
E. Gelenbe, Model of Information Recovery Using the Method of Multiple Checkpointing, Automation and Control, 4 (1976)251–255.
V.F. Nicola, Checkpointing and the Modeling of Program Execution time, Software fault Tolerance, in: M.R. Lyu Ed. John Wiley & Sons, (1995)167–188.
V.F. Nicola and J.M. van Spanje,Comparative Analysis of Different Models of Checkpointing and Recovery, IEEE Trans. Software Eng. 16(1990)807–821.
A.N. Tantawi and m. Ruschitzka, Performance Analysis of Checkpointing Strategies, ACM Trans. Computer Systems, 2 (1984) 123–144.
N.H. Vaidya, A case for Two-level Recovery Schemes, IEEE Trans. Computers, 47 (6) (1998) 656–666.
J.W. Young, A first Order Approximation to the Optimum Checkpoint Interval, Comm. ACM 17(1974) 530–531.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Panda, B.S., Das, S.K. (2002). Performance Evaluation of a Two Level Error Recovery Scheme for Distributed Systems. In: Das, S.K., Bhattacharya, S. (eds) Distributed Computing. IWDC 2002. Lecture Notes in Computer Science, vol 2571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36385-8_9
Download citation
DOI: https://doi.org/10.1007/3-540-36385-8_9
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00355-7
Online ISBN: 978-3-540-36385-9
eBook Packages: Springer Book Archive