Abstract
As systems scale up, their mean-time-to-failure reduces drastically. We consider parallel servers subject to permanent failures but such that only one needs to survive in order to execute a given task. This kind of failure-model is appropriate in at least two types of systems: systems in which repair cannot take place (e.g. spacecraft) and systems that have strict deadlines (e.g. navigation systems). We use multiple replicas to perform the same task in order to improve the reliability of systems. The server in the system is subject to failure while it is on and the time to failure is memoryless, i.e. exponentially distributed. We derive expressions for the Laplace transform of the sojourn time distribution of a tagged task, jointly with the probability that the tagged task completes service, for a network of one or more parallel servers with exponential service times and times to failure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Ben-Ari M (2006) Principles of concurrent and distributed programming. Addison-Wesley Longman, Boston
Dean J, Barroso LA (2013) The tail at scale. Commun ACM 56(2):74–80
Gelenbe E (1989) Random neural networks with positive and negative signals and product form solution. Neural Comput 1(4):502–510
Gelenbe E (1993) G-networks with triggered customer movement. J Appl Prob 30:742–748
Harrison PG, Patel NM (1992) Performance modelling of communication networks and computer architectures (International Computer S. Addison-Wesley Longman, Boston
Harrison P, Pitel E (1993) Sojourn times in single server queues with negative customers. J Appl Prob 30:943–963
Macedo DF, Correia LH, dos Santos AL, Loureiro AA, Nogueira JMS, Pujolle G (2006) Evaluating fault tolerance aspects in routing protocols for wireless sensor networks. Challenges in Ad Hoc Networking, Springer, Berlin, In, pp 285–294
Maxion RA, Siewiorek DP, Elkind SA (1987) Techniques and architectures for fault-tolerant computing. Ann Rev Comput Sci 2(1):469–520
Nathan (2013) Nasas mars rover curiosity forced to backup computer as result of computer glitch. http://planetsave.com/2013/03/03/nasas-mars-rover-curiosity-forced-to-b%ackup-computer-as-result-of-computer-glitch/
Stewart WJ (2011) Probability, Markov chains, queues, and simulation: the mathematical basis of performance modeling. Princeton University Press, New Jersey
Tang C, Li Q, Hua B, Liu A (2009) Developing reliable web services using independent replicas. In: Fifth International Conference on Semantics, Knowledge and Grid (SKG 2009) IEEE. pp 330–333
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Qiu, Z., Harrison, P.G. (2013). A Model of Speculative Parallel Scheduling in Networks of Unreliable Sensors. In: Gelenbe, E., Lent, R. (eds) Information Sciences and Systems 2013. Lecture Notes in Electrical Engineering, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-01604-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-01604-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01603-0
Online ISBN: 978-3-319-01604-7
eBook Packages: Computer ScienceComputer Science (R0)