Nothing Special   »   [go: up one dir, main page]

Skip to main content

Part of the book series: Informatik-Fachberichte ((INFORMATIK,volume 147))

Abstract

This paper describes a technique for distributed recovery in multiprocessor ring configurations, which has been developed and implemented for the multiprocessor system DIRMU 25 — a 25 processor system which is operational at the University of Erlangen-Nuremberg. First a short overview of the DIRMU hardware architecture and the distributed operating system DIRMOS is given. The steps of distributed recovery using distributed system checkpoints are described. By measurement of the runtime overhead of a realistic application (2D-Poisson-multigrid) its efficiency is discussed in comparasion to recovery techniques using central system checkpoints.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Dilger, E.; Maehle, E.: Systemarchitektur und Fehlertoleranz, Informatik Spektrum, Themenheft “Fehlertoleranz in Systemen”, Vol. 9, No. 2, p 110–118, April 1986.

    Google Scholar 

  2. Hackbusch, W.; Trottenberg, U.: Multigrid Methods, Lecture Notes in Mathematics 960, p 1–170, Springer Verlag, Berlin-Heidelberg 1982.

    Book  MATH  Google Scholar 

  3. Haendler, W.; Maehle, E.; Wirl, K.: DIRMU Multiprocessor Configurations, Proc. 1985 Int. Conf. on Parallel Processing, p 652–656, St. Charles, Illinois 1985.

    Google Scholar 

  4. Hayes, J.P.; Yanney, R.N.: Distributed Recovery in Fault-Tolerant Multiprocessor Networks, IEEE Transactions on Computers, Vol. 35, No. 10, October 1986.

    Google Scholar 

  5. Maehle, E.; Moritzen, K.; Wirl, K.: Fault Tolerant Hardware Configuration Management on the Multiprocessor System DIRMU 25, Proceedings CONPAR 86, Aachen 1986, Lecture Notes in Computer Science 237, p 190–197, Springer-Verlag, Berlin-Heidelberg 1986.

    Google Scholar 

  6. Maehle, E.; Moritzen, K.; Wirl, K.: A Graph Modell and its Application to a Fault-Tolerant Multiprocessor System, Proceedings International Symposium on Fault-Tolerant Computing ‘FTCS-161’, p 292-297, Vienna 1986.

    Google Scholar 

  7. Young, J.W.: A First Order Approximation to the Optimum Checkpoint Interval, Communications of the ACM, Vol. 17, No. 6, p 493–499, September 1978.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1987 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lehmann, L., Brehm, J. (1987). Rollback Recovery in Multiprocessor Ring Configurations. In: Belli, F., Görke, W. (eds) Fehlertolerierende Rechensysteme / Fault-Tolerant Computing Systems. Informatik-Fachberichte, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45628-2_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-45628-2_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-18294-8

  • Online ISBN: 978-3-642-45628-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics