Nothing Special   »   [go: up one dir, main page]

skip to main content
10.5555/648136.746482guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Efficient Replay of PVM Programs

Published: 26 September 1999 Publication History

Abstract

The paper presents a definition of replay of a distributed application as a function of three parameters: depth, width, and length. It addresses the problem of nondeterminism in distributed system and proposes an efficient approach to trace a PVM application behaviour in order to eliminate races in repetited execution. Detecting races in distributed computations requires implementation of a strongly consistent system of vector clocks. Therefore a system of vector clocks was adapted for a dynamic application model. Finally it presents the architecture of a tool supporting replay of PVM applications.

References

[1]
Dione, C., Feeley, M., Desbiens, J.: A Taxonomy of Distributed Debuggers Based on Execution Replay. Proc. of the International Conference on Parallel and Distributed Techniques and Applications, Sunnyvale, California (1996).
[2]
Damodaran-Kamal, S.K., Francioni, J.M.: Testing Races in Parallel Programs with an OtOt Strategy. Proc. of the 1994 International Symposium on Software Testing and Analysis (ISSTA), ACM Sigsoft, ACM Press, New York (1994) 216-227.
[3]
Fagot, A., de Kergommeaux, J.C.: Systematic Assessment of the Overhead of Tracing Parallel Programs. Proc. of PDP'96, IEEE Computer Society, (1996) 179-186.
[4]
Geist, G.A., Beguelin, A., Dongarra, J.J., Jiang, W., Manchek, R., Sunderam, V.S.: PVM: Parallel Virtual Machine, A User's Guide and Tutorial for Networked Parallel Computing. MIT Press, Cambridge, MA, (1994).
[5]
Krawczyk, H., Wiszniewski, B., Kuzora, P., Neyman, M., Proficz, J.: Integrated Static and Dynamic Analysis of PVM Programs with STEPS. Computers and Artificial Intelligence, 17(5) (1998) 441-453.
[6]
Lamport, L.: Time, clocks and the ordering of events in a distributed system. Communications of ACM, 21(7) (1978) 558-565.
[7]
Lourenço, J., Cunha, J.C.: Replaying Distributed Applications with RPVM. Proc. of DAPSYS'98, (1998).
[8]
Lourenço, J., Cunha, J.C., Krawczyk, H., Kuzora, P., Neyman, M., Wiszniewski, B.: An integrated testing and debugging environment for parallel and distributed programs. Proc. of the 23rd Euromicro Conference (EUROMICRO'97), IEEE Computer Society Press, Budapest, Hungary, (1997) 291-298.
[9]
Mackey, M.: Program Replay in PVM. Technical Report, Hewlett Packard, Concurrent Computing Department, Hewlett Packard Laboratories, (1993).
[10]
Neyman, M.: Non-deterministic Recovery of Computations in Testing of Distributed Systems. Proc. of Ninth European Workshop on Dependable Computing, (1998) 114-117.
[11]
Netzer, R.B., Miller, B.P.: Optimal Tracing and Replay for debugging message-passing parallel programs. The Journal of Supercomputing, 8(4) (1995) 371-388.
[12]
Raynal, M., Singhal, M.: Logical Time: Capturing Causality in Distributed Systems. IEEE Computer, 1 (1996) 49-56.

Cited By

View all
  • (2008)Preserving time in large-scale communication tracesProceedings of the 22nd annual international conference on Supercomputing10.1145/1375527.1375537(46-55)Online publication date: 7-Jun-2008

Index Terms

  1. Efficient Replay of PVM Programs
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    Proceedings of the 6th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
    September 1999
    530 pages

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 26 September 1999

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 30 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2008)Preserving time in large-scale communication tracesProceedings of the 22nd annual international conference on Supercomputing10.1145/1375527.1375537(46-55)Online publication date: 7-Jun-2008

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media