Nothing Special   »   [go: up one dir, main page]

skip to main content
article
Free access

The implementation of dynamite: an environment for migrating PVM tasks

Published: 01 July 2000 Publication History

Abstract

Parallel programming on clusters of workstations is increasingly attractive, but dynamic load balancing is needed to make efficient use of the available resources. Dynamite provides dynamic load balancing for PVM applications running under Linux and Solaris. It supports migration of individual tasks between nodes in a manner transparent both to the application programmer and to the user, implemented entirely in user space. Dynamically linked executables are supported, as are tasks with open files and with direct PVM connections. In this paper, we describe the technical aspects of migrating message-passing tasks.

References

[1]
A. Geist, A. Beguelin, J. Dongarra, W. Jiang, R. Mancheck, and V. Sunderam, PVM: Parallel Virtual Machine. A Users' Guide and Tutorial for Networked Parallel Computing, MIT Press, Cambridge, Massachusetts, 1994. http://www.epm.ornl.gov/pvm/]]
[2]
MPI: A Message-Passing Interface Standard, Version 1.1, Technical Report, University of Tennessee, Knoxville, TN, June 1995. http://www-unix.mcs.anl.gov/mpi/]]
[3]
W. D. Gropp, and E. Lusk, User's Guide for mpich, a Portable Implementation of MPI, Technical Report, ANL-96/6, Argonne National Laboratory, 1996.]]
[4]
L. Dikken, F. van der Linden, J. J. J. Vesseur, and P. M. A. Sloot, DynamicPVM: Dynamic Load Balancing on Parallel Systems In Proceedings of High Performance Computing and Networking, in series Lecture Notes in Computer Science, v. 797, n. II, Networking and Tools, pp. 273-277, Springer-Verlag, 1994.]]
[5]
J. J. J. Vesseur, R. N. Heederik, B. J. Overeinder, and P. M. A. Sloot, Experiments in Dynamic Load Balancing for Parallel Cluster Computing, In Proceedings of the Workshop on Parallel Programming and Computation (ZEUS'95) and the 4th Nordic Transputer Conference (NTUG'95), in series Transputer and Occam Engineering Series, Parallel Programming and Applications, pp. 189-194, IOS Press, 1995.]]
[6]
B. J. Overeinder, P. M. A. Sloot, R. N. Heederik, and L. O. Hertzberger, A Dynamic Load Balancing System for Parallel Cluster Computing, Future Generation Computer Systems, v. 12, n. 1, pp. 101-115, 1996.]]
[7]
G. D. van Albada, J. Clinckemaillie, A. H. L. Emmen, J. Gehring, O. Heinz, F. van der Linden, B. J. Overeinder, A. Reinefeld, and P. M. A. Sloot, Dynamite --- blasting obstacles to parallel cluster computing, In Proceedings of HPCN Europe '99, Amsterdam, The Netherlands, Lecture Notes in Computer Science, n. 1593, pp. 300-310, Springer-Verlag, 1999.]]
[8]
J. Casas, D. L. Clark, R. Konuru, S. W. Otto, R. M. Prouty, and J. Walpole, MPVM: A Migration Transparent Version of PVM, Usenix Computing Systems, v. 8, n. 2, pp. 171-216, 1995.]]
[9]
G. Stellner, and J. Pruyne, Resource Management and Checkpointing for PVM, In Proceedings of the 2nd European Users' Group Meeting, pp. 131-136, 1995.]]
[10]
C. P. Tan, W. F. Wong, and C. K. Yuen, tmPVM --- Task Migratable PVM, In Proceedings of the 2nd Merged Symposium IPPS/SPDP, pp. 196-202.5, 1999.]]
[11]
P. Dan, W. Dongsheng, Z. Youhui, and S. Meiming, Quasi-asynchronous Migration: A Novel Migration Protocol for PVM Tasks, Operating Systems Review, v. 33, n. 2, ACM, pp. 5-14, 1999.]]
[12]
P. Czarnul, and H. Krawczyk, Dynamic Assignment with Process Migration in Distributed Environments, In Proceedings of the 6th European PVM/MPI Users' Group Meeting, Barcelona, Spain, September 1999, in series Lecture Notes in Computer Science, n. 1697, pp. 509-516, Springer-Verlag, 1999.]]
[13]
G. Stellner, CoCheck: Checkpointing and Process Migration for MPI, In Proceedings of the International Parallel Processing Symposium, pp. 526-531, Honolulu, HI, 1996.]]
[14]
J. Robinson, S. H. Russ, B. Flachs, and B. Heckel, A task migration implementation of the Message Passing Interface, In Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing, pp. 61-68, 1996.]]
[15]
http://www.genias.de/products/codine/]]
[16]
J. Pruyne, and M. Livny, Managing checkpoints for parallel programs, In Proceedings of IPPS Second Workshop on Job Scheduling Strategies for Parallel Processing, 1996.]]
[17]
J. S. Plank, M. Beck, G. Kingsley, and K. Li, Libckpt: Transparent Checkpointing under Unix, Proceedings of the Usenix Winter 1995 Technical Conference, New Orleans, LA, pp. 213-223, 1995.]]
[18]
M. Litzkow, T. Tannenbaum, J. Basney, and M. Livny, Checkpoint and Migration of UNIX Processes in the Condor Distributed Processing System, University of Wisconsin---Madison Computer Sciences Technical Report #1346, 1997.]]
[19]
A. Barak, O. La'adan, and A. Shiloh, Scalable Cluster Computing with MOSIX for LINUX, In Proceedings of Linux Expo '99, pp. 95-100, Raleigh, N.C., 1999.]]
[20]
S. Loosemore, R. M. Stallman, R. McGrath, A. Oram, and U. Drepper, The GNU C Library Reference Manual, Free Software Foundation, Inc, Boston, MA, USA, 1999.]]

Cited By

View all
  • (2021)Dynamic Load Balancing in Parallel Execution of Cellular AutomataIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.302510232:2(470-484)Online publication date: 1-Feb-2021
  • (2020)A fault-tolerant hybrid resource allocation model for dynamic computational gridJournal of Computational Science10.1016/j.jocs.2020.101268(101268)Online publication date: Nov-2020
  • (2010)Research of Process Migration Mechanism Based on Checkpoint in Computational GridProceedings of the The Fifth Annual ChinaGrid Conference10.1109/ChinaGrid.2010.21(245-248)Online publication date: 16-Jul-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 34, Issue 3
July 2000
76 pages
ISSN:0163-5980
DOI:10.1145/506117
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2000
Published in SIGOPS Volume 34, Issue 3

Check for updates

Author Tags

  1. PVM
  2. cluster computing
  3. message-passing
  4. task migration

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)109
  • Downloads (Last 6 weeks)8
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Dynamic Load Balancing in Parallel Execution of Cellular AutomataIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2020.302510232:2(470-484)Online publication date: 1-Feb-2021
  • (2020)A fault-tolerant hybrid resource allocation model for dynamic computational gridJournal of Computational Science10.1016/j.jocs.2020.101268(101268)Online publication date: Nov-2020
  • (2010)Research of Process Migration Mechanism Based on Checkpoint in Computational GridProceedings of the The Fifth Annual ChinaGrid Conference10.1109/ChinaGrid.2010.21(245-248)Online publication date: 16-Jul-2010
  • (2009)The User-Level Scheduling of Divisible Load Parallel Applications With Resource Selection and Adaptive Workload Balancing on the GridIEEE Systems Journal10.1109/JSYST.2008.20113013:1(121-130)Online publication date: Mar-2009
  • (2008)A Grid-based Virtual ReactorJournal of Parallel and Distributed Computing10.1016/j.jpdc.2007.08.01068:5(596-608)Online publication date: 1-May-2008
  • (2008)Parallel Irregular Computations with Dynamic Load Balancing through Global Consistent State MonitoringParallel Processing and Applied Mathematics10.1007/978-3-540-68111-3_103(971-980)Online publication date: 2008
  • (2007)Parallel irregular computations with dynamic load balancing through global consistent state monitoringProceedings of the 7th international conference on Parallel processing and applied mathematics10.5555/1786194.1786309(971-980)Online publication date: 9-Sep-2007
  • (2007)Regular PaperInternational Journal of High Performance Computing Applications10.1177/109434200707487421:2(210-221)Online publication date: 1-May-2007
  • (2007)CHPOX: Transparent Checkpointing System for Linux Clusters2007 4th IEEE Workshop on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications10.1109/IDAACS.2007.4488396(159-164)Online publication date: Sep-2007
  • (2007)An autonomic tool for building self-organizing Grid-enabled applicationsFuture Generation Computer Systems10.1016/j.future.2006.11.00323:5(671-679)Online publication date: 1-Jun-2007
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media