Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/335231.335241acmconferencesArticle/Chapter ViewAbstractPublication PagesicsConference Proceedingsconference-collections
Article
Free access

A simulation-based study of scheduling mechanisms for a dynamic cluster environment

Published: 08 May 2000 Publication History

Abstract

Scheduling of processes onto processors of a parallel machine has always been an important and challenging area of research. The issue becomes even more crucial and difficult as we gradually progress to the use of off-the-shelf workstations, operating systems, and high bandwidth networks to build cost-effective clusters for demanding applications. Clusters are gaining acceptance not just in scientific applications that need supercomputing power, but also in domains such as databases, web service and multimedia, which place diverse Quality-of-Service (QoS) demands on the underlying system. Further, these applications have diverse characteristics in terms of their computation, communication and I/O requirements, making conventional parallel scheduling solutions, such as space sharing or coscheduling, an unattractive option. At the same time, leaving it to the native operating system of each node to make decisions independently can lead to ineffective use of system resources whenever there is communication. Instead, an emerging class of dynamic coscheduling mechanisms, that attempt to take remedial actions to guide the system towards coscheduled execution without requiring explicit synchronization, offer a lot of promise for cluster scheduling. Using a detailed simulator, this paper evaluates the pros and cons of different dynamic coscheduling alternatives, while comparing their advantages over traditional coscheduling (and not performing any coordinated scheduling at all). The impact of dynamic job arrivals, job characteristics and different system parameters on these alternatives are evaluated in terms of several performance criteria.

References

[1]
A. C. Arpaci-Dussean, D. E. Culler, and A. M. Mainwaring. Scheduling with Implicit Information in Distributed Systems. In Proceedings of the A CM SIGMETRICS 1998 Conference on Measurement and Modeling of Computer Systems, 1998.
[2]
M. Buchanan and A. Chien. Coordinated Thread Scheduling for Workstation Clusters under Windows NT. In Proceedings of the USENIX Windows NT Workshop, August 1997.
[3]
A. C. Dusseau, R. H. Arpaci, and D. E. Culler. Effective Distributed Scheduling of Parallel Workloads. In Proceedings of the A CM SIGMETRICS 1996 Conference on Measurement and Modeling of Computer Systems, pages 25-36, 1996.
[4]
D. G. Feitelson and L. Rudolph. Coscheduling based on Run-Time Identification of Activity Working Sets. Technical Report Research Report RC 18416(80519), IBM T. J. Watson Research Center, October 1992.
[5]
D. G. Peitelson and L. Rudolph. Gang Scheduling Performance Benefits for Fine-Grained Synchronization. Journal of Parallel and Distributed Computing, 16(4):306-318, December 1992.
[6]
H. Franke, J. Jann, J. E. Moreira, P. Pattnaik, and M. A. Jette. Evaluation of Parallel Job Scheduling for ASCI Blue-Pacific. In Proceedings of Supercomputing, November 1999.
[7]
A. Hori, H. Tezuka, and Y. Ishikawa. Global State Detection Using Network Preemption. In Proceedings of the IPPS Workshop on Job Scheduling Strategies for Parallel Processing, pages 262-276, April 1997. LNCS 1291.
[8]
D. Lifka. The ANL/IBM SP Scheduling System. In Proceedings of the IPPS Workshop on Job Scheduling Strategies for Parallel Processing, pages 295-303, April 1995. LNCS 949.
[9]
J. E. Moreira, H. Franke, W. Chan, L. L. Fong, M. A. Jette, and A. Yoo. A Gang-Scheduling System for ASCI Blue-Pacific. In Proceedings of the 7th International Conference on High-Performance Computing and Networking(HPCN'99), volume 1593 of Lecture Notes in Computer Science, pages 831-840, April 1999.
[10]
S. Nagar, A. Banerjee~ A. Sivasubramaniana, and C. R. Das. A Closer Look at Coscheduling Approaches for a Network of Workstations. In Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, pages 96-105, June 1999.
[11]
S. Nagar, A. Banerjee, A. Sivasubramaniam, and C. R. Das. Alternatives to Coscheduling a Network of Workstations. Journal of Parallel and Distributed Computing, 59(2):302-327, November 1999.
[12]
J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proceedings of the 3rd International Conference on Distributed Computing Systems, pages 22-30, May 1982.
[13]
S. Pakin, M. Lauria, and A. Chien. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Proceedings of Supercomputing '95, December 1995.
[14]
F. Petrini and W. Feng. Buffered Coscheduling: A New Method for Multitasking Parallel Jobs on Distributed Systems. Technical report, Los Alamos National Laboratory, September 1999.
[15]
P. G. Sobalvarro. Demand-based Coseheduling of Parallel Jobs on Multiprogrammed Multiprocessors. PhD thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, January 1997.
[16]
R. Subrahmaniam. Implementing Coscheduling Heuristics for Windows NT Clusters. Master's thesis, Dept. of Computer Science and Engineering, Penn State University, University Park, PA 16802, October 1999.
[17]
Thinking Machines Corporation, Cambridge, Massachusetts. The Connection Machine CM-5 Technical Summary, October 1991.
[18]
Specification for the Virtual Interface Architecture. http://maw, viarch, org.
[19]
T. von Eicken, A. Basu, V. Buch, and W. Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proceedings of the 15th A CM Symposium on Operating System Principles, December 1995.
[20]
Y. Zhang, H. Franke, J. Moreira, and A. Sivasubramaniam. Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques. In Proceedings of the International Parallel and Distributed Processing Symposium, May 2000. To appear.
[21]
Y. Zhang, A. Sivasubramaniam, J. Moreira, and H. Franke. A Simulation-based Study of Scheduling Mechanisms for a Dynamic Cluster Environment. Technical Report CSE-99-022, Dept. of Computer Science and Engineering, The Pennsylvania State University, November 1999.

Cited By

View all
  • (2014)Scheduling parallel jobs on multicore clusters using CPU oversubscriptionThe Journal of Supercomputing10.1007/s11227-014-1142-968:3(1113-1140)Online publication date: 1-Jun-2014
  • (2012)A job scheduling approach for multi-core clusters based on virtual malleabilityProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_20(191-203)Online publication date: 27-Aug-2012
  • (2009)Xen and Co.IEEE Transactions on Computers10.1109/TC.2009.5358:8(1111-1125)Online publication date: 1-Aug-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ICS '00: Proceedings of the 14th international conference on Supercomputing
May 2000
347 pages
ISBN:1581132700
DOI:10.1145/335231
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 May 2000

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clusters
  2. coscheduling
  3. dynamic coscheduling
  4. parallel scheduling
  5. simulation

Qualifiers

  • Article

Conference

ICS00
Sponsor:
ICS00: International Conference on Supercomputing
May 8 - 11, 2000
New Mexico, Santa Fe, USA

Acceptance Rates

ICS '00 Paper Acceptance Rate 33 of 122 submissions, 27%;
Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)33
  • Downloads (Last 6 weeks)11
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2014)Scheduling parallel jobs on multicore clusters using CPU oversubscriptionThe Journal of Supercomputing10.1007/s11227-014-1142-968:3(1113-1140)Online publication date: 1-Jun-2014
  • (2012)A job scheduling approach for multi-core clusters based on virtual malleabilityProceedings of the 18th international conference on Parallel Processing10.1007/978-3-642-32820-6_20(191-203)Online publication date: 27-Aug-2012
  • (2009)Xen and Co.IEEE Transactions on Computers10.1109/TC.2009.5358:8(1111-1125)Online publication date: 1-Aug-2009
  • (2008)Project status: RIVER: Resource management infrastructure for consolidated hosting in virtualized data centers2008 IEEE International Symposium on Parallel and Distributed Processing10.1109/IPDPS.2008.4536394(1-5)Online publication date: Apr-2008
  • (2008)Generalized parallel-server fork-join queues with dynamic task schedulingAnnals of Operations Research10.1007/s10479-008-0312-7160:1(227-255)Online publication date: 2-Feb-2008
  • (2007)Xen and co.Proceedings of the 3rd international conference on Virtual execution environments10.1145/1254810.1254828(126-136)Online publication date: 13-Jun-2007
  • (2007)Evaluation of fault-tolerant policies using simulationProceedings of the 2007 IEEE International Conference on Cluster Computing10.1109/CLUSTR.2007.4629244(303-311)Online publication date: 17-Sep-2007
  • (2006)LOMARCIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2006.16017:11(1360-1375)Online publication date: 1-Nov-2006
  • (2004)ClusterSchedSim: A Unifying Simulation Framework for Cluster Scheduling StrategiesSIMULATION10.1177/003754970404408080:4-5(191-206)Online publication date: 1-May-2004
  • (2004)LOMARC — lookahead matchmaking for multi-resource coschedulingProceedings of the 10th international conference on Job Scheduling Strategies for Parallel Processing10.1007/11407522_16(288-315)Online publication date: 13-Jun-2004
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media