Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/378580.378646acmconferencesArticle/Chapter ViewAbstractPublication PagesspaaConference Proceedingsconference-collections
Article

Scheduling best-effort and real-time pipelined applications on time-shared clusters

Published: 03 July 2001 Publication History

Abstract

Two important emerging trends are influencing the design, implementation and deployment of high performance parallel systems. The first is on the architectural end, where both economic and technological factors are compelling the use of off-the-shelf computing elements (workstations/PCs and networks) to put together high performance systems called clusters. The second is from the user community that is finding an increasing number of applications to benefit from such high performance systems. Apart from the scientific applications that have traditionally needed supercomputing power, a large number of graphics, visualization, database, web service and e-commerce applications have started using clusters because of their high processing and storage requirements. These applications have diverse characteristics and can place different Quality-of-Service (QoS) requirements on the underlying system (low response time, high throughput, high I/O demands, guaranteed response/throughput etc.). Further, clusters running such applications need to cater to potentially a large number of users (or other applications) in a time-shared manner. The underlying system needs to accommodate the requirements of each application, while ensuring that they do not interfere with each other.
This paper focuses on the CPU resources of a cluster and investigates scheduling mechanisms to meet the responsiveness, throughput and guaranteed service requirements of different applications. Specifically, we propose and evaluate three different scheduling mechanisms. These mechanisms have been drawn from traditional solutions on parallel systems (gang scheduling and dynamic coscheduling), and have been extended to accommodate the new criteria under consideration. These mechanisms have been investigated using detailed simulation and workload models to show their pros and cons for different performance metrics.

References

[1]
M. Aron, P. Druschel, and W. Zwaenepoel. Cluster Reserves: a mechanism for resource management in cluster-based network servers. In Proceedings of ACM Sigmetrics, 2000.]]
[2]
A. C. Arpaci-Dusseau, D. E. Culler, and A. M. Mainwaring. Scheduling with Implicit Information in Distributed Systems. In Proceedings of the ACM SIGMETRICS 1998 Conference on Measurement and Modeling of Computer Systems, 1998.]]
[3]
http://www.ibm.com/software/data/db2/.]]
[4]
D. Babbar and P.Krueger. On-line hard real-time scheduling of parallel tasks on partitionable multiprocessors. In Proceedings of the 1994 International Conference on Parallel Processing, pages II: 29-38, August 1994.]]
[5]
D. Bailey et al. The NAS Parallel Benchmarks. International Journal of Supercomputer Applications, 5(3):63-73, 1991.]]
[6]
K. J. Duda and D. R. Cheriton. Borrowed-virtual-time(bvt) scheduling: supporting latency-sensitive threads in a general-purpose scheduler. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, December 1999.]]
[7]
A. C. Dusseau, R. H. Arpaci, and D. E. Culler. Effective Distributed Scheduling of Parallel Workloads. In Proceedings of the ACM SIGMETRICS 1996 Conference on Measurement and Modeling of Computer Systems, pages 25-36, 1996.]]
[8]
D. G. Feitelson. A Survey of Scheduling in Multiprogrammed Parallel Systems. Technical Report Research Report RC 19790(87657), IBM T. J. Watson Research Center, October 1994.]]
[9]
D. G. Feitelson and L. Rudolph. Gang Scheduling Performance Benefits for Fine-Grained Synchronization. Journal of Parallel and Distributed Computing, 16(4):306-318, December 1992.]]
[10]
P. Goyal, X. Guo, and H. M. Vin. A Hierarchical CPU Scheduler for Multimedia Operating Systems. In Proceedings of 2nd Symposium on Operating System Design and Implementation, pages 107-122, October 1996.]]
[11]
D. D. Kandlur, D. L. Kiskis, and K. G. Shin. HARTOS: a distributed real-time operating system. Operating Systems Rev., 23(3):72-89, July 1989.]]
[12]
Microsoft TerraServer. http://www.terraserver.microsoft.com.]]
[13]
S. Nagar, A. Banerjee, A. Sivasubramaniam, and C. R. Das. A Closer Look at Coscheduling Approaches for a Network of Workstations. In Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, pages 96-105, June 1999.]]
[14]
J. Nieh and M. S. Lam. The Design, Implementation and Evaluation of SMART: A scheduler for Multimedia Applications. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles, October 1997.]]
[15]
J. K. Ousterhout. Scheduling Techniques for Concurrent Systems. In Proceedings of the 3rd International Conference on Distributed Computing Systems, pages 22-30, May 1982.]]
[16]
V. S. Pai, M. Aron, G. Banga, M. Svendsen, P. Druschel, W. Zwaenepoel, and E. Nahum. Locality-Aware Request Distribution in Cluster-Based Network Servers. In Proceedings of the Symposium on Architectural Support for Programming Languages and Operating Systems, pages 205-216, 1998.]]
[17]
S. Pakin, M. Lauria, and A. Chien. High Performance Messaging on Workstations: Illinois Fast Messages (FM) for Myrinet. In Proceedings of Supercomputing '95, December 1995.]]
[18]
K. Ramamritham, J. A. Stankovic, and P-F. Shiah. Efficient scheduling algorithms for real-time multiprocessor systems. IEEE Transactions on Parallel and Distributed Systems, 1(2):184-194, April 1990.]]
[19]
P. G. Sobalvarro. Demand-based Coscheduling of Parallel Jobs on Multiprogrammed Multiprocessors. PhD thesis, Dept. of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, January 1997.]]
[20]
M. S. Squillante, Y. Zhang, A. Sivasubramaniam, N. Gautam, H. Franke, and J. Moreira. Analytic Modeling and Analysis of Dynamic Coscheduling for a Wide Spectrum of Parallel and Distributed Environments. Technical Report CSE-01-004, Penn State University, CSE department, February 2001.]]
[21]
Thinking Machines Corporation, Cambridge, Massachusetts. The Connection Machine CM-5 Technical Summary, October 1991.]]
[22]
T. von Eicken, A. Basu, V. Buch, and W. Vogels. U-Net: A User-Level Network Interface for Parallel and Distributed Computing. In Proceedings of the 15th ACM Symposium on Operating System Principles, December 1995.]]
[23]
C. A. Waldspurger and W. E. Weihl. Lottery Scheduling: Flexible Proportional-Share Resource Management. In Proceedings of 1st Symposium on Operating System Design and Implementation, November 1994.]]
[24]
C. A. Waldspurger and W. E. Weihl. Stride scheduling:deterministic proportional-share resource management. Technical Report Technical Memo MIT/LCS/TM-528, MIT laboratory for Computer Science, Jun 1995.]]
[25]
M-T. Yang. An Automatic Pipelined Scheduler for Real-Time Vision Applications. PhD thesis, Dept. of Computer Science & Eng., The Pennsylvania State University, September 2000.]]
[26]
M-T. Yang, R. Kasturi, and A. Sivasubramaniam. An Automatic Pipeline Scheduler for Real-Time Vision Applications. In To appear in Proceedings of the International Parallel and Distributed Processing Symposium, April 2001.]]
[27]
H. Zhang and S. Keshav. Comparison of rate-based service disciplines. In Proceedings of the conference on Communications architecture & protocols, pages 113 - 121, 1991.]]
[28]
Y. Zhang, H. Franke, J. Moreira, and A. Sivasubramaniam. Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques. In Proceedings of the International Parallel and Distributed Processing Symposium, pages 133-142, May 2000.]]
[29]
Y. Zhang and A. Sivasubramaniam. Scheduling Best-Effort and Real-Time Pipelined Applications on Time-Shared Clusters . Technical Report CSE-01-003, Penn State University, CSE department, February 2001.]]
[30]
Y. Zhang, A. Sivasubramaniam, J. Moreira, and H. Franke. A Simulation-based Study of Scheduling Mechanisms for a Dynamic Cluster Environment. In Proceedings of the ACM 2000 International Conference on Supercomputing, pages 100-109, May 2000.]]

Cited By

View all
  • (2012)Business-driven short-term management of a hybrid IT infrastructureJournal of Parallel and Distributed Computing10.1016/j.jpdc.2011.11.00172:2(106-119)Online publication date: 1-Feb-2012
  • (2009)A new technique of switch & feedback job scheduling mechanism in a distributed systemProceedings of the 2009 Spring Simulation Multiconference10.5555/1639809.1639937(1-4)Online publication date: 22-Mar-2009
  • (2009)Achieving efficiency, quality of service and robustness in multi-organizational GridsJournal of Systems and Software10.1016/j.jss.2008.03.06482:1(23-38)Online publication date: 1-Jan-2009
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SPAA '01: Proceedings of the thirteenth annual ACM symposium on Parallel algorithms and architectures
July 2001
340 pages
ISBN:1581134096
DOI:10.1145/378580
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 July 2001

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clusters
  2. coscheduling
  3. parallel scheduling
  4. simulation

Qualifiers

  • Article

Conference

SPAA01

Acceptance Rates

SPAA '01 Paper Acceptance Rate 34 of 93 submissions, 37%;
Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Business-driven short-term management of a hybrid IT infrastructureJournal of Parallel and Distributed Computing10.1016/j.jpdc.2011.11.00172:2(106-119)Online publication date: 1-Feb-2012
  • (2009)A new technique of switch & feedback job scheduling mechanism in a distributed systemProceedings of the 2009 Spring Simulation Multiconference10.5555/1639809.1639937(1-4)Online publication date: 22-Mar-2009
  • (2009)Achieving efficiency, quality of service and robustness in multi-organizational GridsJournal of Systems and Software10.1016/j.jss.2008.03.06482:1(23-38)Online publication date: 1-Jan-2009
  • (2007)Improving security for periodic tasks in embedded systems through schedulingACM Transactions on Embedded Computing Systems10.1145/1275986.12759926:3(20-es)Online publication date: 1-Jul-2007
  • (2007)Engineering grid applications and middleware for high performanceProceedings of the 6th international workshop on Software and performance10.1145/1216993.1217019(141-152)Online publication date: 5-Feb-2007
  • (2006)Process prioritization using output productionACM Transactions on Multimedia Computing, Communications, and Applications10.1145/1201730.12017342:4(318-342)Online publication date: 1-Nov-2006
  • (2005)A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clustersJournal of Parallel and Distributed Computing10.1016/j.jpdc.2005.02.00365:8(885-900)Online publication date: 1-Aug-2005
  • (2004)Simulation study of multitasking in distributed server systems with variable workloadSimulation Modelling Practice and Theory10.1016/S1569-190X(03)00092-312:7-8(591-608)Online publication date: Nov-2004
  • (2003)Performance Analysis of Parallel Job Scheduling in Distributed SystemsProceedings of the 36th annual symposium on Simulation10.5555/786111.786236Online publication date: 30-Mar-2003
  • (2003)Parallel Job Scheduling in Homogeneous Distributed SystemsSIMULATION10.1177/003754970303714879:5-6(287-298)Online publication date: 1-May-2003
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media