Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2749246.2749266acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
short-paper

Planning and Optimization in TORQUE Resource Manager

Published: 15 June 2015 Publication History

Abstract

We presents a unique advanced job scheduler for the widely used TORQUE Resource Manager. Unlike common schedulers that are using queuing approach and heuristics, our solution uses planning (job schedule construction) and schedule optimization by a local search-inspired metaheuristic, achieving better predictability, performance and fairness with respect to common queue-based approaches. The suitability and good performance of our solution is demonstrated both by "synthetic" experiments as well as by our real-life performance results that are coming from the deployment of our scheduler in the production infrastructure of the Czech Centre for Education, Reasearch and Innovation in ICT (CERIT Scientific Cloud).

References

[1]
Adaptive Computing Enterprises, Inc. TORQUE Admininstrator Guide, version 5.1.0, April 2015. http://docs.adaptivecomputing.com.
[2]
D. G. Feitelson, L. Rudolph, U. Schwiegelshohn, K. C. Sevcik, and P. Wong. Theory and practice in parallel job scheduling. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, volume 1291 of LNCS, pages 1--34. Springer Verlag, 1997.
[3]
M. Hovestadt, O. Kao, A. Keller, and A. Streit. Scheduling in HPC resource management systems: Queueing vs. planning. In D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn, editors, Job Scheduling Strategies for Parallel Processing, volume 2862 of LNCS, pages 1--20. Springer Verlag, 2003.
[4]
D. Jackson, Q. Snell, and M. Clement. Core algorithms of the Maui scheduler. In D. G. Feitelson and L. Rudolph, editors, Job Scheduling Strategies for Parallel Processing, volume 2221 of LNCS, pages 87--102. Springer, 2001.
[5]
D. Klusáček. Event-based Optimization of Schedules for Grid Jobs. PhD thesis, Masaryk University, 2011.
[6]
D. Klusáček and H. Rudová. Performance and fairness for users in parallel job scheduling. In W. Cirne, editor, Job Scheduling Strategies for Parallel Processing, volume 7698 of LNCS, pages 235--252. Springer, 2012.
[7]
A. W. Mu'alem and D. G. Feitelson. Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Transactions on Parallel and Distributed Systems, 12(6):529--543, 2001.
[8]
W. Süß, W. Jakob, A. Quinte, and K.-U. Stucky. GORBA: A global optimising resource broker embedded in a Grid resource management system. In International Conference on Parallel and Distributed Computing Systems, PDCS 2005, pages 19--24. IASTED/ACTA Press, 2005.
[9]
P. Switalski and F. Seredynski. Scheduling parallel batch jobs in grids with evolutionary metaheuristics. Journal of Scheduling, pages 1--13, 2014.
[10]
F. Xhafa and A. Abraham. Metaheuristics for Scheduling in Distributed Computing Environments, volume 146 of Studies in Comp. Intel. Springer, 2008.
[11]
F. Xhafa and A. Abraham. Computational models and heuristic methods for Grid scheduling problems. Future Generation Computer Systems, 26(4):608--621, 2010.

Cited By

View all
  • (2023)ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC SystemsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624145(685-695)Online publication date: 12-Nov-2023
  • (2023)Containerization for High Performance Computing Systems: Survey and ProspectsIEEE Transactions on Software Engineering10.1109/TSE.2022.322922149:4(2722-2740)Online publication date: 1-Apr-2023
  • (2021)Container orchestration on HPC systems through KubernetesJournal of Cloud Computing10.1186/s13677-021-00231-z10:1Online publication date: 22-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '15: Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing
June 2015
296 pages
ISBN:9781450335508
DOI:10.1145/2749246
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 June 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. metaheuristic
  2. optimization
  3. planning
  4. scheduler

Qualifiers

  • Short-paper

Funding Sources

  • Ministry of Education Youth and Sports of the Czech Republic
  • Grant Agency of the Czech Republic

Conference

HPDC'15
Sponsor:

Acceptance Rates

HPDC '15 Paper Acceptance Rate 19 of 116 submissions, 16%;
Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)15
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)ZeroSum: User Space Monitoring of Resource Utilization and Contention on Heterogeneous HPC SystemsProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624145(685-695)Online publication date: 12-Nov-2023
  • (2023)Containerization for High Performance Computing Systems: Survey and ProspectsIEEE Transactions on Software Engineering10.1109/TSE.2022.322922149:4(2722-2740)Online publication date: 1-Apr-2023
  • (2021)Container orchestration on HPC systems through KubernetesJournal of Cloud Computing10.1186/s13677-021-00231-z10:1Online publication date: 22-Feb-2021
  • (2021)Containerization and Orchestration on HPC SystemsSustained Simulation Performance 2019 and 202010.1007/978-3-030-68049-7_10(133-147)Online publication date: 2-Mar-2021
  • (2020)SwarmForm: A Distributed Workflow Management System with Task Clustering2020 20th International Conference on Advances in ICT for Emerging Regions (ICTer)10.1109/ICTer51097.2020.9325496(35-40)Online publication date: 4-Nov-2020
  • (2020)Optimising AI Training Deployments using Graph Compilers and Containers2020 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC43674.2020.9286153(1-8)Online publication date: 22-Sep-2020
  • (2020)Container Orchestration on HPC Systems2020 IEEE 13th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD49709.2020.00017(34-36)Online publication date: Oct-2020
  • (2020)Multiverse: Dynamic VM Provisioning for Virtualized High Performance Computing Clusters2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID)10.1109/CCGrid49817.2020.00-80(131-141)Online publication date: May-2020
  • (2020)Static job scheduling for environments with vertical elasticityConcurrency and Computation: Practice and Experience10.1002/cpe.576132:19Online publication date: 13-Apr-2020
  • (2019)A hybrid scheduling platform: a runtime prediction reliability aware scheduling platform to improve HPC scheduling performanceThe Journal of Supercomputing10.1007/s11227-019-03004-3Online publication date: 28-Sep-2019
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media