Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2503210.2503244acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Exploring portfolio scheduling for long-term execution of scientific workloads in IaaS clouds

Published: 17 November 2013 Publication History

Abstract

Long-term execution of scientific applications often leads to dynamic workloads and varying application requirements. When the execution uses resources provisioned from IaaS clouds, and thus consumption-related payment, efficient and online scheduling algorithms must be found. Portfolio scheduling, which selects dynamically a suitable policy from a broad portfolio, may provide a solution to this problem. However, selecting online the right policy from possibly tens of alternatives remains challenging. In this work, we introduce an abstract model to explore this selection problem. Based on the model, we present a comprehensive portfolio scheduler that includes tens of provisioning and allocation policies. We propose an algorithm that can enlarge the chance of selecting the best policy in limited time, possibly online. Through trace-based simulation, we evaluate various aspects of our portfolio scheduler, and find performance improvements from 7% to 100% in comparison with the best constituent policies and high improvement for bursty workloads.

References

[1]
Parallel workloads archive. http://www.cs.huji.ac.il/labs/parallel/workload/. 2013-02-17.
[2]
O. Agmon Ben-Yehuda, A. Schuster, A. Sharov, M. Silberstein, and A. Iosup. Expert: Pareto-efficient task replication on grids and a cloud. In IPDPS, pages 167--178, 2012.
[3]
A. AuYoung, A. Vahdat, and A. C. Snoeren. Evaluating the impact of inaccurate information in utility-based scheduling. In SC, 2009.
[4]
R. N. Calheiros, R. Ranjan, A. Beloglazov, C. A. F. D. Rose, and R. Buyya. Cloudsim: a toolkit for modeling and simulation of cloud computing environments and evaluation of resource provisioning algorithms. Softw., Pract. Exper., 41(1):23--50, 2011.
[5]
S.-H. Chiang and S. Vasupongayya. Design and potential performance of goal-oriented job scheduling policies for parallel computer workloads. IEEE Trans. Parallel Distrib. Syst., 19(12):1642--1656, 2008.
[6]
E. G. Coffman, Jr., M. R. Garey, and D. S. Johnson. Approximation algorithms for bin packing: a survey. In Approximation algorithms for NP-hard problems, pages 46--93. PWS Publishing Co., Boston, MA, USA, 1997.
[7]
M. D. de Assunção, A. di Costanzo, and R. Buyya. Evaluating the cost-benefit of using cloud computing to extend the capacity of clusters. In HPDC, pages 141--150, 2009.
[8]
K. Deng, R. Verboon, K. Ren, and A. Iosup. A periodic portfolio scheduler for scientific computing in the data center. In JSSPP, 2013.
[9]
D. G. Feitelson, L. Rudolph, and U. Schwiegelshohn. Parallel job scheduling - a status report. In JSSPP, pages 1--16, 2004.
[10]
S. Genaud and J. Gossa. Cost-wait trade-offs in client-side resource provisioning with elastic clouds. In IEEE CLOUD, pages 1--8, 2011.
[11]
T. J. Hacker and K. Mahadik. Flexible resource allocation for reliable virtual cluster computing systems. In SC, page 48, 2011.
[12]
B. A. Huberman, R. M. Lukose, and T. Hogg. An economics approach to hard computational problems. Science, 275(5296):51--54, 1997.
[13]
A. Iosup, S. Ostermann, N. Yigitbasi, R. Prodan, T. Fahringer, and D. H. J. Epema. Performance analysis of cloud computing services for many-tasks scientific computing. IEEE Trans. Parallel Distrib. Syst., 22(6):931--945, 2011.
[14]
A. Iosup, O. O. Sonmez, S. Anoep, and D. H. J. Epema. The performance of bags-of-tasks in large-scale distributed systems. In HPDC, pages 97--108, 2008.
[15]
A. Iosup, O. O. Sonmez, and D. H. J. Epema. Dgsim: Comparing grid resource management architectures through trace-based simulation. In Euro-Par, pages 13--25, 2008.
[16]
K. Keahey, R. Figueiredo, J. Fortes, T. Freeman, and M. Tsugawa. Science clouds: Early experiences in cloud computing for scientific applications. Cloud computing and applications, 2008:16, 2008.
[17]
K. Keahey and T. Freeman. Contextualization: Providing one-click virtual clusters. In eScience, pages 301--308, 2008.
[18]
P. Krueger, T.-H. Lai, and V. A. Dixit-Radiya. Job scheduling is more important than processor allocation for hypercube computers. IEEE Trans. Parallel Distrib. Syst., 5(5):488--497, 1994.
[19]
B. Lawson and E. Smirni. Self-adaptive scheduler parameterization via online simulation. In IPDPS, 2005.
[20]
D. A. Lifka. The anl/ibm sp scheduling system. In JSSPP, pages 295--303, 1995.
[21]
M. Malawski, G. Juve, E. Deelman, and J. Nabrzyski. Cost- and deadline-constrained provisioning for scientific workflow ensembles in iaas clouds. In SC, page 22, 2012.
[22]
M. Mao and M. Humphrey. Auto-scaling to minimize cost and meet application deadlines in cloud workflows. In SC, page 49, 2011.
[23]
M. Mao and M. Humphrey. A performance study on the vm startup time in the cloud. In IEEE CLOUD, pages 423--430, 2012.
[24]
H. Markowitz. Portfolio selection*. The journal of finance, 7(1):77--91, 1952.
[25]
P. Marshall, H. M. Tufo, and K. Keahey. Provisioning policies for elastic computing environments. In IPDPS Workshops, pages 1085--1094, 2012.
[26]
A. M. Matsunaga and J. A. B. Fortes. On the use of machine learning to predict the time and resources consumed by applications. In CCGRID, pages 495--504, 2010
[27]
E. Michon, J. Gossa, and S. Genaud. Free elasticity and free cpu power for scientific workloads on iaas clouds. In ICPADS, pages 85--92, 2012.
[28]
A.-M. Oprescu, T. Kielmann, and H. Leahu. Stochastic tail-phase optimization for bag-of-tasks execution in clouds. In UCC, pages 204--208, 2012.
[29]
J. R. Rice. The algorithm selection problem. Advances in Computers, 15:65--118, 1976.
[30]
O. Shai, E. Shmueli, and D. G. Feitelson. Heuristics for resource matching in intel's compute farm. In JSSPP, 2013.
[31]
S. Shen, K. Deng, A. Iosup, and D. H. J. Epema. Scheduling jobs in the cloud using on-demand and reserved instances. In Euro-Par, pages 242--254, 2013.
[32]
E. Shmueli and D. G. Feitelson. Backfilling with lookahead to optimize the packing of parallel jobs. J. Parallel Distrib. Comput., 65(9):1090--1107, 2005.
[33]
W. Smith. Prediction services for distributed computing. In IPDPS, pages 1--10, 2007.
[34]
K. Smith-Miles. Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Comput. Surv., 41(1), 2008.
[35]
O. O. Sonmez, N. Yigitbasi, S. Abrishami, A. Iosup, and D. H. J. Epema. Performance analysis of dynamic workflow scheduling in multicluster grids. In HPDC, pages 49--60, 2010.
[36]
O. O. Sonmez, N. Yigitbasi, A. Iosup, and D. H. J. Epema. Trace-based evaluation of job runtime and queue wait time predictions in grids. In HPDC, pages 111--120, 2009.
[37]
S. Srinivasan, R. Kettimuthu, V. Subramani, and P. Sadayappan. Selective reservation strategies for backfill job scheduling. In JSSPP, pages 55--71, 2002.
[38]
D. Talby and D. G. Feitelson. Improving and stabilizing parallel computer performance using adaptive backfilling. In IPDPS, 2005.
[39]
W. Tang, Z. Lan, N. Desai, and D. Buettner. Fault-aware, utility-based job scheduling on blue, gene/p systems. In CLUSTER, pages 1--10, 2009.
[40]
D. Tsafrir, Y. Etsion, and D. G. Feitelson. Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans. Parallel Distrib. Syst., 18(6):789--803, 2007.
[41]
D. Villegas, A. Antoniou, S. M. Sadjadi, and A. Iosup. An analysis of provisioning and allocation policies for infrastructure-as-a-service clouds. In CCGRID, pages 612--619, 2012.
[42]
G. von Laszewski, J. Diaz, F. Wang, and G. Fox. Comparison of multiple cloud frameworks. In IEEE CLOUD, pages 734--741, 2012.
[43]
L. Wang, J. Zhan, W. Shi, and Y. Liang. In cloud, can scientific communities benefit from the economies of scale? IEEE Trans. Parallel Distrib. Syst., 23(2):296--303, 2012.
[44]
A. M. Weil and D. G. Feitelson. Utilization, predictability, workloads, and user runtime estimates in scheduling the ibm sp2 with backfilling. IEEE Trans. Parallel Distrib. Syst., 12(6):529--543, 2001.
[45]
L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. Satzilla: Portfolio-based algorithm selection for sat. J. Artif. Intell. Res. (JAIR), 32:565--606, 2008.

Cited By

View all
  • (2024)An exploration of online-simulation-driven portfolio scheduling in Workflow Management SystemsFuture Generation Computer Systems10.1016/j.future.2024.07.005161(345-360)Online publication date: Dec-2024
  • (2023)On the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime SystemsJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-22698-4_1(3-24)Online publication date: 12-Jan-2023
  • (2019)SpotWebProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3307681.3325397(1-12)Online publication date: 17-Jun-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
November 2013
1123 pages
ISBN:9781450323789
DOI:10.1145/2503210
  • General Chair:
  • William Gropp,
  • Program Chair:
  • Satoshi Matsuoka
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 November 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. IaaS cloud
  2. portfolio scheduling
  3. resource provisioning
  4. scientific workloads

Qualifiers

  • Research-article

Funding Sources

Conference

SC13
Sponsor:

Acceptance Rates

SC '13 Paper Acceptance Rate 91 of 449 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An exploration of online-simulation-driven portfolio scheduling in Workflow Management SystemsFuture Generation Computer Systems10.1016/j.future.2024.07.005161(345-360)Online publication date: Dec-2024
  • (2023)On the Feasibility of Simulation-Driven Portfolio Scheduling for Cyberinfrastructure Runtime SystemsJob Scheduling Strategies for Parallel Processing10.1007/978-3-031-22698-4_1(3-24)Online publication date: 12-Jan-2023
  • (2019)SpotWebProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3307681.3325397(1-12)Online publication date: 17-Jun-2019
  • (2019)Portfolio Scheduling for Managing Operational and Disaster-Recovery Risks in Virtualized Datacenters Hosting Business-Critical Workloads2019 18th International Symposium on Parallel and Distributed Computing (ISPDC)10.1109/ISPDC.2019.00022(94-102)Online publication date: Jun-2019
  • (2019)A Deadline-Constrained Scheduling Algorithm for Scientific Workflows in Clouds2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2019.00029(98-105)Online publication date: Aug-2019
  • (2019)Real-Time Scheduling Policy Selection from Queue and Machine States2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)10.1109/CCGRID.2019.00052(381-390)Online publication date: May-2019
  • (2018)Analysis of Potential Online Scheduling Improvements by Real-Time Strategy Selection2018 Symposium on High Performance Computing Systems (WSCAD)10.1109/WSCAD.2018.00011(1-7)Online publication date: Oct-2018
  • (2018)Online Tuning of EASY-Backfilling using Queue Reordering PoliciesIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.282069929:10(2304-2316)Online publication date: 1-Oct-2018
  • (2018)Towards Efficient Resource Allocation for Heterogeneous Workloads in IaaS CloudsIEEE Transactions on Cloud Computing10.1109/TCC.2015.24814006:1(264-275)Online publication date: 1-Jan-2018
  • (2018)Enabling Demand Response for HPC Systems through Power Capping and Node Scaling2018 IEEE 20th International Conference on High Performance Computing and Communications; IEEE 16th International Conference on Smart City; IEEE 4th International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC/SmartCity/DSS.2018.00133(789-796)Online publication date: Jun-2018
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media