research-article

Open access

Optimal Resource Allocation for Elastic and Inelastic Jobs

Authors:

Mor Harchol-Balter,

Benjamin Moseley,

Justin WhitehouseAuthors Info & Claims

SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures

Pages 75 - 87

https://doi.org/10.1145/3350755.3400265

Published: 09 July 2020 Publication History

Abstract

Modern data centers are tasked with processing heterogeneous workloads consisting of various classes of jobs. These classes differ in their arrival rates, size distributions, and job parallelizability. With respect to parallelizability, some jobs are elastic, meaning they can parallelize linearly across any number of servers. Other jobs are inelastic, meaning they can only run on a single server. Although job classes can differ drastically, they are typically forced to share a single cluster. When sharing a cluster among heterogeneous jobs, one must decide how to allocate servers to each job at every moment in time. In this paper, we design and analyze allocation policies which aim to minimize the mean response time across jobs, where a job's response time is the time from when it arrives until it completes.

We model this problem in a stochastic setting where each job may be elastic or inelastic. Job sizes are drawn from exponential distributions, but are unknown to the system. We show that, in the common case where elastic jobs are larger on average than inelastic jobs, the optimal allocation policy is Inelastic-First, giving inelastic jobs preemptive priority over elastic jobs. We obtain this result by introducing a novel sample path argument. We also show that there exist cases where Elastic-First (giving priority to elastic jobs) performs better than Inelastic-First. We provide the first analysis of mean response time under both Elastic-First and Inelastic-First by leveraging techniques for solving high-dimensional Markov chains.

References

[1]

I.J.B.F. Adan, G.J. van Houtum, and J. van der Wal. Upper and lower bounds for the waiting time in the symmetric shortest queue system. Annals of Operations Research, 48:197--217, 1994.

[2]

Kunal Agrawal, I-Ting Angelina Lee, Jing Li, Kefu Lu, and Benjamin Moseley. Practically efficient scheduler for minimizing average flow time of parallel jobs. In 2019 IEEE International Parallel and Distributed Processing Symposium, IPDPS 2019, Rio de Janeiro, Brazil, May 20--24, 2019, pages 134--144. IEEE, 2019.

[3]

Kunal Agrawal, Jing Li, Kefu Lu, and Benjamin Moseley. Scheduling parallel DAG jobs online to minimize average flow time. In Robert Krauthgamer, editor, Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10--12, 2016, pages 176--189. SIAM, 2016.

Digital Library

[4]

S. Anand, Naveen Garg, and Amit Kumar. Resource augmentation for weighted flow-time explained by dual fitting. In Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17--19, 2012, pages 1228--1241, 2012.

[5]

Spyros Angelopoulos, Giorgio Lucarelli, and Nguyen Kim Thang. Primal-dual and dual-fitting analysis of online scheduling algorithms for generalized flow-time problems. Algorithmica, 81(9):3391--3421, 2019.

Digital Library

[6]

Eitan Bachmat and Hagit Sarfati. Analysis of size interval task assignment policies. Performance Evaluation Review, 36(2):107--109, 2008.

Digital Library

[7]

Benjamin Berg, Jan-Pieter Dorsman, and Mor Harchol-Balter. Towards optimalityin parallel scheduling. Proceedings of the ACM on Measurement and Analysis of Computing Systems, 1(2):1--30, 2018.

Digital Library

[8]

Benjamin Berg, Mor Harchol-Balter, Benjamin Moseley, Weina Wang, and Justin Whitehouse. Optimal resource allocation for elastic and inelastic jobs. https://arxiv.org/abs/2005.09745.

[9]

Carl Bussema and Eric Torng. Greedy multiprocessor server scheduling. Oper.Res. Lett., 34(4):451--458, 2006.

Digital Library

[10]

Jivitej S. Chadha, Naveen Garg, Amit Kumar, and V. N. Muralidhara. A competitive algorithm for minimizing weighted flow time on unrelated machines with speed augmentation. In Michael Mitzenmacher, editor,Proceedings of the 41st Annual ACM Symposium on Theory of Computing, STOC 2009, Bethesda, MD, USA, May31 - June 2, 2009, pages 679--684. ACM, 2009.

[11]

Richard W Conway, Louis W Miller, and William L Maxwell. Theory of scheduling.Dover, 2003.

[12]

Jeffrey Dean and Sanjay Ghemawat. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 51(1):107--113, 2008.

Digital Library

[13]

Christina Delimitrou and Christos Kozyrakis. Quasar: resource-efficient and qos-aware cluster management. ACM SIGPLAN Notices, 49(4):127--144, 2014.

Digital Library

[14]

Jeff Edmonds. Scheduling in the dark. Theor. Comput. Sci., 235(1):109--141, 2000.

Digital Library

[15]

Jeff Edmonds, Sungjin Im, and Benjamin Moseley. Online scalable scheduling for the k-norms of flow time without conservation of work. In Dana Randall, editor, Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2011, San Francisco, California, USA, January 23--25, 2011, pages 109--119. SIAM, 2011.

Digital Library

[16]

Jeff Edmonds and Kirk Pruhs. Scalably scheduling processes with arbitrary speedup curves. In Proceedings of the twentieth annual ACM-SIAM symposium on Discrete algorithms, pages 685--692. SIAM, 2009.

Digital Library

[17]

Kyle Fox and Benjamin Moseley. Online scheduling on identical machines using SRPT. In Dana Randall, editor, Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2011, San Francisco, California, USA, January 23--25, 2011, pages 120--128. SIAM, 2011.

Digital Library

[18]

Anshul Gandhi and Mor Harchol-Balter. How data center size impacts the effectiveness of dynamic power management. In 2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1164--1169. IEEE, 2011.

[19]

Isaac Grosof, Ziv Scully, and Mor Harchol-Balter. Script for multiserver systems. Performance Evaluation, 127:154--175, 2018.

[20]

Abhishek Gupta, Bilge Acun, Osman Sarood, and Laxmikant V Kalé. Towards realizing the potential of malleable jobs. In 2014 21st International Conference on High Performance Computing (HiPC), pages 1--10. IEEE, 2014.

[21]

Anupam Gupta, Sungjin Im, Ravishankar Krishnaswamy, Benjamin Moseley,and Kirk Pruhs. Scheduling heterogeneous processors isn't as easy as you think. In Yuval Rabani, editor, Proceedings of the Twenty-Third Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2012, Kyoto, Japan, January 17--19, 2012, pages 1242--1253. SIAM, 2012.

[22]

Varun Gupta, Benjamin Moseley, Marc Uetz, and Qiaomin Xie. Stochastic online scheduling on unrelated machines. In Integer Programming and Combinatorial Optimization - 19th International Conference, IPCO 2017, Waterloo, ON, Canada, June 26--28, 2017, Proceedings, pages 228--240, 2017.

[23]

Varun Gupta, Karl Sigman, Mor Harchol-Balter, and Ward Whitt. In sensitivity for ps server farms with jsq routing. ACM SIGMETRICS Performance Evaluation Review, 35(2):24--26, 2007.

Digital Library

[24]

Mor Harchol-Balter. Task assignment with unknown duration. Journal of the ACM, 49(2):260--288, March 2002.

Digital Library

[25]

Mor Harchol-Balter. Performance modeling and design of computer systems: queueing theory in action. Cambridge University Press, 2013.

[26]

Mor Harchol-Balter, Cuihong Li, Takayuki Osogami, Alan Scheller-Wolf, and Mark Squillante. Cycle stealing under immediate dispatch task assignment. In Proceedings of the 15th ACM Symposium on Parallel Algorithms and Architectures, pages 274--285, San Diego, CA, June 2003.

Digital Library

[27]

Mor Harchol-Balter, Cuihong Li, Takayuki Osogami, Alan Scheller-Wolf, and Mark Squillante. Task assignment with cycle stealing under central queue. In Proceedings of the 23rd International Conference on Distributed Computing Systems, pages 628--637, Providence, RI, May 2003.

[28]

Mor Harchol-Balter, Takayuki Osogami, Alan Scheller-Wolf, and Adam Wierman. Multi-server queueing systems with multiple priority classes. Queueing Systems: Theory and Applications, 51(3--4):331--360, 2005.

[29]

Mor Harchol-Balter, Alan Scheller-Wolf, and Andrew Young. Surprising results on task assignment in server farms with high-variability workloads. In ACM Sigmetrics 2009 Conference on Measurement and Modeling of Computer Systems, pages 287--298, 2009.

Digital Library

[30]

Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D Joseph, Randy H Katz, Scott Shenker, and Ion Stoica. Mesos: A platform for fine-grained resource sharing in the data center. In NSDI, volume 11, pages 22--22,2011.

Digital Library

[31]

Sungjin Im, Benjamin Moseley, Kirk Pruhs, and Eric Torng. Competitively scheduling tasks with intermediate parallelizability. ACM Transactions on Parallel Computing (TOPC), 3(1):1--19, 2016.

[32]

Cheeha Kim and Ashok K Agrawala. Analysis of the fork-join queue. IEEE Transactions on computers, 38(2):250--255, 1989.

Digital Library

[33]

Leonard Kleinrock. Queueing systems, volume 2: Computer applications, volume 66. Wiley New York, 1976.

[34]

Guy Latouche and V. Ramaswami. Introduction to Matrix Analytic Methods in Stochastic Modeling. ASA-SIAM, Philadelphia, 1999.

[35]

Stefano Leonardi and Danny Raz. Approximating total flow time on parallel machines. Journal of Computer and System Sciences, 73(6):875--891, 2007.

Digital Library

[36]

Xiangru Lian, Ce Zhang, Huan Zhang, Cho-Jui Hsieh, Wei Zhang, and Ji Liu. Can decentralized algorithms outperform centralized algorithms? a case study for decentralized parallel stochastic gradient descent. In Advances in Neural Information Processing Systems, pages 5330--5340, 2017.

Digital Library

[37]

Richard Liaw, Romil Bhardwaj, Lisa Dunlap, Yitian Zou, Joseph E Gonzalez, Ion Stoica, and Alexey Tumanov. Hypersched: Dynamic resource reallocation for model development on a deadline. In Proceedings of the ACM Symposium on Cloud Computing, pages 61--73, 2019.

Digital Library

[38]

David Lo, Liqun Cheng, Rama Govindaraju, Parthasarathy Ranganathan, and Christos Kozyrakis. Heracles: Improving resource efficiency at scale. In Proceedings of the 42nd Annual International Symposium on Computer Architecture, pages 450--462, 2015.

Digital Library

[39]

Jason Mars, Lingjia Tang, Robert Hundt, Kevin Skadron, and Mary Lou Soffa. Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture, pages 248--259, 2011.

Digital Library

[40]

Robert McNaughton. Scheduling with deadlines and loss functions. Management Science, 6(1):1--12, 1959.

Digital Library

[41]

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. Ray: A distributed framework for emerging AI applications. In 13th USENIX Symposium on Operating Systems Design and Implementation (OSDI18), pages 561--577, 2018.

[42]

Randolph Nelson and Asser Tantawi. Approximate analysis of fork/join synchronization in parallel queues. Transactions on Computers, 37(6):739--743, 1988.

Digital Library

[43]

Marcel F. Neuts. Matrix-Geometric Solutions in Stochastic Models. Johns Hopkins University Press, 1981.

[44]

Marcel F. Neuts. Structured Stochastic Matrices of M/G/1 Type and Their Applications. Marcel Dekker, 1989.

[45]

Takayuki Osogami and Mor Harchol-Balter. Closed form solutions for mapping general distributions to quasi-minimal PH distributions. Performance Evaluation, 63(6):524--552, 2006.

Digital Library

[46]

Takayuki Osogami, Mor Harchol-Balter, and Alan Scheller-Wolf. Analysis of cycle stealing with switching times and thresholds. In Proceedings of ACM Sigmetrics, pages 184--195, San Diego, CA, June 2003.

Digital Library

[47]

Yanghua Peng, Yixin Bao, Yangrui Chen, Chuan Wu, and Chuanxiong Guo. Optimus: an efficient dynamic resource scheduler for deep learning clusters. In Proceedings of the Thirteenth EuroSys Conference, pages 1--14, 2018.

Digital Library

[48]

Donald R Smith. A new proof of the optimality of the shortest remaining processing time discipline.Operations Research, 26(1):197--199, 1978.

Digital Library

[49]

Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, EricTune, and John Wilkes. Large-scale cluster management at google with borg. In Proceedings of the European Conference on Computer Systems, pages 1--17, 2015.

Digital Library

[50]

Weina Wang, Mor Harchol-Balter, Haotian Jiang, Alan Scheller-Wolf, and Rayadurgam Srikant. Delay asymptotics and bounds for multitask parallel jobs. Queueing Systems, 91(3--4):207--239, 2019.

Cited By

Ghanbarian SMukhopadhyay AMazumdar RGuillemin FPapavassiliou SSchmid S(2024)On Optimal Server Allocation for Moldable Jobs with Concave Speed-UpProceedings of the Twenty-fifth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing10.1145/3641512.3686370(191-200)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3641512.3686370
Huang JGolubchik LHuang L(2024)When Lyapunov Drift Based Queue Scheduling Meets Adversarial Bandit LearningIEEE/ACM Transactions on Networking10.1109/TNET.2024.337475532:4(3034-3044)Online publication date: Aug-2024
https://doi.org/10.1109/TNET.2024.3374755
Marin AAjmone Marsan MMeo MSereno M(2024)Queuing models of links carrying streaming and elastic servicesComputer Networks10.1016/j.comnet.2024.110306244(110306)Online publication date: May-2024
https://doi.org/10.1016/j.comnet.2024.110306
Show More Cited By

Index Terms

Optimal Resource Allocation for Elastic and Inelastic Jobs
1. Mathematics of computing
  1. Probability and statistics
    1. Statistical paradigms
      1. Queueing theory
    2. Stochastic processes
      1. Markov processes
2. Theory of computation
  1. Design and analysis of algorithms
    1. Online algorithms
      1. Online learning algorithms
        Scheduling algorithms
  2. Models of computation
    1. Concurrency
      1. Parallel computing models

Recommendations

Scheduling of deteriorating jobs with release dates to minimize the maximum lateness

In this paper, we consider the problem of scheduling n deteriorating jobs with release dates on a single (batching) machine. Each job's processing time is a simple linear function of its starting time. The objective is to minimize the maximum lateness. ...
Scheduling jobs under decreasing linear deterioration

This paper considers the scheduling problems under decreasing linear deterioration. Deterioration of a job means that its processing time is a function of its execution start time. Optimal algorithms are presented respectively for single machine ...
Single machine parallel-batch scheduling with deteriorating jobs

We consider several single machine parallel-batch scheduling problems in which the processing time of a job is a linear function of its starting time. We give a polynomial-time algorithm for minimizing the maximum cost, an O(n⁵) time algorithm for ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

SPAA '20: Proceedings of the 32nd ACM Symposium on Parallelism in Algorithms and Architectures

July 2020

601 pages

ISBN:9781450369350

DOI:10.1145/3350755

General Chair:
Christian Scheideler
Institut fuer Informatik Universitaet Paderborn Fuerstenallee 11
,
Program Chair:
Michael Spear

Copyright © 2020 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
SIGARCH: ACM Special Interest Group on Computer Architecture
EATCS: European Association for Theoretical Computer Science

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 July 2020

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

SPAA '20

Sponsor:

SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures

July 15 - 17, 2020

Virtual Event, USA

Acceptance Rates

Overall Acceptance Rate 447 of 1,461 submissions, 31%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

9
Total Citations
View Citations
566
Total Downloads

Downloads (Last 12 months)192
Downloads (Last 6 weeks)27

Reflects downloads up to 14 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Ghanbarian SMukhopadhyay AMazumdar RGuillemin FPapavassiliou SSchmid S(2024)On Optimal Server Allocation for Moldable Jobs with Concave Speed-UpProceedings of the Twenty-fifth International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Networks and Mobile Computing10.1145/3641512.3686370(191-200)Online publication date: 14-Oct-2024
https://dl.acm.org/doi/10.1145/3641512.3686370
Huang JGolubchik LHuang L(2024)When Lyapunov Drift Based Queue Scheduling Meets Adversarial Bandit LearningIEEE/ACM Transactions on Networking10.1109/TNET.2024.337475532:4(3034-3044)Online publication date: Aug-2024
https://doi.org/10.1109/TNET.2024.3374755
Marin AAjmone Marsan MMeo MSereno M(2024)Queuing models of links carrying streaming and elastic servicesComputer Networks10.1016/j.comnet.2024.110306244(110306)Online publication date: May-2024
https://doi.org/10.1016/j.comnet.2024.110306
Vesilo R(2024)Core allocation to minimize total flow time in a multicore system in the presence of a processing time constraintQueueing Systems10.1007/s11134-024-09923-0108:3-4(475-577)Online publication date: 26-Aug-2024
https://doi.org/10.1007/s11134-024-09923-0
Dudin ADudin SDudina O(2023)Analysis of a Queueing System with Mixed Service DisciplineMethodology and Computing in Applied Probability10.1007/s11009-023-10042-125:2Online publication date: 31-May-2023
https://dl.acm.org/doi/10.1007/s11009-023-10042-1
Berg BWhitehouse JMoseley BWang WHarchol-Balter M(2022)The Case for Phase-Aware Scheduling of Parallelizable JobsACM SIGMETRICS Performance Evaluation Review10.1145/3529113.352913649:3(65-66)Online publication date: 25-Mar-2022
https://dl.acm.org/doi/10.1145/3529113.3529136
Grosof IHarchol-Balter MScheller-Wolf A(2022)WCFS: a new framework for analyzing multiserver systemsQueueing Systems: Theory and Applications10.1007/s11134-022-09848-6102:1-2(143-174)Online publication date: 1-Oct-2022
https://dl.acm.org/doi/10.1007/s11134-022-09848-6
Berg BWhitehouse JMoseley BWang WHarchol-Balter M(2021)The case for phase-aware scheduling of parallelizable jobsPerformance Evaluation10.1016/j.peva.2021.102246153:COnline publication date: 30-Dec-2021
https://dl.acm.org/doi/10.1016/j.peva.2021.102246
Harchol-Balter M(2021)Open problems in queueing theory inspired by datacenter computingQueueing Systems: Theory and Applications10.1007/s11134-020-09684-697:1-2(3-37)Online publication date: 27-Jan-2021
https://dl.acm.org/doi/10.1007/s11134-020-09684-6

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents