Stochastic tail-phase optimization for bag-of-tasks execution in clouds

AM Oprescu, T Kielmann… - 2012 IEEE Fifth …, 2012 - ieeexplore.ieee.org
2012 IEEE Fifth International Conference on Utility and Cloud …, 2012ieeexplore.ieee.org
Elastic applications like bags of tasks benefit greatly from Infrastructure as a Service (IaaS)
clouds that let users allocate compute resources on demand, charging based on reserved
time intervals. Users, however, still need guidance for mapping their applications onto
multiple IaaS offerings, both minimizing execution time and respecting budget limitations.
For budget-controlled execution of bags of tasks, we built Bats, a scheduler that estimates
possible budget and make spancombinations using a tiny task sample, and then executes a …
Elastic applications like bags of tasks benefit greatly from Infrastructure as a Service (IaaS) clouds that let users allocate compute resources on demand, charging based on reserved time intervals. Users, however, still need guidance for mapping their applications onto multiple IaaS offerings, both minimizing execution time and respecting budget limitations. For budget-controlled execution of bags of tasks, we built Bats, a scheduler that estimates possible budget and make spancombinations using a tiny task sample, and then executes a bag within the user's budget constraints. Previous work has shown the efficacy of this approach. There remains, however, the risk of outlier tasks causing the execution to exceed the predicted make span. In this work, we present a stochastic optimization of the tail phase for Bats' execution. The main idea is to use the otherwise idling machines up until the end of their (already paid-for) allocation time. Using the task completion time information acquired during the execution, BaTS decides which tasks to replicate onto idle machines in the tail phase, reducing the make span and improving the tolerance to outlier tasks. Our evaluation results show that this effect is robust w.r.t. the quality of runtime predictions and is the strongest with more expensive schedules in which many fast machines are available.
ieeexplore.ieee.org