{Prediction-Based} power oversubscription in cloud platforms
2021 USENIX Annual Technical Conference (USENIX ATC 21), 2021•usenix.org
Prior work has used power capping to shave rare power peaks and add more servers to a
datacenter, thereby oversubscribing its resources and lowering capital costs. This works well
when the workloads and their server placements are known. Unfortunately, these factors are
unknown in public clouds, forcing providers to limit the oversubscription and thus the
potential performance loss from power capping. In this paper, we argue that providers can
use predictions of workload performance criticality and virtual machine (VM) resource …
datacenter, thereby oversubscribing its resources and lowering capital costs. This works well
when the workloads and their server placements are known. Unfortunately, these factors are
unknown in public clouds, forcing providers to limit the oversubscription and thus the
potential performance loss from power capping. In this paper, we argue that providers can
use predictions of workload performance criticality and virtual machine (VM) resource …
Abstract
Prior work has used power capping to shave rare power peaks and add more servers to a datacenter, thereby oversubscribing its resources and lowering capital costs. This works well when the workloads and their server placements are known. Unfortunately, these factors are unknown in public clouds, forcing providers to limit the oversubscription and thus the potential performance loss from power capping. In this paper, we argue that providers can use predictions of workload performance criticality and virtual machine (VM) resource utilization to increase oversubscription. This poses many challenges, such as identifying the performance-critical workloads from opaque VMs, creating support for criticality-aware power management, and increasing oversubscription while limiting the impact of capping. We address these challenges for the hardware and software of Microsoft Azure. The results show that we enable a 2x increase in oversubscription with minimum impact to critical workloads. We describe lessons from deploying our work in production.
usenix.org
Showing the best result for this search. See all results