Flexible queueing architectures

JN Tsitsiklis, K Xu - Operations Research, 2017 - pubsonline.informs.org
Operations Research, 2017pubsonline.informs.org
We study a multiserver model with n flexible servers and n queues, connected through a
bipartite graph, where the level of flexibility is captured by an upper bound on the graph's
average degree, dn. Applications in content replication in data centers, skill-based routing in
call centers, and flexible supply chains are among our main motivations. We focus on the
scaling regime where the system size n tends to infinity, while the overall traffic intensity
stays fixed. We show that a large capacity region and an asymptotically vanishing queueing …
We study a multiserver model with n flexible servers and n queues, connected through a bipartite graph, where the level of flexibility is captured by an upper bound on the graph’s average degree, dn. Applications in content replication in data centers, skill-based routing in call centers, and flexible supply chains are among our main motivations. We focus on the scaling regime where the system size n tends to infinity, while the overall traffic intensity stays fixed. We show that a large capacity region and an asymptotically vanishing queueing delay are simultaneously achievable even under limited flexibility (dnn). Our main results demonstrate that, when dn ≫ ln n, a family of expander-graph-based flexibility architectures has a capacity region that is within a constant factor of the maximum possible, while simultaneously ensuring a diminishing queueing delay for all arrival rate vectors in the capacity region. Our analysis is centered around a new class of virtual-queue-based scheduling policies that rely on dynamically constructed job-to-server assignments on the connectivity graph. For comparison, we also analyze a natural family of modular architectures, which is simpler but has provably weaker performance.
The online appendix is available at https://doi.org/10.1287/opre.2017.1620.
INFORMS