Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing | 2021

Load balancing guardrails: keeping your heavy traffic on the road to low response times (invited paper)

 
 
 

Abstract


This talk is about scheduling and load balancing in a multi-server system, with the goal of minimizing mean response time in a general stochastic setting. We will specifically concentrate on the common case of a load balancing system, where a front-end load balancer (a.k.a. dispatcher) dispatches requests to multiple back-end servers, each with their own queue. Much is known about load balancing in the case where the scheduling at the servers is First-Come-First-Served (FCFS). However, to minimize mean response time, we need to use Shortest-Remaining-Processing-Time (SRPT) scheduling at the servers. Unfortunately, there is almost nothing known about optimal dispatching when SRPT scheduling is used at the servers. To make things worse, it turns out that the traditional dispatching policies that are used in practice with FCFS servers often have poor performance in systems with SRPT servers. In this talk, we devise a simple fix that can be applied to any dispatching policy. This fix, called guardrails ensures that the dispatching policy yields optimal mean response time under heavy traffic, when used in a system with SRPT servers. Any dispatching policy, when augmented with guardrails becomes heavy-traffic optimal. Our results also yield the first analytical bounds on mean response time for load balancing systems with SRPT scheduling at the servers. Load balancing and scheduling are highly studied both in the stochastic and the worst-case scheduling communities. One aim of this talk is to contrast some differences in the approaches of the two communities when tackling multi-server scheduling problems.

Volume None
Pages None
DOI 10.1145/3406325.3465359
Language English
Journal Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing

Full Text