Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices

USENIX Symposium on Networked Systems Design and Implementation (NSDI ’24) |

Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: the end-to-end application latency and per-service resource usage. Translation between these two levels, however, is challenging because user requests traverse heterogeneous services that collectively (but unevenly) contribute to the end-to-end latency. This paper presents Autothrottle, a bi-level learning-assisted resource management framework for SLO-targeted microservices. It architecturally decouples mechanisms of application SLO feedback and service resource control, and bridges them with the notion of performance targets. This decoupling enables targeted control policies for these two mechanisms, where we combine lightweight heuristics and learning techniques. We evaluate Autothrottle on three microservice applications, with workload traces from production scenarios. Results show its superior CPU resource saving, up to 26.21% over the best-performing baseline, and up to 93.84% over all baselines.