Solver-In-The-Loop Cluster Resource Management for Database-as-a-Service

In Database-as-a-Service (DBaaS) clusters, resource management is a complex optimization problem that assigns tenants to nodes, subject to various constraints and objectives.  Tenants share resources within a node, however, their resource demands can change over time and exhibit high variance. As tenants may accumulate large state, moving them to a different node  becomes disruptive, making intelligent placement decisions crucial to avoid service disruption. Placement decisions need to accounts for dynamic changes in tenant resource demands, different causes of service disruption, and various placement constraints, giving rise to a complex search space. In this paper, we show how to bring combinatorial solvers to bear on this problem,  formulating the objective of minimizing service disruption as an optimization problem amenable to fast solutions. We implemented our approach in the Service Fabric cluster manager codebase. Experiments show significant reductions in constraint violations and tenant moves, compared to the previous state-of-the-art, including the unmodified Service Fabric cluster manager, as well as recent research on DBaaS tenant placement.