LlamaTune: Sample-Efficient DBMS Configuration Tuning

VLDB 2022 |

DOI

Tuning a database system to achieve optimal performance on a given workload is a long-standing problem in the database community. A number of recent papers have leveraged ML-based approaches to guide the sampling of large parameter spaces (hundreds of tuning knobs) in search for high performance configurations. Looking at Microsoft production services operating millions of databases, sample efficiency emerged as a crucial requirement to use tuners on diverse workloads. This motivates our investigation in LlamaTune, a system that leverages two key insights: 1) an automated dimensionality reduction technique based on randomized embeddings, and 2) a biased sampling approach to handle special values for certain tuning knobs. LlamaTune compares favorably with the state-of-the-art optimizers across a diverse set of workloads achieving the best performing configurations with up to 11× fewer workload runs, and reaching up to 21% higher throughput. We also show that benefits from LlamaTune generalizes across random-forest and Gaussian Process-based Bayesian optimizers. While the journey to perform database tuning at cloud-scale remains long, LlamaTune goes a long way in making automatic DB tuning practical at scale.