The CacheLib Caching Engine: Design and Experiences at Scale
- Benjamin Berg ,
- Daniel S. Berger ,
- Sara McAllister ,
- Isaac Grosof ,
- Sathya Gunasekar ,
- Jimmy Lu ,
- Michael Uhlar ,
- Jim Carrig ,
- Nathan Beckmann ,
- Mor Harchol-Balter ,
- Gregory R. Ganger
2020 Operating Systems Design and Implementation |
Web services rely on caching at nearly every layer of the system architecture. Commonly, each cache is implemented and maintained independently by a distinct team and is highly specialized to its function. For example, an application-data cache would be independent from a CDN cache. However, this approach ignores the difficult challenges that different caching systems have in common, greatly increasing the overall effort required to deploy, maintain, and scale each cache.
This paper presents a different approach to cache development, successfully employed at Facebook, which extracts a core set of common requirements and functionality from otherwise disjoint caching systems. CacheLib is a general-purpose caching engine, designed based on experiences with a range of caching use cases at Facebook, that facilitates the easy development and maintenance of caches. CacheLib was first deployed at Facebook in 2017 and today powers over 70 services including CDN, storage, and application-data caches.
This paper describes our experiences during the transition from independent, specialized caches to the widespread adoption of CacheLib. We explain how the characteristics of production workloads and use cases at Facebook drove important design decisions. We describe how caches at Facebook have evolved over time, including the significant benefits seen from deploying CacheLib. We also discuss the implications our experiences have for future caching design and research.