At SIGMOD 2018, a team from Microsoft Research will be presenting a new embedded key-value store called FASTER, described in their paper “FASTER: A Concurrent Key-Value Store with In-Place Updates”. As its name suggests, FASTER makes a major leap forward in terms of supporting fast and frequent lookups and updates of large amounts of state information – a particularly challenging problem for applications in the cloud today. For example, in scenarios such as Internet-of-Things, billions of devices report and update state such as per-device performance counters. In advertising platforms, user activity such as ad and search result clicks drive the creation and frequent update of per-user behavior models and per-ad statistics.
Applications that maintain such state typically scale out on multiple machines for memory, severely underutilizing other resources such as storage and networking on the machine. FASTER takes a different approach; it leverages the temporal locality inherent in all these applications to control the in-memory footprint of the system and cache the frequently accessed values without maintaining any fine-grained statistics per record. FASTER is a single-node shared memory key-value store library that makes two important technical innovations:
- It proposes a cache-friendly and concurrent latch-free hash index designed to grow and shrink dynamically while maintaining logical pointers to records in a log.
- It proposes a new concurrent hybrid log record allocator abstraction backing the index that spans fast storage (such as cloud storage and SSD) and main memory.
The FASTER hash index is an array of cache-line-sized hash buckets, each with 8-byte entries that hold hash tags and logical pointers to records that are stored separately in the hybrid log. All operations on the hash table are latch-free, using atomic compare-and-swap instructions for very high performance. Keys are not stored as part of the index structure in order to keep its memory footprint small.
While traditional key-value stores have used log-structured record organizations, the hybrid log of FASTER seamlessly combines log-structuring with read-copy-updates (that are good for external storage) and in-place updates (that are good for in-memory performance). Specifically, the head of the hybrid log on storage uses a read-copy-update strategy for updating records, whereas the tail of the hybrid log in main memory uses in-place updates. In between these two regions lies a read-only region in memory that provides hot records a “second chance” to be quickly copied back to the tail. This record organization captures temporal locality of updates, allows records to spill to sequential storage efficiently and enables a natural clustering of hot records in memory for fast in-place updates. Maintaining this elegant design in a concurrent latch-free setting required solving new technical challenges and proposing an extended epoch-protection-based concurrent system design framework that is detailed in the paper.
The result? FASTER can outperform even pure in-memory data structures such as the Intel TBB hash map when the working set fits in memory. Further, it outperforms today’s key-value stores and caching systems such as RocksDB and Redis by several orders-of-magnitude.
To support failure recovery, FASTER incorporates a recovery strategy that can bring the system back to a recent consistent state at low cost, without blocking or having to create a separate “write ahead log”, a recovery mechanism used in traditional database systems. Researchers are currently working on writing a follow up paper that describes this innovation in more detail. They are also working on using the key-value store in systems at Microsoft, including within their previous research project, Trill (site | blog post), a highly successful and widely deployed incremental stream analytics library.