Mimalloc: Free List Sharding in Action

MSR-TR-2019-18 |

Published by Microsoft

Modern memory allocators have to balance many simultaneous demands,
including performance, security, the presence of concurrency, and
application-specific demands depending on the context of their use. One
increasing use-case for allocators is as back-end implementations of
languages, such as Swift and Python, that use reference counting to
automatically deallocate objects. We present mimalloc, a memory allocator
that effectively balances these demands, shows significant performance
advantages over existing allocators, and is tailored to support languages
that rely on the memory allocator as a backend for reference counting.
Mimalloc combines several innovations to achieve this result. First, it
uses three page-local sharded free lists to increase locality, avoid
contention, and support a highly-tuned allocate and free fast path. These
free lists also support *temporal cadence*, which allows the allocator to
predictably leave the fast path for regular maintenance tasks such as
supporting deferred freeing, handling frees from non-local threads, etc.
While influenced by the allocation workload of the reference-counted Lean
and Koka programming language, we show that mimalloc has superior
performance to modern commercial memory allocators, including tcmalloc
and jemalloc, with speed improvements of 7% and 14%, respectively, on
redis, and consistently out performs over a wide range of sequential and
concurrent benchmarks. Allocators tailored to provide an efficient
runtime for reference-counting languages reduce the implementation burden
on developers and encourage the creation of innovative new language
designs.