We are focused on building a scale-out, predictable, resource management substrate for big-data workloads. To this end, we started with providing predictable allocation SLOs for jobs that have completion time requirements, and then focused on improving cluster efficiency. Using Apache Hadoop YARN as the base, we have built a scale-out fabric by composing the following projects:
1. Preemption (YARN-45 (opens in new tab)): We added work-conserving preemption to YARN to improve cluster utilization.
2. Rayon (YARN-1051 (opens in new tab)): We added a resource reservation layer to YARN scheduler to support predictable resource allocation. Rayon ships with YARN 2.6.
3. Mercury (YARN-2877 (opens in new tab)): We enhanced the YARN scheduler to improve cluster utilization by minimizing scheduling latency. Mercury will ship with YARN 3.0.
4. Federation (YARN-2915 (opens in new tab)): Building upon Mercury, we developed scale-out resource management substrate. The idea is to leverage a new “federation layer” to combine multiple YARN clusters into a single datacenter scale YARN cluster. This has allowed us to leverage stabilization work as well as improvements made to YARN by the community. Federation will ship with YARN 3.x.
5. Morpheus (YARN-5326 (opens in new tab)): Many of the production jobs that have completion time deadlines are periodic. That is, the same job is periodically run on newly arriving data. Morpheus builds upon Rayon to provide predictable allocation for such jobs.
6. Medea (YARN-6592 (opens in new tab)): We are working on enhancing the YARN scheduler to better support long-running services. This requires adding constraints to the scheduler such as affinity, anti-affinity, etc.
The stack we have built is being deployed at scale.