Fast BGP Simulation of Large Datacenters
- Nuno Lopes ,
- Andrey Rybalchenko
VMCAI: Verification, Model Checking, and Abstract Interpretation |
Frequent configuration churn caused by maintenance, upgrades, hardware and firmware failures regularly leads to costly outages. Preventing network outages caused by misconfigurations is important for ensuring high network availability. Dealing with production datacenters with thousands of routers is a major challenge.
Network verification inspects the forwarding tables of routers. These tables are determined by the so-called control plane, which is given by the steady state of the routing protocols. The ability to simulate routing protocols given router configuration files and thus obtain the control plane is a key enabling technology.
In this paper, we present FastPlane, an efficient BGP simulator. BGP support is mandated by modern datacenter designs, which choose BGP as the routing protocol. The key to FastPlane’s performance is our insight into the routing policy of cloud datacenters that allows the usage of a generalized Dijkstra’s algorithm. The insight reveals that these networks are monotonic, i.e., route advertisements decrease preference when propagated through the network.
The evaluation on real world, production datacenters of a major cloud provider shows that FastPlane (1) is two orders of magnitude faster than the state-of-the-art on small and medium datacenters, and (2) goes beyond the state-of-the-art by scaling to large datacenters. FastPlane was instrumental in finding several production bugs in router firmware, routing policy, and network architecture.