Canopy: An End-to-End Performance Tracing And Analysis System
- Jonathan Kaldor ,
- Jonathan Mace ,
- Michał Bejda ,
- Edison Gao ,
- Wiktor Kuropatwa ,
- Joe O'Neill ,
- Kian Win Ong ,
- Bill Schaller ,
- Pingjia Shan ,
- Brendan Viscomi ,
- Vinod Venkataraman ,
- Kaushik Veeraraghavan ,
- Yee Jiun Song
2017 Symposium on Operating Systems Principles |
Published by ACM
This paper presents Canopy, Facebook’s end-to-end performance tracing infrastructure. Canopy records causally related performance data across the end-to-end execution path of requests, including from browsers, mobile applications, and backend services. Canopy processes traces in near real-time, derives user-specified features, and outputs to performance datasets that aggregate across billions of requests. Using Canopy, Facebook engineers can query and analyze performance data in real-time. Canopy addresses three challenges we have encountered in scaling performance analysis: supporting the range of execution and performance models used by different components of the Facebook stack; supporting interactive ad-hoc analysis of performance data; and enabling deep customization by users, from sampling traces to extracting and visualizing features. Canopy currently records and processes over 1 billion traces per day. We discuss how Canopy has evolved to apply to a wide range of scenarios, and present case studies of its use in solving various performance challenges.