Canopy: An End-to-End Performance Tracing And Analysis System

  • Jonathan Kaldor ,
  • ,
  • Michał Bejda ,
  • Edison Gao ,
  • Wiktor Kuropatwa ,
  • Joe O'Neill ,
  • Kian Win Ong ,
  • Bill Schaller ,
  • Pingjia Shan ,
  • Brendan Viscomi ,
  • Vinod Venkataraman ,
  • Kaushik Veeraraghavan ,
  • Yee Jiun Song

2017 Symposium on Operating Systems Principles |

Published by ACM

PDF

This paper presents Canopy, Facebook’s end-to-end performance tracing infrastructure. Canopy records causally related performance data across the end-to-end execution path of requests, including from browsers, mobile applications, and backend services. Canopy processes traces in near real-time, derives user-specified features, and outputs to performance datasets that aggregate across billions of requests. Using Canopy, Facebook engineers can query and analyze performance data in real-time. Canopy addresses three challenges we have encountered in scaling performance analysis: supporting the range of execution and performance models used by different components of the Facebook stack; supporting interactive ad-hoc analysis of performance data; and enabling deep customization by users, from sampling traces to extracting and visualizing features. Canopy currently records and processes over 1 billion traces per day. We discuss how Canopy has evolved to apply to a wide range of scenarios, and present case studies of its use in solving various performance challenges.