Online Experimentation at Microsoft

  • Ronny Kohavi ,
  • Thomas Crook ,
  • Roger Longbotham ,
  • Brian Frasca ,
  • Randy Henne ,
  • Juan Lavista Ferres ,
  • Tamir Melamed ,

Controlled experiments, also called randomized experiments and A/B tests, have had a profound influence on multiple fields, including
medicine, agriculture, manufacturing, and advertising. Through randomization and proper design, experiments allow establishing causality
scientifically, which is why they are the gold standard in drug tests. In software development, multiple techniques are used to define
product requirements; controlled experiments provide a valuable way to assess the impact of new features on customer behavior. At
Microsoft, we have built the capability for running controlled experiments on web sites and services, thus enabling a more scientific
approach to evaluating ideas at different stages of the planning process. In our previous papers, we did not have good examples of
controlled experiments at Microsoft; now we do! The humbling results we share bring to question whether a-priori prioritization is as good
as most people believe it is. The Experimentation Platform (ExP) was built to accelerate innovation through trustworthy experimentation.
Along the way, we had to tackle both technical and cultural challenges and we provided software developers, program managers, and
designers the benefit of an unbiased ear to listen to their customers and make data-driven decisions. A technical survey of the literature on
controlled experiments was recently published by us in a journal (Kohavi, Longbotham, Sommerfield, & Henne, 2009). The goal of this
paper is to share lessons and challenges focused more on the cultural aspects and the value of controlled experiments.