This is an umbrella project for machine learning with explore-exploit tradeoff: the trade-off between acquiring and using information. This is a mature, yet very active, research area studied in Machine Learning, Theoretical Computer Science, Operations Research, and Economics. Much of our activity focuses on “multi-armed bandits” and “contextual bandits”, relatively simple and yet very powerful models for explore-exploit tradeoff.
We are located in (or heavily collaborating with) Microsoft Research New York City. Most of us are involved in Multi-World Testing: an approach & system for contextual bandit learning.
People
Robert Schapire
Partner Researcher
Paul Mineiro
Principal Data and Applied Scientist
Siddhartha Sen
Principal Researcher
Sarah Bird
Chief Product Officer of Responsible AI @ Microsoft
Alex Slivkins
Senior Principal Researcher
Miro Dudík
Sr Principal Researcher Manager
John Langford
Partner Researcher Manager
Markus Cozowicz
Senior Research Engineer