Dramatic cloud over city of Montreal skyline at Quebec, Canada.

Microsoft Research Lab – Montréal

Projects

Theoretical foundations for Offline Reinforcement Learning

MSR contributions in the space of theoretical foundation for Offline RL Globally, MSR has made some recent advances in the space of the statistical foundations of Offline RL (opens in new tab), where a central question is to understand what…

Offline Reinforcement Learning Algorithms

In this page, we describe the algorithmic landscape of Offline RL and enumerate some algorithmic development efforts made by MSR in this space In a tutorial lecture (opens in new tab) on Offline RL (opens in new tab), we analyze its…

Towards a generalized policy iteration theorem

We intend to advance the theoretical understanding of actor-critic algorithms under the lens of policy iteration. Policy Iteration consists in a loop over two processing steps: policy evaluation and policy improvement. Policy Iteration has strong convergence properties when the policy…

Project Galena

Galena uses imitation learning to provide predictions of controller inputs to compensate for poor network conditions. If client-to-server lag occurs, predictions are used instead of user input.

Human-Centered Transparency & Intelligibility in AI

We take a human-centered approach to transparency, empirically studying how to provide stakeholders of ML systems with the right information to achieve their goals.

Offline Reinforcement Learning

This page introduces the research area of Offline Reinforcement Learning (also sometimes called Batch Reinforcement Learning). It consists in training a target policy from a fixed dataset of trajectories collected with a behavioral policy. In comparison to classic Reinforcement Learning…