Pinwheel graphic representing the Microsoft Research Summit

Return to Event: Microsoft Research Summit 2021

Microsoft Research Summit 2021 • Videos

Research talk: Reinforcement learning with preference feedback

Speaker: Aadirupa Saha, Postdoctoral Researcher, Microsoft Research NYC

In Preference-based Reinforcement Learning (PbRL), an agent receives feedback only in terms of rank-ordered preferences over a set of selected actions, unlike the absolute reward feedback in traditional reinforcement learning. This is relevant in settings where it is difficult for the system designer to explicitly specify a reward function to achieve a desired behavior, but instead possible to elicit coarser feedback, say from an expert, about actions preferred over other actions at states. The success of the traditional reinforcement learning framework crucially hinges on the underlying agent-reward model. This, however, depends on how accurately a system designer can express an appropriate reward function, which is often a non-trivial task. The main novelty of the mobility-aware centralized reinforcement learning (MCRL) framework is the ability to learn from non-numeric, preference-based feedback that eliminates the need to handcraft numeric reward models. We will set up a formal framework for PbRL and discuss different real-world applications. Though introduced almost a decade ago, we will also discuss a problem here—that most work in PbRL has been primarily applied or experimental in nature, barring a handful of very recent ventures on the theory side. Finally, we will discuss the limitations of the existing techniques and the scope of future developments.

Learn more about the 2021 Microsoft Research Summit: https://Aka.ms/researchsummit

Évènement :: Microsoft Research Summit 2021
Piste :: Reinforcement Learning
Date:: October 20, 2021
Haut-parleurs:: Aadirupa Saha
Affiliation:: Microsoft Research NYC

- Aadirupa Saha
  
  Postdoctoral Researcher
Domaine de recherche
- Artificial intelligence
Événement
- Microsoft Research Summit 2021

Reinforcement Learning

Opening remarks: Reinforcement Learning
October 20, 2021
Speakers:

Katja Hofmann
Keynote: Key research challenges for real world reinforcement learning
October 20, 2021
Speakers:

John Langford
Research talk: Reinforcement learning with preference feedback
October 20, 2021
Speakers:

Aadirupa Saha
Research talk: Safe reinforcement learning using advantage-based intervention
October 20, 2021
Speakers:

Nolan Wagener
Research talk: Evaluating human-like navigation in 3D video games
October 20, 2021
Speakers:

Raluca Stevenson,

Ida Momennejad
Research talk: Maia Chess: A human-like neural network chess engine
October 20, 2021
Speakers:

Reid McIlroy-Young
Fireside chat: Opportunities and challenges in human-oriented AI
October 20, 2021
Speakers:

Ashley Llorens,

Katja Hofmann,

Siddhartha Sen
Research talk: Making deep reinforcement learning industrially applicable
October 20, 2021
Speakers:

Jiang Bian,

Tie-Yan Liu
Panel: Generalization in reinforcement learning
October 20, 2021
Speakers:

Mingfei Sun,

Roberta Raileanu,

Harm van Seijen

, et. al.
Research talk: Project Dexter: Machine learning and automatic decision-making for robotic manipulation
October 20, 2021
Speakers:

Andrey Kolobov,

Ching-An Cheng
Research talk: Successor feature sets: Generalizing successor representations across policies
October 20, 2021
Speakers:

Kiante Brantley
Research talk: Towards efficient generalization in continual RL using episodic memory
October 20, 2021
Speakers:

Mandana Samiei
Research talk: Breaking the deadly triad with a target network
October 20, 2021
Speakers:

Shangtong Zhang
Panel: The future of reinforcement learning
October 20, 2021
Speakers:

Geoff Gordon,

Emma Brunskill,

Craig Boutilier

, et. al.
Closing remarks: Reinforcement Learning
October 20, 2021
Speakers:

John Langford