Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Luisa Zintgraf; Sam Devlin; Kamil Ciosek; Shimon Whiteson; Katja Hofmann

Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

Luisa Zintgraf ,
Sam Devlin ,
Kamil Ciosek ,
Shimon Whiteson ,
Katja Hofmann

20th International Conference on Autonomous Agents and Multiagent Systems | May 2021

Download BibTex

Agents that interact with other agents often do not know a priori what the other agents’ strategies are, but have to maximise their own online return while interacting with and learning about others. The optimal adaptive behaviour under uncertainty over the other agents’ strategies w.r.t. some prior can in principle be computed using the Interactive Bayesian Reinforcement Learning framework. Unfortunately, doing so is intractable in most settings, and existing approximation methods are restricted to small tasks. To overcome this, we propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior. To model beliefs over other agents, we combine sequential and hierarchical Variational Auto-Encoders, and meta-train this inference model alongside the policy. We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.

Reinforcement learning in Minecraft: Challenges and opportunities in multiplayer games

Games have a long history as test beds in pushing AI research forward. From early works on chess and Go to more recent advances on modern video games, researchers have used games as complex decision-making benchmarks. Learning in multi-agent settings is one of the fundamental problems in AI research, posing unique challenges for agents that learn independently, such as coordinating with other learning agents or adapting rapidly online to agents they haven’t previously learned with.

In this webinar, join Microsoft researcher Sam Devlin and Queen Mary University of London researchers Martin Balla, Raluca D. Gaina, and Diego Perez-Liebana to learn how the latest AI techniques can be applied to multiplayer games in the challenging and diverse 3D environment of Minecraft. The researchers will demonstrate how Project Malmo—a platform for AI experimentation built on Minecraft—provides an ideal environment for designing different and rich training tasks and how reinforcement learning agents can be trained in these scenarios. They’ll provide examples of tasks, agent implementations, and the latest research done in this area.

Together, you’ll explore:

The Malmo platform and multi-agent tasks
Using the reinforcement learning library RLlib to implement and train agents to complete Minecraft tasks
Coordinated policies for collaborative multi-agent tasks
Open challenges in learning robust policies for ad-hoc teamwork

Resource list:

Project Malmo – Microsoft Research (opens in new tab) (project page)
Project Malmo key repository (opens in new tab) (GitHub)
Difference Rewards Policy Gradients (opens in new tab) (paper)
Deep Interactive Bayesian Reinforcement Learning via Meta-Learning (opens in new tab) (paper)

*This on-demand webinar features a previously recorded Q&A session and open captioning.

Explore more Microsoft Research webinars (opens in new tab)