{"id":794090,"date":"2021-11-16T08:00:40","date_gmt":"2021-11-16T16:00:40","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=794090"},"modified":"2021-11-09T15:27:11","modified_gmt":"2021-11-09T23:27:11","slug":"research-talk-successor-feature-sets-generalizing-successor-representations-across-policies","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/research-talk-successor-feature-sets-generalizing-successor-representations-across-policies\/","title":{"rendered":"Research talk: Successor feature sets: Generalizing successor representations across policies"},"content":{"rendered":"

Successor-style representations have many advantages for reinforcement learning. For example, they can help an agent generalize from experience to new goals. However, successor-style representations are not optimized to generalize across policies\u2014typically, a limited-length list of policies is maintained and information shared among them by representation learning or generalized policy iteration. Join University of Maryland PhD candidate Kiant\u00e9 Brantley to address these limitations in successor-style representations. With collaborators from Microsoft Research Montr\u00e9al, he developed a new general successor-style representation, which brings together ideas from predictive state representations, belief space value iteration, and convex analysis. The new representation is highly expressive. For example, it allows for efficiently reading off an optimal policy for a new reward function or a policy that imitates a demonstration. Together, you\u2019ll explore the basics of successor-style representation, the challenges of current approaches, and results of the proposed approach on small, known environments.<\/p>\n

Learn more about the 2021 Microsoft Research Summit: https:\/\/Aka.ms\/researchsummit (opens in new tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

Successor-style representations have many advantages for reinforcement learning. For example, they can help an agent generalize from experience to new goals. However, successor-style representations are not optimized to generalize across policies\u2014typically, a limited-length list of policies is maintained and information shared among them by representation learning or generalized policy iteration. Join University of Maryland PhD […]<\/p>\n","protected":false},"featured_media":794093,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-video-type":[261263,261311],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-794090","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-video-type-research-summit-2021","msr-video-type-reinforcement-learning","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/youtu.be\/Gfv26bD3Mlo","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/794090"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/794090\/revisions"}],"predecessor-version":[{"id":794096,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/794090\/revisions\/794096"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/794093"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=794090"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=794090"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=794090"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=794090"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=794090"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=794090"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=794090"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}