{"id":708421,"date":"2020-11-27T18:15:11","date_gmt":"2020-11-28T02:15:11","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=708421"},"modified":"2021-12-12T01:42:59","modified_gmt":"2021-12-12T09:42:59","slug":"reinforcement-learning-algorithms-and-applications","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/reinforcement-learning-algorithms-and-applications\/","title":{"rendered":"Reinforcement Learning: Algorithms and Applications"},"content":{"rendered":"<p>In this project, we focus on developing RL algorithms, especially deep RL algorithms for real-world applications. We are interesting in the following topics.<\/p>\n<p><strong>Distributional Reinforcement Learning.<\/strong> Distributional Reinforcement Learning focuses on developing RL algorithms which model the return distribution, rather than the expectation as in conventional RL. Such algorithms have been demonstrated to be effective when combined with deep neural network for function approximation. The goal here is to explore the potential of distributional RL in every aspect, including but not limited to parameterization, distribution metric based temporal difference loss, and the interaction between distributional formulation and DNN.  <\/p>\n<p><strong>Representation Learning and Interpretability for RL<\/strong>, where we focus on the discovering and leveraging rich structures in representation for Deep Reinforcement Learning, including but not limited to 1) low-dimensional representation structure for high-dimensional\/redundant input, 2) decomposable\/factored structure in terms of reward and transition, 3\uff09 casual relations. <\/p>\n<p><strong>RL for Logistics<\/strong>, where we focus on developing efficient Deep Reinforcement Learning algorithms for logistics.  <\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this project, we focus on developing RL algorithms, especially deep RL algorithms for real-world applications. We are interesting in the following topics. Distributional Reinforcement Learning. Distributional Reinforcement Learning focuses on developing RL algorithms which model the return distribution, rather than the expectation as in conventional RL. Such algorithms have been demonstrated to be effective [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-708421","msr-project","type-msr-project","status-publish","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[548745,621192,621201,705331,708370,717736,796777,796885,797137],"related-downloads":[],"related-videos":[],"related-groups":[705946],"related-events":[],"related-opportunities":[],"related-posts":[627537],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[{"type":"user_nicename","display_name":"Jiang Bian","user_id":38481,"people_section":"Section name 0","alias":"jiabia"},{"type":"user_nicename","display_name":"Chang Liu","user_id":39889,"people_section":"Section name 0","alias":"changliu"},{"type":"user_nicename","display_name":"Tao Qin","user_id":33871,"people_section":"Section name 0","alias":"taoqin"},{"type":"user_nicename","display_name":"Li Zhao","user_id":36152,"people_section":"Section name 0","alias":"lizo"}],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/708421","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/708421\/revisions"}],"predecessor-version":[{"id":708427,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/708421\/revisions\/708427"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=708421"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=708421"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=708421"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=708421"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=708421"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}