{"id":852783,"date":"2022-06-16T08:05:33","date_gmt":"2022-06-16T15:05:33","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=852783"},"modified":"2022-06-16T13:36:16","modified_gmt":"2022-06-16T20:36:16","slug":"offlinerltheory","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/offlinerltheory\/","title":{"rendered":"Theoretical foundations for Offline Reinforcement Learning"},"content":{"rendered":"
\n\t
\n\t\t
\n\t\t\t\t\t<\/div>\n\t\t\n\t\t
\n\t\t\t\n\t\t\t
\n\t\t\t\t\n\t\t\t\t
\n\t\t\t\t\t\n\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n

Theoretical foundations for Offline Reinforcement Learning<\/h1>\n\n\n\n

MSR contributions in the space of theoretical foundation for Offline RL<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n

Globally, MSR has made some recent advances in the space of the statistical foundations of Offline RL (opens in new tab)<\/span><\/a>, where a central question is to understand what representational conditions (involving the function approximator) and coverage conditions (involving the data distribution) enable sample efficient offline RL in large state spaces. Other theoretical questions about specific algorithms have also been addressed:<\/p>\n\n\n\n