{"id":681471,"date":"2020-09-28T08:00:43","date_gmt":"2020-09-28T15:00:43","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=681471"},"modified":"2021-03-10T19:47:37","modified_gmt":"2021-03-11T03:47:37","slug":"coax-rl","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/coax-rl\/","title":{"rendered":"coax: A Modular RL Package"},"content":{"rendered":"

coax<\/h2>\n

coax is a modular Reinforcement Learning (RL) Python package for solving OpenAI Gym (opens in new tab)<\/span><\/a> environments with JAX (opens in new tab)<\/span><\/a>-based function approximators (using Haiku (opens in new tab)<\/span><\/a>).<\/p>\n

RL concepts, not agents<\/h3>\n

The primary thing that sets coax<\/strong> apart from other packages is that is designed to align with the core RL concepts, not with the high-level concept of an agent<\/em>. This makes\u00a0coax<\/strong> more modular and user-friendly for RL researchers and practitioners.<\/p>\n

You’re in control<\/h3>\n

Other RL frameworks often hide structure that you (the RL practitioner) are interested in. Most notably, the neural network architecture of the function approximators is often hidden from you. In coax<\/strong>, the network architecture takes center stage. You are in charge of defining their own forward-pass function.<\/p>\n

Another bit of structure that other RL frameworks hide from you is the main training loop. This makes it hard to take an algorithm from paper to code. The design of\u00a0coax<\/strong> is agnostic of the details of your training loop. You are in charge of how and when you update your function approximators.<\/p>\n

Learn More<\/h3>\n

Documentation > (opens in new tab)<\/span><\/a><\/p>\n

\"Documentation (opens in new tab)<\/span><\/a><\/p>\n

GitHub > (opens in new tab)<\/span><\/a><\/p>\n

\"coax (opens in new tab)<\/span><\/a><\/p>\n

Webinar > (opens in new tab)<\/span><\/a><\/p>\n

\"a (opens in new tab)<\/span><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"

coax is a modular Reinforcement Learning (RL) Python package for solving OpenAI Gym environments with JAX-based function approximators (using Haiku).<\/p>\n","protected":false},"featured_media":686862,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-681471","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[690306],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[],"related-articles":[],"tab-content":[],"slides":[],"related-researchers":[],"msr_research_lab":[],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/681471"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":26,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/681471\/revisions"}],"predecessor-version":[{"id":732436,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/681471\/revisions\/732436"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/686862"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=681471"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=681471"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=681471"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=681471"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=681471"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}