{"id":870966,"date":"2022-08-17T10:48:26","date_gmt":"2022-08-17T17:48:26","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/"},"modified":"2022-08-17T10:48:26","modified_gmt":"2022-08-17T17:48:26","slug":"graph-based-skill-acquisition-for-reinforcement-learning","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/graph-based-skill-acquisition-for-reinforcement-learning\/","title":{"rendered":"Graph-Based Skill Acquisition For Reinforcement Learning"},"content":{"rendered":"\n\n\n<p class=\"wp-block-paragraph\">In machine learning, Reinforcement Learning (RL) is an important tool for creating intelligent agents that learn solely through experience. One particular subarea within the RL domain that has received great attention is how to define macro-actions, which are temporal abstractions composed of a sequence of primitive actions. This subarea, loosely called skill acquisition, has been under development for several years and has led to better results in a diversity of RL problems. Among the many skill acquisition approaches, graph-based methods have received considerable attention. This survey presents an overview of graph-based skill acquisition methods for RL. We cover a diversity of these approaches and discuss how they evolved throughout the years. Finally, we also discuss the current challenges and open issues in the area of graph-based skill acquisition for RL.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In machine learning, Reinforcement Learning (RL) is an important tool for creating intelligent agents that learn solely through experience. One particular subarea within the RL domain that has received great attention is how to define macro-actions, which are temporal abstractions composed of a sequence of primitive actions. This subarea, loosely called skill acquisition, has been [&hellip;]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","msr-author-ordering":[{"type":"user_nicename","value":"Matheus Mendon\u00e7a","user_id":"41940"},{"type":"guest","value":"artur-ziviani","user_id":"870333"},{"type":"guest","value":"andre-da-motta-salles-barreto","user_id":"870330"}],"msr_publishername":"","msr_publisher_other":"","msr_booktitle":"","msr_chapter":"","msr_edition":"","msr_editors":"","msr_how_published":"","msr_isbn":"","msr_issue":"1","msr_journal":"ACM Computing Surveys","msr_number":"","msr_organization":"","msr_pages_string":"","msr_page_range_start":"6","msr_page_range_end":"","msr_series":"","msr_volume":"52","msr_copyright":"","msr_conference_name":"","msr_doi":"10.1145\/3291045","msr_arxiv_id":"","msr_mag_id":"2914949695","msr_other_authors":"","msr_other_contributors":"","msr_speaker":"","msr_award":"","msr_affiliation":"","msr_institution":"","msr_host":"","msr_version":"","msr_duration":"","msr_release_tracker_id":"","msr_highlight_type":"","msr_date_display_format":"","msr_main_download_label":"","msr_external_link_label":"","msr_doi_label":"","msr_published_date":"2019-02-12","msr_startdate":"","msr_presentation_date":"","msr_highlight_text":"","msr_notes":"","msr_longbiography":"","msr_publicationurl":"","msr_external_url":"","msr_secondary_video_url":"","msr_conference_url":"","msr_journal_url":"","msr_year":2019,"msr_month":2,"msr_day":12,"msr_microsoftintellectualproperty":false,"msr_pub_id":"","msr_publication_uploader":[{"type":"doi","viewUrl":"false","id":false,"title":"10.1145\/3291045","label_id":243106,"label":0},{"type":"url","viewUrl":"false","id":false,"title":"https:\/\/dl.acm.org\/doi\/10.1145\/3291045","label_id":243109,"label":0}],"msr_related_uploader":[],"msr_original_fields_of_study":["Computer science","Artificial intelligence","Reinforcement learning","Cluster analysis","Dreyfus model of skill acquisition","Graph (abstract data type)","Intelligent agent","Graph based","Centrality"],"msr_s2_paper_id":"","msr_s2_pdf_url":"","msr_citation_count_updated":"","msr_citation_count":0,"msr_influential_citations":0,"msr_reference_count":0,"msr_s2_open_access":false,"msr_s2_author_ids":[],"msr_pub_ids":[],"msr_hide_image_in_river":0,"footnotes":""},"msr-research-highlight":[],"research-area":[13561,13556],"msr-publication-type":[193715],"msr-publisher":[],"msr-publication-cta":[],"msr-focus-area":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246694,258700,248083,246691,251620,256612,252943,246820,265830],"msr-conference":[],"msr-journal":[263416],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-870966","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-field-of-study-artificial-intelligence","msr-field-of-study-centrality","msr-field-of-study-cluster-analysis","msr-field-of-study-computer-science","msr-field-of-study-graph-abstract-data-type","msr-field-of-study-graph-based","msr-field-of-study-intelligent-agent","msr-field-of-study-reinforcement-learning","msr-field-of-study-skill-acquisition"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2019-02-12","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"ACM Computing Surveys","msr_volume":"52","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"1","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":0,"msr_main_download":"","msr_publicationurl":"","msr_doi":"10.1145\/3291045","msr_publication_uploader":[{"type":"doi","viewUrl":"false","id":"false","title":"10.1145\/3291045","label_id":"243106","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/dl.acm.org\/doi\/10.1145\/3291045","label_id":"243109","label":0}],"msr_related_uploader":[],"msr_citation_count":0,"msr_citation_count_updated":"","msr_s2_paper_id":"","msr_influential_citations":0,"msr_reference_count":0,"msr_arxiv_id":"","msr_s2_author_ids":[],"msr_s2_open_access":false,"msr_s2_pdf_url":null,"msr_attachments":[],"msr-author-ordering":[{"type":"user_nicename","value":"Matheus Mendon\u00e7a","user_id":41940,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Matheus Mendon\u00e7a"},{"type":"guest","value":"artur-ziviani","user_id":870333,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=artur-ziviani"},{"type":"guest","value":"andre-da-motta-salles-barreto","user_id":870330,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=andre-da-motta-salles-barreto"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[],"msr_project":[],"publication":[],"video":[],"msr-tool":[],"msr_publication_type":"article","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/870966","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":1,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/870966\/revisions"}],"predecessor-version":[{"id":870969,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/870966\/revisions\/870969"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=870966"}],"wp:term":[{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=870966"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=870966"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=870966"},{"taxonomy":"msr-publisher","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publisher?post=870966"},{"taxonomy":"msr-publication-cta","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-cta?post=870966"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=870966"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=870966"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=870966"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=870966"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=870966"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=870966"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=870966"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=870966"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}