{"id":246407,"date":"2014-06-21T12:23:54","date_gmt":"2014-06-21T19:23:54","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=246407"},"modified":"2018-10-16T22:23:59","modified_gmt":"2018-10-17T05:23:59","slug":"combinatorial-partial-monitoring-game-linear-feedback-applications","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/combinatorial-partial-monitoring-game-linear-feedback-applications\/","title":{"rendered":"Combinatorial Partial Monitoring Game with Linear Feedback and its Applications"},"content":{"rendered":"
In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultaneously addresses limited feedback, in\ufb01nite outcome space of the environment and exponentially large action space of the player. We present the Global Con\ufb01dence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and \ufb01nite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves O(T^{2\/3} logT) distribution-independent regret and O(logT) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log|X| rather than |X|, where X is the action space. GCB isolates of\ufb02ine optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowd sourcing application leading to both an ef\ufb01cient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.<\/p>\n","protected":false},"excerpt":{"rendered":"
In online learning, a player chooses actions to play and receives reward and feedback from the environment with the goal of maximizing her reward over time. In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultaneously addresses limited feedback, in\ufb01nite outcome space of the environment and […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13561,13556,13548,13559],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-246407","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-research-area-economics","msr-research-area-social-sciences","msr-locale-en_us"],"msr_publishername":"","msr_edition":"Proceedings of the 31st International Conference on Machine Learning (ICML'2014), Beijing, China, June 2014.","msr_affiliation":"","msr_published_date":"2014-06-21","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"JMLR: W&CP volume 32.","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"457461","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"file","title":"CombinatorialPartialMonitoringGamewith LinearFeedbackandItsApplications","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2014\/06\/icml14_cpm_main.pdf","id":457461,"label_id":0},{"type":"file","title":"CombinatorialPartialMonitoringGamewith LinearFeedbackandItsApplication","viewUrl":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2014\/06\/icml14_cpm_supplementary.pdf","id":457464,"label_id":0}],"msr_related_uploader":"","msr_attachments":[{"id":457464,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/01icml14_cpm_supplementary.pdf"},{"id":457461,"url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/01icml14_cpm_main.pdf"}],"msr-author-ordering":[{"type":"text","value":"Tian Lin","user_id":0,"rest_url":false},{"type":"text","value":"Bruno Abrahao","user_id":0,"rest_url":false},{"type":"text","value":"Robert Kleinberg","user_id":0,"rest_url":false},{"type":"text","value":"John C. S. Lui","user_id":0,"rest_url":false},{"type":"user_nicename","value":"Wei Chen","user_id":34795,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Wei Chen"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[],"msr_group":[802999],"msr_project":[],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/246407","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/246407\/revisions"}],"predecessor-version":[{"id":428514,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/246407\/revisions\/428514"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=246407"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=246407"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=246407"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=246407"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=246407"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=246407"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=246407"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=246407"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=246407"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=246407"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=246407"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=246407"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=246407"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=246407"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=246407"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=246407"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}