{"id":756592,"date":"2021-06-23T23:18:30","date_gmt":"2021-06-24T06:18:30","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-research-item&p=756592"},"modified":"2022-10-19T23:35:54","modified_gmt":"2022-10-20T06:35:54","slug":"optimal-regret-algorithm-for-pseudo-1d-bandit-convex-optimization-2","status":"publish","type":"msr-research-item","link":"https:\/\/www.microsoft.com\/en-us\/research\/publication\/optimal-regret-algorithm-for-pseudo-1d-bandit-convex-optimization-2\/","title":{"rendered":"Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization"},"content":{"rendered":"

We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost\/reward functions $\\f_t$ admit a “pseudo-1d” structure, i.e. $\\f_t(\\w) = \\loss_t(\\pred_t(\\w))$ where the output of $\\pred_t$ is one-dimensional. At each round, the learner observes context $\\x_t$, plays prediction $\\pred_t(\\w_t; \\x_t)$ (e.g. $\\pred_t(\\cdot)=\\langle \\x_t, \\cdot\\rangle$) for some $\\w_t \\in \\mathbb{R}^d$ and observes loss $\\loss_t(\\pred_t(\\w_t))$ where $\\loss_t$ is a convex Lipschitz-continuous function. The goal is to minimize the standard regret metric. This pseudo-1d bandit convex optimization problem (\\SBCO) arises frequently in domains such as online decision-making or parameter-tuning in large systems. For this problem, we first show a lower bound of $\\min(\\sqrt{dT}, T^{3\/4})$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm \\sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively, guaranteeing the {\\em optimal} regret bound mentioned above, up to additional logarithmic factors. In contrast, applying state-of-the-art online convex optimization methods leads to $\\tilde{O}\\left(\\min\\left(d^{9.5}\\sqrt{T},\\sqrt{d}T^{3\/4}\\right)\\right)$ regret, that is significantly suboptimal in $d$.<\/p>\n","protected":false},"excerpt":{"rendered":"

We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost\/reward functions $\\f_t$ admit a “pseudo-1d” structure, i.e. $\\f_t(\\w) = \\loss_t(\\pred_t(\\w))$ where the output of $\\pred_t$ is one-dimensional. At each round, the learner observes context $\\x_t$, plays prediction $\\pred_t(\\w_t; \\x_t)$ (e.g. $\\pred_t(\\cdot)=\\langle \\x_t, \\cdot\\rangle$) for some $\\w_t \\in \\mathbb{R}^d$ […]<\/p>\n","protected":false},"featured_media":0,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"msr-content-type":[3],"msr-research-highlight":[],"research-area":[13561,13556],"msr-publication-type":[193716],"msr-product-type":[],"msr-focus-area":[],"msr-platform":[],"msr-download-source":[],"msr-locale":[268875],"msr-post-option":[],"msr-field-of-study":[246904,246691,256144,250336,250495,257092,246913,256786,246823,249124],"msr-conference":[],"msr-journal":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-756592","msr-research-item","type-msr-research-item","status-publish","hentry","msr-research-area-algorithms","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-field-of-study-algorithm","msr-field-of-study-computer-science","msr-field-of-study-context-language-use","msr-field-of-study-convex-optimization","msr-field-of-study-exponential-function","msr-field-of-study-function-mathematics","msr-field-of-study-gradient-descent","msr-field-of-study-logarithm","msr-field-of-study-regret","msr-field-of-study-upper-and-lower-bounds"],"msr_publishername":"","msr_edition":"","msr_affiliation":"","msr_published_date":"2021-7-14","msr_host":"","msr_duration":"","msr_version":"","msr_speaker":"","msr_other_contributors":"","msr_booktitle":"","msr_pages_string":"","msr_chapter":"","msr_isbn":"","msr_journal":"","msr_volume":"","msr_number":"","msr_editors":"","msr_series":"","msr_issue":"","msr_organization":"","msr_how_published":"","msr_notes":"","msr_highlight_text":"","msr_release_tracker_id":"","msr_original_fields_of_study":"","msr_download_urls":"","msr_external_url":"","msr_secondary_video_url":"","msr_longbiography":"","msr_microsoftintellectualproperty":1,"msr_main_download":"","msr_publicationurl":"","msr_doi":"","msr_publication_uploader":[{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/aps.arxiv.org\/pdf\/2102.07387","label_id":"243132","label":0},{"type":"url","viewUrl":"false","id":"false","title":"https:\/\/arxiv.org\/abs\/2102.07387v1","label_id":"243109","label":0}],"msr_related_uploader":"","msr_attachments":[],"msr-author-ordering":[{"type":"user_nicename","value":"Aadirupa Saha","user_id":39835,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Aadirupa Saha"},{"type":"user_nicename","value":"Nagarajan Natarajan","user_id":37311,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Nagarajan Natarajan"},{"type":"user_nicename","value":"Praneeth Netrapalli","user_id":33279,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Praneeth Netrapalli"},{"type":"user_nicename","value":"Prateek Jain","user_id":33278,"rest_url":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/microsoft-research\/v1\/researchers?person=Prateek Jain"}],"msr_impact_theme":[],"msr_research_lab":[],"msr_event":[740803],"msr_group":[144940],"msr_project":[887322,813607],"publication":[],"video":[],"download":[],"msr_publication_type":"inproceedings","related_content":{"projects":[{"ID":887322,"post_title":"Network Brain","post_name":"netbrain","post_type":"msr-project","post_date":"2022-10-17 05:39:45","post_modified":"2023-11-02 01:11:16","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/netbrain\/","post_excerpt":"Holistic optimization of large-scale networked services The growth of large-scale networked services has brought to the fore myriad challenges: performance, reliability, efficiency, cost, and more. Traditionally, work on addressing and balancing these has been done in silos. For instance, an application could make choices to optimize its own performance or cost, while treating the rest of the workload and infrastructure as outside its purview. However, such an approach breaks down when an application or service…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/887322"}]}},{"ID":813607,"post_title":"AI meets PL","post_name":"program-learning","post_type":"msr-project","post_date":"2022-01-19 05:15:20","post_modified":"2022-01-19 10:38:01","post_status":"publish","permalink":"https:\/\/www.microsoft.com\/en-us\/research\/project\/program-learning\/","post_excerpt":"In this area of research, we broadly explore combining machine learning and program synthesis in various ways. This is an umbrella project that has spawned several projects exploring applications of such a combination in different areas. Heterogeneous data extraction framework (HDEF): This project explores the benefits of combining program synthesis with machine learning for structured information extraction.\u00a0We use machine learning models (\u201cML models\u201d) such as conditional random fields to get an initial labeling of potential…","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/813607"}]}}]},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/756592"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-research-item"}],"version-history":[{"count":4,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/756592\/revisions"}],"predecessor-version":[{"id":804121,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-item\/756592\/revisions\/804121"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=756592"}],"wp:term":[{"taxonomy":"msr-content-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-content-type?post=756592"},{"taxonomy":"msr-research-highlight","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-research-highlight?post=756592"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=756592"},{"taxonomy":"msr-publication-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-publication-type?post=756592"},{"taxonomy":"msr-product-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-product-type?post=756592"},{"taxonomy":"msr-focus-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-focus-area?post=756592"},{"taxonomy":"msr-platform","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-platform?post=756592"},{"taxonomy":"msr-download-source","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-download-source?post=756592"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=756592"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=756592"},{"taxonomy":"msr-field-of-study","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-field-of-study?post=756592"},{"taxonomy":"msr-conference","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-conference?post=756592"},{"taxonomy":"msr-journal","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-journal?post=756592"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=756592"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=756592"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}