{"id":493160,"date":"2018-07-09T07:54:28","date_gmt":"2018-07-09T14:54:28","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=493160"},"modified":"2018-07-09T07:54:28","modified_gmt":"2018-07-09T14:54:28","slug":"adversarial-and-reinforcement-learning-based-approaches-to-information-retrieval","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/adversarial-and-reinforcement-learning-based-approaches-to-information-retrieval\/","title":{"rendered":"Adversarial and reinforcement learning-based approaches to information retrieval"},"content":{"rendered":"<p>Traditionally, machine learning based approaches to information retrieval have taken the form of supervised learning-to-rank models. Recent advances in other machine learning approaches\u2014such as adversarial learning and reinforcement learning\u2014should find interesting new applications in future retrieval systems. At Microsoft AI & Research, we have been exploring some of these methods in the context of web search. We will share some of our recent work in this area at <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"http:\/\/sigir.org\/sigir2018\/\">SIGIR 2018<\/a>. This post briefly describes what is in couple of the papers we are presenting at the conference.<\/p>\n<h3>Adversarial learning for cross domain regularization<\/h3>\n<p>While traditional learning-to-rank methods depend on hand-engineered features, recently proposed deep learning based ranking models such as <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/dssm\/\">DSSM<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/learning-match-using-local-distributed-representations-text-web-search\/\">Duet<\/a> focus more on learning good representations of query and document text for matching by training on large datasets. For example, during training if the ranking model observes that the pair of phrases, \u201cTheresa May\u201d and \u201cUK Prime Minister\u201d co-occur frequently together in documents then it may infer that they are somehow related. The model might learn to promote documents containing \u201cTheresa May\u201d in the ranking when the query involves \u201cUK Prime Minister\u201d. If this same model is employed to retrieve from a different, say, older test collection in which the connection to \u201cJohn Major\u201d may be more appropriate, then the model performance may suffer.<\/p>\n<p>The ability to learn from data that certain entities, concepts, and phrases are related to each other gives representation learning models a distinct advantage. But this can come at the cost of poor cross-domain generalization. For example, if the training and the test data are sampled from different document collections then these models may demonstrate poorer retrieval performance. In web search, the query and document distributions also naturally evolve over time. Therefore, a representation learning model may need to be re-trained periodically to avoid performance degradations. Traditional IR models\u2014such as <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/en.wikipedia.org\/wiki\/Okapi_BM25\">BM25<\/a> \u2014make minimal assumptions about the data distributions and therefore are more robust to such differences. Learning deep models that can exhibit similar robustness in cross domain performance is an important challenge in search. When Daniel Cohen, a PhD student from University of Massachusetts, Amherst, joined the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/lab\/microsoft-research-cambridge\/\">Microsoft Research lab<\/a> in Cambridge last summer, he chose to study the effectiveness of adversarial learning as a cross-domain regularizer.<\/p>\n<blockquote><p><em><strong>\u201cMy motivation for collection\/domain regularization came about with the rise of these massively complex models that are achieving state of the art results. While this performance is desirable, these same models drastically underperform when the domain substantially shifts away from what was seen during training. As there are a lot of fields without enough training data (legal, medical), this provides a stepping stone towards a general relevance model that isn&#8217;t burdened with traditional handcrafted features.\u201d \u2013 Daniel Cohen<\/strong><\/em><\/p><\/blockquote>\n<p>The findings from our project are summarized in, \u201c<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/1805.03403\">Cross Domain Regularization for Neural Ranking Models<\/a>\u201d co-authored by Daniel Cohen, myself, Katja Hofmann, and W. Bruce Croft. In this work, we train the neural ranking models on a small set of domains. Simultaneously, an adversarial discriminator is trained that provides a negative feedback signal to the model to discourage it from learning domain specific representations.<\/p>\n<div id=\"attachment_493169\" style=\"width: 2510px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-493169\" class=\"wp-image-493169 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/adversarial.png\" alt=\"\" width=\"2500\" height=\"733\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/adversarial.png 2500w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/adversarial-300x88.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/adversarial-768x225.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/adversarial-1024x300.png 1024w\" sizes=\"(max-width: 2500px) 100vw, 2500px\" \/><p id=\"caption-attachment-493169\" class=\"wp-caption-text\">Figure 1 \u2013 Cross domain regularization using an adversarial discriminator. The discriminator inspects the learned representations of the ranking model and provides a negative feedback signal for any representation that aids domain discrimination.<\/p><\/div>\n<p>The proposed adversarial approach shows consistent performance improvements over different deep neural baselines. However, we found that a model trained on large in-domain data is still likely to have a significant advantage over these models. Machine learning approaches to retrieval might need significantly more breakthroughs before achieving the level of robustness of some of the traditional retrieval models. In the meantime, improving robustness of these deep models will continue to be an important research challenge.<\/p>\n<h3>Reinforcement learning for effective and efficient query evaluations<\/h3>\n<p>Web search engines attempt to retrieve the top few relevant results by searching through collections containing billions of documents, often in under a second. To achieve such short response times, these systems typically employ large scale distributed indexes and specialized data structures. Typically, an initial set of candidates is identified that are progressively pruned and ranked by a cascade of retrieval models of increasing complexity. The index organization and query evaluation strategies explicitly trade off retrieval effectiveness and efficiency during the candidate generation stage. Unlike in late stage re-ranking where machine learning models are commonplace, the candidate generation frequently employs traditional retrieval models with few learnable parameters.<\/p>\n<p>In Bing, the candidate generation involves scanning the index using statically designed match plans that prescribe sequences of different match criteria and stopping conditions. During query evaluation, the query is classified into one of a few pre-defined categories and based on this categorization a match plan is selected. Documents are scanned based on the chosen match plan which consists of a sequence of match rules and corresponding stopping criteria. A match rule defines the condition that a document should satisfy to be selected as a candidate for ranking and the stopping criteria decides when the index scan using a match rule should terminate \u2014 and whether the matching process should continue with the next match rule, or conclude, or reset to the beginning of the index. These match plans influence the trade-off between how quickly Bing responds to a search query and the quality of its results. For example, long queries with rare intents might require more expensive match plans that consider the body text of the documents and search deeper into the index to find more candidates. In contrast, for a popular navigational query a fast and shallow scan against a subset of the document fields\u2014for example, URL and title\u2014may be sufficient.<\/p>\n<div id=\"attachment_493166\" style=\"width: 1653px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-493166\" class=\"wp-image-493166 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/figure2_cascading-architecture.png\" alt=\"\" width=\"1643\" height=\"763\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/figure2_cascading-architecture.png 1643w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/figure2_cascading-architecture-300x139.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/figure2_cascading-architecture-768x357.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/figure2_cascading-architecture-1024x476.png 1024w\" sizes=\"(max-width: 1643px) 100vw, 1643px\" \/><p id=\"caption-attachment-493166\" class=\"wp-caption-text\">Figure 2 \u2013 A cascading architecture employed in Bing\u2019s retrieval system. Documents are scanned using a pre-defined match plan. Matched documents are passed through additional rank-and-prune stages.<\/p><\/div>\n<p>Sometime last year, we started thinking about casting the match planning process as a reinforcement learning task. After some experimentation, we managed to successfully learn a policy using table-based Q-learning that sequentially decides which match rules to employ during the candidate generation. We trained the model to maximize a cumulative reward computed based on the estimated relevance of the additional documents discovered, discounted by their cost of retrieval. By learning\u2014instead of hand-crafting\u2014these match plans, we observed significant reduction in the number of index blocks accessed with small or no degradations in the candidate set quality. You can find more information about this project in the paper titled \u201c<a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/arxiv.org\/abs\/1804.04410\">Optimizing Query Evaluations using Reinforcement Learning for Web Search<\/a>\u201d, by Corby Rosset, Damien Jose, Gargi Ghosh, myself, and Saurabh Tiwary.<\/p>\n<p>In web search, machine learning models are typically employed to achieve better retrieval effectiveness by learning from large datasets, often in exchange for few additional milliseconds of latency. In this work, we argue that machine learning can also be useful for improving the speed of retrieval. Not only do these translate into material cost savings in query serving infrastructure, but milliseconds of saved run-time can be re-purposed by upstream ranking systems to provide better end-user experience.<\/p>\n<div id=\"attachment_493163\" style=\"width: 1852px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-493163\" class=\"wp-image-493163 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/team-pictures_figure-3.png\" alt=\"\" width=\"1842\" height=\"375\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/team-pictures_figure-3.png 1842w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/team-pictures_figure-3-300x61.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/team-pictures_figure-3-768x156.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/team-pictures_figure-3-1024x208.png 1024w\" sizes=\"(max-width: 1842px) 100vw, 1842px\" \/><p id=\"caption-attachment-493163\" class=\"wp-caption-text\">Figure 3 &#8211; The co-authors of the two papers discussed in this post. (From left) Katja Hofmann, Bhaskar Mitra, W. Bruce Croft, Daniel Cohen, Damien Jose, Gargi Ghosh, Corby Rosset, and Saurabh Tiwary.<\/p><\/div>\n<h3>What\u2019s next?<\/h3>\n<p>At Microsoft AI & Research, we are always exploring new ways to apply state of the art machine learning approaches to information retrieval and search. If you\u2019re interested in learning more about some of our recent work in search and information retrieval, I encourage you to check out our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/research-area\/search-information-retrieval\/?q&content-type=publications\">recent publications<\/a>. You can also find more information about the research that my team is focused on <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/web-ai-sciences\/\">here<\/a>. There are a lot of hard problems and exciting recent work in machine learning for search. If you\u2019re interested in this area, please be sure to check out our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/introduction-neural-information-retrieval\/\">overview paper on neural information retrieval<\/a>.<\/p>\n<h4><\/h4>\n<h4>Related Papers<\/h4>\n<ul>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/optimizing-query-evaluations-using-reinforcement-learning-web-search\/\">Optimizing Query Evaluations Using Reinforcement Learning for Web Search<\/a><\/li>\n<li><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/cross-domain-regularization-neural-ranking-models-using-adversarial-learning\/\">Cross Domain Regularization for Neural Ranking Models Using Adversarial Learning<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Traditionally, machine learning based approaches to information retrieval have taken the form of supervised learning-to-rank models. Recent advances in other machine learning approaches\u2014such as adversarial learning and reinforcement learning\u2014should find interesting new applications in future retrieval systems. At Microsoft AI & Research, we have been exploring some of these methods in the context of web [&hellip;]<\/p>\n","protected":false},"author":37074,"featured_media":493187,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[194460],"tags":[],"research-area":[13556,13555],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-493160","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-search-and-information-retrieval","msr-research-area-artificial-intelligence","msr-research-area-search-information-retrieval","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[267093],"related-projects":[],"related-events":[489929],"related-researchers":[{"type":"user_nicename","value":"Bhaskar Mitra","user_id":31257,"display_name":"Bhaskar Mitra","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/bmitra\/\" aria-label=\"Visit the profile page for Bhaskar Mitra\">Bhaskar Mitra<\/a>","is_active":false,"last_first":"Mitra, Bhaskar","people_section":0,"alias":"bmitra"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"926\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/InformationRetrieval_Carousel_06_2018_480x280.jpg\" class=\"img-object-cover\" alt=\"\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/InformationRetrieval_Carousel_06_2018_480x280.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/InformationRetrieval_Carousel_06_2018_480x280-300x175.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/InformationRetrieval_Carousel_06_2018_480x280-768x448.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2018\/06\/InformationRetrieval_Carousel_06_2018_480x280-480x280.jpg 480w\" sizes=\"(max-width: 926px) 100vw, 926px\" \/>","byline":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/bmitra\/\" title=\"Go to researcher profile for Bhaskar Mitra\" aria-label=\"Go to researcher profile for Bhaskar Mitra\" data-bi-type=\"byline author\" data-bi-cN=\"Bhaskar Mitra\">Bhaskar Mitra<\/a>","formattedDate":"July 9, 2018","formattedExcerpt":"Traditionally, machine learning based approaches to information retrieval have taken the form of supervised learning-to-rank models. Recent advances in other machine learning approaches\u2014such as adversarial learning and reinforcement learning\u2014should find interesting new applications in future retrieval systems. At Microsoft AI &amp; Research, we have been&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/493160"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37074"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=493160"}],"version-history":[{"count":7,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/493160\/revisions"}],"predecessor-version":[{"id":493469,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/493160\/revisions\/493469"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/493187"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=493160"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=493160"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=493160"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=493160"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=493160"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=493160"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=493160"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=493160"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=493160"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=493160"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=493160"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}