{"id":493160,"date":"2018-07-09T07:54:28","date_gmt":"2018-07-09T14:54:28","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=493160"},"modified":"2018-07-09T07:54:28","modified_gmt":"2018-07-09T14:54:28","slug":"adversarial-and-reinforcement-learning-based-approaches-to-information-retrieval","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/adversarial-and-reinforcement-learning-based-approaches-to-information-retrieval\/","title":{"rendered":"Adversarial and reinforcement learning-based approaches to information retrieval"},"content":{"rendered":"
Traditionally, machine learning based approaches to information retrieval have taken the form of supervised learning-to-rank models. Recent advances in other machine learning approaches\u2014such as adversarial learning and reinforcement learning\u2014should find interesting new applications in future retrieval systems. At Microsoft AI & Research, we have been exploring some of these methods in the context of web search. We will share some of our recent work in this area at SIGIR 2018<\/a>. This post briefly describes what is in couple of the papers we are presenting at the conference.<\/p>\n While traditional learning-to-rank methods depend on hand-engineered features, recently proposed deep learning based ranking models such as DSSM<\/a> and Duet<\/a> focus more on learning good representations of query and document text for matching by training on large datasets. For example, during training if the ranking model observes that the pair of phrases, \u201cTheresa May\u201d and \u201cUK Prime Minister\u201d co-occur frequently together in documents then it may infer that they are somehow related. The model might learn to promote documents containing \u201cTheresa May\u201d in the ranking when the query involves \u201cUK Prime Minister\u201d. If this same model is employed to retrieve from a different, say, older test collection in which the connection to \u201cJohn Major\u201d may be more appropriate, then the model performance may suffer.<\/p>\n The ability to learn from data that certain entities, concepts, and phrases are related to each other gives representation learning models a distinct advantage. But this can come at the cost of poor cross-domain generalization. For example, if the training and the test data are sampled from different document collections then these models may demonstrate poorer retrieval performance. In web search, the query and document distributions also naturally evolve over time. Therefore, a representation learning model may need to be re-trained periodically to avoid performance degradations. Traditional IR models\u2014such as BM25<\/a> \u2014make minimal assumptions about the data distributions and therefore are more robust to such differences. Learning deep models that can exhibit similar robustness in cross domain performance is an important challenge in search. When Daniel Cohen, a PhD student from University of Massachusetts, Amherst, joined the Microsoft Research lab<\/a> in Cambridge last summer, he chose to study the effectiveness of adversarial learning as a cross-domain regularizer.<\/p>\n \u201cMy motivation for collection\/domain regularization came about with the rise of these massively complex models that are achieving state of the art results. While this performance is desirable, these same models drastically underperform when the domain substantially shifts away from what was seen during training. As there are a lot of fields without enough training data (legal, medical), this provides a stepping stone towards a general relevance model that isn’t burdened with traditional handcrafted features.\u201d \u2013 Daniel Cohen<\/strong><\/em><\/p><\/blockquote>\n The findings from our project are summarized in, \u201cCross Domain Regularization for Neural Ranking Models<\/a>\u201d co-authored by Daniel Cohen, myself, Katja Hofmann, and W. Bruce Croft. In this work, we train the neural ranking models on a small set of domains. Simultaneously, an adversarial discriminator is trained that provides a negative feedback signal to the model to discourage it from learning domain specific representations.<\/p>\nAdversarial learning for cross domain regularization<\/h3>\n