Unbiased Learning-to-Rank with Biased Feedback

Thorsten  Joachims; Adith Swaminathan; Tobias Schnabel

Unbiased Learning-to-Rank with Biased Feedback

Thorsten Joachims ,
Adith Swaminathan ,
Tobias Schnabel

Web Search and Data Mining | February 2017

Published by ACM

Best Paper Award

Download BibTex

Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly inﬂuences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-bias the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly eﬀective in dealing with biases, that it is robust to noise and propensity model misspeciﬁcation, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.