Overview of the TREC 2022 deep learning track

Nick Craswell; Bhaskar Mitra; Emine Yilmaz; Daniel Campos; Jimmy Lin; Ellen M. Voorhees; Ian Soboroff

Overview of the TREC 2022 deep learning track

Nick Craswell ,
Bhaskar Mitra ,
Emine Yilmaz ,
Daniel Campos ,
Jimmy Lin ,
Ellen M. Voorhees ,
Ian Soboroff

Text REtrieval Conference (TREC) | March 2023

Published by TREC | Organized by NIST

PDF

Download BibTex

This is the fourth year of the TREC Deep Learning track. As in previous years, we leverage the MS MARCO datasets that made hundreds of thousands of human annotated training labels available for both passage and document ranking tasks. In addition, this year we also leverage both the refreshed passage and document collections that were released last year leading to a nearly 16 times increase in the size of the passage collection and nearly four times increase in the document collection size. Unlike previous years, in 2022 we mainly focused on constructing a more complete test collection for the passage retrieval task, which has been the primary focus of the track. The document ranking task was kept as a secondary task, where document-level labels were inferred from the passage-level labels. Our analysis shows that similar to previous years, deep neural ranking models that employ large scale pretraining continued to outperform traditional retrieval methods. Due to the focusing our judging resources on passage judging, we are more confident in the quality of this year’s queries and judgments, with respect to our ability to distinguish between runs and reuse the dataset in future. We also see some surprises in overall outcomes. Some top-performing runs did not do dense retrieval. Runs that did single-stage dense retrieval were not as competitive this year as they were last year.

Related Tools

TREC Deep Learning Track

April 24, 2024

The TREC Deep Learning Track studies information retrieval in a large training data regime. This is the case where the number of training queries with at least one positive label is at least in the tens of thousands, if not hundreds of thousands or more. This corresponds to real-world scenarios such as training based on click logs and training based on labels from shallow pools (such as the pooling in the TREC Million Query Track or the evaluation of search engines based on early precision).

Access