下载
ORCAS: Open Resource for Click Analysis in Search
2024年4月
ORCAS is a click-based dataset associated with the TREC Deep Learning Track. It covers 1.4 million of the TREC DL documents, providing 18 million connections to 10 million distinct queries.
TREC Deep Learning Track
2024年4月
The TREC Deep Learning Track studies information retrieval in a large training data regime. This is the case where the number of training queries with at least one positive label is at least in the tens of thousands, if not…
Tip of the Tongue Known Item Retrieval Dataset for Movie Identification
2021年8月
The Tip of the Tongue (ToT) dataset is from the paper Tip of the Tongue Known-Item Retrieval: A Case Study in Movie Identification. It is comprised of 758 question/answer pairs scraped from the website iRememberThisMovie.com between 2013 and 2018. These…
Conformer-Kernel Model with Query Term Independence (TREC Deep Learning Quick Start)
2021年3月
This is a quick start guide for the document ranking task in the TREC Deep Learning (TREC-DL) benchmark. If you are new to TREC-DL, then this repository may make it more convenient for you to download all the required datasets…
MS MARCO
2019年5月
MS MARCO is a collection of datasets focused on deep learning in search. The first dataset was a question answering dataset featuring 100,000 real Bing questions and a human generated answer. Since then we released a 1,000,000 question dataset, a…
IR metrics for R
2018年4月
This is a small library for implementing several standard "test collection" or "offline" evaluation measures for search systems. See: https://github.com/Microsoft/irmetrics-r
Dual Word Embeddings Trained on Bing Queries
2016年2月
This data is being released for research purposes only. The DESM Word Embeddings dataset may include terms that some may consider offensive, indecent or otherwise objectionable. Microsoft has not reviewed or modified the content of the dataset. Microsoft is providing…
人员
Nick Craswell
Principal Architect
Bhaskar Mitra
Principal Researcher
Paul Thomas
Senior applied scientist
Milad Shokouhi
Principal Applied Scientist