Dramatic cloud over city of Montreal skyline at Quebec, Canada.

Microsoft Research Lab – Montréal

Downloads

Generative Neural Visual Artist (GeNeVA) – Datasets – Generation Code

May 2019

Scripts to generate the CoDraw and i-CLEVR datasets used for the GeNeVA Neural Visual Artist (GeNeVA) task proposed in Tell, Draw, and Repeat: Generating and modifying images based on continual linguistic instruction.

Github

MS MARCO

May 2019

MS MARCO is a collection of datasets focused on deep learning in search. The first dataset was a question answering dataset featuring 100,000 real Bing questions and a human generated answer. Since then we released a 1,000,000 question dataset, a…

Download

FigureQA Dataset

March 2018

Answering questions about a given image is a difficult task, requiring both an understanding of the image and the accompanying query. Microsoft Montreal’s FigureQA dataset introduces a new visual reasoning task for research, specific to graphical plots and figures. The…

Download

Frames Dataset

March 2018

Frames is a dataset designed to encourage research towards conversational agents which can support decision-making in complex settings, in this case – booking a vacation including flights and a hotel. More than just searching a database, we believe the next…

Download

NewsQA Dataset

March 2018

The purpose of Microsoft Montreal’s NewsQA dataset is to help the research community build algorithms that are capable of answering questions requiring human-level comprehension and reasoning skills. Leveraging CNN articles from the DeepMind Q&A Dataset, we prepared a crowd-sourced machine…

Download

nlg-eval

January 2018

nlg-eval Evaluation code for various unsupervised automated metrics for NLG (Natural Language Generation). It takes as input a hypothesis file, and one or more references files and outputs values of metrics. Rows across these files should correspond to the same…

Github