Downloads
Hippocorpus
February 2022
To examine the cognitive processes of remembering and imagining and their traces in language, we introduce Hippocorpus, a dataset of 6,854 English diary-like short stories about recalled and imagined events. Using a crowdsourcing framework, we first collect recalled stories and…
GLUECoS
July 2020
This is the repo for the ACL 2020 paper GLUECoS: An Evaluation Benchmark for Code-Switched NLP GLUECoS is a benchmark comprising of multiple code-mixed tasks across 2 language pairs (En-Es and En-Hi)
MIcrosoft News Dataset (MIND)
July 2020
MIcrosoft News Dataset (MIND) is a large-scale dataset for news recommendation research. It was collected from anonymized behavior logs of Microsoft News website. The mission of MIND is to serve as a benchmark dataset for news recommendation and facilitate the…