{"id":486827,"date":"2019-01-14T09:35:06","date_gmt":"2019-01-14T17:35:06","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&#038;p=486827"},"modified":"2023-03-29T19:11:44","modified_gmt":"2023-03-30T02:11:44","slug":"newsqa-dataset","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/newsqa-dataset\/","title":{"rendered":"NewsQA Dataset"},"content":{"rendered":"<p>With massive volumes of written text being produced every second, how do we make sure that we have the most recent and relevant information available to us? Microsoft research Montreal is tackling this problem by building AI systems that can read and comprehend large volumes of complex text in real-time.<\/p>\n<p>The purpose of the NewsQA dataset is to help the research community build algorithms that are capable of answering questions requiring human-level comprehension and reasoning skills.<\/p>\n<p>Leveraging CNN articles from the DeepMind Q&A Dataset, we prepared a crowd-sourced machine reading comprehension dataset of 120K Q&A pairs.<\/p>\n<ul>\n<li>Documents are CNN news articles.<\/li>\n<li>Questions are written by human users in natural language.<\/li>\n<li>Answers may be multiword passages of the source text.<\/li>\n<li>Questions may be unanswerable.<\/li>\n<li>NewsQA is collected using a 3-stage, siloed process.<\/li>\n<li>Questioners see only an article&#8217;s headline and highlights.<\/li>\n<li>Answerers see the question and the full article, then select an answer passage.<\/li>\n<li>Validators see the article, the question, and a set of answers that they rank.<\/li>\n<li>NewsQA is more natural and more challenging than previous datasets.<\/li>\n<\/ul>\n<h3 style=\"margin-top: 25px;\">Challenges<\/h3>\n<p>A significant proportion of questions in NewsQA cannot be solved without reasoning. The reasoning types we have identified in our analysis are as follows:<\/p>\n<ul>\n<li>Synthesis: Some answers can only be inferred by synthesizing information distributed across multiple sentences.<\/li>\n<li>Paraphrasing: A single sentence in the article might entail or paraphrase the question. Paraphrase recognition may require synonymy and word knowledge.<\/li>\n<li>Inference: Some answers must be inferred from incomplete information in the article or by recognizing conceptual overlap. This typically draws on general knowledge.<\/li>\n<li>Additionally, some questions have no answer or no unique answer in the corresponding story, so a system must learn to recognize when given information is not sufficient.<\/li>\n<\/ul>\n<p>See other datasets from Microsoft Montreal:<br \/>\n<b><a href=\"\/en-us\/research\/project\/Frames-dataset\/\">Frames<\/a> | <a href=\"\/en-us\/research\/project\/figureqa-dataset\/\">FigureQA<\/a><\/b><\/p>\n","protected":false},"excerpt":{"rendered":"<p>With massive volumes of written text being produced every second, how do we make sure that we have the most recent and relevant information available to us? Microsoft research Montreal is tackling this problem by building AI systems that can read and comprehend large volumes of complex text in real-time. The purpose of the NewsQA [&hellip;]<\/p>\n","protected":false},"featured_media":489527,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-locale":[268875],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-486827","msr-project","type-msr-project","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us","msr-archive-status-active"],"msr_project_start":"","related-publications":[],"related-downloads":[],"related-videos":[],"related-groups":[],"related-events":[],"related-opportunities":[],"related-posts":[599313],"related-articles":[],"tab-content":[{"id":0,"name":"Stats","content":"<h3>Summary<\/h3>\r\n[row]\r\n[column class=\"l-col-4-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">12,744<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Stories<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-4-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">119,633<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Question-Answer Pairs<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-4-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">616<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Average Words per Article<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-4-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">4.13<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Average Words per Answer<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-4-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">74.9%<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Human Performance (F1)<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[\/row]\r\n<h3 style=\"margin-top: 25px\">Reasoning Statistics<\/h3>\r\nReasoning mechanisms needed to answer questions in NewsQA based on 500 examples. For each type, we show an example question with the text snippet that contains the answer span, with words relevant to the reasoning type in bold.\r\n<table style=\"border-spacing: 0px;border-collapse: separate;border: 1px solid #d0d0d0;height: 225px\" width=\"100%\" cellspacing=\"0\" cellpadding=\"15\">\r\n<thead>\r\n<tr style=\"background-color: #104586\">\r\n<td style=\"padding: 15px;border: inherit\" valign=\"middle\" width=\"15%\"><span style=\"color: #ffffff\"><strong>Reasoning<\/strong><\/span><\/td>\r\n<td style=\"padding: 15px;border: inherit\" valign=\"middle\" width=\"15%\"><span style=\"color: #ffffff\"><strong>Proportion<\/strong><\/span><\/td>\r\n<td style=\"padding: 15px;border: inherit\" valign=\"middle\" width=\"70%\"><span style=\"color: #ffffff\"><strong>Example<\/strong><\/span><\/td>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Word Matching<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">31.6%<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Q: <strong>When<\/strong> were the <strong>findings published<\/strong>?\r\n\r\nT: Both sets of research <strong>findings were published Thursday<\/strong>...<\/td>\r\n<\/tr>\r\n<tr style=\"background-color: #f2f2f2\">\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Paraphrasing<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">26.8%<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Q: <strong>Who<\/strong> is the struggle between in Rwanda?\r\n\r\nT: The struggle <strong>pits ethnic Tutsis<\/strong>, supported by Rwanda, <strong>against ethnic Hut<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Synthesis<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">17.8%<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Q: <strong>Where<\/strong> is <strong>Brittanee Drexel<\/strong> from?\r\n\r\nT: The mother of a 17-year-old <strong>Rochester, New York<\/strong> high school student ... says she did not give her daughter permission to go on the trip. <strong>Brittanee<\/strong> Marie <strong>Drexel's<\/strong> mom says...<\/td>\r\n<\/tr>\r\n<tr style=\"background-color: #f2f2f2\">\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Inference<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">14.0%<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Q: <strong>Who<\/strong> drew <strong>inspiration<\/strong> from <strong>presidents<\/strong>?\r\n\r\nT: <strong>Rudy Ruiz<\/strong> says the lives of US <strong>presidents<\/strong> can make them <strong>positive role models<\/strong> for students.<\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Ambiguous\/Insufficient<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">9.8%<\/td>\r\n<td style=\"vertical-align: middle;padding: 15px;border: inherit\">Q: <strong>Whose mother<\/strong> is <strong>moving<\/strong> to the White House?\r\n\r\nT: ... <strong>Barack Obama's mother-in-law<\/strong>, Marian Robinson will join the Obamas at the <strong>family's private quarters<\/strong> at 1600 Pennsylvania Avenue. [Michelle is never mentioned]<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<h3 style=\"margin-top: 25px\">Consensus Statistics<\/h3>\r\n[row]\r\n[column class=\"l-col-8-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">102,841<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">With Consensus (Including Validated)<\/div>\r\n<div style=\"color: white;font-size: 15px\">(85.96%)<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-8-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">51,630<\/div>\r\n&nbsp;\r\n<div style=\"color: white;font-size: 15px\">Validated Answers<\/div>\r\n<div style=\"color: white;font-size: 15px\">Only 51,630 needed to be validated because there was agreement during second step of the collection.<\/div>\r\n<\/div>\r\n[\/column]\r\n\r\n[column class=\"l-col-8-24\"]\r\n<div style=\"text-align: center;vertical-align: middle;background-color: #104586;min-width: 150px;min-height: 200px;padding: 15px\">\r\n<div style=\"color: white;font-size: 30px;padding-bottom: 10px\">45,381<\/div>\r\n<div style=\"color: white;font-size: 15px\">Validated Answers with Consensus<\/div>\r\n<div style=\"color: white;font-size: 15px\">(87.90%)<\/div>\r\n<\/div>\r\n[\/column]\r\n[\/row]\r\n<h3 style=\"margin-top: 25px\">Story Length Distribution<\/h3>\r\n[gallery columns=\"2\" size=\"medium\" ids=\"487460,487457\"]\r\n<h3 style=\"margin-top: 25px\">Question Length Distribution<\/h3>\r\n[gallery columns=\"2\" size=\"medium\" link=\"file\" ids=\"487451,487448\"]\r\n<h3 style=\"margin-top: 25px\">Question Type Distribution<\/h3>\r\n<img class=\"aligncenter wp-image-487454 size-full\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2018\/05\/plot_question_types.svg\" alt=\"Question Type Distribution\" width=\"768\" height=\"528\" \/>\r\n<h3 style=\"margin-top: 25px\">Overall Answer Length Distribution<\/h3>\r\n[gallery columns=\"2\" size=\"medium\" link=\"file\" ids=\"487439,487436\"]\r\n<h3 style=\"margin-top: 25px\">Answer Length Distribution per Question<\/h3>\r\n[gallery columns=\"2\" size=\"medium\" link=\"file\" ids=\"487445,487442\"]"},{"id":1,"name":"Download","content":"<h3>CNN Stories<\/h3>\r\nNotice: CNN articles are used here by permission from The Cable News Network (CNN). CNN does not waive any rights of ownership in its articles and materials. CNN is not a partner of, nor does it endorse, Microsoft research Montreal or its activities.\r\n\r\nThe stories are not owned by Microsoft and can be retrieved from <a href=\"http:\/\/cs.nyu.edu\/~kcho\/DMQA\/\" target=\"_blank\" rel=\"noopener\">DeepMind Q&amp;A Dataset<\/a>.\r\n<h3 style=\"margin-top: 25px\">Questions and Answers<\/h3>\r\nThe following package just includes the questions and answers. Instructions on how to combine the stories and answers into one file can be found in the <a href=\"https:\/\/github.com\/Maluuba\/newsqa\" target=\"_blank\" rel=\"noopener\">GitHub repo<\/a>.\r\n<ul class=\"stripped no-margin-bottom no-margin-last ms-row\">\r\n \t<li class=\"s-col-2-4 m-col-1-4 l-col-4-4 margin-bottom-sp1\"><a class=\"button-solid x-hidden-focus\" href=\"https:\/\/www.microsoft.com\/en-us\/download\/details.aspx?id=57162\" target=\"_blank\" rel=\"noopener\">Agree &amp; Download<\/a><\/li>\r\n<\/ul>\r\n&nbsp;\r\n<h3 style=\"margin-top: 25px\">GitHub Repo<\/h3>\r\n<a href=\"https:\/\/github.com\/Maluuba\/newsqa\" target=\"_blank\" rel=\"noopener\">https:\/\/github.com\/Maluuba\/newsqa<\/a>\r\n<h3 style=\"margin-top: 25px\">Paper<\/h3>\r\n<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/newsqa-machine-comprehension-dataset\/\">NewsQA: A Machine Comprehension Dataset<\/a>"}],"related-researchers":[{"type":"user_nicename","display_name":"Alessandro Sordoni","user_id":37230,"people_section":"Section name 1","alias":"alsordon"},{"type":"user_nicename","display_name":"Eric Yuan","user_id":37167,"people_section":"Section name 1","alias":"eryua"}],"msr_research_lab":[437514,1148609],"msr_impact_theme":[],"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486827","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-project"}],"version-history":[{"count":17,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486827\/revisions"}],"predecessor-version":[{"id":560247,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-project\/486827\/revisions\/560247"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/489527"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=486827"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=486827"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=486827"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=486827"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=486827"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}