Search-based Neural Structured Learning for Sequential Question Answering
- Mohit Iyyer ,
- Scott Wen-tau Yih ,
- Ming-Wei Chang
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics |
Published by Association for Computational Linguistics
Recent work in semantic parsing for question answering has focused on long and complicated questions, many of which would seem unnatural if asked in a normal conversation between two humans. In an effort to explore a conversational QA setting, we present a more realistic task: answering sequences of simple but inter-related questions. We collect a dataset of 6,066 question sequences that inquire about semi-structured tables from Wikipedia, with 17,553 question-answer pairs in total. To solve this sequential question answering task, we propose a novel dynamic neural semantic parsing framework trained using a weakly supervised reward-guided search. Our model effectively leverages the sequential context to outperform state-of-the-art QA systems that are designed to answer highly complex questions.
Publication Downloads
Search-based Neural Structured Learning for Sequential Question Answering
August 7, 2018
This project contains the source code of the Dynamic Neural Semantic Parser (DynSP), based on DyNet, described in the paper paper "Search-based Neural Structured Learning for Sequential Question Answering".
Microsoft Research Sequential Question Answering (SQA) Dataset
November 11, 2016
Recent work in semantic parsing for question answering has focused on long and complicated questions, many of which would seem unnatural if asked in a normal conversation between two humans. In an effort to explore a conversational QA setting, we present a more realistic task: answering sequences of simple but inter-related questions. We created SQA by asking crowdsourced workers to decompose 2,022 questions from WikiTableQuestions (WTQ), which contains highly-compositional questions about tables from Wikipedia. We had three workers decompose each WTQ question, resulting in a dataset of 6,066 sequences that contain 17,553 questions in total. Each question is also associated with answers in the form of cell locations in the tables.