{"id":487187,"date":"2019-01-14T09:39:09","date_gmt":"2019-01-14T17:39:09","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=487187"},"modified":"2023-03-30T12:23:16","modified_gmt":"2023-03-30T19:23:16","slug":"frames-dataset","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/frames-dataset\/","title":{"rendered":"Frames Dataset"},"content":{"rendered":"

Motivation<\/h3>\n

A generation of voice assistants such as Siri, Cortana, and Google Now have been popular spoken dialogue systems. More recently, we have seen a rise in text-based conversational agents (aka chatbots). Text is preferred to voice by many users for privacy reasons and in order to avoid bad speech recognition in noisy environments. These agents are also welcome as an alternative to downloading and installing applications. This makes a lot of sense when completing simple tasks such as ordering a cab or asking for the weather.<\/p>\n

In most cases, much like voice assistants, these chatbots only support very simple and sequential interactions. The reason is that the user’s goal is well-defined and dialogue flow can be easily hand-crafted. However, there are other use-cases such as customer service, or travel booking where there is a decision-making process.<\/p>\n

Frames is precisely meant to encourage research towards conversational agents which can support decision-making in complex settings, in this case – booking a vacation including flights and a hotel. More than just searching a database, we believe the next generation of conversational agents will need to help users explore a database, compare items, and reach a decision.<\/p>\n

The dialogues in Frames were collected in a Wizard-of-Oz fashion. Two humans talked to each other via a chat interface. One was playing the role of the user and the other one was playing the role of the conversational agent. We call the latter a wizard as a reference to the Wizard of Oz, the man behind the curtain. The wizards had access to a database of 250+ packages, each composed of a hotel and round-trip flights. We gave users a few constraints for each dialogue and we asked them to find the best deal. This resulted in complex dialogues where a user would often consider different options, compare packages, and progressively build the description of her ideal trip.<\/p>\n

Frame Tracking<\/h3>\n

With this dataset, we also present a new task: frame tracking. Our main observation is that decision-making is tightly linked to memory. In effect, to choose a trip, users and wizards talked about different possibilities, compared them and went back-and-forth between cities, dates, or vacation packages.<\/p>\n

Current systems are memory-less. They implement slot-filling for search as a sequential process where the user is asked for constraints one after the other until a database query can be formulated. Only one set of constraints is kept in memory. For instance, in the illustration below, on the left, when the user mentions Montreal, it overwrites Toronto as destination city. However, behaviours observed in Frames imply that slot values should not be overwritten. One use-case is comparisons: it is common that users ask to compare different items and in this case, different sets of constraints are involved (for instance, different destinations). Frame tracking consists of keeping in memory all the different sets of constraints mentioned by the user. It is a generalization of the state tracking task to a setting where not only the current frame is memorized.<\/p>\n

Adding this kind of conversational memory is key to building agents which do not simply serve as a natural language interface for searching a database but instead accompany users in their exploration and help them find the best item.<\/p>\n