{"id":851928,"date":"2022-06-23T09:00:00","date_gmt":"2022-06-23T16:00:00","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=851928"},"modified":"2022-08-17T08:54:40","modified_gmt":"2022-08-17T15:54:40","slug":"godel-combining-goal-oriented-dialog-with-real-world-conversations","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/godel-combining-goal-oriented-dialog-with-real-world-conversations\/","title":{"rendered":"GODEL: Combining goal-oriented dialog with real-world conversations"},"content":{"rendered":"\n<figure class=\"wp-block-image alignwide size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1024x576.jpg\" alt=\"Diagram showing GODEL\u2019s architecture. The environment of the dialog system consists of both structured and unstructured content, which it uses to retrieve information. This source content, which we term \u201cgrounding,\u201d is updated and repeatedly used by GODEL to produce a new response after each user input.\" class=\"wp-image-851937\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1536x865.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-2048x1153.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-scaled-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-scaled-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1920x1080.jpg 1920w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>They make restaurant recommendations, help us pay bills, and remind us of appointments. Many people have come to rely on virtual assistants and chatbots to perform a wide range of routine tasks. But what if a single dialog agent, the technology behind these language-based apps, could perform all these tasks and then take the conversation further? In addition to providing on-topic expertise, such as recommending a restaurant, it could engage in a conversation about the history of the neighborhood or a recent sports game, and then bring the conversation back on track.\u202fWhat if the agent\u2019s responses continually reflect the latest world events? And what if it could do all of this without the need for any additional work by the designer?\u202f&nbsp;&nbsp;<\/p>\n\n\n\n<p>With <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/microsoft\/GODEL\" target=\"_blank\" rel=\"noreferrer noopener\">GODEL<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>, this may not be far off. GODEL stands for <strong>G<\/strong>rounded <strong>O<\/strong>pen <strong>D<\/strong>ialogu<strong>e<\/strong> <strong>L<\/strong>anguage Model, and it ushers in a new class of pretrained language models that enable both task-oriented and social conversation and are evaluated by the usefulness of their responses.&nbsp;&nbsp;<\/p>\n\n\n\n<p>Pretrained language models are among the engines that power conversational AI, the technology that underlies these dialog agents. They can either be task-oriented (\u201cgive me a job, and I\u2019ll do it\u201d) or engage in a conversation without a specified outcome, known as open-domain or chit-chat. GODEL combines both these capabilities, giving dialog agents the ability to generate responses based not just on the context of the conversation, but also on external information, content that was not part of the dataset when the model was trained. This includes both structured content, such as information stored in databases, and unstructured content, such as restaurant reviews, Wikipedia articles, and other publicly available material found on the web. This explains how a simple task-based query about restaurant recommendations can evolve into a dialog about ingredients, food, and even cooking techniques\u2014the kind of winding path that real-world conversations take.&nbsp;&nbsp;<\/p>\n\n\n\n<p>In 2019, the <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/deep-learning-group\/\" target=\"_blank\" rel=\"noreferrer noopener\">Deep Learning<\/a> and <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/group\/natural-language-processing\/\" target=\"_blank\" rel=\"noreferrer noopener\">Natural Language Processing<\/a> groups at Microsoft Research released <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/large-scale-pretraining-for-response-generation\/\" target=\"_blank\" rel=\"noreferrer noopener\">DialoGPT<\/a>, the first large-scale pretrained language model designed specifically for dialog. This helped make conversational AI more accessible and easier to work with, and it enabled the research community to make considerable progress in this area. With GODEL, our goal is to help further this progress by empowering researchers and developers to create dialog agents that are unrestricted in the types of queries they can respond to and the sources of information they can draw from. We also worked to ensure those responses are useful to the person making the query.&nbsp;&nbsp;&nbsp;&nbsp;<\/p>\n\n\n\n<p>In our paper, \u201c<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/godel-large-scale-pre-training-for-goal-directed-dialog\/\" target=\"_blank\" rel=\"noreferrer noopener\">GODEL: Large-Scale Pre-training for Goal-Directed Dialog<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>,\u201d we describe the technical details underlying GODEL, and we have made the code available on <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/GODEL\" target=\"_blank\" rel=\"noreferrer noopener\">GitHub<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>.&nbsp;<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-1 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button\"><a data-bi-type=\"button\" class=\"wp-block-button__link\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/godel-large-scale-pre-training-for-goal-directed-dialog\/\" target=\"_blank\" rel=\"noreferrer noopener\">Read the paper<\/a><\/div>\n\n\n\n<div class=\"wp-block-button is-style-fill-github\"><a data-bi-type=\"button\" class=\"wp-block-button__link\" href=\"https:\/\/github.com\/Microsoft\/GODEL\" target=\"_blank\" rel=\"noreferrer noopener\">Download the code<\/a><\/div>\n<\/div>\n\n\n\n<div style=\"height:10px\" aria-hidden=\"true\" class=\"wp-block-spacer\"><\/div>\n\n\n\n<h2 id=\"a-grounded-model\">A grounded model<\/h2>\n\n\n\n<p>One of GODEL\u2019s key features is the flexibility it provides users in defining their model\u2019s <em>grounding<\/em>\u2014the sources from which their dialog agents retrieve information. This flexibility informs GODEL\u2019s versatility in diverse conversational settings. If someone were to inquire about a local restaurant for example, GODEL would be able to provide specific and accurate responses even though that venue may not have been included in the data used to train it. Responses would vary depending on whether the grounding information is empty, a snippet of a document, a search result (unstructured text), or information drawn from a database about the restaurant (structured text). However, each response would be appropriate and useful.&nbsp;<\/p>\n\n\n\n<p>In addition to specificity, grounded generation helps keep models up to date, as the grounded text can incorporate information that may not have been available at the time the model was trained. For example, if a model were developed before the 2022 Winter Olympics, GODEL would be able to provide details on those games and a list of winners even though all the data available to train it predates that event.<\/p>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1085526\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Microsoft research podcast<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/whats-your-story-lex-story\/\" aria-label=\"What\u2019s Your Story: Lex Story\" data-bi-cN=\"What\u2019s Your Story: Lex Story\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2024\/08\/Lex-Story_WYS_Hero_Feature_1400x788.jpg\" alt=\"photo of Lex Story for the What's Your Story episode of the Microsoft Research podcast\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">What\u2019s Your Story: Lex Story<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p class=\"large\">Model maker and fabricator Lex Story helps bring research to life through prototyping. He discusses his take on failure; the encouragement and advice that has supported his pursuit of art and science; and the sabbatical that might inspire his next career move.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/podcast\/whats-your-story-lex-story\/\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" aria-label=\"Listen now\" data-bi-cN=\"What\u2019s Your Story: Lex Story\" target=\"_blank\">\n\t\t\t\t\t\t\tListen now\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h2 id=\"broad-application-of-godel\">Broad application of GODEL<\/h2>\n\n\n\n<p>Another main feature of GODEL is its wide range of dialog applications. While its predecessor, DialoGPT, and other prior pretrained models for dialog have mostly focused on social bots, GODEL can be applied to a variety of dialogs, including those that are task-oriented, question-answering, and grounded chit-chat. In the same conversation, GODEL can produce reasonable responses for a variety of query types, including general questions or requests for specific actions.&nbsp;&nbsp;<\/p>\n\n\n\n<p>In addition, GODEL\u2019s responses have been evaluated for their helpfulness. In our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/godel-large-scale-pre-training-for-goal-directed-dialog\/\" target=\"_blank\" rel=\"noreferrer noopener\">paper<\/a>, we show that evaluation is done more reliably on datasets that are goal-directed, and that people generally agree on which responses are better when asked to judge their utility towards achieving certain goals. Equipped with this robust evaluation setup, we compared our model against several strong baselines and state-of-the-art approaches and show that GODEL is superior in terms of both human and automatic evaluation, as indicated in Figure 1. The <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/godel-large-scale-pre-training-for-goal-directed-dialog\/\" target=\"_blank\" rel=\"noreferrer noopener\">paper<\/a> describes extensive experiments against other state-of-the-art pretrained language models and demonstrates that performance gains are even larger in these cases.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image alignwide size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Two bar graphs showing that GODEL outperforms the baseline, in terms of both human and automated dialog evaluation. For human evaluation, GODEL received much higher human ratings (47, 41, and 27), while the human ratings for the best baseline were low (30, 22, and 17). For automatic evaluation, differences are smaller yet still statistically significant.  \" href=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1.png\"><img loading=\"lazy\" decoding=\"async\" width=\"1920\" height=\"640\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1.png\" alt=\"Two bar graphs showing that GODEL outperforms the baseline, in terms of both human and automated dialog evaluation. For human evaluation, GODEL received much higher human ratings (47, 41, and 27), while the human ratings for the best baseline were low (30, 22, and 17). For automatic evaluation, differences are smaller yet still statistically significant.  \" class=\"wp-image-851958\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1.png 1920w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1-300x100.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1-1024x341.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1-768x256.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1-1536x512.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Fig1-240x80.png 240w\" sizes=\"(max-width: 1920px) 100vw, 1920px\" \/><\/a><figcaption>Figure 1: These charts illustrate GODEL\u2019s performance against T5, a pretrained model that performed best in our evaluation. They compare the aggregate performance of models fine-tuned from GODEL against that of models fine-tuned from T5. They show that GODEL performs much better in human evaluations and makes appreciable gains in the automatic evaluation. The test set for these experiments combines a variety of dialog genres, including task-oriented dialog, conversational question-answering, and grounded chit-chat.<\/figcaption><\/figure>\n\n\n\n<p>The following examples illustrate different dialog scenarios where GODEL uses a variety of sources to respond to identical user queries.&nbsp;<\/p>\n\n\n\n\n\n<p>This example illustrates how GODEL responds in an open-ended scenario in which the user asks a question that is completely unrelated to the initial question. Despite the lack of relevance, GODEL responds appropriately while trying to bring the conversation back on track.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Figure showing how GODEL responds to a user who just changed the topic, demonstrating that it can bring the conversation back on track. While the initial query is about a restaurant, the user suddenly mentions a series of tornadoes that have recently affected the area. GODEL uses grounding from a recent news article to provide information about the tornadoes, as requested by the user. Finally, it asks the user if there is anything else it can help with.\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1.jpg\" alt=\"Figure showing how GODEL responds to a user who just changed the topic, demonstrating that it can bring the conversation back on track. While the initial query is about a restaurant, the user suddenly mentions a series of tornadoes that have recently affected the area. GODEL uses grounding from a recent news article to provide information about the tornadoes, as requested by the user. Finally, it asks the user if there is anything else it can help with.\" class=\"wp-image-852318\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide1-960x540.jpg 960w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/a><\/figure>\n\n\n\n\n\n<p>This example illustrates how GODEL responds in a task-oriented setting in which the model is connected to the components of a traditional goal-oriented dialog systems, such as a database. In this case, the relevant environment contains structured information, a database returning two restaurants relevant to the current conversation. \u202f<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Figure showing how GODEL responds appropriately to a user\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2.jpg\" alt=\"Figure showing how GODEL responds appropriately to a user's request for a restaurant reservation. The user expresses a preference for a restaurant named Lucky Star, and GODEL extracts information from a database about that restaurant and retrieves relevant information, such as a reference number, to generate a response that flows naturally with the rest of the conversation.\" class=\"wp-image-852321\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide2-960x540.jpg 960w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/a><\/figure>\n\n\n\n\n\n<p>This example illustrates how GODEL responds in a task-oriented setting in which traditional components of task-oriented dialog systems are not available. In this case, GODEL retrieves a restaurant review via a search engine. The response reflects both the context of the conversation and a snippet of the retrieved text, a restaurant review.&nbsp;&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Figure showing how GODEL responds appropriately to a user\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3.jpg\" alt=\"Figure showing how GODEL responds appropriately to a user's request for information about a specific restaurant. The user asks whether a given restaurant is good for groups, and GODEL uses text originating from restaurant reviews to infer that the restaurant is indeed good for groups. Also, GODEL provides additional information to address a concern with larger groups\u2014that food is typically served quickly.\" class=\"wp-image-852324\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide3-960x540.jpg 960w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/a><\/figure>\n\n\n\n\n\n<p>\u202fThis example illustrates how GODEL responds in a question-answering scenario, where the user asks a general question and the context provides the dialog agent with the words it needs to search for the relevant information on the web.&nbsp;<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Figure showing how GODEL responds appropriately when asked to give an example of a popular Chinese dish. GODEL uses grounding originating from search results to respond to the question while focusing on the most relevant information of the retrieved document. \" href=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"1280\" height=\"720\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4.jpg\" alt=\"Figure showing how GODEL responds appropriately when asked to give an example of a popular Chinese dish. GODEL uses grounding originating from search results to respond to the question while focusing on the most relevant information of the retrieved document. \" class=\"wp-image-852315\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/GODEL_Slide4-960x540.jpg 960w\" sizes=\"(max-width: 1280px) 100vw, 1280px\" \/><\/a><\/figure>\n\n\n\n\n\n<h2 id=\"godel-available-as-open-source\">GODEL available as open source<\/h2>\n\n\n\n<p>To advance research, we believe it is crucial to make code and models publicly available, and we have released GODEL as fully <a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/github.com\/Microsoft\/GODEL\" target=\"_blank\" rel=\"noreferrer noopener\">open source<span class=\"sr-only\"> (opens in new tab)<\/span><\/a>. We have made three versions of GODEL available: base, large, and extra-large. We are also including the code needed to retrain all pretrained models and to fine-tune models for specific tasks: the CoQA dataset, intended for conversational question-answering; the Wizard of Wikipedia and Wizard of the Internet datasets, aimed at information-seeking chats; and MultiWOZ is for task-completion dialogs. <\/p>\n\n\n\n<p>We hope GODEL helps numerous academic research teams advance the field of conversational AI with innovative dialog models while eliminating the need for significant GPU resources. We plan to continuously improve GODEL and make more models available to the research community. Please visit our <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/project\/godel\/\" target=\"_blank\" rel=\"noreferrer noopener\">project page<\/a> to learn more about the GODEL project and new releases.<\/p>\n\n\n\n<h2 id=\"acknowledgements\">Acknowledgements<\/h2>\n\n\n\n<p>We would like to thank our fellow colleagues at Microsoft Research who contributed to this work and blog post: Bill Dolan, Pengcheng He, Elnaz Nouri, Clarisse Simoes Ribeiro.&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>They make restaurant recommendations, help us pay bills, and remind us of appointments. Many people have come to rely on virtual assistants and chatbots to perform a wide range of routine tasks. But what if a single dialog agent, the technology behind these language-based apps, could perform all these tasks and then take the conversation [&hellip;]<\/p>\n","protected":false},"author":37583,"featured_media":851937,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"footnotes":""},"categories":[1],"tags":[],"research-area":[13556,13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-851928","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144736,144931],"related-projects":[599811],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Michel Galley","user_id":32887,"display_name":"Michel Galley","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mgalley\/\" aria-label=\"Visit the profile page for Michel Galley\">Michel Galley<\/a>","is_active":false,"last_first":"Galley, Michel","people_section":0,"alias":"mgalley"},{"type":"user_nicename","value":"Lars Liden","user_id":32612,"display_name":"Lars Liden","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/laliden\/\" aria-label=\"Visit the profile page for Lars Liden\">Lars Liden<\/a>","is_active":false,"last_first":"Liden, Lars","people_section":0,"alias":"laliden"},{"type":"user_nicename","value":"Chris Brockett","user_id":31423,"display_name":"Chris Brockett","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/chrisbkt\/\" aria-label=\"Visit the profile page for Chris Brockett\">Chris Brockett<\/a>","is_active":false,"last_first":"Brockett, Chris","people_section":0,"alias":"chrisbkt"},{"type":"guest","value":"zhou-yu","user_id":"852018","display_name":"Zhou Yu","author_link":"<a href=\"http:\/\/www.cs.columbia.edu\/~zhouyu\/\" aria-label=\"Visit the profile page for Zhou Yu\">Zhou Yu<\/a>","is_active":true,"last_first":"Yu, Zhou","people_section":0,"alias":"zhou-yu"},{"type":"user_nicename","value":"Jianfeng Gao","user_id":32246,"display_name":"Jianfeng Gao","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/jfgao\/\" aria-label=\"Visit the profile page for Jianfeng Gao\">Jianfeng Gao<\/a>","is_active":false,"last_first":"Gao, Jianfeng","people_section":0,"alias":"jfgao"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-scaled-960x540.jpg\" class=\"img-object-cover\" alt=\"Diagram showing GODEL\u2019s architecture. The environment of the dialog system consists of both structured and unstructured content, which it uses to retrieve information. This source content, which we term \u201cgrounding,\u201d is updated and repeatedly used by GODEL to produce a new response after each user input.\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-scaled-960x540.jpg 960w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-300x169.jpg 300w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1024x576.jpg 1024w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-768x432.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1536x865.jpg 1536w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-2048x1153.jpg 2048w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1066x600.jpg 1066w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-655x368.jpg 655w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-343x193.jpg 343w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-240x135.jpg 240w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-scaled-640x360.jpg 640w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1280x720.jpg 1280w, https:\/\/www.microsoft.com\/en-us\/research\/uploads\/prod\/2022\/06\/1400x788_Godel_Hero_Image_still-1920x1080.jpg 1920w\" sizes=\"(max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"June 23, 2022","formattedExcerpt":"They make restaurant recommendations, help us pay bills, and remind us of appointments. Many people have come to rely on virtual assistants and chatbots to perform a wide range of routine tasks. But what if a single dialog agent, the technology behind these language-based apps,&hellip;","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/851928"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/37583"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=851928"}],"version-history":[{"count":22,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/851928\/revisions"}],"predecessor-version":[{"id":870555,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/851928\/revisions\/870555"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/851937"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=851928"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=851928"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=851928"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=851928"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=851928"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=851928"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=851928"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=851928"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=851928"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=851928"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=851928"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}