{"id":985203,"date":"2023-11-20T18:04:33","date_gmt":"2023-11-21T02:04:33","guid":{"rendered":""},"modified":"2023-11-22T11:19:48","modified_gmt":"2023-11-22T19:19:48","slug":"orca-2-teaching-small-language-models-how-to-reason","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/orca-2-teaching-small-language-models-how-to-reason\/","title":{"rendered":"Orca 2: Teaching Small Language Models How to Reason"},"content":{"rendered":"\n<figure class=\"wp-block-image size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"576\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1024x576.png\" alt=\"Orca-2 blog hero | abstract waves of data\" class=\"wp-image-985938\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n\n<p>A few months ago, we introduced <a href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4\/\">Orca<\/a>, a 13-billion parameter language model that demonstrated strong reasoning abilities by imitating the step-by-step reasoning traces of more capable LLMs.<\/p>\n\n\n\n<p>Orca 2 is the latest step in our efforts to explore the capabilities of smaller LMs (on the order of 10 billion parameters or less). With Orca 2, we continue to show that improved training signals and methods can empower smaller language models to achieve enhanced reasoning abilities, which are typically found only in much larger language models.<\/p>\n\n\n\n<p>Orca 2 significantly surpasses models of similar size (including the original Orca model) and attains performance levels similar to or better than models 5-10 times larger, as assessed on complex tasks that test advanced reasoning abilities in zero-shot settings.<\/p>\n\n\n\n<p>Orca 2 comes in two sizes (7 billion and 13 billion parameters); both are created by fine-tuning the corresponding LLAMA 2 base models on tailored, high-quality synthetic data. We are making the Orca 2 weights&nbsp;publicly available&nbsp;to encourage research on the development, evaluation, and alignment of smaller LMs.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-content-justification-center is-content-justification-center is-layout-flex wp-container-core-buttons-is-layout-1 wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-outline is-style-outline--1\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/orca-progressive-learning-from-complex-explanation-traces-of-gpt-4\/\">Read the Orca paper<\/a><\/div>\n\n\n\n<div class=\"wp-block-button is-style-outline is-style-outline--2\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/publication\/orca-2-teaching-small-language-models-how-to-reason\/\">Read the Orca 2 paper<\/a><\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"using-llms-to-train-smaller-language-models\">Using LLMs to train smaller language models<\/h2>\n\n\n\n<p>Frontier Language Models such as GPT-4, PaLm, and others have demonstrated a remarkable ability to reason, for example, answering complex questions, generating explanations, and even solving problems that require multi-step reasoning; capabilities that were once considered beyond the reach of AI. Traditionally, such abilities have not been observed in smaller language models, so the challenge is how to use our growing knowledge of large language models to increase the abilities of these smaller models.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"expanding-the-capabilities-of-smaller-language-models\">Expanding the capabilities of smaller language models<\/h2>\n\n\n\n<p>A key insight behind Orca 2 is that different tasks could benefit from different solution strategies (e.g. such as step-by-step processing, recall then generate, recall-reason-generate, extract-generate, and direct answer) and that the solution strategy employed by a large model may not be the best choice for a smaller one. For example, while an extremely capable model like GPT-4 can answer complex tasks directly, a smaller model may benefit from breaking the task into steps.<\/p>\n\n\n\n<p>Orca 2 is trained with an expanded, highly tailored synthetic dataset. The training data was generated such that it teaches Orca 2 various reasoning techniques, such as step-by-step processing, recall then generate, recall-reason-generate, extract-generate, and direct answer methods, while also teaching it to choose different solution strategies for different tasks.<\/p>\n\n\n\n<p>The training data is obtained from a more capable teacher model. Note that we can obtain the teacher\u2019s responses through very detailed instructions and even multiple calls, depending on the task and the desired behavior of the model. In the absence of the original instruction, which details how to approach the task, the student model will be encouraged to learn that underlying strategy as well as the reasoning capabilities it elicits.<\/p>\n\n\n\n\t<div class=\"border-bottom border-top border-gray-300 mt-5 mb-5 msr-promo text-center text-md-left alignwide\" data-bi-aN=\"promo\" data-bi-id=\"1116360\">\n\t\t\n\n\t\t<p class=\"msr-promo__label text-gray-800 text-center text-uppercase\">\n\t\t<span class=\"px-4 bg-white display-inline-block font-weight-semibold small\">Microsoft Research Blog<\/span>\n\t<\/p>\n\t\n\t<div class=\"row pt-3 pb-4 align-items-center\">\n\t\t\t\t\t\t<div class=\"msr-promo__media col-12 col-md-5\">\n\t\t\t\t<a class=\"bg-gray-300\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/story\/microsoft-research-2024-a-year-in-review\/\" aria-label=\"Research at Microsoft 2024: Meeting the challenge of a changing world\" data-bi-cN=\"Research at Microsoft 2024: Meeting the challenge of a changing world\" target=\"_blank\">\n\t\t\t\t\t<img decoding=\"async\" class=\"w-100 display-block\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2024\/12\/Year-in-review-2024_Stories_Hero_Feature-1400x788-1.jpg\" alt=\"Research at Microsoft 2024 - Year in Review\" \/>\n\t\t\t\t<\/a>\n\t\t\t<\/div>\n\t\t\t\n\t\t\t<div class=\"msr-promo__content p-3 px-5 col-12 col-md\">\n\n\t\t\t\t\t\t\t\t\t<h2 class=\"h4\">Research at Microsoft 2024: Meeting the challenge of a changing world<\/h2>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<p class=\"large\">In this new AI era, technology is changing even faster than before, and the transition from research to reality, from concept to solution, now takes days or weeks rather than months or years.<\/p>\n\t\t\t\t\n\t\t\t\t\t\t\t\t<div class=\"wp-block-buttons justify-content-center justify-content-md-start\">\n\t\t\t\t\t<div class=\"wp-block-button\">\n\t\t\t\t\t\t<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/story\/microsoft-research-2024-a-year-in-review\/\" class=\"btn btn-brand glyph-append glyph-append-chevron-right\" aria-label=\"Read more\" data-bi-cN=\"Research at Microsoft 2024: Meeting the challenge of a changing world\" target=\"_blank\">\n\t\t\t\t\t\t\tRead more\t\t\t\t\t\t<\/a>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t\t\t<\/div><!--\/.msr-promo__content-->\n\t<\/div><!--\/.msr-promo__inner-wrap-->\n\t<\/div><!--\/.msr-promo-->\n\t\n\n\n<h2 class=\"wp-block-heading\" id=\"orca-2-has-reasoning-capabilities-comparable-to-much-larger-models\">Orca 2 has reasoning capabilities comparable to much larger models<\/h2>\n\n\n\n<p>To evaluate Orca 2, we use a comprehensive set of 15 diverse benchmarks that correspond to approximately 100 tasks and more than 36,000 unique test cases in zero-shot settings. The benchmarks cover a variety of aspects, including language understanding, common-sense reasoning, multi-step reasoning, math problem solving, reading comprehension, summarizing, groundedness, truthfulness, and toxic content generation and identification.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"Results comparing Orca 2 (7B & 13B) to LLaMA-2-Chat (13B & 70B) and WizardLM (13B & 70B) on variety of benchmarks (in 0-shot setting) covering language understanding, common sense reasoning, multi-step reasoning, math problem solving, etc. Orca 2 models significantly surpass other models including models 10x larger. Note that Orca 2 models were trained by continual training of LLaMA-2 base models of the same size. \" href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final.png\"><img loading=\"lazy\" decoding=\"async\" width=\"2498\" height=\"781\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final.png\" alt=\"Results comparing Orca 2 (7B & 13B) to LLaMA-2-Chat (13B & 70B) and WizardLM (13B & 70B) on variety of benchmarks (in 0-shot setting) covering language understanding, common sense reasoning, multi-step reasoning, math problem solving, etc. Orca 2 models significantly surpass other models including models 10x larger. Note that Orca 2 models were trained by continual training of LLaMA-2 base models of the same size. \" class=\"wp-image-986271\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final.png 2498w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-300x94.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-1024x320.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-768x240.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-1536x480.png 1536w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-2048x640.png 2048w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/orca2-final-240x75.png 240w\" sizes=\"auto, (max-width: 2498px) 100vw, 2498px\" \/><\/a><figcaption class=\"wp-element-caption\">Figure 1: Results comparing Orca 2 (7B and 13B) to LLaMA-2-Chat (13B and 70B) and WizardLM (13B and 70B) on variety of benchmarks (in zero-shot setting) covering language understanding, common-sense reasoning, multi-step reasoning, math problem solving, etc. Orca 2 models match or surpass other models, including models 5-10 times larger. Note that all models in this figure share the same base model (LLAMA-2).<\/figcaption><\/figure>\n\n\n\n<p>Our preliminary results indicate that Orca 2\u2019s performance significantly surpasses models of similar size. It also attains performance levels similar or better than those of models at least 10 times larger, showcasing the potential of equipping smaller models with better reasoning capabilities.<\/p>\n\n\n\n<p>Orca 2 models exhibit limitations common to other language models and could retain many of the constraints of the base models upon which they were trained. While Orca 2 training could be applied to different base models, we report results based on using LLaMA-2 7B and 13B models. Orca 2 models have not gone through reinforcement learning from human feedback (RLHF) training for safety.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\" id=\"conclusion\">Conclusion<\/h2>\n\n\n\n<p>Our research on the Orca 2 model has yielded significant insights into enhancing the reasoning abilities of smaller language models. By strategically training these models with tailored synthetic data, we have achieved performance levels that rival or surpass those of larger models, particularly in zero-shot reasoning tasks.&nbsp;<\/p>\n\n\n\n<p>Orca 2\u2019s success lies in its application of diverse reasoning techniques and the identification of optimal solutions for various tasks. While it has several limitations, including limitations inherited from its base models and common to other language models, Orca 2\u2019s potential for future advancements is evident, especially in improved reasoning, specialization, control, and safety of smaller models. The use of carefully filtered synthetic data for post-training emerges as a key strategy in these improvements.<\/p>\n\n\n\n<p>Our findings underscore the value of smaller models in scenarios where efficiency and capability need to be balanced. As larger models continue to excel, our work with Orca 2 marks a significant step in diversifying the applications and deployment options of language models.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><a data-bi-bhvr=\"14\"  data-bi-cn=\"This image from the paper \"Orca-2: Teaching Small Language Models How To Reason\" showcases differences in how Orca 2, LLaMA-2, LLaMA-2-Chat, and ChatGPT (GPT-3.5-Turbo) process and answer a logic-based question. The LLaMA-2 and LLaMA-2-Chat outputs were generated via replicate.com\/meta\/llama-2-13b and chat.lmsys.org, employing standard settings (temperature=0, top_p=1). ChatGPT\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2.jpg\"><img loading=\"lazy\" decoding=\"async\" width=\"777\" height=\"1024\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2-777x1024.jpg\" alt=\"This image from the paper \"Orca-2: Teaching Small Language Models How To Reason\" showcases differences in how Orca 2, LLaMA-2, LLaMA-2-Chat, and ChatGPT (GPT-3.5-Turbo) process and answer a logic-based question. The LLaMA-2 and LLaMA-2-Chat outputs were generated via replicate.com\/meta\/llama-2-13b and chat.lmsys.org, employing standard settings (temperature=0, top_p=1). ChatGPT's response was retrieved from chat.openai.com, providing a clear comparison of how each model approaches problem-solving.\" class=\"wp-image-985878\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2-777x1024.jpg 777w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2-228x300.jpg 228w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2-768x1012.jpg 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2-137x180.jpg 137w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/Orca_Figure2.jpg 800w\" sizes=\"auto, (max-width: 777px) 100vw, 777px\" \/><\/a><figcaption class=\"wp-element-caption\">This image from the paper &#8220;Orca-2: Teaching Small Language Models How To Reason&#8221; showcases differences in how Orca 2, LLaMA-2, LLaMA-2-Chat, and ChatGPT (GPT-3.5-Turbo) process and answer a logic-based question. The LLaMA-2 and LLaMA-2-Chat outputs were generated via replicate.com\/meta\/llama-2-13b and chat.lmsys.org, employing standard settings (temperature=0, top_p=1). ChatGPT&#8217;s response was retrieved from chat.openai.com, providing a clear comparison of how each model approaches problem-solving.<\/figcaption><\/figure>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>At Microsoft, we\u2019re expanding AI capabilities by training small language models to achieve the kind of enhanced reasoning and comprehension typically found only in much larger models.<\/p>\n","protected":false},"author":42183,"featured_media":985938,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[1],"tags":[],"research-area":[13556,13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[243984],"msr-impact-theme":[264846],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-985203","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-research-blog","msr-research-area-artificial-intelligence","msr-research-area-human-language-technologies","msr-locale-en_us","msr-post-option-blog-homepage-featured"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[199565],"msr_impact_theme":["Computing foundations"],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[392600,702211],"related-projects":[983295],"related-events":[],"related-researchers":[{"type":"user_nicename","value":"Ahmed Awadallah","user_id":31979,"display_name":"Ahmed Awadallah","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hassanam\/\" aria-label=\"Visit the profile page for Ahmed Awadallah\">Ahmed Awadallah<\/a>","is_active":false,"last_first":"Awadallah, Ahmed","people_section":0,"alias":"hassanam"},{"type":"user_nicename","value":"Andres Codas","user_id":42207,"display_name":"Andres Codas","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/andrescodas\/\" aria-label=\"Visit the profile page for Andres Codas\">Andres Codas<\/a>","is_active":false,"last_first":"Codas, Andres","people_section":0,"alias":"andrescodas"},{"type":"user_nicename","value":"Hamed Khanpour","user_id":38055,"display_name":"Hamed Khanpour","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/hakhanpo\/\" aria-label=\"Visit the profile page for Hamed Khanpour\">Hamed Khanpour<\/a>","is_active":false,"last_first":"Khanpour, Hamed","people_section":0,"alias":"hakhanpo"},{"type":"user_nicename","value":"Shweti Mahajan","user_id":42594,"display_name":"Shweti Mahajan","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/shmahaj\/\" aria-label=\"Visit the profile page for Shweti Mahajan\">Shweti Mahajan<\/a>","is_active":false,"last_first":"Mahajan, Shweti","people_section":0,"alias":"shmahaj"},{"type":"user_nicename","value":"Arindam Mitra","user_id":42978,"display_name":"Arindam Mitra","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/armitra\/\" aria-label=\"Visit the profile page for Arindam Mitra\">Arindam Mitra<\/a>","is_active":false,"last_first":"Mitra, Arindam","people_section":0,"alias":"armitra"},{"type":"user_nicename","value":"Corby Rosset","user_id":41997,"display_name":"Corby Rosset","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/corbyrosset\/\" aria-label=\"Visit the profile page for Corby Rosset\">Corby Rosset<\/a>","is_active":false,"last_first":"Rosset, Corby","people_section":0,"alias":"corbyrosset"},{"type":"user_nicename","value":"Guoqing Zheng","user_id":37941,"display_name":"Guoqing Zheng","author_link":"<a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/zheng\/\" aria-label=\"Visit the profile page for Guoqing Zheng\">Guoqing Zheng<\/a>","is_active":false,"last_first":"Zheng, Guoqing","people_section":0,"alias":"zheng"}],"msr_type":"Post","featured_image_thumbnail":"<img width=\"960\" height=\"540\" src=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-960x540.png\" class=\"img-object-cover\" alt=\"Orca-2 blog hero | abstract waves of data\" decoding=\"async\" loading=\"lazy\" srcset=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-960x540.png 960w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-300x169.png 300w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1024x576.png 1024w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-768x432.png 768w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1066x600.png 1066w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-655x368.png 655w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-343x193.png 343w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-240x135.png 240w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-640x360.png 640w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1-1280x720.png 1280w, https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/11\/ORCA-BlogHeroFeature-1400x788-1.png 1400w\" sizes=\"auto, (max-width: 960px) 100vw, 960px\" \/>","byline":"","formattedDate":"November 20, 2023","formattedExcerpt":"At Microsoft, we\u2019re expanding AI capabilities by training small language models to achieve the kind of enhanced reasoning and comprehension typically found only in much larger models.","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/985203","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/42183"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=985203"}],"version-history":[{"count":28,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/985203\/revisions"}],"predecessor-version":[{"id":986484,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/985203\/revisions\/986484"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/985938"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=985203"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=985203"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=985203"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=985203"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=985203"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=985203"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=985203"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=985203"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=985203"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=985203"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=985203"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}