{"id":1022232,"date":"2024-04-03T13:08:04","date_gmt":"2024-04-03T20:08:04","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-video&#038;p=1022232"},"modified":"2024-04-03T13:09:32","modified_gmt":"2024-04-03T20:09:32","slug":"mega-multi-lingual-evaluation-of-generative-ai","status":"publish","type":"msr-video","link":"https:\/\/www.microsoft.com\/en-us\/research\/video\/mega-multi-lingual-evaluation-of-generative-ai\/","title":{"rendered":"MEGA: Multi-lingual Evaluation of Generative AI"},"content":{"rendered":"\n<p>Generative AI models have impressive performance on many Natural Language Processing tasks such as language understanding, reasoning and language generation. One of the most important questions that is being asked by the AI community today is about the capabilities and limits of these models, and it is clear that evaluating generative AI is very challenging. Most studies on generative Large Language Models (LLMs) are restricted to English and it is unclear how capable these models are at understanding and generating other languages. We present the first comprehensive benchmarking of generative LLMs &#8211; MEGA, which evaluates models on standard NLP benchmarks, covering 8 diverse tasks and 33 typologically diverse languages. We also compare the performance of generative LLMs to State of the Art (SOTA) non-autoregressive models on these tasks to determine how well generative models perform compared to the previous generation of LLMs. We present a thorough analysis of the performance of models across languages and discuss some of the reasons why generative LLMs are currently not optimal for all languages. We create a framework for evaluating generative LLMs in the multilingual setting and provide directions for future progress in the field.<\/p>\n\n\n\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex\">\n<div class=\"wp-block-button is-style-fill-download\"><a data-bi-type=\"button\" class=\"wp-block-button__link wp-element-button\" href=\"https:\/\/www.microsoft.com\/en-us\/research\/wp-content\/uploads\/2023\/08\/MARI-Seminar-June-28-Multilingual-Evaluation-of-Generative-AI.pdf\">Presentation slides<\/a><\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading h5\" id=\"speaker-details\">Speaker Details<\/h2>\n\n\n\n<p><strong><a class=\"msr-external-link glyph-append glyph-append-open-in-new-tab glyph-append-xsmall\" href=\"https:\/\/www.linkedin.com\/in\/kabirahuja2431\/\" target=\"_blank\" rel=\"noreferrer noopener\">Kabir Ahuja<span class=\"sr-only\"> (opens in new tab)<\/span><\/a><\/strong><\/p>\n\n\n\n<p>I am Kabir Ahuja. I am currently a Research Fellow at Microsoft Research India (MSRI), where I am supervised by Dr. Sunayana Sitaram and Dr. Monojit Choudhury . I primarily work on understanding, improving and effectively evaluating Cross Lingual Transfer in Pretrained Multilingual Language Models. My research vision is about creating multilingual models that can effectively serve all of the world&#8217;s languages while requiring only a little explicit supervision on low resource languages. I also collaborate with Dr. Navin Goyal at MSRI on studying the properties of self-attention in Transformers and the types of functions that can be expressed and learned by these networks.<\/p>\n\n\n\n<p><strong><a href=\"https:\/\/www.microsoft.com\/en-us\/research\/people\/mochieng\/\" target=\"_blank\" rel=\"noreferrer noopener\">Millicent Ochieng<\/a><\/strong><\/p>\n\n\n\n<p>Millicent Ochieng is a Data & Applied Scientist at the Microsoft Africa Research Institute (MARI), specializing in Natural Language Processing (NLP), Machine Learning (ML), and Computer Vision. With a passion for leveraging these technologies to address pressing real-world challenges, she explores their practical applications in the domains of healthcare, the future of work, society, and sustainability. Currently, Millicent is actively working on multilingual evaluations of Large Language Models (LLMs) and enhancing Automatic Speech Recognition (ASR) systems for the Swahili language.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Generative AI models have impressive performance on many Natural Language Processing tasks such as language understanding, reasoning and language generation. One of the most important questions that is being asked by the AI community today is about the capabilities and limits of these models, and it is clear that evaluating generative AI is very challenging. [&hellip;]<\/p>\n","protected":false},"featured_media":1022235,"template":"","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"research-area":[13556],"msr-video-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-pillar":[],"class_list":["post-1022232","msr-video","type-msr-video","status-publish","has-post-thumbnail","hentry","msr-research-area-artificial-intelligence","msr-locale-en_us"],"msr_download_urls":"","msr_external_url":"https:\/\/www.youtube.com\/watch?v=UICFiLxArhI","msr_secondary_video_url":"","msr_video_file":"","_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/1022232","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/msr-video"}],"version-history":[{"count":3,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/1022232\/revisions"}],"predecessor-version":[{"id":1022256,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video\/1022232\/revisions\/1022256"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media\/1022235"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=1022232"}],"wp:term":[{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=1022232"},{"taxonomy":"msr-video-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-video-type?post=1022232"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=1022232"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=1022232"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=1022232"},{"taxonomy":"msr-pillar","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-pillar?post=1022232"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}