{"id":1149202,"date":"2025-09-22T14:26:49","date_gmt":"2025-09-22T21:26:49","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-academic-program&p=1149202"},"modified":"2026-01-26T08:28:28","modified_gmt":"2026-01-26T16:28:28","slug":"lingua-expanding-europes-voices-in-ai","status":"publish","type":"msr-academic-program","link":"https:\/\/www.microsoft.com\/en-us\/research\/academic-program\/lingua-expanding-europes-voices-in-ai\/","title":{"rendered":"LINGUA: Expanding Europe\u2019s Voices in AI"},"content":{"rendered":"\n\n

<\/p>\n\n\n\n\n\n\n

\"LINGUA<\/figure>
\n

LINGUA: Announcing the awardees from Microsoft\u2019s AI for Good Lab Open Call<\/h2>\n\n\n\n

On the European Day of Languages celebrating Europe\u2019s rich linguistic and cultural diversity, we released the LINGUA Open Call. The call invited proposals that advanced digital inclusion for Europe\u2019s low-resource languages. These are languages with limited online content and datasets, leading to underrepresentation in AI technologies compared to high-resource counterparts such as English, Spanish, French, or German. While many vulnerable and endangered languages fall into this category, the call was open to any European language that lacks the digital foundations required for fair representation and participation in the AI era.<\/p>\n\n\n\n

LINGUA aims to address this gap by supporting innovative projects that collect high-quality speech and text datasets for Europe\u2019s underrepresented languages. It is part of Microsoft\u2019s commitment to digital sovereignty and linguistic diversity in Europe, ensuring that every language has the opportunity to be represented in the future of AI. Read more about the initiative (opens in new tab)<\/span><\/a>.<\/p>\n<\/div><\/div>\n\n\n\n

\n

Our commitment<\/h3>\n\n\n\n

At the Microsoft AI for Good Lab, we are deepening our commitment to Europe\u2019s digital future by supporting linguistic diversity, digital sovereignty, and inclusive innovation. The LINGUA Open Call is part of the EU Digital Unlock initiative, which aims to make Europe\u2019s languages and cultures more open and accessible in the digital era. We are proud to collaborate with nonprofits, universities, research institutes, startups, and cultural organizations to enhance resources for low-resource languages, close digital gaps, and maximize impact through shared knowledge and collective action.<\/p>\n\n\n\n

We are excited to launch this initiative in close coordination with the APERTUS project led by EPFL & ETH Zurich, and in consultation with the Council of Europe. Together, we are building data resources for European languages, expanding the supply of multilingual datasets, and enhancing the performance of low resource language LLMs. Our goal is to ensure that Europe\u2019s rich linguistic and cultural heritage is fully represented in the next generation of AI models (e.g., Apertus (opens in new tab)<\/span><\/a>, EuroLLM, SmolLM3) by empowering communities, fostering innovation, and recognizing the people and organizations that make Europe a hub of creativity and inclusion.<\/p>\n<\/div>

\"LINGUA<\/figure><\/div>\n\n\n\n
\n\n\n\n

LINGUA Open Call awardees  <\/h3>\n\n\n\n

The selected projects span 16 languages and dialects across 10 countries<\/strong>, reflecting a diverse mix of low-resource, vulnerable, and underrepresented linguistic communities.<\/p>\n\n\n\n

Based on applicant estimates, they collectively cover languages spoken by over 65 million people, <\/strong>including Icelandic, Luxembourgish, Basque, Maltese, Ladino, Romansh, Ladin, Ukrainian, Romani (and Greco-Romani), several Balkan languages (Serbian, Turkish, Bosnian), and Italian dialects (Neapolitan, Sicilian, Roman), alongside multi-language work.<\/p>\n\n\n\n

The awardees bring together universities, nonprofits, a government language center, and a public broadcaster, with efforts focused on open dataset creation and digitization, heritage language preservation, and new evaluation resources (including safety benchmarks) to strengthen multilingual AI and help safeguard Europe\u2019s linguistic diversity.<\/p>\n\n\n\n

We\u2019re grateful to MILA Quebec, Mozilla, and EPFL for their close collaboration and support throughout the evaluation and selection process.<\/p>\n\n\n\n

We\u2019re pleased to announce the selected projects for the LINGUA Open Call:<\/p>\n\n\n\n