Research Archives - Microsoft Translator Blog http://approjects.co.za/?big=en-us/translator/blog/tag/research/ Wed, 08 Mar 2023 06:44:41 +0000 en-US hourly 1 Bing’s gendered translations tackle bias in translation http://approjects.co.za/?big=en-us/translator/blog/2023/03/08/bings-gendered-translations-tackle-bias-in-translation/ Wed, 08 Mar 2023 08:00:07 +0000 http://approjects.co.za/?big=en-us/translator/blog/?p=9679 We’re excited to announce that, as of today, masculine and feminine alternative translations are available for when translating from English to Spanish, French, or Italian. You can try out this new feature in both Bing Search and Bing Translator verticals. Over the last few years, the field of Machine Translation (MT) has been revolutionized by the advent of transformer models,....

The post Bing’s gendered translations tackle bias in translation appeared first on Microsoft Translator Blog.

]]>
Gender de-bias
3D rendering of gender symbols.

We’re excited to announce that, as of today, masculine and feminine alternative translations are available for when translating from English to Spanish, French, or Italian. You can try out this new feature in both Bing Search and Bing Translator verticals.

Over the last few years, the field of Machine Translation (MT) has been revolutionized by the advent of transformer models, leading to tremendous improvements in quality. However, models optimized to capture the statistical properties of data collected from the real world inadvertently learn or even amplify social biases found in that data.

Our latest release is a step towards reducing one of these biases, specifically gender bias that is prevalent in MT systems. Bing Translator has always produced a single translation for an input sentence even when the translations could have had other gender variations including feminine and masculine variants. In accordance with the Microsoft responsible AI principles, we want to ensure we provide correct alternative translations and are more inclusive to all genders. As part of this journey our first step is to provide feminine and masculine translation variants.

Gender is expressed differently across different languages. For example, in English, the word lawyer could refer to either a male or female individual, but in Spanish, abogada would refer to a female lawyer, while abogado would refer to a male one. In the absence of information about the gender of a noun like ‘lawyer’ in a source sentence, MT models may resort to selecting an arbitrary gender for the noun in the target language. Often, these arbitrary gender assignments align with stereotypes, perpetuating harmful societal bias (Stanovsky et al., 2019; Ciora et al., 2021) and leading to translations that are not fully accurate.

In the example below, you notice that while translating gender-neutral sentences from English to Spanish, the translated text follows the stereotypical gender role, i.e., lawyer is translated as being male.

Translation with gender bias
Screenshot of translation of English text “Let’s get our lawyer’s opinion on this issue.” into Spanish language having gender bias.

As there is no context in the source sentence that implies the gender of the lawyer, producing a translation with the assumption of either a male or female lawyer would both be valid. Now, Bing Translator produces translations with both feminine and masculine forms.

Translation of gender ambiguous English Text into Spanish
Screenshot of translation of English text “Let’s get our lawyer’s opinion on this issue.” into Spanish language having gender specific translations.

System design

We aimed to design our system to meet the following key criteria for providing gendered alternatives:

  1. The feminine and masculine variants should have minimal differences except for those needed to convey gender.
  2. We wanted to cover a wide range of sentences where multiple gendered alternatives are possible.
  3. We wanted to ensure that the translations preserve the meaning of the original source sentence.

Detecting gender ambiguity

In order to accurately detect gender ambiguity in source text, we utilize a coreference model to analyze inputs containing animate nouns. For instance, if a given input text contains a gender-neutral profession word, we only want provide gendered alternatives for it when its gender can’t be determined by other information in the sentence. For example: On translating an English sentence “The lawyer met her driver at the hotel lobby.” into French we can determine that the lawyer is female, while the gender of the driver is unknown.

Translation of gender ambiguous English Text into French
Screenshot of translation of English text “The lawyer met her driver at the hotel lobby.” into French language.

Generating alternate translation

When the source sentence is ambiguously gendered, we examine our translation system’s output to decide if an alternative gender interpretation is possible. If so, we proceed to determine the best way to revise the translation. We begin by constructing a set of candidate target translations by rewriting the original translation. We apply linguistic constraints based on dependency relations to ensure consistency in the proposed alternatives and prune the erroneous candidates.

However, in many cases, even after applying our constraints, we are left with multiple candidate rewrites for the gendered alternative translation. To determine the best option, we evaluate each candidate by scoring it with our translation model. By leveraging the fact that a good gender rewrite will also be an accurate translation of the source sentence, we are able to ensure high accuracy in our final output.

System design of gender re-inflection
A diagram showing system design of gender re-inflection.

Leveraging managed online endpoints in Azure Machine Learning

The gendered alternative feature in Bing is hosted on managed online endpoints in Azure Machine Learning. Managed online endpoints provide a unified interface to invoke and manage model deployments on Microsoft-managed compute in a turnkey manner. They enable us to take advantage of scalable and reliable endpoints without being concerned about infrastructure management. This inference environment also enables the processing of large numbers of requests with low latency. Our ability to create and deploy the gender debias service with the latest frameworks and technologies has been greatly improved through the use of managed inference features in Azure Machine Learning. By leveraging these features, we have been able to maintain low COGS (Cost of Goods Sold) and ensure straightforward security and privacy compliance.

How can you contribute?

To facilitate progress in gender bias reduction in MT, we are releasing a test corpus containing gender-ambiguous translation examples from English into Spanish, French and Italian. Each English source sentence is accompanied by multiple translations, covering each possible gender variation.

Our test set is constructed to be challenging, morphologically rich and linguistically diverse. This corpus has been instrumental in our development process. It was developed with the help of a bilingual linguists with significant translation experience. We are also releasing a technical paper that discusses the test corpus in detail and the methodology and tools for evaluation.

GATE: A challenge set for Gender-Ambiguous Translation Examples – Paper

GATE: A challenge set for Gender-Ambiguous Translation Examples – Test set

Path forward

Through this work we aim to improve the quality of MT output in cases of ambiguous source gender, as well as facilitate the development of better and more inclusive natural language processing (NLP) tools in general. Our initial release focuses on translating from English to Spanish, French, and Italian. Going forward, we plan to expand to new language pairs, as well as cover additional scenarios and types of biases.

Credits:

Ranjita Naik, Spencer Rarrick, Sundar Poudel, Varun Mathur, Jeshwanth Kumar Chandrala, Charan Mohan, Lee Schwartz, Steven Nguyen, Amit Bhagwat, Vishal Chowdhary.

The post Bing’s gendered translations tackle bias in translation appeared first on Microsoft Translator Blog.

]]>
From Tweet to Translate: Microsoft’s translation service powers new translation feature in Twitter for Windows Phone http://approjects.co.za/?big=en-us/translator/blog/2013/06/27/from-tweet-to-translate-microsofts-translation-service-powers-new-translation-feature-in-twitter-for-windows-phone/ Thu, 27 Jun 2013 18:02:00 +0000 https://blogs.msdn.microsoft.com/translation/2013/06/27/from-tweet-to-translate-microsofts-translation-service-powers-new-translation-feature-in-twitter-for-windows-phone/ Over the last few months, we shared with you two innovative translation experiences that we developed for the Windows platform – Bing Translator for Windows Phone and for Windows 8. These apps utilize the best technologies from Microsoft Research, Bing and Windows to deliver great travel, communication and information consumption experiences to consumers. Thousands of developers are at BUILD 2013....

The post From Tweet to Translate: Microsoft’s translation service powers new translation feature in Twitter for Windows Phone appeared first on Microsoft Translator Blog.

]]>
Over the last few months, we shared with you two innovative translation experiences that we developed for the Windows platform – Bing Translator for Windows Phone and for Windows 8. These apps utilize the best technologies from Microsoft Research, Bing and Windows to deliver great travel, communication and information consumption experiences to consumers.

Thousands of developers are at BUILD 2013 in San Francisco this week where Microsoft is showcasing how they can create great experiences for their consumers on Windows platforms by utilizing these technologies in their own applications.

Today during Steven Guggenheimer’s keynote at BUILD, Microsoft showcased the availability of an exciting new update to Twitter for Windows Phone – bringing instant translation of Tweets that are in a different language than your own. Over the last year, Microsoft has been working with the team at Twitter to explore how its translation technology, based on Microsoft Research’s extensive advancements in machine learning, can help the global Twitter community better communicate across language barriers.

        Twitter Screenshot 2    Twitter Screenshot

With this update, a soccer/football fan can still follow the news about their favorite soccer team even if the breaking news on Twitter is not in their language. Tapping on a Tweet with a globe icon, which indicates translation is available, expands the Tweet and shows translated text right below the original content. The built-in Tweet translation feature is available for the 38 languages supported by the app powered by Microsoft Translator. Download/update your Windows Phone Twitter app to try it out for yourself!

“Breaking down language barriers with world-class research and engineering has been the guiding principle behind the development of Microsoft Translator, and Twitter is an excellent new addition to community of customers and developers leveraging Microsoft’s translation technology for their users,” said Peter Lee, Corporate Vice President of Microsoft Research US. “The integration of machine translation technology from Microsoft Research has the ability to broaden any application’s impact through a substantial increase in accessibility to real time communications and information sharing. No longer is language a barrier to real time instant connections around the world.”

Windows Phone application developers can take advantage of the Microsoft Translator API to bring the power of instant translation to their apps. Windows developers can also download the just announced Translator control for Windows to reach a global audience and differentiate their Windows applications.

As the next billion users come online, we look forward to delivering and enabling many more global experiences by continuing to harness the innovations coming out of our research work and data platforms with developers, app builders and partners. 

The post From Tweet to Translate: Microsoft’s translation service powers new translation feature in Twitter for Windows Phone appeared first on Microsoft Translator Blog.

]]>
A Window to the World, Bing Translator App for Windows Now Available http://approjects.co.za/?big=en-us/translator/blog/2013/06/06/a-window-to-the-world-bing-translator-app-for-windows-now-available/ Thu, 06 Jun 2013 16:56:00 +0000 https://blogs.msdn.microsoft.com/translation/2013/06/06/a-window-to-the-world-bing-translator-app-for-windows-now-available/ The Bing Translator app for Windows is available for download today. Designed from the ground up for Windows devices, the app places powerful translation technology at your fingertips by instantly translating content in more than 40 languages, at home, work or on-the-go. Whether utilizing your PC’s camera to deliver “augmented reality” translation, typing in a quick sentence or two, working....

The post A Window to the World, Bing Translator App for Windows Now Available appeared first on Microsoft Translator Blog.

]]>
The Bing Translator app for Windows is available for download today. Designed from the ground up for Windows devices, the app places powerful translation technology at your fingertips by instantly translating content in more than 40 languages, at home, work or on-the-go. Whether utilizing your PC’s camera to deliver “augmented reality” translation, typing in a quick sentence or two, working offline when you are not connected or harnessing unique features of Windows to translate content from many other Windows apps, Bing Translator is a must have application for all your Windows devices.

You can now download the free app from the Windows store here.

The Bing Translator app is based on years of Microsoft Research’s investments in advancing machine learning – a way to find patterns that humans can’t see, helping people interpret the words and worlds around them.

Translating content whether browsed, typed or scanned is nearly instantaneous. Just point your device’s camera at printed text and watch as the translation is automatically overlaid over the video stream – creating subtitles for everyday life. You can also type to translate with your keyboard and hear translations spoken with a native speaker’s accent.

The Translator app is the perfect companion when traveling. The app can help overcome language barriers, even when there’s no internet connection. Save on expensive data plans when traveling with offline language packs for select languages so you can travel with confidence, even in the most remote locations. More language packs coming soon.

The Share Charm lets you quickly translate highlighted text in any Windows 8 app, with Snap View you can multi-task while browsing, chatting or more by snapping Bing Translator to the right or left of your screen. With this unique feature, powerful translation technology is just a swipe away in Windows 8 no matter where you are – at your desk or on the go.

For more about the app, check out the Translator for Windows product page, and check back for future developments and updates.

We hope that this app becomes your window to the world, no matter where you are!

– Vikram Dendi,
Director of Product Management,
Microsoft/Bing Translator – Microsoft Research

The post A Window to the World, Bing Translator App for Windows Now Available appeared first on Microsoft Translator Blog.

]]>
Ready to Reenergize: Community Unveiling of the Custom Mayan to Spanish Translation System http://approjects.co.za/?big=en-us/translator/blog/2013/01/04/ready-to-reenergize-community-unveiling-of-the-custom-mayan-to-spanish-translation-system/ Sat, 05 Jan 2013 00:00:00 +0000 https://blogs.msdn.microsoft.com/translation/2013/01/04/ready-to-reenergize-community-unveiling-of-the-custom-mayan-to-spanish-translation-system/ Special guest post from Microsoft Research Connections Director Kristin Tolle, who has been working with the Mayan community to enable them to preserve their language. Microsoft Translator Hub provides a means for communities and businesses to build custom language translation systems. At X’Caret, the Mayan eco-archaeological park in Carmen Del playa, the Rector of the Universidad Intercultural Maya de Quintana....

The post Ready to Reenergize: Community Unveiling of the Custom Mayan to Spanish Translation System appeared first on Microsoft Translator Blog.

]]>
Special guest post from Microsoft Research Connections Director Kristin Tolle, who has been working with the Mayan community to enable them to preserve their language. Microsoft Translator Hub provides a means for communities and businesses to build custom language translation systems.

At X’Caret, the Mayan eco-archaeological park in Carmen Del playa, the Rector of the Universidad Intercultural Maya de Quintana Roo, Professor Francisco Rosado-May and I along with Governor of Quintara Roo, Roberto Borge Angulo, unveiled the custom Mayan to Spanish translation system to demonstrate it to the community on December 21st, 2012—a date that coincided with the end of the 13th b’ak’tun and the beginning of the 14th. A fitting beginning for the Mayan-Spanish translation system.

I mentioned what an honor it is in a Microsoft Research Connections blog to work with local communities to create new translation models. What is special about the Microsoft Translator Hub is that it enables this capability “at home” by putting the power of developing a translation system into the hands of the organizations that care about it the most—the communities themselves.

An organization’s small data can be combined with our big data for the major languages to aid in the training of a new system—keeping it in use for coming generations or as the Mayans say, b’ak’tun. This is incredibly important to culture and language preservation as Carlos Allende, Public Sector Director Microsoft México explains, “The Microsoft Translator Hub is Microsoft’s contribution to worldwide cultures. In Mexico we are proud that this incredible technology is displayed for celebrating the Mayan Katun for keeping this language alive and allowing the next generation to have access to this millenarian knowledge.”

It takes a great deal of effort to build a translation model between two languages. One of the features of the Microsoft Translator Hub is that one can do this directly—create a translation model between two languages without having to go through a “pivot” language (usually English). And this is what the local university, Universidad Intercultural Maya de Quintana Roo, has set out to do; to translate from Mayan to Spanish and vice versa.

The process began in May of this year when the Rector of the University, Professor Francisco Rosado-May, met with us at the LATAM Faculty Summit held in Cancun to discuss how it might be possible for his institution to work on Yucatec, a local Mayan dialect, as well as other related languages.

“The Translator Hub by Microsoft is not only a powerful software that facilitates the proper communication between Maya and Spanish but it is also a very important tool to achieve one of the strategic goals of our university: to preserve and increase the use of Maya,” said Professor Rosado-May who went on to explain the significance of language preservation, “Language is the genetic code of any culture, by understanding and using a lot more Maya, we also understand better the mental processes that trigger the construction of knowledge. In the case of Maya, that means understanding how they created sophisticated knowledge such as the zero, astronomy, mathematics, etc. This is why my University and I appreciate so much what Microsoft is doing with the Translator Hub.”

What is being unveiled today is a result of the hard work of linguistics professor, Martin Equival-Pat, his students, local language experts and the support of the local government agencies and Microsoft Mexico. Through their work the university has been able to build a Spanish to Yucatec and Yucatec to Spanish translation system that is just the beginning. As Rosado-May goes on to elaborate, “I expect that the hub will play an important role for the years to come in positioning the Maya language in the global world. We might be witnessing something special for the Baktuns ahead of us and contributing to one of the most important dreams all over the world: live in peace by understanding each other better, and recognizing that different cultures and different languages are important for peace.”

Microsoft Mexico fully supports this project and is comitted to the Mayan society. As Juan Alberto González Esparza, General Director Microsoft México explains, “Think for a moment of a situation where a Spanish speaker and a Maya person communicate with one another in their own languages using a computer or a phone. This is the world that Microsoft has imagined and now this is a reality thanks the Microsoft Translator HUB-Maya; that brings to the new age the Mayan language with all its culture, meanings, stories and lifestyle that will be preserved and available to everyone worldwide. This is the way we are generating a real impact in vulnerable communities connecting people with the potential of our technology.”

As we entered into the 14th b’ak’tun on December 22nd energized and engaged; the possibilities for the impact of the Hub and the impact of language preservation throughout the world are limitless.

Kristin Tolle
Director, Natural User Interactions Team
Microsoft Research Connections

 

 

The post Ready to Reenergize: Community Unveiling of the Custom Mayan to Spanish Translation System appeared first on Microsoft Translator Blog.

]]>
Breakthroughs in Translating Speech from our Research Teams http://approjects.co.za/?big=en-us/translator/blog/2012/11/12/breakthroughs-in-translating-speech-from-our-research-teams/ Mon, 12 Nov 2012 18:59:00 +0000 https://blogs.msdn.microsoft.com/translation/2012/11/12/breakthroughs-in-translating-speech-from-our-research-teams/ This is the year of machine learning and big data. Whether it is predicting political results, supercharging your Excel spreadsheets, helping map queries to intent in Search, or even customizing a translation engine to best fit your content – these research areas are playing a starring role in transforming technology and productivity. A couple of weeks back, at the 14th....

The post Breakthroughs in Translating Speech from our Research Teams appeared first on Microsoft Translator Blog.

]]>
This is the year of machine learning and big data. Whether it is predicting political results, supercharging your Excel spreadsheets, helping map queries to intent in Search, or even customizing a translation engine to best fit your content – these research areas are playing a starring role in transforming technology and productivity.

A couple of weeks back, at the 14th annual Computing in the 21st Century Conference, attendees saw a glimpse of where else these technologies are taking us – and loved it. Rick Rashid, who heads up Microsoft Research worldwide, went up on stage and in the span of eight sentences, got the 2000+ strong crowd up on their feet and cheering. It was a moment where technology was indistinguishable from magic – and one that would spur science fiction writers to start thinking of bigger challenges for researchers to tackle 🙂

Watch the video to see for yourself:

 

 

A combination of powerful technologies were employed to make this amazing demonstration possible: Deep Neural Network based processing combined with high performance computing allowed a significant jump in accuracy of speech recognition. The Microsoft Translator technology that you use each day was customized to best fit Rick’s speech content. New speech synthesis technology that allows personalization of acoustic characteristics was able to create “Rick’s voice” in a language he does not speak. You can read Rick’s blog post here.

Some of these technologies are already available today, especially the industry-leading translation (Microsoft Translator) with customization capabilities (Translator Hub). If you are a Windows Phone user, you have been enjoying the most innovative translation app on any phone for over a year now, which includes an early speech translation experience that has been tuned for travel situations. The audio output that you hear on Bing Translator website uses some of the newer speech synthesis engines coming out of our Speech research. Deep-Neural-Net research is also behind our audio/video indexing service – MAVIS, which is available commercially.

The excitement that has been rippling across the web in response to this demonstration is an indicator of how much everyone wants to experience this ‘magic’. There is much work to do, but you will see the benefits of this amazing research in our products in our future releases.

Vikram Dendi
Director
Microsoft/Bing Translator & Microsoft Research

The post Breakthroughs in Translating Speech from our Research Teams appeared first on Microsoft Translator Blog.

]]>
Politically Incorrect Machines http://approjects.co.za/?big=en-us/translator/blog/2008/10/25/politically-incorrect-machines/ Sun, 26 Oct 2008 05:57:00 +0000 https://blogs.msdn.microsoft.com/translation/2008/10/25/politically-incorrect-machines/ While we at the Machine Translation team have been seeing increasing traffic to our various offerings over the past few months, we noticed a sudden bump in traffic yesterday. Having grown up on Agatha Christie and Sherlock Holmes, such mysteries are irresistible for me – and a number of other folks on the team were just as curious to find....

The post Politically Incorrect Machines appeared first on Microsoft Translator Blog.

]]>
While we at the Machine Translation team have been seeing increasing traffic to our various offerings over the past few months, we noticed a sudden bump in traffic yesterday. Having grown up on Agatha Christie and Sherlock Holmes, such mysteries are irresistible for me – and a number of other folks on the team were just as curious to find out what caused this sudden bump. We figured that the IE8 Activity/Accelerator, the Messenger Bot, Search translations, Office translations were all showing the same upward trend as the days before and thus were not the specific reason for this bump.

Eventually, we were able to identify one potential reason why we were seeing this spike. Our user community found an oddity in how the machine translation engine processed the translation for several names from English to German. It was to be expected that when the engine translates the name of the candidate of one party to someone from the other party, given the current political atmosphere in the run up to US elections, that it would end up as news. While we certainly welcome all the new users that came by to check this phenomenon out – we wanted to share with our users the reason why such things seem to happen from time to time with statistically trained machine translation systems from us and others.

A Statistical Machine Translation engine is trained on lots and lots of parallel data, that is, data that exists in both a source language (e.g., English) and a target language (e.g., German), where the source and target are translations of one another. Our engine is trained on millions of sentences for each language pair we support. In order to train on a particular corpus of data—maybe a large number of newswire articles in English which have been translated into German—we first have to break that corpus down into sentences. After the corpus is sentence broken, we feed the resulting sentences into a sentence aligner, the sole purpose of which is to find what sentences on the source side align with sentences on the target side. This is no trivial task, since a sentence on one side could conceivably align with one or more sentences on the target (or possibly none at all!). The aligner will sometimes make mistakes, and misalign one sentence with another that is in fact not a translation. This can lead to some mistranslations, especially if there are words in the source and target that are infrequently occurring. Since our translation engine is statistical, it is highly reliant on co-occurrence frequencies between words in the source and target data. If certain words are infrequently occurring—people’s names, for instance, may only occur a few times across a corpus of millions of sentences—the lack of frequency can lead to mistranslations resulting from incorrect “guesses” between source and target (i.e., low probabilities assigned to particular source and target words). This can lead to some comical gaffes in our translation system.

So, that is how the “machine” decided to translate in a way that ended up with the community attributing it to the sense of humor of our team. While we continue to work hard to ensure proper alignments, it is to be expected from a statistical system that is built on millions to billions of words that such a situation could repeat.

The current issue with alignment should now be resolved but we urge our community of users to keep helping us identify any such situations by contacting us through this blog.

-Vikram

Vikram Dendi leads Business Strategy & Product Planning for the Microsoft Translator team

The post Politically Incorrect Machines appeared first on Microsoft Translator Blog.

]]>