{"id":306308,"date":"2010-02-04T14:00:34","date_gmt":"2010-02-04T22:00:34","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?p=306308"},"modified":"2016-10-16T14:34:55","modified_gmt":"2016-10-16T21:34:55","slug":"translator-fast-tracks-haitian-creole","status":"publish","type":"post","link":"https:\/\/www.microsoft.com\/en-us\/research\/blog\/translator-fast-tracks-haitian-creole\/","title":{"rendered":"Translator Fast-Tracks Haitian Creole"},"content":{"rendered":"

By Janie Chang, Writer, Microsoft Research<\/em><\/p>\n

In disaster relief, every hour makes a difference, and communication is essential. When aid efforts began after the recent Haiti earthquake, a request came to the Machine Translation (opens in new tab)<\/span><\/a> team within Microsoft Research\u2019s Natural Language Processing (opens in new tab)<\/span><\/a> (NLP) group from Microsoft volunteers involved in the community supporting assistance in Haiti: Was there a quick way to deliver an online English\/Haitian Creole translator?<\/p>\n

The request to the team came on Tuesday, Jan. 19. Vikram Dendi (opens in new tab)<\/span><\/a>, senior product manager for the Machine Translation team, sent out an SOS.<\/p>\n

Five days later, Dendi sent this e-mail:<\/p>\n

\u201cHi folks \u2013 some news that might be of interest to you. After 5 days of work (including about 20 hours nonstop towards the end) our team shipped a scalable Haitian Creole (<\/em>Krey\u00f2l) system this morning. We would certainly appreciate your help in spreading the word about this\u2013 as well as reaching out to humanitarian agencies that might find it of use. The service and the APIs are all available at no cost.\u201d<\/em><\/p>\n

Figuring Out How to Fast-Track<\/h2>\n

Microsoft Translator is the translation engine behind other applications, so by adding Haitian Creole to its list of supported languages, user applications such as Bing Translator (opens in new tab)<\/span><\/a> or TBot Messenger Translation (opens in new tab)<\/span><\/a> become useful immediately to crisis-response volunteers and companies needing to bridge the language gap in Haiti.<\/p>\n

\"Microsoft

Microsoft Translator’s Haitian Creole widget.<\/p><\/div>\n

Normally, adding a new language to the machine-translation engine can take weeks, if not months. Driven by the urgency of the situation, Dendi\u2019s product team and NLP researchers put aside other priorities and brainstormed ways to get an experimental but functional Haitian Creole machine-translation system online quickly.<\/p>\n

Chris Quirk (opens in new tab)<\/span><\/a>, a researcher with the Machine Translation team, recalls those initial meetings.<\/p>\n

\u201cWhen Vikram first told us the aid community had asked for a Haitian Creole machine-translation system, I was intrigued but skeptical. Statistical machine translation has the incredible ability to turn parallel translated data into translation systems in a matter of hours or days\u2014once you have enough training data.\u201d<\/p>\n

The NLP team knew that its biggest challenge would be identifying parallel data between English and Haitian Creole for training the engine. Haitian Creole, or Krey\u00f2l, is one of two official languages spoken in Haiti; French is the other. Approximately 8 million people in Haiti speak Creole. Compared with more widely spoken languages, the amount of parallel data for Creole is fairly limited.<\/p>\n

\"NLP

The NLP group at work on the Haitian Creole translation system.<\/p><\/div>\n

But team members quickly replaced skepticism with dogged determination and reached out for help. That was when they discovered other groups who had made language resources available.<\/p>\n

\u201cFor instance,\u201d Quirk says, \u201cCarnegie Mellon University had a repository for parallel Haitian Creole and English spoken and text data (opens in new tab)<\/span><\/a>. Government agencies released parallel documents and glossaries, and Web sites such as CrisisCommons (opens in new tab)<\/span><\/a> and haitisurf.com (opens in new tab)<\/span><\/a> were happy to share glossaries and translation resources.\u201d<\/p>\n

Such assistance was invaluable.<\/p>\n

\u201cIf not for the efforts of the community, who made data and dictionaries available with minimal license restrictions,\u201d Dendi says, \u201cthis Haitian Creole machine translator would not be available.\u201d<\/p>\n

Quirk and others immediately turned to the task of integrating these language resources, building training systems, and optimizing translation quality. After a few days, the researchers constructed a system that produced reasonable results, and an engineering team worked nights and weekends to make the translation system go live as soon as possible.<\/p>\n

Availability, Then Improvement<\/h2>\n

The team decided on a strategy of making the system available to the aid community as early as possible and then making improvements to the data. Fortunately, the statistical machine-translation system behind Microsoft Translator enables continuous improvement in translation quality through the addition of more training data. While a typical new language release involves significantly larger amounts of training data and quality testing, the team decided there was justification in making the system available to the community as quickly as possible, because the team would be able to keep improving its translation quality.<\/p>\n

\"Bing

Bing Translator using Haitian Creole.<\/p><\/div>\n

Delivering Haitian Creole via a proven Web service ensures scale and performance; in combination with Microsoft Translator\u2019s extensive API set, it enables developers who are building solutions for the relief effort to add Haitian Creole support to other software and Web sites.<\/p>\n

\u201cReleasing it now means developers can start now,\u201d Dendi says, \u201cand as we add more training data, the translated results will improve. One of the volunteers who contacted us has already built a mobile application using Microsoft Translator APIs.\u201d<\/p>\n

An Ongoing Effort<\/h2>\n

The goals now for Dendi and his team are twofold: improving the training data and making sure the aid community knows the resource is available. Various groups within Microsoft are using social media and blogs to reach out to individual users, as well as to technology projects that could use a scalable translation system in their relief efforts.<\/p>\n

\"A

A Web page translated into Haitian Creole.<\/p><\/div>\n

\u201cWe want everyone who is helping with these relief efforts to know that the services and usage of the Microsoft Translator API (opens in new tab)<\/span><\/a> are completely free,\u201d Dendi emphasizes. \u201cIt can be built into any application or Web site for immediate use. We hope this will help with many of the applications being developed, such as those at crisiscommons.org, to aid in humanitarian efforts. Developers can choose from SOAP, HTTP, and AJAX APIs.\u201d<\/p>\n

Since Jan. 25, the team has added more training data, including manually translated data specifically relevant to humanitarian-aid scenarios, which the team hopes will provide much better results in the field. They are \u201cnowhere near done yet\u201d and continue to work on the project.<\/p>\n

One of Microsoft\u2019s partners in this effort, the Butler Hill Group, provided human translations and evaluation services at no cost, saying, \u201cWe are proud to be able to help you with this important work.\u201d<\/p>\n

That\u2019s the sort of collaborative spirit the project engendered.<\/p>\n

\u201cIt was truly inspiring,\u201d Quirk says, \u201cto see people across the whole natural-language-technologies community work together. This is something we will encourage by releasing more data back to the community. We hope these technologies can help the people at the center of disaster relief efforts communicate a little better.\u201d<\/p>\n

How You Can Help<\/h2>\n

The best way people can help improve the system, Dendi says, is by contributing more training data\u2014typically sentences or words translated between English and Haitian Creole.<\/p>\n

If you know about dictionaries, translated sentences, or Web sites that have such translations, please contribute them via the Taus Data Association (opens in new tab)<\/span><\/a> (TDA) data-sharing initiative. TDA is a non-profit organization providing a neutral, secure platform for sharing language data. Microsoft Research intends to make the Haitian Creole data it collects available to the larger community, via the TDA, for training purposes, as license restrictions permit. Please e-mail your concerns or questions.<\/p>\n

There are many initiatives under way for building applications and Web sites to help with the relief efforts, including several for mobile apps, using the SOAP or HTTP API, and Web sites, using the AJAX API. If you have a project in the works, please provide a link to your application or Web site in the comments of the Microsoft Translator Official Team Blog (opens in new tab)<\/span><\/a>, and the team will make sure to include it where others can find the information.<\/p>\n

If you encounter problems with translations or using APIs, e-mail your feedback.<\/p>\n

There are products and services (opens in new tab)<\/span><\/a> powered by Microsoft Translator that could help your organization. Users and developers can discuss issues on the Machine Translation and Language Tools (opens in new tab)<\/span><\/a> forum.<\/p>\n","protected":false},"excerpt":{"rendered":"

By Janie Chang, Writer, Microsoft Research In disaster relief, every hour makes a difference, and communication is essential. When aid efforts began after the recent Haiti earthquake, a request came to the Machine Translation team within Microsoft Research\u2019s Natural Language Processing (NLP) group from Microsoft volunteers involved in the community supporting assistance in Haiti: Was […]<\/p>\n","protected":false},"author":39507,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"msr-url-field":"","msr-podcast-episode":"","msrModifiedDate":"","msrModifiedDateEnabled":false,"ep_exclude_from_search":false,"_classifai_error":"","footnotes":""},"categories":[194456,194462],"tags":[200699,214646,214649,214652,186515,214655,214658],"research-area":[13545],"msr-region":[],"msr-event-type":[],"msr-locale":[268875],"msr-post-option":[],"msr-impact-theme":[],"msr-promo-type":[],"msr-podcast-series":[],"class_list":["post-306308","post","type-post","status-publish","format-standard","hentry","category-natural-language-processing","category-speech-and-dialog","tag-bing-translator","tag-disaster-relief","tag-haiti-earthquake","tag-humanitarian-agencies","tag-machine-translation","tag-microsoft-translator-api","tag-taus-data-association","msr-research-area-human-language-technologies","msr-locale-en_us"],"msr_event_details":{"start":"","end":"","location":""},"podcast_url":"","podcast_episode":"","msr_research_lab":[],"msr_impact_theme":[],"related-publications":[],"related-downloads":[],"related-videos":[],"related-academic-programs":[],"related-groups":[144736],"related-projects":[],"related-events":[],"related-researchers":[],"msr_type":"Post","byline":"","formattedDate":"February 4, 2010","formattedExcerpt":"By Janie Chang, Writer, Microsoft Research In disaster relief, every hour makes a difference, and communication is essential. When aid efforts began after the recent Haiti earthquake, a request came to the Machine Translation team within Microsoft Research\u2019s Natural Language Processing (NLP) group from Microsoft…","locale":{"slug":"en_us","name":"English","native":"","english":"English"},"_links":{"self":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306308"}],"collection":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/users\/39507"}],"replies":[{"embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/comments?post=306308"}],"version-history":[{"count":2,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306308\/revisions"}],"predecessor-version":[{"id":306326,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/posts\/306308\/revisions\/306326"}],"wp:attachment":[{"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/media?parent=306308"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/categories?post=306308"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/tags?post=306308"},{"taxonomy":"msr-research-area","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/research-area?post=306308"},{"taxonomy":"msr-region","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-region?post=306308"},{"taxonomy":"msr-event-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-event-type?post=306308"},{"taxonomy":"msr-locale","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-locale?post=306308"},{"taxonomy":"msr-post-option","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-post-option?post=306308"},{"taxonomy":"msr-impact-theme","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-impact-theme?post=306308"},{"taxonomy":"msr-promo-type","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-promo-type?post=306308"},{"taxonomy":"msr-podcast-series","embeddable":true,"href":"https:\/\/www.microsoft.com\/en-us\/research\/wp-json\/wp\/v2\/msr-podcast-series?post=306308"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}