Saffi Ali, Author at Microsoft Industry Blogs - United Kingdom http://approjects.co.za/?big=en-gb/industry/blog Thu, 10 Feb 2022 20:05:25 +0000 en-US hourly 1 How you can use Azure Translator to batch translate your documents http://approjects.co.za/?big=en-gb/industry/blog/technetuk/2021/04/06/how-you-can-use-azure-translator-to-batch-translate-your-documents/ Tue, 06 Apr 2021 15:30:02 +0000 In this article we will go through the requirement, challenges, and solution to automatically batch translate documents (HTML/TXT/Word) from any source language to any output language, while maintaining the structure and formatting of the source documents.

The post How you can use Azure Translator to batch translate your documents appeared first on Microsoft Industry Blogs - United Kingdom.

]]>
An illustration of a cloud made of puzzle pieces, with a drawing of Bit the Raccoon to the right of the image.

In this article we will go through the requirement, challenges, and solution to automatically batch translate documents (HTML/TXT/Word) from any source language to any output language, while maintaining the structure and formatting of the source documents.

 

Requirements

Recently, we had a requirement to translate documents in 15 different languages to English and vice-versa. The expectation was to upload a source document and get N number of translated documents with the following high-level requirements:

  1. Most documents are HTML or TXT based.
  2. Any translation must maintain the document structure, keeping static contents, tables, etc. untouched.
  3. Document size can vary anywhere between 1Mb to 20Mbs.
  4. Document volume could reach 12,000 documents per month.
  5. The translation service must not save the documents.
  6. Any customisation to the translation service must enable the customer to view and delete custom data and models at any time.

 

Azure Translate

Azure Cognitive Services offers a variety of AI services and cognitive APIs to help you build intelligent apps. One of those services is Azure Translator. With it, you can translate text in real time across more than 60 languages, powered by the latest innovations in machine translation. It supports a wide range of use cases, such as translation for call centres, multilingual conversational agents, or in-app communication.

An illustration of the Azure Translate process

The great security and compliance features in Azure Translate meets the security requirement as below:

  • Customer data isn’t written to persistent storage. This meets requirement number 5 above.
  • View and delete your custom data and models at any time. This meets requirement number 6 above.

 

Limitations

Now Azure Translator service has natively met 2 of 5 the requirements without writing any code. So, let’s talk about some challenges:

  1. API Limit: Azure Translator Service has an API Limit of 5,000 characters per call. In HTML, where the tags-to-text ratio is high, a good text to HTML ratio is anywhere from 25 to 70 percent. This means we may easily hit the 5,000 character limit with just a call to translate the HTML header, if the header has reasonably large content.
  2. Maintain the structure of HTML document. This means we need:
    • To inspect the overall content and decide what needs to be translated first.
    • To skip certain tags and content.
    • To change LTR/RTL alignment between languages.

 

Solution

There is a great Document Translator WPF application developed by the Microsoft Translator Engineering team that will do the document translation, but this will require users to manually import files. This app cannot scale to the thousands of documents that need to be translated as fast as possible.

My idea was to use the following the components:

  • Azure Blob Storage to store both source documents and translated documents.
  • Azure Function to run the code that orchestrates the translation.
  • Reuse the business logic in the Document Translator after porting it to .NET Core to run in Azure Functions.
  • And of course, the Azure Translator API.

A diagram illustrating the proposed solution

The sequence will be as follows:

  1. Ingestion: Users will upload documents to an Azure blob container. This is like a virtual folder.
  2. Initial processing by Azure Function:
    • Azure function will be triggered when a new, supported file (HTML/TXT), is uploaded in that container. You can learn more about Azure Function Triggers and Bindings on Microsoft Docs.
    • It will determine the source language and destination language, and runtime configurations like the API key.
    • It will then route the processing depending on the file type as below:
- //Translate
- switch (FileExtension)
- {
-     case ("html"):
-         TranslatedContent = HTMLTranslationManager.DoContentTranslation(ContentToBeTranslated, FromLang, ToLang);
-         break;
-     case ("htm"):
-         TranslatedContent = HTMLTranslationManager.DoContentTranslation(ContentToBeTranslated, FromLang, ToLang);
-         break;
-     case "txt":
-         TranslatedContent = DocumentTranslationManager.ProcessTextDocument(ContentToBeTranslated,FromLang,ToLang);
-         break;
-     default:
-         break;
- }
    • For HTML:
      • It will manipulate the content and decide what to translate and what to skip.
      • It will then send batches of requests to the Translate API of 5,000 characters or less to translate.
    • For TXT files:
      • It will then slice the content into batches of 5,000 characters and send it to the API.
    • Lastly, it will concatenate the result in the same sequence they were sent, then correct the alignment and format depending on the output language.
    • Then it will output the translation document to a different Azure Blob container.

 

The Code

The source code for the project is available on GitHub.

To run the application, you need to:

  1. Git clone https://github.com/saffiali/AutoTranslateBlobs.git
  2. Open in Visual Studio or VSCode
  3. Create/Change local.settings.json file to include the following:
1. "AzureWebJobsStorage": "",
2. "FromLang": "Auto-Detect",
3. "ToLang": "Arabic",
4. "AzureTranslateKey": ""

 

About the Author

Saffi is Cloud Solution Architect at Microsoft. He is part of the App Innovation team and is SME for Azure App Development, Azure Blockchain and Azure Integration Services. You can follow him on LinkedIn and Twitter.

 

Useful Links

The post How you can use Azure Translator to batch translate your documents appeared first on Microsoft Industry Blogs - United Kingdom.

]]>
Using Lobe models as APIs in Azure Functions http://approjects.co.za/?big=en-gb/industry/blog/technetuk/2020/11/10/using-lobe-models-as-apis-in-azure-functions/ Tue, 10 Nov 2020 14:00:13 +0000 Lobe is a free client app that helps any user to bring machine learning ideas into reality. Just show it examples of what you want it to learn, and it automatically trains a custom machine learning model that can be shipped to your app.

The post Using Lobe models as APIs in Azure Functions appeared first on Microsoft Industry Blogs - United Kingdom.

]]>
An illustration that represents AI, next to a drawing of Bit the Raccoon.

Microsoft has recently shipped a very interesting Machine Learning app called Lobe. This article will cover:

  • What is Lobe?
  • Lobe vs. Azure Cognitive Service
  • How to use Lobe models in Azure functions.

 

What is Lobe?

Lobe is a free client app that helps any user to bring machine learning ideas into reality. Just show it examples of what you want it to learn, and it automatically trains a custom machine learning model that can be shipped to your app. You can also export those models to be used on any platform.

Lobe currently supports Image Classification, with Object Detection and Data Classification coming soon.

 

Lobe vs. Cognitive Services

While Lobe targets most users, Cognitive Services aims for developers with no/limited machine-learning expertise. All it takes is an API call to embed the ability to see, hear, speak, search, understand and accelerate decision-making into your apps.

Cognitive Services can be trained on the cloud or locally, while Lobe can currently only train locally and data must also be local on the user’s computer.

Collaboration is also an issue for Lobe as it is single use. Cognitive Services is a cloud service that has a robust collaboration and security around data and training.

 

How do you use Lobe?

1) First, see the below product tour. It covers training, labelling and export models:

2) Once you have the model ready, you need to export it as a TensorFlow model as below:

graphical user interface, application

3) Prepare the Model:

  • Install the tf2onnx tool
  • Convert the TensorFlow model to ONNX following using the command python -m tf2onnx.convert –saved-model path/that/contains/saved_model/ –output model.onnx

 

Use Model in Azure Function

1) Create a C# Azure Function in Visual Studio as described here.

2) Install the following packages:

  • lobe.Onnx to import the Onnx based implementation of the image classifier.
  • lobe.ImageSharp to get image manipulation utilities
  • Microsoft.ML.OnnxRuntime to get the native Onnx runtimes

3) Copy the model and change “Copy Always” settings:

graphical user interface, text

4) Copy the below method into your function. You can find the entire repo here.

        private static LobeResult CallModel(string imageToClassify, ExecutionContext context)
        {
            try
            {
                imageToClassify= System.IO.Path.Combine(context.FunctionDirectory, "..\\testImages\\1565.jpg");

                var signatureFilePath = System.IO.Path.Combine(context.FunctionDirectory, "..\\signature.json"); ;
                var modelFile = System.IO.Path.Combine(context.FunctionDirectory, "..\\model.onnx");
                var modelFormat = GetConfig("modelFormat");

                ImageClassifier.Register("onnx", () => new OnnxImageClassifier());
                using var classifier = ImageClassifier.
                    CreateFromSignatureFile(
                    new FileInfo(signatureFilePath),
                    modelFile,
                    modelFormat);

                var results = classifier.Classify(Image
                    .Load(imageToClassify).CloneAs());

                return new LobeResult { Confidence = results.Classification.Confidence, Label = results.Classification.Label };
            }
            catch (Exception e)
            {
                return new LobeResult
                {
                    Label = "unknown",
                    Confidence = 0
                };

5) Test the function using Postman:

graphical user interface, application, website

 

Summary

Although using Lobe is not a substitute for the usage of Cognitive Services, it is still a great tool for starter and local machine learning scenarios where scale is a big deal. However, you must note the following:

  1. Lobe is in public preview. There is no SLA, and in the future Lobe may change or be cancelled.
  2. At present, Lobe only supports image classification (more details on our website).
  3. Lobe is single-user only.
  4. Lobe data and training occurs on the user’s computer.
  5. Lobe does not provide easy access to cloud data, and cannot train in the cloud.

 

Useful Links

The post Using Lobe models as APIs in Azure Functions appeared first on Microsoft Industry Blogs - United Kingdom.

]]>