{"id":9514,"date":"2022-05-25T09:09:23","date_gmt":"2022-05-25T16:09:23","guid":{"rendered":"https://www.microsoft.com\/en-us\/translator/blog\/?p=9514"},"modified":"2022-05-27T14:06:47","modified_gmt":"2022-05-27T21:06:47","slug":"translate-scanned-pdf-documents-with-document-translation","status":"publish","type":"post","link":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/","title":{"rendered":"Translate scanned PDF documents with Document translation"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"alignnone size-full wp-image-9523\" src=\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2022\/05\/ScannedPDF-DT-Hero.jpg\" alt=\"Phone used to capture image of document.\" width=\"1500\" height=\"1000\" \/><\/p>\n<p>Today, the\u202f<a href=\"https:\/\/aka.ms\/DocumentTranslationDocs\" target=\"_blank\" rel=\"noopener\">Document translation<\/a> feature of Translator, a Microsoft Azure Cognitive Service,\u202fadds the ability to translate PDF documents containing scanned image content, eliminating the need for customers to preprocess them through an OCR engine before translation.<\/p>\n<p>Document translation was made generally available last year, May 25, 2021, allowing customers to translate entire documents and batches of documents into more than <a href=\"https:\/\/aka.ms\/translatorlanguages\" target=\"_blank\" rel=\"noopener\">110 languages and dialects<\/a> while preserving the layout and formatting of the original file. Document translation supports a variety of file types, including Word, PowerPoint and PDF, and customers can use either pre-built or custom machine translation models. Document translation is enterprise-ready with Azure Active Directory authentication, providing secured access between the service and storage through Managed Identity.<\/p>\n<p>Translating PDFs with scanned image content is a highly requested feature from Document translation customers. Customers find it difficult to segregate PDF documents which have regular text or scanned image content through automation. This creates workflow issues as customers have to route PDF documents with scanned image content first to an OCR engine before sending them to document translation.<\/p>\n<p>Document translation services now have the intelligence<\/p>\n<ul>\n<li>to identify whether the PDF document contains scanned image content or not,<\/li>\n<li>to route PDFs containing scanned image content to an OCR engine internally to extract text,<\/li>\n<li>to reconstruct the translated content as regular text PDF while retaining the original layout and structure.<\/li>\n<\/ul>\n<p>Font formatting like bold, italics, underline, highlights, etc. are not retained for scanned PDF content as OCR technology does not currently capture them. However, font formatting is preserved while translating regular text PDF documents.<\/p>\n<p>Document translation currently supports PDF documents containing scanned image content <a href=\"https:\/\/aka.ms\/TranslatorOCRLanguages\" target=\"_blank\" rel=\"noopener\">from 68 source languages into 87 target languages<\/a>. Support for additional source and target languages will be added in due course.<\/p>\n<p>Now it\u2019s easier for customers to send all PDF documents to Document translation directly and let it decide when and how to use the OCR engine efficiently.<\/p>\n<p>For customers already using Document translation, no code change is required to be able to use this new feature. PDF documents with scanned content can be submitted for translation like any other supported document formats.<\/p>\n<p>We are also pleased to announce that the Document translation adds support for scanned PDF document content with no additional charges to customers. Two pricing plans are available for Document translation through Azure \u2014 the Pay-as-you-go plan and the D3 volume discount plan for higher volumes of document translation. Pricing details can be found at\u202f<a href=\"https:\/\/aka.ms\/TranslatorPricing\" target=\"_blank\" rel=\"noopener\">aka.ms\/TranslatorPricing<\/a>.<\/p>\n<p>Learn how to get started with Document translation at\u202f<a href=\"https:\/\/aka.ms\/DocumentTranslationDocs\" target=\"_blank\" rel=\"noopener\">aka.ms\/DocumentTranslationDocs<\/a>.<br \/>\nSend your feedback to <a href=\"mailto:mtfb@microsoft.com?Subject=Document Translation Feedback\" target=\"_blank\" rel=\"noopener\">mtfb@microsoft.com.<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today, the\u202fDocument translation feature of Translator, a Microsoft Azure Cognitive Service,\u202fadds the ability to translate PDF documents containing scanned image content, eliminating the need for customers to preprocess them through an OCR engine before translation. Document translation was made generally available last year, May 25, 2021, allowing customers to translate entire documents and batches of documents into more than 110<span class=\"read-more-ellipsis\">&#8230;.<\/span><\/p>\n <p class=\"c-paragraph-3 read-more-link\"><a class=\"c-call-to-action c-glyph f-lightweight\" href=\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\">CONTINUE READING <span class=\"x-screen-reader\">\"Translate scanned PDF documents with Document translation\"<\/span><\/a><\/p>","protected":false},"author":54,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[13,5,6],"tags":[],"class_list":["post-9514","post","type-post","status-publish","format-standard","hentry","category-business","category-developers","category-product-news"],"acf":[],"yoast_head":"<title>Translate scanned PDF documents with Document translation - Microsoft Translator Blog<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Translate scanned PDF documents with Document translation - Microsoft Translator Blog\" \/>\n<meta property=\"og:description\" content=\"Today, the\u202fDocument translation feature of Translator, a Microsoft Azure Cognitive Service,\u202fadds the ability to translate PDF documents containing scanned image content, eliminating the need for customers to preprocess them through an OCR engine before translation. Document translation was made generally available last year, May 25, 2021, allowing customers to translate entire documents and batches of documents into more than 110....\" \/>\n<meta property=\"og:url\" content=\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\" \/>\n<meta property=\"og:site_name\" content=\"Microsoft Translator Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/microsofttranslator\" \/>\n<meta property=\"article:published_time\" content=\"2022-05-25T16:09:23+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-05-27T21:06:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2022\/05\/ScannedPDF-DT-Hero.jpg\" \/>\n<meta name=\"author\" content=\"Microsoft Translator\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@mstranslator\" \/>\n<meta name=\"twitter:site\" content=\"@mstranslator\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Microsoft Translator\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"3 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#article\",\"isPartOf\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\"},\"author\":{\"name\":\"Microsoft Translator\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37\"},\"headline\":\"Translate scanned PDF documents with Document translation\",\"datePublished\":\"2022-05-25T16:09:23+00:00\",\"dateModified\":\"2022-05-27T21:06:47+00:00\",\"mainEntityOfPage\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\"},\"wordCount\":434,\"publisher\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\"},\"articleSection\":[\"Business\",\"Developers\",\"Product News\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\",\"name\":\"Translate scanned PDF documents with Document translation - Microsoft Translator Blog\",\"isPartOf\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#website\"},\"datePublished\":\"2022-05-25T16:09:23+00:00\",\"dateModified\":\"2022-05-27T21:06:47+00:00\",\"breadcrumb\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https://www.microsoft.com\/en-us\/translator/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Translate scanned PDF documents with Document translation\"}]},{\"@type\":\"WebSite\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#website\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/\",\"name\":\"Microsoft Translator Blog\",\"description\":\"\",\"publisher\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https://www.microsoft.com\/en-us\/translator/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#organization\",\"name\":\"Microsoft Corporation\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/\",\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png\",\"contentUrl\":\"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png\",\"width\":300,\"height\":300,\"caption\":\"Microsoft Corporation\"},\"image\":{\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/www.youtube.com\/playlist?list=PLD7HFcN7LXRd4kd2XgZjIbQ8TwTC32Zc9\",\"https:\/\/www.facebook.com\/microsofttranslator\",\"https:\/\/twitter.com\/mstranslator\"]},{\"@type\":\"Person\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37\",\"name\":\"Microsoft Translator\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g\",\"caption\":\"Microsoft Translator\"},\"url\":\"https://www.microsoft.com\/en-us\/translator/blog\/author\/mtteam\/\"}]}<\/script>","yoast_head_json":{"title":"Translate scanned PDF documents with Document translation - Microsoft Translator Blog","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/","og_locale":"en_US","og_type":"article","og_title":"Translate scanned PDF documents with Document translation - Microsoft Translator Blog","og_description":"Today, the\u202fDocument translation feature of Translator, a Microsoft Azure Cognitive Service,\u202fadds the ability to translate PDF documents containing scanned image content, eliminating the need for customers to preprocess them through an OCR engine before translation. Document translation was made generally available last year, May 25, 2021, allowing customers to translate entire documents and batches of documents into more than 110....","og_url":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/","og_site_name":"Microsoft Translator Blog","article_publisher":"https:\/\/www.facebook.com\/microsofttranslator","article_published_time":"2022-05-25T16:09:23+00:00","article_modified_time":"2022-05-27T21:06:47+00:00","og_image":[{"url":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2022\/05\/ScannedPDF-DT-Hero.jpg"}],"author":"Microsoft Translator","twitter_card":"summary_large_image","twitter_creator":"@mstranslator","twitter_site":"@mstranslator","twitter_misc":{"Written by":"Microsoft Translator","Est. reading time":"3 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#article","isPartOf":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/"},"author":{"name":"Microsoft Translator","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37"},"headline":"Translate scanned PDF documents with Document translation","datePublished":"2022-05-25T16:09:23+00:00","dateModified":"2022-05-27T21:06:47+00:00","mainEntityOfPage":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/"},"wordCount":434,"publisher":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization"},"articleSection":["Business","Developers","Product News"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/","url":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/","name":"Translate scanned PDF documents with Document translation - Microsoft Translator Blog","isPartOf":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#website"},"datePublished":"2022-05-25T16:09:23+00:00","dateModified":"2022-05-27T21:06:47+00:00","breadcrumb":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/"]}]},{"@type":"BreadcrumbList","@id":"https://www.microsoft.com\/en-us\/translator/blog\/2022\/05\/25\/translate-scanned-pdf-documents-with-document-translation\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https://www.microsoft.com\/en-us\/translator/blog\/"},{"@type":"ListItem","position":2,"name":"Translate scanned PDF documents with Document translation"}]},{"@type":"WebSite","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#website","url":"https://www.microsoft.com\/en-us\/translator/blog\/","name":"Microsoft Translator Blog","description":"","publisher":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https://www.microsoft.com\/en-us\/translator/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#organization","name":"Microsoft Corporation","url":"https://www.microsoft.com\/en-us\/translator/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/","url":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png","contentUrl":"https://www.microsoft.com\/en-us\/translator/blog\/wp-content\/uploads\/sites\/13\/2021\/05\/microsoft_logo_element-300x300-1.png","width":300,"height":300,"caption":"Microsoft Corporation"},"image":{"@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.youtube.com\/playlist?list=PLD7HFcN7LXRd4kd2XgZjIbQ8TwTC32Zc9","https:\/\/www.facebook.com\/microsofttranslator","https:\/\/twitter.com\/mstranslator"]},{"@type":"Person","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/0a163e1bf796b3bb651085032849cf37","name":"Microsoft Translator","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https://www.microsoft.com\/en-us\/translator/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/d22a72f3ca14b9d59f8bcdc837a51c6bf52b4a675c30ef18a9275753db5eda6c?s=96&d=mm&r=g","caption":"Microsoft Translator"},"url":"https://www.microsoft.com\/en-us\/translator/blog\/author\/mtteam\/"}]}},"_links":{"self":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts\/9514","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/users\/54"}],"replies":[{"embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/comments?post=9514"}],"version-history":[{"count":8,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts\/9514\/revisions"}],"predecessor-version":[{"id":9526,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/posts\/9514\/revisions\/9526"}],"wp:attachment":[{"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/media?parent=9514"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/categories?post=9514"},{"taxonomy":"post_tag","embeddable":true,"href":"https://www.microsoft.com\/en-us\/translator/blog\/wp-json\/wp\/v2\/tags?post=9514"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}