{"id":649749,"date":"2020-05-19T08:01:11","date_gmt":"2020-05-19T15:01:11","guid":{"rendered":"https:\/\/www.microsoft.com\/en-us\/research\/?post_type=msr-project&p=649749"},"modified":"2024-09-09T08:40:22","modified_gmt":"2024-09-09T15:40:22","slug":"ai-at-scale","status":"publish","type":"msr-project","link":"https:\/\/www.microsoft.com\/en-us\/research\/project\/ai-at-scale\/","title":{"rendered":"AI at Scale"},"content":{"rendered":"
\n\t
\n\t\t
\n\t\t\t\"Project\t\t<\/div>\n\t\t\n\t\t
\n\t\t\t\n\t\t\t
\n\t\t\t\t\n\t\t\t\t
\n\t\t\t\t\t\n\t\t\t\t\t
\n\t\t\t\t\t\t
\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\n\n

AI at Scale<\/h1>\n\n\n\n

Models, infrastructure and hardware for next-generation AI applications<\/p>\n\n\t\t\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t<\/div>\n\t\t<\/div>\n\t<\/div>\n<\/section>\n\n\n\n\n\n

Why does AI at scale matter?<\/h2>\n\n\n\n

Microsoft\u2019s AI at Scale initiative is pioneering a new approach that will result in next-generation AI capabilities that are scaled across the company\u2019s products and AI platforms. Building on years of systems work by Microsoft researchers, particularly in the area of parallel computation<\/strong>, AI at Scale makes it possible to quickly train machine learning models at an unprecedented scale<\/strong>. This includes developing a new class of large, centralized AI models<\/strong> that can be scaled and specialized across product domains, as well as creating state-of-the-art hardware and infrastructure<\/strong> to power this new class of models.<\/p>\n\n\n\n

\n
\n

ONNX Integration<\/h4>\n\n\n\n

AI at Scale capabilities, including DeepSpeed, have been integrated into the ONNX (Open Neural Network Exchange) runtime to add distributed training support for machine learning models that is framework-agnostic and hardware-agnostic.<\/p>\n\n\n\n

Get the ONNX code> (opens in new tab)<\/span><\/a><\/p>\n\n\n\n

Explore training examples > (opens in new tab)<\/span><\/a><\/p>\n<\/div>\n\n\n\n

\n

Project Parasail<\/h4>\n\n\n\n

Pioneering a novel approach to parallelizing a large class of seemingly sequential applications, particularly stochastic gradient descent.<\/p>\n\n\n\n

More on Project Parasail ><\/a><\/p>\n<\/div>\n\n\n\n

\n

Project Fiddle<\/h4>\n\n\n\n

Pipeline parallelism is a novel approach to model training to overcome the higher communication costs of data parallelism and the hardware resource inefficiency of model parallelism.<\/p>\n\n\n\n

More on Project Fiddle ><\/a><\/p>\n\n\n\n

Read the blog ><\/a><\/p>\n<\/div>\n<\/div>\n\n\n\n

\n\t\n\t
\n\t\t
\n\t\t\t
\"DeepSpeed<\/figure>
\n

DeepSpeed for large model training<\/h3>\n\n\n\n

DeepSpeed is an open-source PyTorch-compatible library that vastly improves large model training by improving scale, speed, cost and usability\u2014unlocking the ability to train models with over 100 billion parameters enabling breakthroughs in areas such as natural language processing (NLP), and multi-modality (combining language with other types of data, such as images, video, and speech).<\/p>\n\n\n\n

Learn more about the latest DeepSpeed updates ><\/a><\/p>\n\n\n\n

\n
Download DeepSpeed<\/a><\/div>\n<\/div>\n<\/div><\/div>\t\t<\/div>\n\t<\/div>\n\n\t<\/div>\n\n\n\n
\n
\n

Advances in natural language processing<\/h2>\n<\/div>\n\n\n\n
\n

The Turing Natural Language Generation (T-NLG) is a 17-billion parameter language model that outperforms the state-of-the-art on many downstream NLP tasks. In particular, it can enhance the Microsoft Office experience through writing assistance and answering reader questions paving the way for more fluent digital assistants.<\/p>\n\n\n\n

\n
\n
\n\t
\n\t\t
\n\t\t\t\t\t\tBlog<\/span>\n\t\t\tTuring-NLG: A 17-billion-parameter language model by Microsoft<\/span> <\/span><\/a>\t\t\t\t\t\t\t

February 2020<\/p>\n\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n<\/div>\n\n\n\n

\n
\n\t
\n\t\t
\n\t\t\t\t\t\tBlog<\/span>\n\t\t\tIntroducing the next wave of AI at Scale innovations in Bing<\/span> <\/span><\/a>\t\t\t\t\t\t\t

September 2020<\/p>\n\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n

On the multi-modality language-image front, we\u2019ve significantly outperformed the state-of-the-art on downstream language-image tasks (e.g. visual search) with Oscar (Object-Semantics Aligned Pre-training<\/a>).<\/p>\n\n\n\n

Recently, pre-trained models such as Unicoder<\/a>, M-BERT (opens in new tab)<\/span><\/a>, and XLM (opens in new tab)<\/span><\/a> have been developed to learn multilingual representations for cross-lingual and multilingual tasks. By performing masked language model, translation language model, and other bilingual pre-training tasks on multilingual and bilingual corpora with shared vocabulary and weights for multiple languages, these models obtain surprisingly good cross-lingual capability. However, the community still lacks benchmark datasets to evaluate such capability. To help researchers further advance language-agnostic models and make AI systems more inclusive, the XGLUE<\/a> dataset helps researchers test a language model\u2019s zero-shot cross-lingual transfer capability \u2013 its ability to transfer what it learned in English to the same task in other languages.<\/p>\n\n\n\n

We are incorporating these breakthroughs into the company\u2019s products, including Bing, Office, Dynamics, and Xbox. Read this blog post (opens in new tab)<\/span><\/a> to learn more.<\/p>\n\n\n\n

\n
Download XGLUE dataset<\/a><\/div>\n<\/div>\n\n\n\n
<\/div>\n\n\n\n

New hardware for deep learning<\/h2>\n\n\n\n
\"Azure<\/figure>\n\n\n\n

Project Brainwave\u2019s<\/a> hardware boasts a soft Neural Processing Unit (NPU), based on a high-performance field-programmable gate array (FPGA), which accelerates deep neural network (DNN) inferencing, making it ideal for applications in computer vision and natural language processing. This approach is transforming computing by augmenting CPUs with an interconnected and configurable compute layer composed of programmable silicon.<\/p>\n\n\n\n

With a high-performance, precision-adaptable FPGA soft processor, Microsoft datacenters can serve pre-trained DNN models with high efficiencies at low batch sizes.<\/p>\n\n\n\n

Exploiting FPGAs on a datacenter-scale compute fabric, a single DNN model can be deployed as a scalable hardware microservice that leverages multiple FPGAs to create web-scale services. This can process massive amounts of dynamic data.<\/p>\n\n\n\n

\n
\n
\n\t
\n\t\t
\n\t\t\t\t\t\tArticle<\/span>\n\t\t\tDeploy ML models to field-programmable gate arrays (FPGAs) with Azure Machine Learning<\/span> <\/span><\/a>\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n<\/div>\n\n\n\n
\n
\n\t
\n\t\t
\n\t\t\t\t\t\tArticle<\/span>\n\t\t\tWhat is Azure Stack Edge Pro FPGA?<\/span> <\/span><\/a>\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n<\/div>\n<\/div>\n\n\n\n

<\/p>\n\n\n\n

Spell correction at scale<\/h2>\n\n\n\n

Customers around the world use Microsoft products in over 100 languages, yet most do not come with high-quality spell correction. This prevents customers from maximizing their ability to search for information on the web and enterprise\u2014and even to author content. With AI at Scale, we used deep learning along with language families to solve this problem for customers by building what we believe is the most comprehensive and accurate spelling correction system ever in terms of language coverage and accuracy.<\/p>\n\n\n\n

\n
\n
\n
\n\t
\n\t\t
\n\t\t\t\t\t\tBlog<\/span>\n\t\t\tSpeller100: Zero-shot spelling correction at scale for 100-plus languages<\/span> <\/span><\/a>\t\t\t\t\t\t\t

February 2021<\/p>\n\t\t\t\t\t<\/div>\n\t<\/article>\n<\/div>\n<\/div>\n\n\n\n

<\/div>\n<\/div>\n<\/div>\n\n\n\n
\n\t\n\t
\n\t\t
\n\t\t\t
\"a<\/figure>
\n

Learn more<\/h2>\n\n\n\n