Microsoft Research blog
Research Focus: Week of October 28, 2024
New Research | FLASH: Workflow automation agent for diagnosing recurring incidents; METAREFLECTION: Learning instructions for language agents using past reflections; Boosting LLM training efficiency through faster communication between GPUs; and more.
Research at Microsoft 2023: A year of groundbreaking AI advances and discoveries
AI saw unparalleled growth in 2023, reaching millions daily. This progress owes much to the extensive work of Microsoft researchers and collaborators. In this review, learn about the advances in 2023, which set the stage for further progress in 2024.
Announcing the DeepSpeed4Science Initiative: Enabling large-scale scientific discovery through sophisticated AI system technologies
| Shuaiwen Leon Song, Bonnie Kruft, Minjia Zhang, Conglong Li, Martin Cai, and Yuxiong He
Editor’s note, Sept. 28, 2023 – The founding collaborators list was updated to correct omissions and the scientific foundation model graph was updated to correct information. In the next decade, deep learning may revolutionize the natural sciences, enhancing our capacity to…
DeepSpeed ZeRO++: A leap in speed for LLM and chat model training with 4X less communication
| DeepSpeed Team and Andrey Proskurin
Large AI models are transforming the digital world. Generative language models like Turing-NLG, ChatGPT, and GPT-4, powered by large language models (LLMs), are incredibly versatile, capable of performing tasks like summarization, coding, and translation. Similarly, large multimodal generative models like…
Research Focus: Week of November 7, 2022
Welcome to Research Focus, a new series of blog posts that highlights notable publications, events, code/datasets, new hires and other milestones from across the research community at Microsoft. Barun Patra, Saksham Singhal, Shaohan Huang, Zewen Chi, Li Dong, Furu Wei,…
DeepSpeed Compression: A composable library for extreme compression and zero-cost quantization
| DeepSpeed Team and Andrey Proskurin
Large-scale models are revolutionizing deep learning and AI research, driving major improvements in language understanding, generating creative texts, multi-lingual translation and many more. But despite their remarkable capabilities, the models’ large size creates latency and cost constraints that hinder the…
DeepSpeed: Advancing MoE inference and training to power next-generation AI scale
| DeepSpeed Team and Andrey Proskurin
In the last three years, the largest trained dense models have increased in size by over 1,000 times, from a few hundred million parameters to over 500 billion parameters in Megatron-Turing NLG 530B (MT-NLG). Improvements in model quality with size…
Research at Microsoft 2021: Collaborating for real-world change
Over the past 30 years, Microsoft Research has undergone a shift in how it approaches innovation, broadening its mission to include not only advancing the state of computing but also using technology to tackle some of the world’s most pressing…
Tutel: An efficient mixture-of-experts implementation for large DNN model training
| Wei Cui, Yifan Xiong, Peng Cheng, and Rafael Salas
Mixture of experts (MoE) is a deep learning model architecture in which computational cost is sublinear to the number of parameters, making scaling easier. Nowadays, MoE is the only approach demonstrated to scale deep learning models to trillion-plus parameters, paving…