Yifan Xiong

Senior Researcher

Microsoft Research Blog

Research Focus: Week of August 12, 2024

August 14, 2024

In this issue: Research Forum Ep. 4 explores multimodal AI. Registration is now open; Surveying developers’ AI needs; SuperBench improves cloud AI infrastructure reliability; Virtual Voices: Exploring factors influencing participation in virtual meetings.

Awards | USENIX ATC 2024

Best Paper Award at USENIX ATC 2024

July 10, 2024

Our paper titled "SuperBench: Improving Cloud AI Infrastructure Reliability with Proactive Validation" received the Best Paper Award at the 2024 USENIX Annual Technical Conference (USENIX ATC '24).

A line graph comparing the end-to-end performance of Meta’s MoE language model using Azure NDm A100 v4 VMs with and without Tutel. The x-axis is the number of A100 (80GB) GPUs, beginning at 8 and going up to 512, and the y-axis is the throughput (K tokens/s), beginning with 0 and going up to 1,000 in intervals of 100. Tutel always achieves higher throughput than fairseq.

Microsoft Research Blog

Tutel: An efficient mixture-of-experts implementation for large DNN model training

November 22, 2021 | Wei Cui, Yifan Xiong, Peng Cheng, and Rafael Salas

Mixture of experts (MoE) is a deep learning model architecture in which computational cost is sublinear to the number of parameters, making scaling easier. Nowadays, MoE is the only approach demonstrated to scale deep learning models to trillion-plus parameters, paving…

Yifan Xiong

News & features

Research Focus: Week of August 12, 2024

Best Paper Award at USENIX ATC 2024

Tutel: An efficient mixture-of-experts implementation for large DNN model training

Contact Yifan Xiong

Microsoft Research Lab – Asia