robotics workshop header - server hallway with a color gradient overlay

Efficient AI

Making Azure’s big bet possible

Recent innovations in generative large language models (LLMs) have made their applications and use-cases ubiquitous. This has led to large-scale deployments of these models, using complex, expensive, and power-hungry AI accelerators, most commonly GPUs. These developments make LLM training and inference efficiency an important challenge.

In the Azure Research – Systems (opens in new tab) group we are working on improving the Azure infrastructure including hardware, power, and serving.

Check the publications tab for our work so far!