About
I am the Principal Research SDE Manager at Microsoft Research Asia, leading the Systems and Engineering Group in Shanghai. My work focuses on advancing large-scale model systems, multimodal systems, and intelligent agents. I specialize in efficient computation techniques, long-context inference, and real-world applications of large language models (LLMs). My research bridges cutting-edge AI innovations with practical applications, with publications in top-tier conferences such as OSDI, SOSP, NeurIPS, EuroSys, ATC, CVPR, ICCV, and etc.. I hold B.S. and Ph.D. degrees from Fudan University, earned in 2006 and 2011, respectively.
We are currently working on:
- Resource scheduling and compiling optimization to accelerate the large-scale, sparse, and dynamic DNN models, e.g., the topology-aware GPU scheduler and sparsity compiling stack
- Hardware efficiency (e.g., latency, energy, and carbon footprint) study of diverse DNN models and prediction based automatic efficient model design
- Real time DNN models and systems for cloud gaming and video streaming
- Wireless sensing for Healthcare, Environments, and Human Computer Interaction
We have various positions opening (Researcher, Research SDE, and Intern). Welcome to join us. Please contact me via yuqyang@microsoft.com.