Microsoft Research Asia StarTrack Scholars

Region: Global

1. Societal AI

With the rise of large-scale AI models, such as Large Language Models (LLMs), we are witnessing a transformation in how these technologies are integrated into various aspects of our society. These models stand out for two key reasons: 1) General-purpose functionality: LLMs can perform a wide range of tasks, from translation and question answering to code completion and more; 2) Human-like competence: They have demonstrated the ability to perform many tasks at a level comparable to human, making them accessible and versatile tools for various domains.

While these powerful models offer significant societal benefits, they also introduce unforeseen challenges. These challenges arise not only from technical complexities but also from the broader social implications of widespread AI adoption. As Brad Smith aptly noted, “The more powerful the tool, the greater the benefit or damage it can cause.”

To ensure that AI’s integration into society is harmonious, synergistic, and resilient—minimizing any potential side effects—it is critical to foster Societal AI research. This emerging field prioritizes a multi-disciplinary approach, bringing together computer science and social science to address the complex dynamics of AI’s role in shaping our world.

We invite researchers from both fields to join us in this exciting endeavor. Together, we can explore innovative solutions that ensure the responsible and equitable advancement of AI technologies.

- AI’s impact on human cognition, learning, and creativity.
- AI’s role in reshaping work and global business models.
- Designing fair and inclusive AI systems.
- Ensuring AI safety, reliability, and control.
- Aligning AI with human values and ethics.
- Optimizing human-AI collaboration.
- Evaluating AI in unforeseen tasks and environments.
- Enhancing AI transparency and interpretability.
- AI’s transformation of social science research.
- Evolving regulatory frameworks for AI governance.

Xing Xie (Engaging Lead)
PARTNER RESEARCH MANAGER

We warmly invite researchers from both computer science and social sciences to join us in exploring the exciting frontier of societal AI. As we stand at the intersection of technological innovation and social impact, this collaboration provides a unique opportunity to bridge diverse fields of expertise. We are confident that this interdisciplinary approach will not only advance our knowledge of intelligence but also help shape AI technologies that are aligned with human values and needs. Ultimately, this partnership holds great promise for generating long-term benefits for human society, ensuring that AI serves as a force for positive social change.

Fangzhao Wu
PRINCIPAL RESEARCHER

We will work together with the visiting scholars on two research projects. The first one is improving the safety, privacy, and reliability of LLMs, and the second one is studying the impact of LLMs on humans and society, as well as how to maximize the positive impact and minimize the negative impact.

Xiaoyuan Yi
SENIOR RESEARCHER

In this collaboration, we aim to integrate insights from social sciences, psychology, and related fields, adopting an interdisciplinary perspective to envision the ideal form of the next-generation AI. Our goal is to develop methods for evaluating, analyzing, and understanding the intrinsic driving factors of big models—such as their values. Additionally, we seek to predict and align the behavior of these models based on their traits like values, maximizing the societal well-being that AI technology can offer in a future where humans and machines coexist symbiotically.

Jianxun Lian
SENIOR RESEARCHER

Through this collaboration, we aim to explore the synergy between AI and social science. For instance, how can we leverage social science to evaluate, interpret, and shape AI behaviors? Conversely, how can a well-aligned AI transform the field of social science research? To answer these questions, we need to gain a deeper understanding of AI behaviors from an interdisciplinary perspective, particularly on those traits that reflect human-likeness, such as emotions, motives, preferences, beliefs, social intelligence, and theory of mind.

2. Media Foundation

We envision a future where multimedia and advanced AI are seamlessly integrated into our daily life, building a hybrid world, and significantly enhancing humans’ capabilities and experiences. Media Foundation is a cutting-edge research theme that aims to investigate the foundations of multimedia and advanced AI from multiple disciplines while ensuring responsible AI practices. Humans are amazing learners who can acquire a wide range of skills and knowledge from multiple modalities, such as vision, hearing, touch, and language. Advanced AI, like humans, also needs to learn from the real world. But how can we bridge the gap between the complex and noisy environment and the abstract and semantic representations of advanced AI? Media Foundation can transform the physical and virtual worlds into media tokens, which can be manipulated and communicated, thereby enhancing the learning and creativity of advanced AI and enabling novel multimodal applications. This research theme aims to explore the synergy between multimedia and advanced AI, covering topics such as neural codec, immersive telepresence, multimedia understanding, AIGC for user interfaces and HCI, 3D and computer vision, audio and speech, and deep learning fundamentals for multimedia.

For more detailed information, please refer to the article Media Founda tion.

- Neural codec
- Immersive telepresence
- Multimedia understanding
- AIGC for user interfaces and HCI
- 3D and computer vision
- Audio and speech

Yan Lu (Engaging Lead)
PARTNER RESEARCH MANAGER

We welcome applicants for media foundation research, which aims to build an intelligent environment through world model for benefiting humanity in the future. We seek candidates with a strong background in AI and multimedia, or related fields, including individuals with experience in AI foundaitons, neural codecs, computer vision, audio and speech, HCI, and other related domains. We also expect participants are passionate about exploring the paradigm shifts in research and are eager to contribute to the development of advanced AI systems that can enhance human learning and creativity.

Xiulian Peng
PRINCIPAL RESEARCH MANAGER

We welcome applications from individuals with expertise in audio/speech, multi-modal learning, AI, and related disciplines, especially audio/speech technologies. We anticipate that candidates will have a profound interest in exploring transformative developments within the realm of audio/speech and multimedia research and be enthusiastic about contributing to the advancement of cutting-edge AI systems to elevate humans’ capabilities and quality of life.

Shujie Liu
PRINCIPAL RESEARCHER

We welcome researchers with expertise in Artificial Intelligence Generated Content (AIGC), Multi-modal Large Language Models, Speech and Video Generation, and Human-Computer Interaction (HCI) to join our cutting-edge research programs. Our team is committed to pushing the boundaries of AI in creative content generation, enhancing how machines understand and produce human-like interactions across various modalities. We are particularly interested in scholars who seek to explore advanced AI techniques for multimodal fusion, adaptive interfaces, and human-centric AI applications. By fostering interdisciplinary collaborations, we aim to create groundbreaking technologies that redefine how humans interact with intelligent systems.

Bin Li
PRINCIPAL RESEARCHER

We welcome candidates with expertise in multimedia, AI, and related fields, such as neural codecs, computer vision, and deep learning. We value individuals passionate about investigating video research’s paradigm shifts and contributing to advanced AI systems development, fostering human learning and creativity enhancement. We expect applicants to have a deep understanding of the field and are committed to pushing the boundaries of AI and multimedia research.

3. Embodied AI and Large Action Models

With the rapid advancements in AI and robotics, the development of highly intelligent robots capable of seamlessly interacting with the physical environment is becoming increasingly achievable. As the next AI/AGI wave, embodied AI innovations promise to revolutionize various industries and significantly impact human life.

Although promising progress has been made, generalist robots and embodied AGI are still in their infancy. Embodied AI foundation models differ significantly from existing LLMs and VLMs. The ability to produce dense robot actions precisely and efficiently for physical interaction with general objects in the 3D spatial world is far beyond the capability of existing AI foundation models.

Our research aims to build a new generation of foundational embodied AI models with enhanced 3D spatial and physical proficiencies in perception, reasoning, and action. We explore vision-language-action model architectures, Internet-scale action data construction approaches, multimodal-sensory intelligence, and fundamental 3D computer vision techniques, among others. We are dedicated to making technical breakthroughs and creating new business opportunities for the company.

For more detailed information, please refer to the article Redefining robot intelligence.

- VLA model architecture design
- Large-scale action dataset construction from unlabeled videos
- 3D human-object-environment reconstruction and understanding
- Dexterous hand manipulation RL
- Neural robot simulator
- Multimodal-sensory intelligence

Baining Guo (Engaging Lead)
DISTINGUISHED SCIENTIST

We invite researchers with a deep enthusiasm for Embodied AI to become part of our team. Embodied AI represents the next breakthrough in developing intelligent systems capable of perceiving, understanding, and interacting with the world in a more human-like way. If AI can talk and see, why not give it a body? If you have a strong research foundation in areas like foundation models, 3D computer vision, and robotics, we encourage you to join us. Together, let’s advance this new frontier in AI.

Jiaolong Yang (Engaging Lead)
Principal Research Manager

I’m eager to welcome professionals with a solid background in 3D computer vision, robotics, language models, reinforcement learning, video generation, and related fields. I expect our visiting researcher to have a strong passion in Embodied AI and be prepared to collaborate closely with our team during their visit. I also hope to build long-term partnerships with our visiting scholars, working together to produce influential contributions in the forthcoming surge of advancements in Embodied AI.

Lily Sun (Engaging Lead)
DIRECTOR of MSR Accelerator–Region & China

Embodied AI has recently emerged as one of the most promising fields, with immense potential for disruptive breakthroughs that could transform various industries and daily life. While significant progress has been made, many challenges remain in developing general foundation models for embodied AI that can empower diverse robots to solve problems across industrial and everyday scenarios. We warmly invite StarTrack scholars from around the world to join us in advancing the frontiers of this dynamic field and achieving impactful real-world outcomes!

Yaobo Liang
SENIOR RESEARCHER

We welcome researchers with a strong passion for Embodied AI to join our team. Embodied AI is the next frontier in creating intelligent systems that can perceive, understand, and interact with the real world in a more human-like manner. When AI can talk and see, let’s give it a body! If you have a solid research background in foundation models,3Dcomputer vision, and robotics, we invite you to join us. Let’s work together to push this new AI frontier.

Yu Deng (opens in new tab)
SENIOR RESEARCHER

I look forward to collaborating closely with visiting researchers on topics related to foundation models in Embodied AI, as well as 3D/4D reconstruction and generation. I hope participants will have a solid background in 3D computer vision and graphics, along with a passion for tackling cutting-edge challenges in these fields. Let’s work together to advance research frontiers and make a big impact.

Fangyun Wei
SENIOR RESEARCHER

I am eager to collaborate closely with visiting researchers on advancing embodied AI, with a particular focus on machine perception, reinforcement learning, and dexterous robotic grasping. Additionally, I look forward to exploring cutting-edge developments in vision-language-action foundation models, teleoperation, neural simulation environments, and the integration of intelligent robotic agents.

4. Next generation systems in the AI era

With the increasingly critical role of AI, the very foundation of computing is being redefined around AI. This transformation demands innovations across multiple dimensions of AI and computing systems. This research theme aims to call for research proposals addressing the fundamental challenges in computer systems in the era of AI. The topics include but are not limited to: (1) disruptive methods to significantly lower the bar of developing secure and reliable systems; (2) next generation software/hardware architecture for future AI; (3) next generation storage/database systems for AI; (4) Development and debugging tools for next generation intelligent systems; and (5) approaches for continual learning that enable systems to retain and adapt knowledge over time without extensive retraining. We particularly welcome unconventional, multi-discipline approaches to tackling the above challenges.

For more detailed information, please refer to the article Next-generation systems in the AI era.

- System verification and security with formal methods
- Neural-symbolic reasoning for math and code, e.g., LLM for math/code
- DNN compiler
- Agent system
- Algorithm-system codesign with long-context LLMs and reinforcement learning
- Continual learning and lifelong learning
- Software-hardware codesign for AI
- Vector storage system
Fan Yang (Engaging Lead)
SR PRINCIPAL
RESEARCH MANAGER

In the past decade, systems have played a critical role in fostering the unforeseen development of AI. We welcome young systems researchers to join us exploring the fundamental system principles in the era of AI.

Mike Chieh-Jan Liang
PRINCIPAL RESEARCHER

We are interested in using LLM-agents to continually evolve system design. System design goes beyond code completion, as it requires LLM-agents to (1) generate new ideas and (2) reason+refine potential ideas. Our current prototype mimics how right-brain and left-brain work. In addition to designing traditional systems, I also welcome collaborations to tackle AI infra design problems.

Ting Cao
PRINCIPAL RESEARCH MANAGER

We are at a pivotal moment where the LLM-centered economy is rapidly emerging, calling for a rethinking of hardware, systems, and algorithms to bring this vision into reality. Our team is focused on advancing these technologies to overcome the critical bottlenecks of LLM serving, such as the scaling of computing power, memory capacity and model capability—especially for ubiquitous user devices. We invite researchers to join us in pushing the boundaries of what’s possible.

Xian Zhang
SENIOR RESEARCHER

We are working on neural-symbolic math reasoning. We welcome researchers passionate about boosting AI math reasoning through any potential approaches towards expert level. For example, expertise in the areas of mathematics with ties to AI and machine learning, including computer-assisted mathematics such as the use of proof assistants (e.g., Lean, Isabelle, Coq); specializing in one or more areas of high-performance computing, ML systems, programming languages, and formal methods.

Li Lyna Zhang
SENIOR RESEARCHER

We are working on self-improving AI. We welcome researchers working on: (1) Optimizing LLM inference scaling laws; (2) Improving LLM reasoning capabilities, particularly in domains like math and code reasoning; (3) LLM post-training, such as supervised-finetuning, reinforcement learning from human feedback, or reward model training; (4) optimizing LLM long-context capabilities. Identify and address cutting-edge research challenges in the real applications of long-context LLMs.

Lingxiao Ma
SENIOR RESEARCHER

We welcome researchers working on deep learning compilers to join us to explore the compiler techniques for future AI models and hardware

Shijie Cao
SENIOR RESEARCHER

Our research focuses on addressing the critical efficiency challenges that large language models (LLMs) face as they continue to transform industries and everyday life. To truly make LLMs accessible on every device, innovative approaches in model-system-hardware co-design are essential. We explore cutting-edge techniques, such as low-bit precision and sparsity, and novel system optimizations and custom hardware designs specifically tailored for these techniques. We are excited to invite talented researchers to collaborate with us in pioneering the solutions that will shape the future of AI at scale.

Shuai Lu
RESEARCHER

We are working on LLM for code verification. We welcome faculties with interests and experiences in: (1) Using LLMs for code like code generation, automatic debugging, and code verification. Applying LLMs to system and software engineering problems. (2) Applying LLMs to reasoning tasks, designing and training models to solve domain-specific problems. (3) Formal verification and familiar with formal languages like Verus, Dafny, Isabelle, or Lean.

Baotong Lu
RESEARCHER

We believe in the critical role of the “storage component” in the era of AI. Specifically, our research is dedicated to enhancing the efficiency of large language models (LLMs) in critical scenarios, such as long context inference, through hardware-algorithm co-design. We strive to understand and leverage the fundamental working principles of the large languages models to advance the model performance. We welcome researchers working on modern storage systems, e.g., database systems in disaggregated hardware, vector stores, storage for LLMs, to join us in the exploration.

Ran Shu
SENIOR RESEARCHER

We are working on simulation of the large-scale AI systems. Specifically, we are focusing on cutting-edge simulators with extreme efficiency and fidelity. We welcome professors to collaborate on: (1) hardware / software co-design for network simulation acceleration; (2) efficient and accurate GPU / system bus / NIC simulation and modeling; (3) efficient and accurate traffic generation and monitoring in simulation.

Zhirong Wu
SENIOR RESEARCHER

Training large language models is highly computationally intensive, and with the rapid expansion of global knowledge each year, updating LLMs incrementally without extensive retraining has become a pressing challenge. We invite applicants with interests in continual learning, mitigating catastrophic forgetting, neural plasticity, and efficient model adaptation to join us in tackling this timely issue.

5. Towards a synergy between AI and brain

Our brain is one of the most complicated objects on this planet. Although scientists have spent hundreds of years trying to unveil our brain, there are still a lot of questions remaining unresolved. Understanding how the brain works is very important because it can help us better treat neurological and psychiatric disorders and develop more advanced artificial intelligence. In this theme, we aim to explore how to use AI to better understand the brain and apply the insights to improve brain health and design brain-inspired AI.

For more detailed information, please refer to the article Towards a synergy between AI and the brain.

- AI for understanding brain signals
- Brain-computer interface
- Brain-inspired machine learning
- AI for brain health
Dongsheng Li (Engaging Lead)
PRINCIPAL RESEARCH MANAGER

As AI is transforming the world, it is equally crucial to deepen our understanding of biological intelligence. The integration of AI and brain science research holds immense potential to drive innovation, enhance human well-being, and expand our knowledge of both artificial and biological intelligence. I am eager to collaborate with researchers in this field to revolutionize existing paradigms through interdisciplinary studies, advancing both AI and brain science in groundbreaking ways.

Yansen Wang
SENIOR RESEARCHER

Our brain, encapsulated within a structure no larger than our two fists, is a marvel of complexity and power. It governs every facet of our thoughts, emotions, and actions, enabling feats from artistic expression to scientific innovation. Yet, much of its inner workings remain shrouded in mystery, with countless questions about consciousness, memory, and cognition still unanswered. Understanding the brain is not only essential for unraveling these mysteries but also for constructing interfaces between the brain and artificial intelligence. I hope to see a better synergy in the future.

Dongqi Han
SENIOR RESEARCHER

A single neuron, while capable of carrying some information, cannot generate intelligence on its own. However, when many neurons connect and interact, they form complex networks—whether in the biological brain or artificial neural networks—capable of processing information, learning, and making decisions. The secret behind this collective intelligence lies in the emergent properties of interconnected systems, where the intricate patterns of connections and dynamic interactions between neurons give rise to cognition, perception, and learning. Unraveling the origin of intelligence requires research into these interactions, exploring how simple components combine to produce profound capabilities.

6. Theoretical Foundation of Large Language Models

Large language models, led by GPT and followed by numerous other models, have demonstrated their strong capabilities in many areas from language-based tasks to reasoning and planning. LLMs have fueled the new wave of AI revolution that will generate deep impact on every aspect of the society. However, behind the LLMs, its basic mechanism can be simply explained as next word prediction — predicting the next word to be generated as accurate as possible after trained on a large amount of text data. What is the intrinsic connection between the seemly simple next word prediction mechanism and the amazingly intelligent capabilities on language processing, theory of mind, reasoning, planning, and so on? Is there any fundamental limitation of LLMs based on next word prediction? Currently there is still lack of principled understanding of these crucial questions.

In this proposal, we would like to solicit research collaborations on the principled and theoretical understanding of the capabilities of LLMs. The potential collaborations may cover a wide range of research topics, such as understanding the reasoning and planning capabilities of LLMs, theoretical explanation of the emergent behavior of LLMs, LLMs as compressors, enhancing LLMs with data-efficient training, new architectures, or paradigms for language models, etc. There is no established theory yet for large language models and generative AI systems in general, which means that the area is wide open and there are many opportunities to explore. The exploration may like to require interdisciplinary approaches and the integration with empirical-based approaches to be successful. We believe that developing a solid theoretical foundation of LLMs is imperative for the future development of safe, secure, and interpretable AI systems.

For more detailed information, please refer to the article Theoretical foundation of large language models.

- Understanding the reasoning and planning capabilities of LLMs
- Theoretical explanation of the scaling laws and emergent behavior of LLMs
- Theory of in-context learning
- Understanding the intelligence of LLMs from the compression point of view
- Enhancing LLMs with data-efficient training
- Improving generative AI with new model architectures or learning paradigms
Wei Chen (Engaging Lead)
PRINCIPAL RESEARCHER

I am looking forward to collaborations with young researchers worldwide who are working on the fundamental understanding of large language models and AI in general. I believe it is crucial to have a solid theoretical understanding to further advance the AI research and keep it safe and controllable.

Siwei Wang
SENIOR RESEARCHER

I am looking forward to collaborations with young researchers worldwide who are working on (1) conducting theoretical analysis of how Transformers work, e.g., on the dynamics during Transformer training phase or the inference phase; (2) understanding how current LLMs work in different tasks via either theoretical or experimental analysis, and (3) designing new training process, inference architectures or model structures to address potential limitations in current LLMs.

Zhong Li
RESEARCHER

I am looking forward to establishing deep cooperations with world-class scholars. It is of vital importance to achieve the controllable and efficient modeling not only for large language models but also for general AI through fundamental research in theoretical machine learning.

Yifei Shen
SENIOR RESEARCHER

I am looking forward to collaborations with young researchers worldwide who are working on (1) understanding how high-level capabilities such as reasoning and planning emerge from next-token-prediction; (2) the efficient training of large LLMs (e.g., 70B models); (3) Efficient long-context inference of reasoning tasks. I believe it is crucial to advance the development of high-level abilities in LLMs and enable more people to conduct LLM research.

7. Advancing Healthcare through Innovative Sensing Technologies and AI

Advancements in sensing technologies have revolutionized healthcare by enabling continuous, non-invasive monitoring of vital signs, which are essential indicators of an individual’s health. This research project focuses on the development and deployment of cutting-edge sensing systems that provide accurate, real-time measurement of vital signs, facilitating continuous in-home health monitoring. These technologies enable seamless data collection and connectivity, supporting proactive healthcare management and early disease detection. A key innovation is the co-design of hardware and software systems that integrate wireless signal processing with machine learning algorithms for advanced data analysis, predictive insights, and personalized healthcare. This approach promises to create a more responsive, individualized, and efficient healthcare ecosystem.

- Development of Cutting-Edge Sensing Hardware and/or Software Systems: We encourage proposals focused on designing advanced sensing technologies capable of delivering accurate, real-time measurements of vital signs, enabling continuous, in-home health monitoring. This includes wearable sensors, wireless sensing technologies, or non-invasive sensing systems that seamlessly integrate into daily life, offering round-the-clock monitoring of physiological signals such as heart rate, respiration, and other critical health indicators.
- Medical Imaging Analytics: We seek proposals that apply AI and machine learning algorithms to enhance medical imaging, improving diagnostic accuracy and offering advanced capabilities such as predictive analysis and AI-driven reasoning. Proposals may focus on various imaging modalities, including MRI, CT, X-ray, and ultrasound, to support more effective clinical decision-making and early detection of disease.
- Multi-Modal Analytics of Human Healthcare Data: Health data are inherently multi-dimensional, encompassing vital signs, medical images, laboratory results, and behavioral data. We are particularly interested in proposals that develop innovative methods to integrate and analyze multi-modal healthcare data, creating a holistic view of patient health. The goal is to improve diagnostic precision and predictive accuracy by synthesizing disparate data streams, enabling more informed and personalized care.
- AI-Powered Intervention Techniques: We invite proposals that explore the development of AI-driven intervention strategies. These strategies should utilize data from sensors, medical imaging, and electronic health records to provide personalized, adaptive treatment plans or deliver real-time interventions. This could include anything from lifestyle recommendations to timely medical responses, helping healthcare providers manage patients more effectively.
- AI-Powered Discovery of Biomarkers and Drug Development: We encourage proposals focused on leveraging AI to discover novel biomarkers for disease diagnosis and progression, as well as to accelerate drug discovery. AI-driven analysis of complex biological data has the potential to identify new therapeutic targets, optimize clinical trials, and drive the development of more effective treatments, ultimately advancing the field of precision medicine.
Lili Qiu (Engaging Lead)
ASSISTANT MANAGING DIRECTOR

We are seeking research proposals that aim to transform the healthcare landscape through the development of innovative sensing technologies and machine learning approaches. Our vision is to enable continuous, non-invasive monitoring of vital signs and to harness advanced health data analysis, creating a more proactive, personalized, and efficient healthcare system.

Zilong Wang
SENIOR RESEARCHER

We are seeking passionate researchers dedicated to revolutionizing healthcare through cutting-edge technology. Our vision is to develop innovative sensing systems with various features – portable or wearable, non-invasive, accurate, continuous, and easily accessible for home use, enabling comprehensive health monitoring. Through developing advanced algorithms, we aim to analyze sensing data, medical images, and multi-modal healthcare data to address critical healthcare challenges, such as early disease detection, treatment optimization, and personalized medicine. We look forward to collaborating and exploring new possibilities in next-generation healthcare.

The seven research fields mentioned above are the primary focus for collaboration in the Microsoft Research Asia StarTrack Scholars 2025. In addition to these, applicants can also choose a field of their interest from the following options: Heterogeneous Extreme Computing, Intelligent Cloud and Edge, Intelligent Multimedia, Internet Graphics, Machine Learning, Media Computing, Multi-Modal Computing, Natural Language Computing, Networking Infrastructure, Social Computing, Systems, Trustworthy Systems, Visual Computing, Wireless.