Portrait of Yifei Shen

Yifei Shen

Researcher

About

I received the B.S. degree in computer science from ShanghaiTech University and Ph.D degree in the electronic and computer engineering from the Hong Kong University of Science and Technology. My research focuses on interpreting the mechanisms of foundation models, analyzing them mathematically, and applying these mechanisms to push the limits of foundation models. Our work consists of three main components:

Controlled Experiments: We design rigorous controlled experiments to train smaller models from scratch, aiming to discover universal laws that extend beyond the current foundation models.

Mechanistic Interpretations: We perform mechanistic interpretations of pretrained large models, transforming them into “grey boxes” to enhance our understanding of their inner workings.

Applications: We use the discovered laws and interpretations to develop novel training paradigms for foundation models. Currently, our applications include interdisciplinary collaborative projects in

 

In addition to my primary research, I have a strong interest in system-level work that enables LLM training on less powerful GPUs:

This complementary work aims to make LLMs accessible to more researchers and practitioners.