Bio Embedding

Life is ruled by biological sequences and molecules, i.e. DNA, RNA, and protein sequences, following the de facto ‘natural’ language of biology. Understanding how these biomolecular behaves and interacts with each other can help with millions of lives that are still dying of diseases like cancers. However, it is not easy to effectively understand the biomolecule, such as protein sequence, the labeled data (e.g., structural information) is quite limited and cost to collect. Therefore, understanding these sequences is vital and urgent for biology, healthcare, and medicine.

In this project, the goal is to learn meaningful representations for biomolecule (protein, molecule). Specifically, we aim to design bio-inspired pretraining techniques and to empower (or even enable) impactful downstream applications by applying these developed techniques.

People

Portrait of Liang He

Liang He

Senior Researcher

Portrait of Fusong Ju

Fusong Ju

Researcher

Portrait of Tie-Yan Liu

Tie-Yan Liu

Distinguished Scientist, Microsoft Research AI for Science

Portrait of Tao Qin

Tao Qin

Partner Research Manager

Portrait of Yingce Xia

Yingce Xia

Principal Researcher