RL Theory

Finding the optimal policy of a given controlled problem is a fundamental task in artificial intelligence. The reinforcement problem faces the following challenges: (1) No pre-given I.I.D. data. (2) No direct supervised label as in supervised learning. (3) Hard to do the optimization. To propose efficient algorithms to solve the control problem, we need the theoretical view to better understand the problem structure. Specifically, we focus on how to collect and use the data: For collecting data, we focus on developing efficient exploration algorithms which is 1) sample-efficient in theory; 2) computationally efficient and 3) compatible with deep reinforcement. For using data, we focus on the theoretical understanding of the algorithms under non-i.i.d. setting and understanding the policy optimization process by using optimal control as a tool and propose new policy optimization algorithms that are more stable and efficient.

People

Yue Wang

Senior Researcher

Learn more