Li Zhao

Principal Researcher

animation of reinforcement learning agents beating human competitors in Atari

Microsoft Research Blog

Finding the best learning targets automatically: Fully Parameterized Quantile Function for distributional RL

December 18, 2019 | Li Zhao

Reinforcement learning has achieved great success in game scenarios, with RL agents beating human competitors in such games as Go and poker. Distributional reinforcement learning, in particular, has proven to be an effective approach for training an agent to maximize…

Li Zhao

News & features

Finding the best learning targets automatically: Fully Parameterized Quantile Function for distributional RL

Microsoft Research Lab – Asia