Deep Policy Gradient Algorithms: A Closer Look

Deep reinforcement learning methods are behind some of the most publicized recent results in machine learning. In spite of these successes, however, deep RL methods face a number of systemic issues: brittleness to small changes in hyperparameters, high reward variance across runs, and sensitivity to seemingly small algorithmic changes.

In this talk we take a closer look at the potential root of these issues. Specifically, we study how the policy gradient primitives underlying popular deep RL algorithms reflect the principles informing their development.


Logan Engstrom

Series: Microsoft Research Talks