Online Learning and Optimization from Continuous to Discrete Time
Many discrete algorithms for convex optimization and online learning can be interpreted as a discretization of a continuous-time process. Perhaps the simplest and oldest example is gradient descent, which is the discretization of the ODE $\dot X = -\nabla f(X)$. Studying the continuous-time process offers many advantages: the analysis is often simple and elegant, it provides insights into the discrete process, and can help streamline the design of algorithms (by performing the design in the continuous domain then discretizing). In this talk, I will present two such examples: In the first, I will show how some (stochastic) online learning algorithms can be obtained by discretizing an ODE on the simplex, known as the replicator dynamics. I will review properties of the ODE, then give sufficient conditions for convergence of the discrete process, by relating it to the solution trajectories of the ODE. In the second example, I will show how we can design an ODE for accelerated first-order optimization of smooth convex functions. The continuous-time design relies on an “inverse” Lyapunov argument: we start from an energy function which encodes the constraints of the problem and the desired convergence rate, then design dynamics tailored to that energy function. Then, by carefully discretizing the ODE, we obtain a family of accelerated algorithms with optimal rate of convergence.
- Date:
- Speakers:
- Walid Krichene
- Affiliation:
- Berkeley
-
-
Ben Ryon
-
-