Optimization in Deep Learning

Understand the dynamics of the optimizers in deep learning and its convergence rate and implicit regularization.

One important aspect to open the black-box of deep neural networks is to understand the dynamics of the optimization process in deep learning. We investigate the influence of the noise in the stochastic optimization algorithms, the influence of the local Hessian properties, and the optimization path of SGD in deep learning.

Optimization theory motivated architecture design.

We analyze the forward/backward stability of the deep neural network and establish the conditions that guarantee stability of residual connection and multi-branch structure. Based on our new understanding, we design new neural network architectures by ensuring the stability of signal propagation across layers.

Rethinking the invariance properties in deep learning models.

Group-invariance/equivariance widely exists in deep learning, e.g., Positively scale-invariance of ReLU networks, rotation invariance of 3D point clouds, translation and scaling equivariance of fluid dynamics, etc. By removing redundancy, representing neural networks by the corresponding group-invariant variables can help both optimization and generalization.