Time | Talk Titles | Speakers |
---|---|---|
08:30-08:40 | Opening | Huishuai Zhang |
Session 1 | ||
08:40-09:05 | Faster Neural Network Training, Algorithmically [video (opens in new tab)] | Jonathan Frankle |
09:05-09:30 | Bayesian Interpolation with Deep Linear Networks [video (opens in new tab)] | Boris Hanin |
09:30-09:55 | Variational Principles for Mirror Descent and Mirror Langevin Dynamics [video (opens in new tab)] | Maxim Raginsky |
09:55-10:20 | How Does Sharpness-Aware Minimization Minimize Sharpness? [video (opens in new tab)] | Zhiyuan Li |
10:20-10:35 | Coffee Break | |
Session 2 | ||
10:35-11:00 | Analysis of a Toy Case for Emergence [video (opens in new tab)] | Sebastien Bubeck |
11:00-11:25 | Beyond Neural Scaling Laws: Towards Data Efficient Deep Learning [video (opens in new tab)] | Surya Ganguli |
11:25-11:50 | Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers [video (opens in new tab)] | Li Dong |
11:50-12:15 | Flow Straight and Fast: A Simple and Unified Approach to Generative Modeling, Domain Transfer, and Optimal Transport [video (opens in new tab)] | Qiang Liu |
12:15-13:30 | Lunch Break | |
Session 3 | ||
13:30-13:55 | Condensation in Deep Learning [video (opens in new tab)] | Zhiqin Xu |
13:55-14:20 | Adapting to Distribution Shifts: Recent Advances in Importance Weighting Methods [video (opens in new tab)] | Masashi Sugiyama |
14:20-14:45 | Which Graph Neural Network Can Provably Solve Practical Problems? [video (opens in new tab)] | Di He |
14:45-15:10 | Contrastive Learning Is Spectral Clustering on Similarity Graph [video (opens in new tab)] | Yang Yuan |
15:10-15:25 | Coffee Break | |
Session 4 | ||
15:25-15:50 | On the Theoretical Understanding of Mixup | Kenji Kawaguchi |
15:50-16:15 | Benign Overfitting in Two-layer Convolutional Neural Networks [video (opens in new tab)] | Yuan Cao |
16:15-16:40 | Environment Invariant Linear Least Squares [video (opens in new tab)] | Cong Fang |