CS 201 | Reimagining Gradient Descent: Large Stepsize, Oscillation, and Acceleration, JINGFENG WU, UC Berkeley

Speaker: Jingfeng Wu
Affiliation: UC Berkeley

ABSTRACT:

Gradient Descent (GD) and Stochastic Gradient Descent (SGD) are pivotal in machine learning, particularly in neural network optimization. Conventional wisdom suggests smaller stepsizes for stability, yet in practice, larger stepsizes often yield faster convergence and improved generalization, despite initial instability.

This talk delves into the dynamics of GD with a constant stepsize applied to logistic regression with linearly separable data, where the constant stepsize \(\eta\) is so large that the loss initially oscillates. We show that GD exits the initial oscillatory phase rapidly in \(O(\eta)\) steps, and subsequently achieves an \(\tilde{O}(1/(t\eta))\) convergence rate. Our results imply that, given a budget of \(T\) steps, GD can achieve an accelerated loss of \(\tilde{O}(1/T^2)\) with an aggressive stepsize of \(\eta = \Theta(T)\), without any use of momentum or variable stepsize schedulers. Our proof technique is versatile and also handles general classification loss functions (where exponential tails are needed for the \(\tilde{O}(1/T^2)\) acceleration), nonlinear predictors in the neural tangent kernel regime, and online stochastic gradient descent (SGD) with a large stepsize, under suitable separability conditions. Our results are consistent with experiments.

This talk is based on a joint paper with Peter Bartlett, Matus Telgarsky, and Bin Yu.

BIO:

Jingfeng Wu is currently a postdoc at the Simons Institute at UC Berkeley, hosted by Peter Bartlett and Bin Yu. He obtained a PhD in CS at Johns Hopkins University advised by Vladimir Braverman, and MS and BS at Peking University. His research focuses on the theory and algorithms of deep learning, and related topics in algorithms, optimization, and statistical learning theory. He was selected as a 2023 rising star in data science by UChicago and UCSD.

Hosted by Professor Quanquan Gu

Date/Time:
Date(s) - Mar 05, 2024
4:15 pm - 5:45 pm

Location:
3400 Boelter Hall
420 Westwood Plaza Los Angeles California 90095