Speaker: Yasaman Bahri
Affiliation: Google Brain
Recent investigations into deep neural networks that are infinitely wide have given rise to intriguing connections with kernel methods. Specifically, it was found that the dynamics of infinite-width neural nets is equivalent to using a fixed kernel, the “Neural Tangent Kernel” (NTK). We investigate the dependence of this connection on the learning rate or step size used in gradient descent optimization. Surprisingly, even as networks become wider, we find there is a range of (large) learning rates for which one does not recover kernel (NTK) dynamics, suggesting the existence of an alternative infinite-width limit outside of current theory. The small and large learning rate regimes we investigate are separated by a phase transition. We provide comprehensive support for these findings through empirics and theoretically through analysis of a class of solvable models. I will describe the signatures of these two phases, their connection to neural network performance, and also discuss the broader implications for deep learning theory.
Yasaman Bahri is a Research Scientist at Google on the Brain Team. Her current research interests are to build scientific and theoretical foundations for deep learning that bridge the gap with practice. Select topics of her recent work include investigations into optimization and generalization in deep learning; neural network and Gaussian process correspondences; and the intersection of statistical mechanics and deep learning. She was trained as a theoretical condensed matter physicist and received her Ph.D. in Physics from UC Berkeley in 2017. She is a recipient of the NSF Graduate Fellowship and was recently named a 2020 Rising Stars in EECS.
Hosted by Professor Baharan Mirzasoleiman
Date(s) - Jan 14, 2021
4:00 pm - 5:45 pm
404 Westwood Plaza Los Angeles