Videos

Towards a Geometric Theory of Deep Learning

Presenter
October 7, 2025
Abstract
The mathematical core of deep learning is function approximation by neural networks trained on data using stochastic gradient descent. I will present a collection of sharp results on training dynamics for the deep linear network (DLN), a phenomenological model introduced by Arora, Cohen and Hazan in 2017. Our analysis reveals unexpected ties with several areas of mathematics (minimal surfaces, geometric invariant theory and random matrix theory) as well as a conceptual picture for `true' deep learning. This is joint work with several co-authors: Nadav Cohen (Tel Aviv), Kathryn Lindsey (Boston College), Alan Chen, Tejas Kotwal, Zsolt Veraszto and Tianmin Yu (Brown).