MathInstitutes.org

Towards a Geometric Theory of Deep Learning

Presenter

Govind Menon

October 7, 2025

Towards a Geometric Theory of Deep Learning Thumbnail

Abstract

The mathematical core of deep learning is function approximation by neural networks trained on data using stochastic gradient descent. I will present a collection of sharp results on training dynamics for the deep linear network (DLN), a phenomenological model introduced by Arora, Cohen and Hazan in 2017. Our analysis reveals unexpected ties with several areas of mathematics (minimal surfaces, geometric invariant theory and random matrix theory) as well as a conceptual picture for `true' deep learning. This is joint work with several co-authors: Nadav Cohen (Tel Aviv), Kathryn Lindsey (Boston College), Alan Chen, Tejas Kotwal, Zsolt Veraszto and Tianmin Yu (Brown).

Videos

Towards a Geometric Theory of Deep Learning

Presenter

Abstract