| Analytic Theories of Language, Creativity, and Reasoning in Artificial Intelligence |
|
| Tuesday, March 24 Location: 370 Jay Street, Room 825 Time: 11:00 AM EST |
|
| |
|
| Surya Ganguli Associate Professor of Applied Physics, and by courtesy, of Neurobiology, of Electrical Engineering, and of Computer Science, Stanford University
Surya Ganguli is a professor of Applied Physics at Stanford, a Senior Fellow of Stanford’s Human Centered AI Institute, and a Venture Partner at General Catalyst. Dr. Ganguli triple majored in physics, mathematics, and EECS at MIT, completed a Masters in Pure Mathematics and a PhD in string theory at Berkeley, and a postdoc in theoretical neuroscience at UCSF. |
|
|
|
|
He has also been a visiting researcher at both Google and Meta AI and a Venture Partner at a16z. His research spans the fields of neuroscience, machine learning and physics, focusing on understanding and improving how both biological and artificial neural networks learn striking emergent computations. He has been awarded a Swartz-Fellowship in computational neuroscience, a Burroughs-Wellcome Career Award, a Terman Award, two NeurIPS Outstanding Paper Awards, a Sloan fellowship, a James S. McDonnell Foundation scholar award in human cognition, a McKnight Scholar award in Neuroscience, a Simons Investigator Award in the mathematical modeling of living systems, an NSF CAREER award, a Schmidt Science Polymath Award, and an AI2050 Senior Fellowship. |
|
Abstract Two major advances of the last decade in AI involve language models and diffusion models. However, the remarkable capabilities of such complex models often elude explanation through analytic theory. I will discuss several works in which simple analytic theories can quantitatively explain their performance characteristics. First, for language modeling, we can quantitatively predict for the first time, the power law exponents governing neural scaling laws relating loss to the amount of training data. We show these neural exponents are simply a function of two statistical properties of language itself. Second, for diffusion models, we develop an analytic theory of creativity that explains how they can generate exponentially many novel images from a finite training set by constructing patch mosaics of the training data. Our analytic theory predicts individual image outputs of trained convolution only diffusion models with high fidelity. Third, we show how co-designing learning during training time and search during test-time scaling can lead to improved mathematical reasoning using language models. Given time, we will also mention some of our work on constructing, explaining, and controlling digital twins of the brain. |
|
This event is free and open to the public.
The Seminar Series in Modern Artificial Intelligence is held at NYU Tandon School of Engineering and is hosted by the Department of Electrical and Computer Engineering. Organized by Professor Anna Choromanska, the series aims to bring together faculty and students to discuss the most important research trends in the world of AI. The speakers include world-renowned experts whose research is making an immense impact on the development of new machine learning techniques and technologies and helping to build a better, smarter, more-connected world. |
|
| To make sure you keep getting these emails, please add Modern AI Seminar Series Surya Ganguli to your address book or whitelist us. If you are not a member of the NYU Tandon community and wish to be removed from all our mailing lists, unsubscribe.
Can't see this email? View it as a webpage here. |
|
|
|
|
|
|
|
|
|
|
|