Hi everyone,
We're super excited to host
Dan Fu, CS PhD Student at Stanford, for
tomorrow's MLSys Seminar (December 4th) at 1:00 pm PT (please note the time change).
The talk details are as follows:
Title:
Monarch Mixer: Making Foundation Models More Efficient
Abstract: Machine learning models are increasingly being scaled in both sequence length and model dimension to reach longer contexts
and better performance. However, existing architectures like Transformers scale quadratically along both these axes. In this talk I'll discuss Monarch Mixer (M2), a new architecture that uses the same sub-quadratic primitive along both sequence length and
model dimension. M2 mixes information along the sequence and model dimensions using Monarch matrices, a simple class of expressive structured matrices that captures many linear transforms, achieves high hardware efficiency on GPUs, and scales sub-quadratically.
Bio: Dan
Fu is a PhD student in the Computer Science Department at Stanford University, where he is co-advised by Christopher Ré and Kayvon Fatahalian. His research is at the intersection of systems and machine learning and focuses on developing algorithms and architectures
to make machine learning more efficient.
See everyone there!!
Best,
Simran