MLSys Seminar Episodes 76 + 77: Jack Rae, Susan Zhang [Mon, Wd 3:30-4:20 pm PT]

Dan Fu

unread,

Feb 27, 2023, 1:09:14 AM2/27/23

to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu

Dan Fu da...@cs.stanford.edu

Sun, Feb 12, 9:34 PM

to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu

Hi everyone,

This week we'll have two episodes of the MLSys Seminar -- Monday 3:30-4:20pm PT, and Wednesday 3:30-4:20pm PT.

Monday will be Jack Rae from OpenAI, and Wednesday will be Susan Zhang from Meta!

Livestream links:

Monday: https://www.youtube.com/watch?v=dO4TPJkeaaU
Wednesday: https://www.youtube.com/watch?v=p9IxoSkvZ-M

Talk details below!

Jack Rae

Title: Compression for AGI

Abstract: In this talk we discuss how foundation models are beginning to validate a hypothesis formed over 70 years ago: statistical models which better compress their source data resultantly learn more fundamental and general capabilities from it. We start by covering some fundamentals of compression, and then describe how larger language models, spanning into the hundreds of billions of parameters, are actually state-of-the-art lossless compressors. We discuss some of the emergent capabilities and persistent limitations we may expect along the path to optimal compression.

Bio: Jack Rae is a team lead at OpenAI with a research focus on large language models and long-range memory. Previously, he worked at DeepMind for 8 years and led the large language model (LLM) research group. This group developed a 280B parameter LLM ‘Gopher’, which halved the gap towards human-level performance on a suite of exams, alongside ‘RETRO’ — a retrieval-augmented LLM, and ‘Chinchilla Scaling Laws’ — a discovery that contemporary LLMs were considerably under-trained, which won best paper at NeurIPS 2022. Jack has a PhD in Computer Science from UCL, and has published in AI venues such as ACL, ICLR, ICML, NeurIPS, and Nature.

Susan Zhang

Susan will be talking about Meta's OPT models!

See you all there!

Best, Dan

Dan Fu

unread,

Feb 27, 2023, 6:23:49 PM2/27/23

to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu

We're live with Jack in 10!

Dan

Dan Fu

unread,

Mar 1, 2023, 6:20:20 PM3/1/23

to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu

We're live with Susan in 10! She'll be talking about the trials of training OPT-175B:

Talk: Trials of developing OPT-175B

Abstract: LLM development at scale is an extraordinarily resource-intensive process, requiring compute resources that many do not have access to. The experimentation process will also appear rather haphazard in comparison, given limited compute-time to fully ablate all architectural / hyperparameter choices. In this talk, we will walk through the development lifecycle of OPT-175B, covering infrastructure and training convergence challenges faced at scale, along with methods of addressing these issues going forward.

Bio: Susan Zhang is a research engineer at Meta focused on the development of large-scale language models. Previously, she worked on designing photonic chips at Luminous Computing, scaling reinforcement learning systems at OpenAI, and building large-scale data infrastructure systems at Unity Technologies.

Dan

On Sun, Feb 26, 2023 at 10:08 PM Dan Fu <da...@cs.stanford.edu> wrote:

Reply all

Reply to author

Forward