MLSys Seminar Episode 83: Deepak Narayanan [Mon, 10:30 am PT]

47 views
Skip to first unread message

Simran Arora

unread,
Oct 26, 2023, 1:49:57 PM10/26/23
to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu
Hi everyone, 

We're super excited to host Deepak Narayanan for this week's MLSys Seminar (October 30th) at 10:30 am PT. 


The talk details are as follows:

Bio: Deepak is a Senior Applied Deep Learning Research Scientist in the ADLR group at NVIDIA, where he builds software systems to more efficiently train and serve LLMs. He graduated from Stanford with a Ph.D. in Computer Science in September 2021, where he was advised by Prof. Matei Zaharia.

Title:  Training Large Language Models at Scale

Abstract: Training LLMs efficiently is challenging for a few reasons: training can require yottaFLOPs of compute,  and accelerators have limited memory capacity making it impossible to fit large models on even a multi-GPU server. Consequently, new methods of model parallelism such as tensor and pipeline parallelism have been proposed. Unfortunately, naïve usage of these methods leads to scaling issues at thousands of GPUs. In this talk, I describe various systems innovations incorporated into Megatron-LM (https://github.com/nvidia/megatron-lm) that allow us to run training iterations for models with up to a trillion parameters on thousands of GPUs.  

See everyone there!!

Best,
Simran

Simran Arora

unread,
Oct 30, 2023, 1:25:35 PM10/30/23
to stanford-ml...@googlegroups.com, cs-se...@lists.stanford.edu, ai-...@cs.stanford.edu, stanf...@googlegroups.com, dawn-i...@lists.stanford.edu
Reminder that this starts in 5 minutes!!

Reply all
Reply to author
Forward
0 new messages