Meeting #137: [Offline at ISTC; 13:00!] Training Compute-Optimal Large Language Models

108 views
Skip to first unread message

Hrant Khachatrian

unread,
May 6, 2022, 2:05:11 PM5/6/22
to Machine Learning Reading Group Yerevan
Hi everyone,

This Saturday we will discuss a recently published paper from DeepMind: “Training Compute-Optimal Large Language Models”.  Authors find that most of the large pre-trained language models are not trained efficiently and they can be potentially improved using the same computational budget. The paper proposes a new model: Chinchilla, which is significantly smaller and outperforms most of the currently available large language models by recording SOTA results.

See you at 1pm at the ISTC conference hall.

Best,
Hrant
Reply all
Reply to author
Forward
0 new messages