Meeting #137: [Offline at ISTC; 13:00!] Training Compute-Optimal Large Language Models

116 views

Skip to first unread message

Hrant Khachatrian

unread,

May 6, 2022, 2:05:11 PM5/6/22

to Machine Learning Reading Group Yerevan

Hi everyone,

This Saturday we will discuss a recently published paper from DeepMind: “Training Compute-Optimal Large Language Models”. Authors find that most of the large pre-trained language models are not trained efficiently and they can be potentially improved using the same computational budget. The paper proposes a new model: Chinchilla, which is significantly smaller and outperforms most of the currently available large language models by recording SOTA results.

See you at 1pm at the ISTC conference hall.