Hi everyone,
This Saturday we will discuss a recently published paper from DeepMind: “
Training Compute-Optimal Large Language Models”. Authors find that most of the large pre-trained language models are not trained efficiently and they can be potentially improved using the same computational budget. The paper proposes a new model: Chinchilla, which is significantly smaller and outperforms most of the currently available large language models by recording SOTA results.
See you
at 1pm at the ISTC conference hall.
Best,
Hrant