SWE-Glu SF: Scaling Test Time Compute, Backtracking

15 views

Skip to first unread message

Cheikh Fiteni

unread,

Sep 14, 2024, 11:36:35 PM9/14/24

to SWE-Glu SF Papers Reading Group

Glu evening,

After a week hiatus, SWE-Glu has returned.

Our next meeting will be Sunday, September 15th, 2:30 PM @ 848 Divisadero Street. Since then, OpenAI has released a new flagship reasoning model whose inference-focused compute structure is said to offer a “new dimension to scaling.”

Following the format from last session, we will center our discussion around the Interconnects blog post OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference, and everyone is encouraged to come in with further readings on any part they found most interesting.

For those who might enjoy a paper, a good base couldl be “Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters” by Snell et al (2024) https://arxiv.org/abs/2408.03314, with specific interests shoutouts to backtracking, self-talk, and recursive introspection.

Practical summaries of some of the above are also covered in the Planning Section of Lilian Weng’s excellent LLM Powered Autonomous Agents.

Why inference time compute is cool:

Basis for the OpenAI’s o1-model 🍓
Builds on the planning work Noam Brown et al did on Cicero (paper, code) which were among some of our earliest motivations for getting into deep learning.
Serves as an inflection point out of deep RL’s “trough of disillusionment” (see the really excellent: Deep Reinforcement Learning Doesn't Work Yet)