SWE-Glu SF: Super Convergence & One Cycle Learning Rate - Tomorrow (8.10)

42 views
Skip to first unread message

sasha.hydrie

unread,
Aug 9, 2024, 12:44:10 PM8/9/24
to SWE-Glu SF Papers Reading Group
Glu Enthusiasts,

Our meeting will be Saturday, August 10th, 2:30 PM @ 619 Oak Street.
Expect to be there ~1.5 hours, extended if discussion warrants

This week's paper is "Super-Convergence: Very Fast Training of Neural Networks Using Large Learning Rates" by Leslie N. Smith and Nicholay Topin (2017) https://arxiv.org/abs/1708.07120

Why this is cool:

1. Jeremy Howard of fast.ai likes it; this technique is the basis of fastai's famous fit_one_cycle and now torch's OneCycleLR.
2. I recently used this technique to train 250k neural networks in a second each with only my Mac's GPU
3. Graphs like this:

Screenshot 2024-08-09 at 09.42.31.jpg

Thanks to the one of you that made it out to last week's inaugural session!

Best,
Cheikh and Sasha

P.S. if you are somehow reading this email but not on our listserv join it here.
If you are on our listserv, send it to your friends.
Reply all
Reply to author
Forward
0 new messages