Daily TMLR digest for Jul 24, 2022

瀏覽次數:0 次
跳到第一則未讀訊息

TMLR

未讀,
2022年7月23日 晚上8:00:082022/7/23
收件者:tmlr-anno...@googlegroups.com


New submissions
===============


Title: A Simple Convergence Proof of Adam and Adagrad

Abstract: We provide a simple proof of convergence covering both the Adam and Adagrad adaptive optimization algorithms when applied to smooth (possibly non-convex) objective functions with bounded gradients. We show that in expectation, the squared norm of the objective gradient averaged over the trajectory has an upper-bound which is explicit in the constants of the problem, parameters of the optimizer and the total number of iterations $N$. This bound can be made arbitrarily small: Adam with a learning rate $\alpha=1/\sqrt{N}$ and a momentum parameter on squared gradients $\beta_2=1-1/N$ achieves the same rate of convergence $O(\ln(N)/\sqrt{N})$ as Adagrad. Finally, we obtain the tightest dependency on the heavy ball momentum among all previous convergence bounds for non-convex Adam and Adagrad, improving from $O((1-\beta_1)^{-3})$ to $O((1-\beta_1)^{-1})$. Our technique also improves
the best known dependency for standard SGD by a factor $1 - \beta_1$.

URL: https://openreview.net/forum?id=ZPQhzTSWA7

---
回覆所有人
回覆作者
轉寄
0 則新訊息