How is Manopt's gradient descent on manifolds connected to Amari's natural gradient?

40 views
Skip to first unread message

Oleg Kachan

unread,
Mar 13, 2021, 5:04:03 AM3/13/21
to Manopt
Dear all,

As far as I understand, Amari's natural gradient learning rule converts the Euclidean gradient \nabla to Riemannian one \tilde{\nabla} by multiplying it by the inverse of the manifold's metric tensor matrix G:

w^{(k+1)} = w^{(k)} - \alpha \tilde{\nabla} L,
w^{(k+1)} = w^{(k)} - \alpha G^{-1} \nabla L.

How that connects to "make a step in the tangent space-retract to the manifold" way described in Edelman, Absil, and Boumal? Could someone explain?

Also reading Nicolas's book I understand the purpose of retraction, but now catching the "conversion" egrad2rgrad, and it seems is exactly what Amari's natural gradient does but without (maybe explicit) retraction.

Thanks.

Nicolas Boumal

unread,
Mar 14, 2021, 4:14:55 AM3/14/21
to Manopt
Hello,

In Amari's setting, the manifold (that is, the set over which we optimize) is a linear space. For example, it is just R^n. That manifold is turned into a Riemannian manifold by defining a Riemannian metric (an inner product which can vary as a function of the point x on the manifold). That metric is described by a positive definite matrix G(x).

Since Amari's manifold is linear, we don't really need a retraction: we can move away from x along any chosen direction v, and x+tv will remain on the manifold. Contrast this to the situation where the manifold is nonlinear (e.g., a sphere, and x is a point on the sphere and v is a tangent vector to the sphere at x).

This should clarify why there are retractions in one setting and not in the other.

Now, for the G(x)^{-1} in the expression for the gradient: that is exactly the same in Amari's setting and in the more general Riemannian optimization setting. (It takes a bit of effort to see it, but it's  the same in the end.)

Best,
Nicolas
Reply all
Reply to author
Forward
0 new messages