Understanding egrad2rgrad

105 views
Skip to first unread message

Ronny Bergmann

unread,
Sep 11, 2021, 3:17:49 AM9/11/21
to Manopt
I am trying to understand egrad2rgrad, maybe document it better, but most of all, doing something similar in Julia. But I aam getting stuck somewhere.

First I thought it performs a change of metric in the following sense

Let g_1 (as matrix G_1) and g_2 (G_2) be two metrics on TpM. Transforming the metric for me would mean that for all tangent vectors X and Y

g_1(X,Y) = g_2(BX,BY)

if the use an ONB in TpM and use matrix representations (symbols pos def) wrt this basis this would change X and Y into their basis coefficient vectors x and y and the metric equation to (lets only consider the real case)

x^TG_1y = x^TB^TG_2By

So since G_1 = B^TG_2B and both are SPD we can use Cholesky to solve this. Ok.

Example. SPDs with linear affine as G2 and Euclidean as G1 (i.e. a euclidean vector comes along and we want to get it to the affine metric) then the above idea reas

g_E(X,Y) = tr(XY) = g_A(BX,BY) = tr(p^(-1)BXp^{-1}BY)

So the transformation B would just be B=p (the point we are at).
so in this derivation M.egrad2rgrad(p,X) = pX

But comparing this to what Manopt does makes me wonder, what I am getting wrong
what Manopt actually does is M.egrad2grad(p,X) = pXp

This would mean that we look for a transformation of X (by some f(X)) such that

g_E(X,Y) = tr(XY) = g_A(f(X),Y) = tr(p^(-1)f(X)Xp^{-1}Y)

holds for all tangent vectors Y? Then here, surely f(X) = pXp as in the code

But why? I thought the transformation should be the first but I do not see why it should be the second (i.e. the one that is implemented). What am I missing? And is there a reference/literature/documentation why egrad2rgread does what it does?

Best
Ronny




Nicolas Boumal

unread,
Sep 11, 2021, 4:01:08 AM9/11/21
to Manopt
Hello Ronny,

For M an embedded submanifold of a Euclidean space E, the logic is as follows.
Note: I'm not requiring that M be a Riemannian submanifold of E: M can have whatever Riemannian metric you want.
  1. Let \bar f : E -> R be a smooth function in the embedding space E. Since E is Euclidean, that function has a gradient grad \bar f.
  2. Restrict \bar f to M so as to get a function f : M -> R. That function is smooth and it has a Riemannian gradient grad f.
  3. At a point x in M, we now have two objects: grad \bar f(x) is a vector in E, and grad f(x) is a vector in T_x M.
  4. The two are related. In particular, for all v in T_x M (which is a subspace of E), we must have <grad f(x), v>_x = D f(x)[v] = D \bar f(x)[v] = <grad \bar f(x), v>_E.
  5. In Manopt, the function M.egrad2rgrad is the map which transforms grad \bar f(x) into grad f(x).
  6. There may be some subtleties related to representation (that is, to choices about how we numerically encode ambient and tangent vectors): you can see this at play in rotationsfactory for example.
Such considerations are discussed in my book, chapters 3 and 7 in particular.

When M is not embedded in E, there is usually still a natural notion of what egrad2rgrad should do (for example, quotients manifolds for which the total space is embedded in a Euclidean space are explained in chapter 9 of my book, and that's how egrad2rgrad in grassmannfactory is designed).

Let me know if this leaves any questions open.

Best,
Nicolas

Ronny Bergmann

unread,
Sep 11, 2021, 4:47:03 AM9/11/21
to Manopt
Ah thanks for the fast response.
We plan to do something _like that_ in Manifolds as well and we currently check what the best model is to do this (especially designing the steps). I was confusing changing the metric with your steps 4. If one has already projected the gradient (or speaks about the grad \bar f after projection), then your step 4 can be seen as changing the Riesz representer of the Differential (representing the Differential in another metric), while the first function I wrote was applying the change to _both_ arguments of the metric.

Subtle but interesting difference :)

We do not yet have an abstract model of quotient manifolds, so for now this might not be completely covered by the path we pursue, but I hope we are generic enough to include this later, easily.

Thanks for the explanation.

Ronny

Ronny Bergmann

unread,
Sep 11, 2021, 4:48:09 AM9/11/21
to Manopt
Oh adendun, for the change in the rotation manifold – sure that is basically a parallel transport into the tangent space at the identity, since tangent vectors there are represented in the Lie algebra. Something like that we already have in mind to cover in our generic implementation.

Nicolas Boumal

unread,
Sep 11, 2021, 5:13:49 AM9/11/21
to Manopt
Right, if the Riemannian metric on T_x M is <u, v>_x = <u, Gv>_E and if Proj_x denotes orthogonal projection from E to T_xM with respect to the Euclidean metric <., .>_E, then we have for all v in T_x M:

<G grad f(x), v>_E = <grad f(x)), Gv>_E = <grad f(x), v>_x = D f(x)[v] = D \bar f(x)[v] = <grad \bar f(x), v>_E = <Proj_x(grad \bar f(x)), v>_E

(This uses the fact that v = Proj_x(v) and that Proj_x is self-adjoint with respect to <., .>_E).

As this is true for all tangent v, we deduce that G grad f(x) = Proj_x(grad \bar f(x)), hence the general formula:

grad f(x) = G^{-1} Proj_x(grad \bar f(x))

In the case of positive definite matrices, <U, V>_X = <U, X^{-1} V X^{-1}>, so the "G" operator applies inv(X) on the left and on the right; the inverse of that is to multiply by X on the left and on the right. Also, Proj_X is identity because the tangent space is the whole embedding space (symmetric matrices); or equivalently: we can think of the embedding space as all square matrices, and think of Proj_X as the projector to symmetric matrices, i.e., Proj_X(M) = (M+M^T)/2.

Bamdev Mishra

unread,
Sep 11, 2021, 5:51:53 AM9/11/21
to Nicolas Boumal, Manopt
Hello, Ronny,

Apart from the excellent explanation by Nicolas, please find a different take on computing the rgrad from egrad. In particular,
rgard = argmin_{xi \in T_x M}    - <xi, egrad> + 0.5 g_x(xi, xi),
where g_x is the Riemannian metric and <,> is the Euclidean inner product.

For quotient manifolds, rgard = argmin_{xi \in V_x }    - <xi, egrad> + 0.5 g_x(xi, xi).

Regards,
Bamdev

--
http://www.manopt.org
---
You received this message because you are subscribed to the Google Groups "Manopt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manopttoolbo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/manopttoolbox/0a8259a0-6fc4-466c-959a-d9ec469d60a8n%40googlegroups.com.

Ronny Bergmann

unread,
Sep 11, 2021, 5:55:35 AM9/11/21
to Manopt
Thanks for both your answers – actually projection is a separate function, so I really do just care about “changing the metric” (so I assume I always have \bar f anyways, since we have a projection function already).

The minimisation approach looks nice, we will incorporate that once we have quotient manifolds models abstractly, for sure.

Ronny Bergmann

unread,
Sep 13, 2021, 12:45:28 AM9/13/21
to Manopt
As a summary, we have a PR now that implements, what I sketched here (it is still missing some polishing, for example to be able to work without memory allocations).

The first one I described is now called `change_metric`
i.e. both arguments are affected by the change and we compute the new tangent vector wrt to this (I am not sure where this is useful, yet)

and we now have (in this PR/Pull Request, not yet finished) `change_gradient`

This way, we can reuse the projection, which is available as `project(M, p, X)` to project X onto TpM (and adapt representation like the correct type or parallel transport to the Lie algebra) and egrad2rgrad will get in Manoptjl `change_gradient(M, EuclideanMetric(), p, project(M, p, egrad))`

The call also illustrates, that if the embedding has a different metric (than the Euclidean one) this can also be covered (given someone implements the `change_gradient` function from that metric to the -implicit- metric of M).

Thanks again for the help and comments. Once this PR is done I might try to find the same generic route for the Hessian, but that might even be more involved to understand its chain rule when adopting/changing the metric.

Kind regards
Ronny

Nicolas Boumal

unread,
Sep 13, 2021, 2:15:47 AM9/13/21
to Ronny Bergmann, Manopt
Hello Ronny,

That makes sense.

For the Hessian I'm quite interested to hear about your progress: this should necessarily involve not only the metric but also a kind of derivative of the metric, and I've found that the derivations are not so easy to go through in full generality.

The notion of "Hessian manifold" may help, as there people work out general formulas for R^n with a Riemannian metric which is given by the (positive definite) Hessian of a scalar field h on R^n.

Best, 
Nicolas 

Ronny Bergmann

unread,
Sep 22, 2021, 2:03:35 PM9/22/21
to Manopt
I just finished the PR for the gradient conversion and since we are also polishing out AD interface (couple manifolds with ForwardDiff/ReverseDiff/FiniteDiff), the gradient (not yet the Hessian) will soon be available from AD (in the embedding)! Yay! Thanks for the help and comments here. It might still take a while until I can post the resulting tutorial, because it is a Pluto-Notebook and I now have to figure out how to include that in our documentation. A sneak (Pluto Notebook code) can be found at


Where lines 134 & 137
define a gradient using forward differences in the tangent space (GetGradientFD) employing FiniteDiff.jl (so not implementing FD ourselves) – an error of the order 10^-10

And line 194
uses ReverseDiff.jl in the embedding to compute the (projected, here not converted, because we are on the sphere) gradient on S2 - numerically without an error.

I will also try (besides rendering the markdown into docs) to provide an SPD example with actual conversion.

Again – thanks for all the help here. Next step: Hessians.
Reply all
Reply to author
Forward
0 new messages