I have a basic doubt. Let's say we have an objective f(L), L is a dxl matrix, s.t. we want to constrain L^TL=I, i.e. we want to constrain L to be on the Stiefel manifold St(l,d). But suppose f(L) is invariant by multiplication of L by an orthogonal matrix O of order lxl, then the correct geometry to use is consider a quotient manifold for L, which is the Grassmannian G(l,d).
In https://epubs.siam.org/doi/pdf/10.1137/080731359 it is mentioned that if we ignore the invariance, then the solutions are not isolated. This issue is not harmful for simple gradient schemes but it greatly affects the convergence of second-order methods.
My question is:
1. Is it conceptually wrong to ignore the invariance?
2. Empirically, it won't matter much if we ignore the invariance, atleast in some problems? At the end of the day all we want is an orthogonal matrix L, right? (Atleast in problems where we are getting good convergence by using GD, or CGD)
Regards
Ujjal
Thanks a lot for the detailed answer. I have few more questions:
1. (Apologies in case it is present already, I could not find it). Do you have a separate manifold utility (like the product manifold ), for implementing quotient manifolds?
2. Particularly, I have the following problem:
f(B,C)=x^T*B*C^{1/2}*C^{1/2}*B^T*x, where x is a dx1 vector. B is a dxl matrix, constrained on Stiefel manifold St(l,d). C is a lxl matrix.
I have yet another vector to learn, let's call it v, a mx1 vector, say. Hence, my overall search space in this case would be a product manifold defined as:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.C = euclideanfactory(l,l);
M = productmanifold(tuple);
But, f(BQ,Q^T*C*Q)=f(B,C), for all Q belonging to the orthogonal group O_l. And hence, we are dealing with a quotient manifold, right?
3. Assuming my question in (2) to be yes, how do we implement it in Manopt?
4. What if I restrict C to be a diagonal matrix, then I can simply learn a vector c representing the diagonal of C, and would like to implement as follows:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.c = euclideanfactory(l);
M = productmanifold(tuple);
Is it correct to do so?
5. All my arguments for (2) and (3) above, would hold the same, if I constrain C to be on the SPD manifold, with the following implmentation:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.C = sympositivedefinitefactory(l);
M = productmanifold(tuple);
Am I right?
Kindly help.
Regards
Ujjal
Hi NicolasThanks a lot for the detailed answer. I have few more questions:
1. (Apologies in case it is present already, I could not find it). Do you have a separate manifold utility (like the product manifold ), for implementing quotient manifolds?
2. Particularly, I have the following problem:
f(B,C)=x^T*B*C^{1/2}*C^{1/2}*B^T*x, where x is a dx1 vector. B is a dxl matrix, constrained on Stiefel manifold St(l,d). C is a lxl matrix.
I have yet another vector to learn, let's call it v, a mx1 vector, say. Hence, my overall search space in this case would be a product manifold defined as:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.C = euclideanfactory(l,l);M = productmanifold(tuple);
But, f(BQ,Q^T*C*Q)=f(B,C), for all Q belonging to the orthogonal group O_l. And hence, we are dealing with a quotient manifold, right?
3. Assuming my question in (2) to be yes, how do we implement it in Manopt?
4. What if I restrict C to be a diagonal matrix, then I can simply learn a vector c representing the diagonal of C, and would like to implement as follows:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.c = euclideanfactory(l);M = productmanifold(tuple);
Is it correct to do so?
5. All my arguments for (2) and (3) above, would hold the same, if I constrain C to be on the SPD manifold, with the following implmentation:
tuple.v = euclideanfactory(m);
tuple.B = stiefelfactory(d, l);
tuple.C = sympositivedefinitefactory(l);M = productmanifold(tuple);
Am I right?