The definition of embedded submanifold

93 views
Skip to first unread message

TIAN LIN

unread,
Apr 16, 2021, 5:02:36 AMApr 16
to Manopt
Dear Nicolas,

The Definition 3.6 in your book is very brief and easy to follow,  but I'm a little puzzled about the second condition, i.e., 

rank Dh(x) = k. 

You said in the book that it is added to avoid some h such as h(x) = x_1^2 - x_2^2 = 0. 

Is there some other explanations or geometric meaning itself? I'm trying to understand it.

Thank you very much in advance. 

TIAN LIN

unread,
Apr 16, 2021, 5:17:26 AMApr 16
to Manopt
Oh I find there is a wonderful explanation in Section 7.7, my fault.

Nicolas Boumal

unread,
Apr 19, 2021, 4:11:22 AMApr 19
to Manopt
Glad you made that connection! Indeed, one way to think about the rank condition is that we are requiring the gradients of the constraints to be linearly independent. Then, these gradients form a basis for the normal space, whose orthogonal complement is the tangent space. This is exactly the "linear independence constraint qualification" (LICQ) from the classical theory of constrained optimization (KKT etc.)

Best,
Nicolas

TIAN LIN

unread,
Jun 23, 2021, 11:48:10 PMJun 23
to Manopt
Dear Boumal, 

Long time no see, I wish you all the best! Besides, thank you for your introduction of the LICQ. Recently I'm puzzled about one issue, that is, in the conventionial Euclidean space, the theory of convex optimization, clearly suggests the convergence of the gradient descend algorithm of the convex functions. However, is there some similar theory of the manifold optimizatoin, i.e., the RCG algorithm? In my experience, sometimes, the RCG algorithm can converge to a global optimal but sometimes not. How can I analyze it? 

Or, in other words, if I have the objective function f(x), while x is on a typical manifold X, can I judge its convergence? (i.e., global optimal or local optimal) Is it depends on the formulation of f(x) and X?
Thanks in advance.

Nicolas Boumal

unread,
Jun 24, 2021, 4:49:45 AMJun 24
to Manopt
Dear Tian Lin,

Whether optimization algorithms such as (Riemannian) gradient descent converge or not, and whether they do so to a local or a global minimum or to something else, is an important but difficult question in general. There are indeed examples of optimization problems on manifolds for which we are able to find global optima, and this is an active research topic. See for example this review paper.

There is an analog of convex optimization on Riemannian manifolds called geodesic convexity. You can find an introduction to that topic in my book (last chapter), with some pointers to recent literature on the subject.

Best,
Nicolas

TIAN LIN

unread,
Aug 16, 2021, 12:13:40 AMAug 16
to Manopt
Dear Bounmal, long time no see, hope you everything well!.

I have learned that  {X|diag(XX^T) = I} is a typical manifold called Oblique manifold, where I denotes the identity matrix and diag represents the operation that making all non-diagonal elements to be zero. 

So I slightly mofidy it as:

A = {X|diag(XX^T) = M}, where M is a diagonal matrix, with its positive diagonal elements, which may not equal. It corresponds to the scenario that the norms of each column of matrix X are fixed but may be different.

B = {X|diag(XGX^T=I}, where G is a positive definate matrix.

Can you kindly check that if A and B are both embeded submanifolds?  I have tried to follow the idea presented in your Book, Chapter 7 to prove it, i.e., find a local define function h(X) and then prove it is full-rank, however, 
I'm not sure if I'm correct. 

Thank you for your time and kind help in advance.

Nicolas Boumal

unread,
Aug 16, 2021, 8:57:27 AMAug 16
to Manopt
Hello,

For A, I suggest you try the following: write A as the product of several spheres, where each sphere has its own radius (given by the square roots of M_{11}, M_{22}, ...). Make sure that a sphere of any positive radius is an embedded submanifold of R^n, then look through Chapter 3 to see what we know about products of manifolds. Specifically: a product of embedded submanifolds is an embedded submanifold. Of course, you can also just study a local defining function for A directly: just make sure that h maps into R^k with k which is the right value. Here, it should be equal to the number of spheres, since each sphere is one constraint.

For B: this is actually a Stiefel manifold, only it's defined with respect to an inner product which is different from the standard one, and also X is transposed. So yes, it's also an embedded submanifold. To verify this explicitly, read the section about Stiefel manifolds in Chapter 7, and adapt the reasoning slightly.

Hope this helps,
Nicolas

TIAN LIN

unread,
Aug 16, 2021, 9:28:45 AMAug 16
to Manopt
Dear Bounmal, thank you for your kind response! 

Yeah, I have read the Chapter 7, which helps a lot. However, the generalized Stiefel manifold is defined as {X|X^TBX=I}, but B is defined as {X|diag{X^TBX}=I}, a little different but I think it can follow the proving idea of A!

Thank you !

Besides,  I want to ask you a problem that troubled me for a long time.. if my optimization associated with two constraints, each of which respectively corresponds to a typical submanifold, for example:

min f(X)
s.t. |X_{ij}| = 1,  X^TBX=I

can I still use the manifold optimization to solve it? 

Or, is that the intersection of two submanifold is still a submanifold?

Is there some general approach to solve it? 

Nicolas Boumal

unread,
Aug 16, 2021, 9:57:28 AMAug 16
to Manopt
Ah, my bad, I missed the diag(..) in the definition of B. Then yes, your best option is to verify whether you have a local defining function. Another possibility is to use the fact that the positive definite matrix G has a Cholesky factorization: G = L^T L, with L invertible. This should allow you to verify that x -> Lx is a change of variable that maps between A and B (assuming M = I for A). From this you could also deduce that B is a manifold using the fact that A is a manifold.

As to your other question: no, in general the intersection of two manifolds is not a manifold. It can happen, but it's not always true. One sufficient (but not necessary) condition for the intersection M_1 \cap M_2 to be a manifold is to find a local defining function for M_1, say, h_1(x) = 0, and a local defining function for M_2, say, h_2(x) = 0, and to verify that the rank of [Dh_1(x) ; Dh_2(x)] is maximal -- that is, to check whether [h_1(x); h_2(x)] is a local defining function for M_1 \cap M_2.

Best,
Nicolas



TIAN LIN

unread,
Aug 16, 2021, 9:15:01 PMAug 16
to Manopt
Thank you so much!

I think I get it!

TIAN LIN

unread,
Aug 19, 2021, 3:44:04 AMAug 19
to Manopt
Dear Boumal, 

during using the manopt toolbox, I have a problem:

In your book, I learned that whether the current point is a stationary point can be determined by whether gradnorm is close to 0.

For my optimization problem, if I initialize randomly, then the gradnorm reduce to 0 quickly with iterations, which meets my expectation, and I believe that I obtain a local optimal.

However, if I initialize it by the solution of another conventional algorithm, then, the manopt can not work, i.e., no iteration will be operated, with the warning that "Last stepsize smaller than minimum allowed",

So I think the initialization resulting from the conventioanl algorithm may already be a local optimal. However, its gradnorm is much larger than 0, which puzzles me. 

Is this a usual case? Or there must be something wrong? 

(As the optimization function is quite easy so I think the derivation of Euclidean gradient is correct)

Thanks for your time,

Nicolas Boumal

unread,
Aug 19, 2021, 5:49:06 AMAug 19
to Manopt
Hello,

There could be a number of things happening here, it's hard to tell without more info. For example, is it certain that the point returned by that other algorithm is on the manifold? Does the issue arise with all of Manopt's solvers, or just with one (I'm guessing you used steepestdescent or conjugategradient)?

Best,
Nicolas

TIAN LIN

unread,
Aug 19, 2021, 11:34:51 AMAug 19
to Manopt
Thank you for your response,

I'm sure the point is on the manifold.

Yeah, I adopt  steepestdescent or conjugategradient, because it is a little complexity to compute the Hessian matrix.

So, is that case reasonable?

Nicolas Boumal

unread,
Aug 20, 2021, 2:52:19 AMAug 20
to Manopt
You can still try to run trustregions: Manopt will approximate the Hessian automatically with finite differences of the gradient.

Also, you can try to perturb the point slightly and initialize the manopt solver from there. Explicitly, if M is the manifold obtained from a factory (what you would put in problem.M):

x = ... your other algorithm... ;
v = M.randvec(x); % random tangent vector at x
t = 1e-4; % some small number; try various values
x_perturbed = M.retr(x, v, t);
x_opt = trustregions(problem, x_perturbed);

Maybe this helps? (Setting t = 0 comes down to not perturbing, i.e., initializing from x).

Best,
Nicolas

TIAN LIN

unread,
Aug 20, 2021, 10:51:17 PMAug 20
to Manopt
Thank you  Nicolas, thank you for your kind advise, I will try it.  

TIAN LIN

unread,
Aug 25, 2021, 11:31:58 AMAug 25
to Manopt
Dear Nicolas,

In Chapter 7, you show the general way to compute the Riemannian gradient, or more essentially, the projection to the tangent space. That is,

first compute the normal space and then decompose the vector as the sum of its tangent and normal parts.  Specifically , the equation 7.69~7.72 in your book.

However, can I obtain the projection of a vector y to the tangent space TxM by solving the following problem:

min_z  ||z-y||^2
s.t. z\in TxM. 

Namely, find the closest point on TxM to y.   Is this another way to get the projection? 

Thank you for your time and kind help.

Best,

Tian.

Nicolas Boumal

unread,
Aug 25, 2021, 4:59:47 PMAug 25
to Manopt
Yes, those two approaches are equivalent.

Best,
Nicolas
Reply all
Reply to author
Forward
0 new messages