Question about underlying optimization algorithm‏

53 views
Skip to first unread message

Krittin Pachtrachai

unread,
Feb 19, 2015, 4:15:48 PM2/19/15
to manopt...@googlegroups.com
I am working on the project which involves optimization on manifold. According to this link http://www.manopt.org/manifold_documentation_rotations.html, an example to solve Procrustes Problem using Manopt toolbox is provided. The cost function is a scalar value, and gradient is matrix (4 by 4 matrix), which is the same as what happen in my project except that the size is different.

I am trying to optimize sum i=1:n of "trace(Z'*Z)", where Z is a function of real vector alpha of size 5. Z is obtained by log(A_i*X), where "log" is a matrix logarithm. My attempt to solve for gradient yields gradient of matrix size 4x4. I am not sure how to proceed from that. Can you tell me or refer to some optimization algorithm that you are using?

kama...@gmail.com

unread,
Feb 20, 2015, 12:25:34 AM2/20/15
to manopt...@googlegroups.com
FYI: A_i is the pre-collected data from my experiment, it is a 4x4 matrix. And X(alpha) is a 4x4 matrix also with input real vector alpha of size 5. Therefore, when I say Z is a function of vector alpha, it is actually X.

BM

unread,
Feb 20, 2015, 6:41:04 AM2/20/15
to manopt...@googlegroups.com
Hello Krittin, 

I am trying to understand the problem that you have. 

Could you write the cost function in X, instead of going though Z? I guess X is the variable for you. 
Also what are the dimensions of the matrices A_i?

Also, it would be appreciated if you could write the optimization problem precisely, e.g., "minimizing the cost function ... subject to the constraint ...". 

Regards,
BM

 

Nicolas Boumal

unread,
Feb 20, 2015, 6:50:45 AM2/20/15
to manopt...@googlegroups.com
I second Bamdev, it would be nice to have a clearer view of your question.

In particular, F is a function of what? The gradient of F will be an object (a matrix, typically) of the same size as the input of X.

Best,
Nicolas

kama...@gmail.com

unread,
Feb 20, 2015, 7:35:34 AM2/20/15
to manopt...@googlegroups.com
Sorry for the ambiguity of my question and my English.

Here is more clearer version.

I am trying to optimize the cost

F(alpha) = sum from i=1:N trace(logm((A_i*X))'*logm((A_i*X)))

alpha = input vector of size 5x1
logm = matrix logarithm
A_i = my data in the form of 4x4 matrices (So I have A_1, A_2, ..., A_N)
X = 4x4 matrix. This is a function of alpha

Thus, I am trying to find the best "alpha" that optimizes the cost F.

Here is my attempt. I re-write F(alpha) as summation of g'*g, where g is a vector [g_1, g_2, ... ,g_N]', each g_i represents sqrt(trace term). Therefore, my gradient would be J'*g, where J is Jacobian matrix.

dg/d(alpha) = dg(epsilon*alpha)/d(epsilon) when epsilon->0

Thus, we can obtain the derivative using chain rule.

dg(epsilon*alpha)/d(epsilon) = dg/d(trace term) * d(trace term)/dX * dX/d(epsilon) when epsilon -> 0

The first term of the derivative will be scalar.
The second term will be 4x4 matrix.
The third will also be 4x4 matrix.

I am not sure whether I calculate derivative correctly because in my understanding, my Jacobian should be Nx5 matrix. Do I need another step to proceed from this?

Thank you very much for your patience.

kama...@gmail.com

unread,
Feb 20, 2015, 7:37:55 AM2/20/15
to manopt...@googlegroups.com
Oh, and there is no constraint upon vector alpha.

BM

unread,
Feb 21, 2015, 8:03:52 AM2/21/15
to manopt...@googlegroups.com
Hmm...

It seems that your variable is alpha and not X. Therefore the gradient should have the dimension of alpha, which is a column vector of size 5.

What is the relationship between alpha and X?

kama...@gmail.com

unread,
Feb 22, 2015, 9:38:28 PM2/22/15
to manopt...@googlegroups.com
Sorry for the delayed reply.

Actually, this is the optimization problem on manifold SE(3), but I have designed the experiment such that translation along z-direction of alpha is always 0. So, I ignore it in the optimization.

And yes, A_i is also a rigid transformation matrix.

Nicolas Boumal

unread,
Feb 23, 2015, 3:19:16 AM2/23/15
to manopt...@googlegroups.com
Hello again,

Perhaps these posts could be a useful start:

About computing gradients of matrix function in general: https://groups.google.com/d/msg/manopttoolbox/uBdpvA0vVx4/QiLBwMittyYJ
This will lead you through computing the directional derivatives first, and getting the gradient from there.

In going through the main step, you will find the need to compute directional derivatives of matrix logarithms. This is a frequent difficulty, for which you can find help here:
(You will actually need the adjoint of the derivative of the matrix logarithm; in many cases, it is equal to the derivative of the matrix logarithm. It'll probably be the case for you too, but "to be checked").

Hope this can help you get started.

Best,
Nicolas
Reply all
Reply to author
Forward
0 new messages