First of all many thanks for such an excellent toolbox. It really helps me with what I'm doing.
In the previous post, Florian asked about the gradient of a cost function of this form:
F(X,A,B) = Trace(log( X^0.5 A X^0.5) log( X^0.5 B X^0.5) ) with X, A and B SDP matrices (and X^0.5 being the matrix square root of X), and log denotes matrix logarithm.
My question is rather simple. What is the grad of F(X,A,B) with respect to A? Having looked into some of the material that you provided in the previous post, I found it is close to a gradient of square of geodesic distance, but not exactly the same.
Many thanks,
Korsuk
Thank you so much for the material. It's very helpful. I implemented it and it's working fine.
There are other queries:
1) When trust region method is used, it always shows the message that Hessian is not provided, and it uses numerical approximation instead. I was wondering if you can point me out how to derive Hessian of
g(A, B, C) = Trace( log(A^-0.5 B A^-0.5) log(A^-0.5 B A^-0.5) ) + Trace( log(A^-0.5 B A^-0.5) log(A^-0.5 C A^-0.5) )
with respect to B.
2) In the literature that you pointed to, it mentions Log-Euclidean metric. Since I have no background in differential geometry, it's not clear to me how this metric works. How to find the gradient and Hessian of g(A,B,C) with respect to B in this metric?
Many thanks,
Korsuk
Hopefully, if I manage to work out the formula of Hessian, I will post it here.
Log-Euclidean metric seems promising to solve my original problem efficiently. The number of parameters involving in my cost function is quite high. So I wish that this alternative will help reducing computational complexity of the optimisation process. It will take me some times to understand the concept of log-Euclidean metric, and I'm not quite sure how to perform optimisation on a space of positive symmetric matrix. Can you give me some pointers?
Really appreciate your help.
Best wishes,
Korsuk
Best wishes,
Korsuk
Can you please check if the Riemannian distance in sympositivedefinitefactory.m is correct?
M.dist = @dist;
function d = dist(X, Y)
d = norm(logm(X\Y), 'fro');
end
it does not give the same value as
d = norm(logm(sqrtm(Y)\X/sqrtm(X)),'fro') or
d = sqrt(sum(log(eig(X\Y)).^2))
Best wishes,
Korsuk
M.inner = @(X, eta, zeta) trace( (X\eta) * (X\zeta) );M.norm = @(X, eta) norm(X\eta, 'fro');
M = sympositivedefinitefactory(5)X = M.rand()Xdot = M.randvec(X);
sqrt(M.inner(X, Xdot, Xdot))M.norm(X, Xdot)
norm_X(H) = sqrt(trace((X\H)^2))dist(X, Y) = sqrt(trace(logm(X\Y)^2))Exp_X(H) = X*expm(X\H)Log_X(Y) = X*logm(X\Y)