Euclidean Gradient not converging for L(Q)=|h Q Qᵀ g|² with Q=I−2vvᴴ (complex sphere)

89 views
Skip to first unread message

Keyvan Aslansefat

unread,
Sep 4, 2025, 5:34:29 PMSep 4
to Manopt

I have a real-valued loss function as follows:

where . I obtained the gradient of L wrt  as follows:

Moreover, the matrix  itself can be written as , where , and has a unit norm, i.e., spherecomplexfactory. I calculated the gradient of L with respect to  as follows, assuming v is constant, but the optimization problem is not converging.

Does anybody have an idea about the problem? I code is as follows:

M = spherecomplexfactory(K, 1);
P.M = M;
P.cost = @(v) -costFun(hr, ht, v);
P.egrad = @(v) -gradFun(P, hr, ht, v);
[v, ~] = conjugategradient(P, [], []);
checkgradient(P)
function c = costFun(hr, ht, v)
I = eye(length(hr));
Q = (I-2*v*v');
Theta = Q*Q.';
c = abs(hr*(Theta)*ht)^2;
end
function g = gradFun(P, hr, ht, v)
I = eye(length(hr));
Q = (I-2*v*v');
Theta = Q*Q.';
G = 2*hr*Theta*ht*((conj(ht)*conj(hr)).' + (conj(ht)*conj(hr)))*conj(Q);
g = -2*G*v;
end




Nicolas Boumal

unread,
Sep 5, 2025, 12:47:48 AMSep 5
to Manopt
Hello,
If you believe the issues may come from an incorrect derivation of the gradient because the variable is complex, then I suggest you write v = u + iw (separate v into real and imaginary parts, so that u and v are two real vectors), and compute the gradient with respect to u and v (as one normally would for a function of real variables). Then, you can recombine these into a complex vector (gradient wrt u + i gradient wrt w).
Best,
Nicolas
Message has been deleted

Keyvan Aslansefat

unread,
Sep 8, 2025, 5:13:17 AMSep 8
to Manopt
Hello, Thank you for your feedback.


I provided my problem with the gradient on Math stackexchange, and the answer they provided me regarding my problem is provided on this link: https://math.stackexchange.com/questions/5094295/gradient-of-a-real-valued-function-wrt-to-complex-valued-vector/5094613?noredirect=1#comment10965891_5094613

But unfortunately, the code has still has the same problem:

clear
clc
close all
K = 4;
g = randn(K,1) + 1i*randn(K,1); g = g/norm(g);
h = randn(K,1) + 1i*randn(K,1); h = h/norm(h);
I = eye(K);
Qfun = @(v) I - 2*(v*v');
Theta = @(v) Qfun(v) * Qfun(v).';
% ---- Manifold and problem
M = spherecomplexfactory(K); % complex unit sphere
P.M = M;
% Cost: f(v) = -| h^T (Q Q^T) g |^2
P.cost = @(v) -abs(h.'*Theta(v)*g)^2;
P.egrad = @(v) 4*(h.'*Theta(v)*g)*(conj(g)*h' + conj(h)*g')*conj(Qfun(v))*v;
checkgradient(P);

Ryan Harvey

unread,
Sep 8, 2025, 7:15:51 PMSep 8
to Manopt
Hello Keyvan,

I'm a little confused. Do you need the gradient with respect to the complex unit vector v, or with respect to its complex conjugate? Your posts on here and stack exchange indicate that you're looking for the gradient of L with respect to v*, but your code reads to me like you are taking the gradient with respect to v itself. 

Best,
Ryan

Keyvan Aslansefat

unread,
Sep 9, 2025, 5:15:21 AMSep 9
to Manopt
Hello Ryan,

Thank you for your message. Actually, the calculated gradient in my code is the same as the one provided in the link, as the gradient is a function of both v and v*.

I appreciate you taking the time to double-check. I will be grateful if you clarify to me if I misunderstood your point.

Sincerely,
Keyvan Aslansefat

Ryan Harvey

unread,
Sep 9, 2025, 11:42:55 AMSep 9
to Manopt
Hello Keyvan,

When you have a function of a complex variable, your derivative with respect to the original variable will not be the same as the derivative with respect to the conjugate. The complex gradient will be a function of both v and v*, but it matters whether you are taking your partials with respect to v or v*.

This is a relevant section from the Matrix Cookbook (Ch. 4 pg 24):

Screenshot from 2025-09-09 10-54-56.png

When you have a real function f of a complex vector z, the gradient of that function is the vector of partials with respect to the variable itself. While this gradient will be a function of both the variable v, and its conjugate v*, it is important to distinguish which you are requiring for the gradient. 

There's typically two ways you can look at complex gradient vectors of real-valued smooth functions:
Screenshot from 2025-09-09 10-57-44.png

The second way is the way Prof. Boumal recommended to you, and I agree because it removes a lot of the confusion and complications from the process. If you go back and find the (real) gradient vector with respect to the real components of v, and then find the (real) gradient vector with respect to the imaginary components of v, you can easily produce the complex gradient by the summation of those two vectors with i multiplied by the latter.

In your stack-exchange post, you ask for a gradient with respect to the conjugate, Q* (and later v*), while your code is implementing a gradient with respect to v (if I'm understanding correctly). The gradient with respect to the complex vector v will not be the same as the gradient with respect to the conjugate vector v*, because the imaginary components will be different. Both will be vector functions of v and v*, but they won't be the same vector function of v and v* because of their differences in the imaginary components of their component vectors. 

Hope that helps,
Ryan 
Reply all
Reply to author
Forward
0 new messages