The checkgradient() function has a different result each time it checks the gradient

52 views

Skip to first unread message

Yuang Chen

unread,

Jan 10, 2025, 9:35:39 AMJan 10

to Manopt

When I check the gradient using the checkgradient() function, sometimes I get a result of 2 correct and sometimes I get a result of 1 incorrect, why is this? My cost function and gradient are both constant and calculated without problem. However, even if the first time the slope is calculated correctly and the result is 2, in subsequent iterations the slope is calculated as 1 and a warning is issued that the result is negative. Please help!

and here are my cost function and egrad function:
____________________________________________________________________
function cost = cost_fun(w_opt,beta_opt,H,HE1,HE2,index,N)

cost = 0;
sumInter = zeros(1,6);

for k = 1:6

if k <= 5
for g = k+1:6
sumInter(k) = sumInter(k) + real(beta_opt(g)*w_opt(:,g)'*H{k}*w_opt(:,g));
end
end
if k == 6
sumInter(k) = 0;
end
if index(k) <= 3 % R
cost = cost + log2(1 + real(beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))/(sumInter(k)+1)) ...
- log2(1 + real(beta_opt(k)*w_opt(:,k)'*HE1*w_opt(:,k))/1);
end
if index(k) > 3 % T
cost = cost + log2(1 + real(beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))/(sumInter(k)+1)) ...
- log2(1 + real(beta_opt(k)*w_opt(:,k)'*HE2*w_opt(:,k))/1);
end

end

end
____________________________________________________________________
function egrad = egrad_fun(w_opt,beta_opt,H,HE1,HE2,index,N)

egrad = zeros(N,6);
sumInter1 = zeros(1,6);
sumInter2 = zeros(1,6);
Item2 = zeros(N,6);

for k = 1:6
if k <= 5
for g = k+1:6
sumInter1(k) = sumInter1(k) + beta_opt(g)*w_opt(:,g)'*H{k}*w_opt(:,g);
end
end
if k == 6
sumInter1(k) = 0;
end
if k >= 2
for p = 1:k-1
for c = p+1:6
sumInter2(k) = sumInter2(k) + beta_opt(c)*w_opt(:,c)'*H{p}*w_opt(:,c);
end
Item2(:,k) = Item2(:,k) + 2*beta_opt(k)*H{p}*w_opt(:,k) ...
/(log(2)*(1 + sumInter2(k) + beta_opt(p)*w_opt(:,p)'*H{p}*w_opt(:,p))) ...
- 2*beta_opt(k)*H{p}*w_opt(:,k)/(log(2)*(1 + sumInter2(k)));
end
end
if k == 1
Item2(:,k) = zeros(N,1);
end

if index(k) <= 3
egrad(:,k) = egrad(:,k) + 2*beta_opt(k)*H{k}*w_opt(:,k)/(log(2)*(1 + sumInter1(k) + beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))) ...
- 2*beta_opt(k)*HE1*w_opt(:,k)/(log(2)*(1 + beta_opt(k)*w_opt(:,k)'*HE1*w_opt(:,k))) ...
+ Item2(:,k);
end
if index(k) > 3
egrad(:,k) = egrad(:,k) + 2*beta_opt(k)*H{k}*w_opt(:,k)/(log(2)*(1 + sumInter1(k) + beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))) ...
- 2*beta_opt(k)*HE2*w_opt(:,k)/(log(2)*(1 + beta_opt(k)*w_opt(:,k)'*HE2*w_opt(:,k))) ...
+ Item2(:,k);
end
end

end
____________________________________________________________________

Nicolas Boumal

unread,

Jan 10, 2025, 9:49:33 AMJan 10

to Manopt

Interesting. There is indeed a source of randomness inside of checkgradient, namely: if you call checkgradient(problem), then internally it selects a point x and a tangent vector v (at x) randomly. Then, it runs the check from that point along that direction.

You can also call checkgradient(problem, x, v), in which case the result should be entirely deterministic.

You could call checkgradient(problem, x, v) with a few choices of x and v (which you can generate randomly from x = problem.M.rand(); and v = problem.M.randvec(x); for example).

This way, you can (a) verify that if you call the check several times with the same (x, v) then you do indeed get the same result (let's make sure of that first).

Afterwards, you could try to see if you can tell the difference between the pairs (x, v) for which you have the right slope, and the others. Perhaps your cost function has some special behavior in some regions and the gradient is correct for some but not all regions? There are some "if-else" clauses in your cost, so that seems plausible?

Best,

Nicolas

Reply all

Reply to author

Forward

0 new messages