The checkgradient() function has a different result each time it checks the gradient

51 views
Skip to first unread message

Yuang Chen

unread,
Jan 10, 2025, 9:35:39 AMJan 10
to Manopt
When I check the gradient using the checkgradient() function, sometimes I get a result of 2 correct and sometimes I get a result of 1 incorrect, why is this? My cost function and gradient are both constant and calculated without problem. However, even if the first time the slope is calculated correctly and the result is 2, in subsequent iterations the slope is calculated as 1 and a warning is issued that the result is negative. Please help!
1.png2.png
and here are my cost function and egrad function:
____________________________________________________________________
function cost = cost_fun(w_opt,beta_opt,H,HE1,HE2,index,N)

    cost = 0;
    sumInter = zeros(1,6);
   
    for k = 1:6

        if k <= 5
            for g = k+1:6
                sumInter(k) = sumInter(k) + real(beta_opt(g)*w_opt(:,g)'*H{k}*w_opt(:,g));
            end
        end
        if k == 6
            sumInter(k) = 0;
        end
        if index(k) <= 3 % R
            cost = cost + log2(1 + real(beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))/(sumInter(k)+1)) ...
                - log2(1 + real(beta_opt(k)*w_opt(:,k)'*HE1*w_opt(:,k))/1);
        end
        if index(k) > 3 % T
            cost = cost + log2(1 + real(beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))/(sumInter(k)+1)) ...
                - log2(1 + real(beta_opt(k)*w_opt(:,k)'*HE2*w_opt(:,k))/1);
        end

    end

end
____________________________________________________________________
function egrad = egrad_fun(w_opt,beta_opt,H,HE1,HE2,index,N)

    egrad = zeros(N,6);
    sumInter1 = zeros(1,6);
    sumInter2 = zeros(1,6);
    Item2 = zeros(N,6);
   
    for k = 1:6
        if k <= 5
            for g = k+1:6
                sumInter1(k) = sumInter1(k) + beta_opt(g)*w_opt(:,g)'*H{k}*w_opt(:,g);
            end
        end
        if k == 6
            sumInter1(k) = 0;
        end
        if k >= 2
            for p = 1:k-1
                for c = p+1:6
                    sumInter2(k) = sumInter2(k) + beta_opt(c)*w_opt(:,c)'*H{p}*w_opt(:,c);
                end
                Item2(:,k) = Item2(:,k) + 2*beta_opt(k)*H{p}*w_opt(:,k) ...
                        /(log(2)*(1 + sumInter2(k) + beta_opt(p)*w_opt(:,p)'*H{p}*w_opt(:,p))) ...
                    - 2*beta_opt(k)*H{p}*w_opt(:,k)/(log(2)*(1 + sumInter2(k)));
            end
        end
        if k == 1
            Item2(:,k) = zeros(N,1);
        end
       
        if index(k) <= 3
            egrad(:,k) = egrad(:,k) + 2*beta_opt(k)*H{k}*w_opt(:,k)/(log(2)*(1 + sumInter1(k) + beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))) ...
                - 2*beta_opt(k)*HE1*w_opt(:,k)/(log(2)*(1 + beta_opt(k)*w_opt(:,k)'*HE1*w_opt(:,k))) ...
                + Item2(:,k);
        end
        if index(k) > 3
            egrad(:,k) = egrad(:,k) + 2*beta_opt(k)*H{k}*w_opt(:,k)/(log(2)*(1 + sumInter1(k) + beta_opt(k)*w_opt(:,k)'*H{k}*w_opt(:,k))) ...
                - 2*beta_opt(k)*HE2*w_opt(:,k)/(log(2)*(1 + beta_opt(k)*w_opt(:,k)'*HE2*w_opt(:,k))) ...
                + Item2(:,k);
        end
    end

end
____________________________________________________________________

Nicolas Boumal

unread,
Jan 10, 2025, 9:49:33 AMJan 10
to Manopt
Interesting. There is indeed a source of randomness inside of checkgradient, namely: if you call checkgradient(problem), then internally it selects a point x and a tangent vector v (at x) randomly. Then, it runs the check from that point along that direction.

You can also call checkgradient(problem, x, v), in which case the result should be entirely deterministic.

You could call checkgradient(problem, x, v) with a few choices of x and v (which you can generate randomly from x = problem.M.rand(); and v = problem.M.randvec(x); for example).

This way, you can (a) verify that if you call the check several times with the same (x, v) then you do indeed get the same result (let's make sure of that first).

Afterwards, you could try to see if you can tell the difference between the pairs (x, v) for which you have the right slope, and the others. Perhaps your cost function has some special behavior in some regions and the gradient is correct for some but not all regions? There are some "if-else" clauses in your cost, so that seems plausible?

Best,
Nicolas
Reply all
Reply to author
Forward
0 new messages