When I run this code, I get the following output (emphasis in bold is mine):
The slope should be 2. It appears to be: 2.00041.
If it is far from 2, then directional derivatives might be erroneous.
The residual should be 0, or very close. Residual: 2.65562e+25.
If it is far from 0, then the gradient is not in the tangent space.
The slope is fine, but the residual seems entirely off (it's huge). This would suggest that the gradient is not properly projected, but actually not.
Indeed, if we pick a random point on the manifold and compute the norm of the gradient there, we get an even bigger number:
>> x = problem.M.rand();
>> problem.M.norm(x, getGradient(problem, x))
The ratio between 10^25 and 10^42 is indeed machine precision, so your gradient appears to be correct.
And indeed, when we run nonlinear conjugate gradients, we get this output (from your code):
iter cost val grad. norm
0 +1.5189654937419912e+42 2.67572166e+42
1 +6.7334922053171432e+41 1.13133108e+42
2 +4.8204183533551820e+41 7.30252036e+41
3 +3.3511451234617862e+41 4.00287658e+41
4 +2.5272662321494862e+41 1.84276234e+41
5 +2.2195981445610432e+41 5.08341667e+40
6 +2.1935520964569974e+41 1.79949772e+40
7 +2.1902579606196875e+41 6.50844406e+39
8 +2.1898265305729956e+41 2.36058383e+39
9 +2.1897697667012173e+41 8.56556861e+38
10 +2.1897622924954353e+41 3.10852006e+38
11 +2.1897613080947600e+41 1.12822097e+38
12 +2.1897611784175141e+41 4.09520128e+37
13 +2.1897611613316668e+41 1.48660811e+37
14 +2.1897611590800744e+41 5.39706600e+36
15 +2.1897611587833031e+41 1.95956102e+36
16 +2.1897611587441753e+41 7.11540341e+35
17 +2.1897611587390177e+41 2.58392415e+35
18 +2.1897611587383399e+41 9.38424432e+34
19 +2.1897611587382494e+41 3.40845814e+34
20 +2.1897611587382386e+41 1.23809944e+34
21 +2.1897611587382335e+41 4.49771327e+33
22 +2.1897611587382335e+41 1.63405416e+33
23 +2.1897611587382335e+41 5.93716784e+32
24 +2.1897611587382335e+41 1.62271681e+32
25 +2.1897611587382335e+41 1.62306806e+32
This shows that the method behaves well: the gradient norm is reduced by 10 orders of magnitude, and the cost function value decreases by about one order of magnitude. It's just that the scaling of the cost function is humongous.
It is probably a good idea to rescale your cost function to bring those numbers back to more typical ranges, because they might cause some numerical issues. Apart from that, all looks normal to me.