Aha, so there might be an issue with losing orthonormality indeed. I run the code with this options structure:
options.statsfun = @mystatsfun;
function stats = mystatsfun(problem, X, stats)
disp(norm(X'*X - eye(p)));
end
This way, at each iteration, we display the distance from X'*X to the identity. It should be zero, but numerically we might be off a bit.
With the retraction, this remains nicely close to 1e-15, but with the exponential we get numbers as high as 1e-10, and then the theorems that are supposed to make all of this work, well, don't work anymore.
All in all, this is what the code for retraction and exponential should look like in the grassmann.m file (changes in bold):
M.retr = @retraction;
function Y = retraction(X, U, t)
if nargin < 3
t = 1.0;
end
Y = X + t*U;
for i = 1 : k
% We do not need to worry about flipping signs of columns here,
% since only the column space is important, not the actual
% columns. Compare this with the Stiefel manifold.
% [Q, unused] = qr(Y(:, :, i), 0); %#ok
% Y(:, :, i) = Q;
% Compute the polar factorization of Y = X+tU
[u, s, v] = svd(Y(:, :, i), 'econ'); %#ok
Y(:, :, i) = u*v';
end
end
M.exp = @exponential;
function Y = exponential(X, U, t)
if nargin == 3
tU = t*U;
else
tU = U;
end
Y = zeros(size(X));
for i = 1 : k
[u, s, v] = svd(tU(:, :, i), 0);
cos_s = diag(cos(diag(s)));
sin_s = diag(sin(diag(s)));
Y(:, :, i) = X(:, :, i)*v*cos_s*v' + u*sin_s*v';
% From numerical experiments, seems necessary to
% re-orthonormalize. This is overall quite expensive.
[q, unused] = qr(Y(:, :, i), 0); %#ok
Y(:, :, i) = q;
end
end
No oscilliations anymore, but the exponential is, well, a tad expensive (which is not necessarily surprising: that's why we use retractions!).
Thank you very much for bringing this up!
Best,
Nicolas