Matlab code for the Procrustes problem using conjugated gradient

67 views
Skip to first unread message

Mårten Gulliksson

unread,
Feb 12, 2025, 9:16:52 AMFeb 12
to Manopt
Hi!

I need a simple Matlab code  for solving the Procrustes problem on the Stiefel manifold using conjugated gradients. Its essential its the conjugated gradient method since I want to compare it to an algorithm building up Krylov subspaces. Thanks in advance.

Best regards,

Mårten

Nicolas Boumal

unread,
Feb 12, 2025, 9:31:31 AMFeb 12
to Manopt
Helllo Mårten,  what did you try so far?

Nicolas Boumal

unread,
Feb 12, 2025, 9:36:26 AMFeb 12
to Manopt
Remarkably, Deepseek R1 has a pretty good answer (I didn't check it, but the code seems fine at first sight).

> I need a simple Matlab code using the Manopt toolbox for solving the Procrustes problem on the Stiefel manifold using conjugated gradients.

Here is the output: there is explicit code at the end.

<think>
Okay, the user now wants a simple MATLAB code using the Manopt toolbox to solve the Procrustes problem on the Stiefel manifold with conjugate gradients. Let me recall that Manopt is a specialized toolbox for optimization on manifolds, which simplifies implementing such algorithms because it handles the manifold structure and optimization solvers internally.

First, I need to remember how Manopt works. The typical steps are:

1. **Define the problem structure**: This includes specifying the manifold (Stiefel in this case) and the cost function.
2. **Set up the solver options**: Choosing the solver (conjugate gradient here) and any parameters like tolerance or maximum iterations.
3. **Run the solver**: Let Manopt handle the optimization using the defined problem and options.

For the Procrustes problem, the cost function is ||A*Q - B||_F², where Q is on the Stiefel manifold (Q'Q = I). In Manopt, I'll need to define this cost function, its Euclidean gradient, and optionally the Hessian. However, since we're using the conjugate gradient solver, Manopt can automatically compute the Riemannian gradient from the Euclidean gradient if we provide it.

The Stiefel manifold in Manopt is accessed via `stiefelfactory(n, p)`, where n is the number of rows and p the number of columns of Q. The factory function creates the manifold structure.

So, the steps in code would be:

- **Create the manifold**: Using `stiefelfactory`.
- **Define the problem structure**: With `problem.manifold`, `problem.cost`, and `problem.egrad` (Euclidean gradient).
- **Specify solver options**: Set `options.verbosity` for output details, `options.maxiter`, and `options.tolgradnorm` for stopping criteria.
- **Run the solver**: Call `conjugategradient(problem, [], options)`.

For the cost function, given a point Q on the manifold, compute f = norm(A*Q - B, 'fro')^2. The Euclidean gradient is 2*A'*(A*Q - B), as previously established.

Manopt automatically converts the Euclidean gradient to the Riemannian gradient by projecting it onto the tangent space. This avoids the need to manually handle the projection, retraction, or vector transport, which simplifies the code significantly.

Testing the code with sample matrices A and B would be good. For example, generate random matrices A (m x n), B (m x p), then solve for Q (n x p) on the Stiefel manifold.

I should also mention that the user needs to have Manopt installed and added to MATLAB's path. The code will be concise because Manopt abstracts away the manifold-related operations, allowing the user to focus on the problem-specific cost and gradient.

Let me verify the steps:

1. **Manifold Creation**: Correct, using `stiefelfactory`.
2. **Problem Definition**: The cost function and Euclidean gradient are correctly defined as anonymous functions or separate functions. Since A and B are fixed, they can be captured in the anonymous function handles.
3. **Solver Options**: Setting parameters like maximum iterations and tolerance ensures the solver stops appropriately.
4. **Solver Call**: Using `conjugategradient` with the problem and options.

Potential issues to note: Ensure that the dimensions of A, B, and Q are compatible. For instance, if A is m×n, B must be m×p, and Q will be n×p. The Stiefel factory is called with n and p accordingly.

Including comments in the code to explain each part will help the user understand the structure. Also, adding a test case with random matrices and displaying the result (like the cost value and final gradient norm) will make the example complete.
</think>

Here's a simple MATLAB code using the Manopt toolbox to solve the Procrustes problem on the Stiefel manifold with a conjugate gradient method:

```matlab
% Install and set up Manopt (ensure it's in your MATLAB path)
% Download from: https://www.manopt.org/

% Generate synthetic data
m = 5; % Rows of A and B
n = 3; % Columns of A
p = 2; % Columns of B and Q (Stiefel dimension)
A = randn(m, n);
B = randn(m, p);

% Create the Stiefel manifold (Q'Q = I)
stiefel = stiefelfactory(n, p);

% Define the Procrustes problem structure
problem = {};
problem.M = stiefel;

% Cost function: ||A*Q - B||_F^2
problem.cost = @(Q) norm(A*Q - B, 'fro')^2;

% Euclidean gradient (automatically converted to Riemannian gradient)
problem.egrad = @(Q) 2*A'*(A*Q - B);

% Set conjugate gradient options
options = struct();
options.verbosity = 1;     % 0: quiet, 1: some output, 2: detailed
options.maxiter = 100;     % Maximum iterations
options.tolgradnorm = 1e-6;% Stopping tolerance for gradient norm

% Solve with conjugate gradient
[Q_opt, cost_opt, info] = conjugategradient(problem, [], options);

% Display results
fprintf('Final cost: %e\n', cost_opt);
fprintf('Final gradient norm: %e\n', info(end).gradnorm);

% Verify Stiefel constraint (should be ~eye(p))
fprintf('Q''Q = \n');
disp(Q_opt'*Q_opt);
```


**Key components:**
1. **Manifold definition:** `stiefelfactory(n, p)` creates the Stiefel manifold
2. **Cost function:** Standard Frobenius norm squared
3. **Euclidean gradient:** Direct matrix calculus result
4. **Conversion to Riemannian gradient:** Handled automatically by Manopt
5. **Conjugate gradient solver:** `conjugategradient()` handles manifold-aware optimization

**Features:**
- Automatic Riemannian gradient computation
- Built-in convergence criteria
- Step-size adaptation
- Vector transport handling
- Convergence monitoring

**Output shows:**
- Optimization progress
- Final cost value
- Final gradient norm
- Orthogonality constraint verification

To use this code:
1. Install Manopt (https://www.manopt.org)
2. Adjust matrix dimensions (m, n, p) as needed
3. Replace A and B with your matrices
4. Tune options (tolerances, iterations) as required

The main advantage of using Manopt is that it handles all manifold-specific operations (retractions, transports, gradients) automatically while maintaining the conjugate gradient's convergence properties.

Bamdev Mishra

unread,
Feb 12, 2025, 10:09:31 AMFeb 12
to Nicolas Boumal, Manopt

Wow, that is some deep analysis of deepseek.

Best,


--
http://www.manopt.org
---
You received this message because you are subscribed to the Google Groups "Manopt" group.
To unsubscribe from this group and stop receiving emails from it, send an email to manopttoolbo...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/manopttoolbox/050a2463-c426-4022-8788-bf4d9d21c081n%40googlegroups.com.

Mårten Gulliksson

unread,
Feb 13, 2025, 2:54:44 AMFeb 13
to Manopt
Hi!

And thanks for the quick answer!

I started (long time ago) with asking ChatGPT to get me a Manopt code but it was messy and never worked. Maybe I did something wrong in the installation of Manopt but anyhow no success. Then I tried myself looking at the paper by Edelman but there you can only find a short code part with conjugated gradient for solving the Newton step and I never managed to translate that to a code for this Procrustes problem mainly because my knowledge in the details of parallel transport is shallow. 

I just recently asked Deepseek about the code and I got one running understanding the details but the performance is very poor, see the code below (I tried to speed it up a bit). For a random starting point it converges one out of ten making my own code look silly good. That's why I put the question here. Would appreciate if you had a glance on it, with your expert knowledge in differential geometry, I suspect it is far from efficient. 

Question: Do you think parallel transport will make the code more efficient? And what would that actually be doing? Sorry for my ignorance.

I will try your suggestion via Deepseek after reinstalling Manopt (I had an old version). Can I send you feedback on that through this conversation? I am sure you are busy.

Best regards,

Mårten


function Y = riemannian_conjugate_gradient(A, B, Y0, max_iter, tol)
% Riemannian Conjugate Gradient Method for the Orthogonal Procrustes Problem
% Input:
% A: m x n matrix
% B: m x p matrix
% Y0: n x p initial guess (must be on the Stiefel manifold, Y0'*Y0 = I)
% max_iter: maximum number of iterations
% tol: tolerance for stopping criterion (based on Riemannian gradient norm)
% Output:
% Y: n x p orthogonal matrix minimizing ||AY - B||_F^2
% Initialize
Y = Y0;
G = A' * (A * Y - B); % Euclidean gradient
grad_F = G - Y * (Y' * G); % Riemannian gradient
P = -grad_F; % Initial search direction (negative gradient)
iter = 0;
while iter < max_iter
% Compute the Euclidean gradient
G_prev = G;
G = A' * (A * Y - B);
% Project the gradient onto the tangent space at Y
grad_F = G - Y * (Y' * G); % Riemannian gradient
% Compute objective value and gradient norm
F = 0.5 * norm(A * Y - B, 'fro')^2;
grad_norm = norm(grad_F, 'fro');
% Print iteration details for debugging
fprintf('Iter %d: F = %.6f, ||grad_F|| = %.6f\n', iter, F, grad_norm);
% Check stopping criterion (Riemannian gradient norm)
if grad_norm < tol
fprintf('Converged after %d iterations.\n', iter);
break;
end
% Update search direction using Riemannian conjugate gradient
if iter == 0
P = -grad_F; % First iteration: steepest descent
else
% Compute the Riemannian version of the Polak-Ribière parameter
beta = trace(grad_F' * (grad_F - G_prev)) / trace(grad_F' * grad_F);
beta = max(0, beta); % Ensure beta is non-negative (Fletcher-Reeves)
P = -grad_F + beta * P; % Update search direction
end
% Project the search direction onto the tangent space at Y
P = P - Y * (Y' * P);
% Line search to find the optimal step size
alpha = line_search(Y, P, A, B);
% Ensure alpha is not too small or too large
alpha = max(1e-10, min(alpha, 1));
% Update Y using the retraction (QR-based retraction)
Y = retract(Y, P, alpha);
iter = iter + 1;
end
if iter == max_iter
fprintf('Reached maximum iterations (%d).\n', max_iter);
end
% Verify that the Riemannian gradient is close to zero at the solution
grad_F_final = A' * (A * Y - B) - Y * (Y' * (A' * (A * Y - B)));
grad_norm_final = norm(grad_F_final, 'fro');
fprintf('Final Riemannian gradient norm: %.6f\n', grad_norm_final);
end
function alpha = line_search(Y, P, A, B)
% Line search to find the optimal step size alpha (Armijo condition only)
alpha = 1; % Initial step size
rho = 0.5; % Reduction factor
c = 1e-4; % Armijo constant
max_line_search_iters = 20;
F = @(Y) 0.5 * norm(A * Y - B, 'fro')^2; % Objective function
grad_F = A' * (A * Y - B) - Y * (Y' * (A' * (A * Y - B))); % Riemannian gradient
directional_derivative = trace(grad_F' * P); % Should be negative
% Armijo backtracking line search
for i = 1:max_line_search_iters
Y_new = retract(Y, P, alpha);
F_new = F(Y_new);
if F_new <= F(Y) + c * alpha * directional_derivative
break; % Armijo condition satisfied
end
alpha = rho * alpha; % Reduce step size
end
end
function Y_new = retract(Y, P, alpha)
% Retraction onto the Stiefel manifold using QR decomposition
Y_new = Y + alpha * P;
[Q, ~] = qr(Y_new, 0); % Thin QR decomposition
Y_new = Q;
end

Nicolas Boumal

unread,
Feb 14, 2025, 11:16:46 AMFeb 14
to Manopt
The code generated by DeepSeek worked as is on my end, so indeed updating Manopt should help.

About parallel transport: we do not have one implemented for Stiefel. I would not a priori expect this to bring substantial gains compared to a transporter, but the only way to know for sure is to try.

Of course, for Procrustes, the most direct approach is via SVD of the data, without optimization at all.

Reply all
Reply to author
Forward
0 new messages