Thank you for this nice package! I want to find a low rank matrix A that minimizes Frobenius norm of Y - M*A where Y and M are known. Both M and Y are large, and M is sparse. Here is my attempt on a simulated dataset:
% Matrix dimensions and parameters
m = 1500; n = 700; p = 650; % Matrix dimensions
rank_values = 0:100:600; % Range of rank values to test
rank_values(1) = 1;
lambda = 0; % Regularization parameter
k_folds = 5; % Number of cross-validation folds
% Generate sparse matrices M and true low-rank A.
M = sprandn(m, p, 0.1); % 1% non-zero elements in M.
M(M~=0) = 1;
A_true = randn(p, 200) * randn(200, n);
Y = M * A_true + 10 * randn(m, n); % Noisy matrix multiplication
% Initialize arrays to store losses for each rank and fold.
train_loss = zeros(length(rank_values), k_folds);
test_loss = zeros(length(rank_values), k_folds);
% Cross-validation: Generate random indices for 5 folds.
indices = crossvalind('Kfold', m, k_folds);
% Loop through each rank value.
for i = 1:length(rank_values)
rank_val = rank_values(i);
fprintf('Testing rank: %d\n', rank_val);
% Define the manifold for matrix A with fixed rank.
manifold = fixedrankembeddedfactory(p, n, rank_val);
% Loop over k folds.
for fold = 1:k_folds
fprintf('Fold: %d\n', fold);
% Split the data into training and testing based on the fold.
test_idx = (indices == fold);
train_idx = ~test_idx;
% Create training and testing matrices.
Y_train = Y(train_idx, :);
Y_test = Y(test_idx, :);
M_train = M(train_idx, :);
M_test = M(test_idx, :);
% Define the optimization problem for the current fold.
problem.M = manifold;
problem.cost = @(A) 0.5 * norm(Y_train - M_train * A.U * A.S * A.V', 'fro')^2;
problem.egrad = @(A) M_train' * (M_train * A.U * A.S * A.V' - Y_train);
%
https://www.manopt.org/lifts.html %problem.ehess = @(A, Adot) M_train'*M_train*Adot;
% Set optimizer options (trust-region).
options.verbosity = 1;
options.maxiter = 100;
% Initialize A randomly on the manifold.
A_init = problem.M.rand();
% Run the optimization.
[A_opt, ~, ~] = trustregions(problem, A_init, options);
% Reconstruct the optimized matrix A.
A_matrix = A_opt.U * A_opt.S * A_opt.V';
% Compute and store the training and testing losses for the fold.
train_loss(i, fold) = 0.5 * norm(Y_train - M_train * A_matrix, 'fro')^2;
test_loss(i, fold) = 0.5 * norm(Y_test - M_test * A_matrix, 'fro')^2;
end
end
% Compute the average loss across all folds for each rank.
avg_train_loss = mean(train_loss, 2);
avg_test_loss = mean(test_loss, 2);
%% Plot training and testing losses vs. rank.
figure;
plot(rank_values, avg_train_loss, '-o', 'LineWidth', 2);
hold on;
plot(rank_values, avg_test_loss, '-s', 'LineWidth', 2);
xlabel('Rank');
ylabel('Loss');
legend('Training Loss', 'Testing Loss');
title('Training and Testing Loss vs. Rank');
grid on;
hold off;
There are three problems I face:
1. I try to include hessian but it was error saying cannot do dot product on structure.
2. I want to do lasso regularization for matrix A, but I don't know how to extend the example given in
https://www.manopt.org/lifts.html for a vector into a matrix.
3. How do I do parallel computation for 5 folds given a rank?