How to avoid out-of-memory issues for large sparse matrices?

1,445 views
Skip to first unread message

Noam

unread,
Jan 27, 2014, 9:15:57 PM1/27/14
to yal...@googlegroups.com
Hi, 
I am trying to use YALMIP on a relatively large-scale, sparse, matrix. Up to some size the script runs from beginning-to-end in very reasonable time. However, when I go to higher sizes, basic operations like indexing start to cause out-of-memory errors. e.g., this script:

yalmip('clear');
n
=100000;
x
= sparse(1:n,1:n,sdpvar(n,1));
tic
fprintf
('Objective... ');
toc
;
fprintf
('Constraints... ');
tic
o
=sdpvar;
a
=sub2ind(size(x),1:n,1:n);
c
=(x(a')>=rand(size(x,1),1));
c=c+cone(x(a),o);
fprintf('
solve... ');

prm.solve.yalmip_prm = sdpsettings;
prm.solve.yalmip_prm.solver = '
+mosek';
prm.solve.yalmip_prm.quadprog.Algorithm = '
interior-point-convex';

solvesdp(c,o, prm.solve.yalmip_prm);


Gives 

Error using sdpvar/subsref (line 180)
Error using .'
Out of memory. Type HELP MEMORY for your options.


Error in test (line 11)
c=(x(a')>=rand(size(x,1),1));


Is it somehow avoidable, or is there nothing to do?

Thanks!

Johan Löfberg

unread,
Jan 28, 2014, 2:11:24 AM1/28/14
to yal...@googlegroups.com
I see where the code crashes due to memory issues, but before robustifying the code, I have to ask why you are writing your model in this weird way, i.e., why are you creating a huuuge diagonal matrix, and then extract the diagonal?

Your model is
x = sdpvar(n,1);
o=sdpvar;
c
=(x>=rand(size(x,1),1));
c=c+cone(x',o);

BTW, no reason to use the '+' directive when specifying the solver here, as you aren't setting up a model for the optimizer framework.



Message has been deleted

Noam

unread,
Jan 28, 2014, 8:01:30 AM1/28/14
to yal...@googlegroups.com
Hi Johan,
Thanks for the quick response. This model is just a synthetic test - I wanted to see how the YALMIP+MOSEK framework would perform for optimizing very large, sparse, symmetric matrices. Had this test passed smoothly, I would continue to my real goal which is optimizing a sparse PSD matrix

Johan Löfberg

unread,
Jan 28, 2014, 8:05:00 AM1/28/14
to yal...@googlegroups.com
I think it would be better if you could show a more typical example, instead of this artificially constructed example. That way, I can optimize my fix (for this case) for the actual situation, or give a hint on a better way to model it in YALMIP. Basically, it should not happen that YALMIP is the bottleneck in the solution of an SDP. If it is, I want to remove that bottleneck.

Johan Löfberg

unread,
Jan 28, 2014, 8:12:00 AM1/28/14
to yal...@googlegroups.com
A simple (but slow) fix is to replace line 228-230 in sdpvar/subsref with

  try
            Z
= X.basis.';
            Z = Z(:,ind1);
            Z = Z.'
;
       
catch
            Z
= X.basis(ind1,:);    
       
end


Noam

unread,
Jan 29, 2014, 10:13:33 PM1/29/14
to yal...@googlegroups.com
Hi Johan,
Well, I tried compiling a typical example. I have to say I am a little confused as to how YALMIP interacts with sparse matrices. Is there some documentation for that? For instance, I am not sure how to define a symmetric sparse matrix (no matter what I do I always generate 'full' matrices). If I understand this, I hope I can give you a typical example. Basically, it seems that all actions such as diag(), subindexing etc. cause memory issues as mentioned above.

Thanks, Noam

Johan Löfberg

unread,
Jan 30, 2014, 12:36:05 AM1/30/14
to yal...@googlegroups.com
Everything is sparse inside YALMIP, YALMIP never saves or works with any dense data. Perhaps you are mistaking the term full printed when displayed

>> x = sdpvar(2)
Linear matrix variable 2x2 (symmetric, real, 3 variables)
>> x = sdpvar(2,3)
Linear matrix variable 2x3 (full, real, 6 variables)

full is this context means that it is not structurally symmetric, but fully parameterized. Unfortunate term.



Noam

unread,
Jan 30, 2014, 4:04:06 AM1/30/14
to yal...@googlegroups.com
Hi Johan,

I do not think I was confusing the terms. The phrase was "I am not sure how to define a symmetric sparse matrix (no matter what I do I always generate 'full' matrices)", the emphasis being on the word "symmetric". That is, for instance, if I do 
>> L = sparse(inds(:,1),inds(:,2),sdpvar(length(inds),1));
The resulting matrix will always be 'full' no matter how I choose inds. I was asking how can I instruct sparse() in this case to generate a symmetric matrix. 

Allow me to rephrase my question: Say I want to define a sparse symmetric matrix  of dimension 100,000 x 100,000, with only 10,000 nonzero entries (at known indices in the matrix, of course). I then want to minimize some functional over this matrix, while maintaining that it satisfies a given set of constraints (e.g. being PSD, and all rows summing to 0). Such a case arises if one wishes, for instance, to peform some optimization over a graph-laplcian, for instance. 

To further clarify, below is an example of my initial attempt to solve such a problem, however it is a crude attempt as it runs slow and I am not sure how to impose to PSD constraint in an elegant manner. Furthermore if n is changed to 100,000 this never finishes running

yalmip('clear');
n
=1000;


%create delaunay triangulation of n points on the plane
X
=rand(n,2);
T
=delaunay(X);
T
=triangulation(T,X);


%generate the matirx L, the graph laplacian of the triangulation
edges
=T.edges;
edges
=[edges;edges(:,[2 1])];
inds
=[edges;[1:n; 1:n]'];
vars=sdpvar(length(inds),1);
L = sparse(inds(:,1),inds(:,2),vars);


%generate the constraints


%all rows should sum to zero
c_1=sum(L)==0;
%the objective - minimize L_ij*(x_i*x_j+y_i*y_j)
o=sdpvar;
c_2=cone(diag(X'
*L*X),o);
%to make sure doesnt shrink to 0.
c_3
=(L(1,1)==1);
%make sure L is symmetric (couldn't find a better way to do it).
c_4=(L==L'
)+(L+L'>=0);
%all the cosntraints...
c=c_1+c_2+c_3+c_4;




fprintf('
solve... ');

prm.solve.yalmip_prm = sdpsettings;
prm.solve.yalmip_prm.solver = '
+mosek';
prm.solve.yalmip_prm.verbose = 0;
prm.solve.yalmip_prm.quadprog.Algorithm = '
interior-point-convex';
tic
res = solvesdp(c,o, prm.solve.yalmip_prm);
toc
res.info



Johan Löfberg

unread,
Jan 30, 2014, 6:57:55 AM1/30/14
to yal...@googlegroups.com
I see what you mean. Small case indicates that there is a bug in the sparse command. I'll dig into this tonight

Johan Löfberg

unread,
Jan 30, 2014, 7:06:54 AM1/30/14
to yal...@googlegroups.com
No, it is not a bug. Most likely you made the same mistake as I did when defining the sparse matrix. Something like

ind = [1 2 3 3 1
       
1 2 3 1 3]';
v = sdpvar(length(ind),1);  
x = sparse(ind(:,1),ind(:,2),v)

Not a symmetric matrix, since the elements (1,3) and (3,1) will have different variables

Here is one way to define the symmetric matrix, based on the indices
tri = find(ind(:,1) >= ind(:,2))
ind
= ind(tri,:);
v
= sdpvar(length(tri),1);

% Define diagonal and lower triangle
x
= sparse(ind(:,1),ind(:,2),v)

% Symmetrize
x
= x + x' - diag(diag(x));

Alternative ways based on creating a vector v with repeated elements would also be possible, and would avoid the use of plus, transpose and multiple diags. Probably much more efficient in huge cases.

I cannot run your code since I lack the function triangulation, so I cannot debug the memory issues

Johan Löfberg

unread,
Jan 30, 2014, 7:12:24 AM1/30/14
to yal...@googlegroups.com
it can easily be seen that this huge model will cause issues. A parameterized matrix

A0+x(1)*A1 + x(2) + ...

is represented internally by

B=[A0(:) A1(:) A2(:)...]

even though all matrices are extremely sparse, MATLAB simply cannot handle this huge model when one starts manipulating it. As an example, this sparse matrix with 0 non-zero elements, cannot even be transposed

B = sparse([],[],[],(100000)^2,1000);
X
=B';
Error using  '

Out of memory. Type HELP MEMORY for your options.

The sparse format requires pointers to every column, and when B is transposed (which happens during various operations internally in YALMIP) it all of a sudden requires 10^10 pointers

Noam

unread,
Jan 30, 2014, 7:55:10 AM1/30/14
to yal...@googlegroups.com
Thanks Johan,
I think I understand the obstacle you describe in the last post - is it a consequence of SDP solvers accepting only dense (as opposed to sparse) inputs? 
Is this a problem that can somehow be bypassed with a different formulation, or, alternatively, be fixed in some future version of YALMIP? Or is there no way around it until the SDP solvers themselves improve?
Thanks, Noam

Johan Löfberg

unread,
Jan 30, 2014, 8:17:09 AM1/30/14
to yal...@googlegroups.com
Not sure what you mean. There is nothing dense anywhere. YALMIP and all solvers work entirely with sparse representations.

Noam

unread,
Jan 30, 2014, 8:26:02 AM1/30/14
to yal...@googlegroups.com
I guess I really am missing the point here (maybe it is I who is dense :-) ). 

The way I see it is as follows: Usually when one says that a solver\algorithm handles sparse input, it means its performance is not affected by the dimension of matrices or vectors but rather only by their sparsity pattern, that is how many non-zero entries are there (well, more or less). 
Hence, you would assume that a "sparse" solver could handle a 100,000 x 100,000 matrix with only 1000 non-zero entries (e.g., matlab's backslash, eigs, etc. handle it).
However, as I understood from what you wrote: 
 A parameterized matrix
A0+x(1)*A1 + x(2) + ...
is represented internally by 
B=[A0(:) A1(:) A2(:)...]
That the solvers themselves require the input in such a form. Otherwise, without the need to  "flatten" the matrices A0... An matrices into 1d vectors, matlab would have been able to handle it without out-of-memory issues. So I assume it is a necessity.

Johan Löfberg

unread,
Jan 30, 2014, 8:45:14 AM1/30/14
to yal...@googlegroups.com
If the matrix you are optimizing is 10^5x10^5, MATLAB typically has no problems if it is very sparse.

However, the 10^5 x 10^5 matrix you are talking about is the matrix obtained for a particular value of the variables parameterizing it. The solver, and YALMIP needs some way to represent this parameterization. Basically, the indicies specifying that element (i,j) has the kth variable x(k) (times some scalar). The way many solvers represent this is by simply stacking the base matrices in one large (10^5*10^5)*numberofvariables matrix. That way, the whole indexing magic is kept trivial, it is taken care of by the underlying sparse support. Getting the matrix for a particular x, is simple reshape(B*[1;x],n,n) etc.

However, sparse matrices take up a lot of memory, even when it is completely empty. There is always an overhead, and this overhead is linear in the number of columns. Hence, if B is transposed, memory use explodes. YALMIP performs transpose sometimes to speed up some operations which are done faster columnwise than rowwise, and then you sink MATLAB. I will try to safeguard the code though, and detect these memory issues and go for (possibly horribly slow) operations on the original form.

B = sparse([],[],[],(1000)^2,1000);
X
= B';
whos
  Name            Size                   Bytes  Class     Attributes

  B         1000000x1000                  8024  double    sparse    
  X            1000x1000000            8000024  double    sparse  

Finally, a 10^5 x 10^5 SDP would never be solved by Mosek, or any other traditional SDP solver. Since most solvers are primal dual second-order solvers, they will have to introduce a dual variable somewhere, and this will typically be a dense 10^5 x 10^5 matrix, requiring 80GB just to store. Operations on this  involving O(n^3) operations would also be frying the processor. When going above dimensions of some thousands, purpose-built first-order solvers etc are required.

Johan Löfberg

unread,
Jan 30, 2014, 8:47:19 AM1/30/14
to yal...@googlegroups.com
If you get me the function triangulate, or create a mock example I can run, I will see how far YALMIP can go before running into problems. At least I should be able to send it to Mosek. and then Mosek could take the blame for not solving it :-)

Johan Löfberg

unread,
Jan 31, 2014, 4:44:35 AM1/31/14
to yal...@googlegroups.com
The following test (tridiagnoal matrix) now runs with essentially no overhead (35s in Mosek, 0.7s total overhead from YALMIP)

n = 1000;
ind1
= (1:n)';
ind2 = (2:n)'
;
ind3
= (3:n)';
x1 = sdpvar(n,1);
x2 = sdpvar(n-1,1);
x3 = sdpvar(n-2,1);
ind = [ind1 ind1;ind2 ind2-1;ind2-1 ind2;ind3 ind3-1;ind3-1 ind3];
X = sparse(ind(:,1),ind(:,2),[x1;x2;x2;x3;x3]);
C = sprandn(n,n,0.01);
solvesdp([X >= 0, sum(diag(X))==1],trace(X))

This is n=1000. n=2000 takes 350s (1s overhead). The case n = 100000 should be expected to take roughly (100)^3 times more time and memory, i.e., 35*1e6 seconds~=1 year.

To get it to run, I had to clean up a memory issue in the check for Hermitian parameterizations and in the conversion to Mosek specific data. After that, it ran without issues.

Noam

unread,
Feb 1, 2014, 12:09:54 PM2/1/14
to yal...@googlegroups.com
Thanks Johan, I guess that ends any idea I had regarding trying to solve such a large sparse system. But thanks for the fixes! I am sure they will be useful in the future.
Noam

Mahdi Gh

unread,
Oct 27, 2015, 1:24:54 PM10/27/15
to YALMIP
Hello Johan Lofberg,


I aim to solve the following semidefinite program:

Maximize  trace (M * X)

Subject to trace (A_i  * X) >=0   for   1<=i<=1000000

The matrices M and A_i are all sparse matrices in my code.  As you know, mosek accept sparse matrices, so I can give the above sdp problem to Mosek directly in a short time. However, it takes a long time  when I give the above sdp to Yalmip. In the other words, it takes a long time to define the constraints of above sdp problem for Yalmip in matlab environment. How can I reduce this time (giving the problem to Yalmip) as low as in the case I give my problem to Mosek directly? 

Johan Löfberg

unread,
Oct 27, 2015, 1:47:37 PM10/27/15
to YALMIP
Note that trace(C*X) never should be written as such, as it is an O(n^3) operation, but for large problems you should write it as the O(n^2) operation C(:)'*X(:) and avoid wasting time/memory by explicitly forming C*X. Additionally, this model probably gains from being dualized (option dualize, see Wiki)

However, this is hopefully not the problem you want to solve, as it is either infeasible, or unbounded (and I assume you have an actual cone constraint also, X>=0)

Mahdi Gh

unread,
Oct 27, 2015, 2:43:53 PM10/27/15
to YALMIP

The sdp problem that I presented here is a simplified version of the actual sdp problem I aim to solve. The original version of the problem has more constraints in addition to the constraint  X>=0, and consequently is feasible and bounded.  

I used the technique that you explained. That is, I replaced the constraints

Constraint = [ ] ;
Constraint = [ Constraint;  trace(A  *X) >=0 ];

with the constraints  

Constraint = [ ] ;
Constraint = [ Constraint;     A(:) ' * X(:) >=0 ];

However, the process of giving my sdp to Yalmip is very slower compared to the case where I give my sdp to mosek directly. Is there any other way, I can define the above constraints for Yalmip is a faster way? 

Regarding the dualizing, I point out that Yalmip convert my problem to its dual and then give it to mosek. 


Mahdi Gh

unread,
Oct 27, 2015, 3:13:41 PM10/27/15
to YALMIP
I add that, I think the following code 


Constraint= zeros (N,1)

for i=1:N
Constraint (N)=  My-SDP-Constraint;
end

is probably faster than 

Constraint=[];

for i=1:N
Constraint=[Constraint;  My-SDP-Constraint];
end

when N is large. However, the first code lead to the following error: 

Conversion to double from constraint is not possible

Johan Löfberg

unread,
Oct 27, 2015, 3:27:21 PM10/27/15
to YALMIP
Without reproducible code nothing more can be said.

With dualize, what happens is that YALMIP interprets (possibly with small changes) your model as a model given in standard primal form, which your problem almost is.

Johan Löfberg

unread,
Oct 27, 2015, 3:32:19 PM10/27/15
to YALMIP
Your first alternative makes no sense. Why would you want to place constraints in a list of zero doubles?

A list of constraint is not a simple list of doubles, obviously, where preallocation is beneficial. There is a lot more going on behind the scenes when you create lists of objects

Mahdi Gh

unread,
Oct 27, 2015, 5:43:57 PM10/27/15
to YALMIP
Regarding the code:


Constraint = [ ] ;
for i=1:100000
Constraint = [ Constraint;     A_i(:) ' * X(:) >=0 ];
end

I think this code is slow because it uses so much memory. Please notice that for 100000 times, a huge list of constraints (that is [ Constraint;     A_i(:) ' * X(:) >=0 ] ) is copied to  Constraint. 



Mahdi Gh

unread,
Oct 27, 2015, 10:30:57 PM10/27/15
to YALMIP
For a specific SDP problem, please consider the following two cases:  

1- The SDP problem is given to Yalmip and Yalmip (after  processing) passes the problem to Mosek. The example is solved in 13 seconds (solver time). 
2- The SDP is given to Mosek directly (without using Yalmip). The example is solved in 360 seconds (solver  time). 

I am curious what process is done to my SDP by Yalmip that reduces the solver time from 360 seconds to 13 seconds? The output of mosek at early stages (before starting interior point optimizer) is shown below for the two cases 1 and 2 above: 


Case 1

Name                   :                 
  Objective sense        : min             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 2016            
  Cones                  : 0               
  Scalar variables       : 8205            
  Matrix variables       : 1               
  Integer variables      : 0 

Case 2 

 Name                   :                 
  Objective sense        : max             
  Type                   : CONIC (conic optimization problem)
  Constraints            : 42145           
  Cones                  : 1               
  Scalar variables       : 3               
  Matrix variables       : 1               
  Integer variables      : 0 

I point out that the number of constraints in my SDP is 42145


Johan Löfberg

unread,
Oct 28, 2015, 2:39:28 AM10/28/15
to YALMIP
Impossible to answer as you're not giving the full code. A small difference which you think is redundant can make a huge difference. 

Of course, using a vectorized version should be used when you already have the data in vectorized form (i.e., you've basically done the modelling already...). This
for i=1:m
 Constraint = [ Constraint, A(:,i)'*X(:) >=0 ];
end

is horribly inefficient in MATLAB, regardless of YALMIP being used or not. You should of course write this as
Constraint = [ Constraint, A'*X(:) >=0 ];


Mahdi Gh

unread,
Oct 28, 2015, 3:31:44 AM10/28/15
to YALMIP
I aim to solve a semidefinite program which is shown in the attached PDF file.  We can solve this problem in two different approaches:



Approach 1 :     The SDP is given to Yalmip and then is passed to Mosek.

Approach 2 :      The SDP is directly given to Mosek (without using Yalmip as an interface). 

The Matlab codes for the above two approaches are included in the attached zipped file (The necessary data for running the codes are also included in the zipped file, and consequently the codes can be run).  The above two codes give a same optimal value for the objective function of SDP, as their outputs. However, in the first code (Approach_1) the solver time is  13 seconds, while  in the second code (Approach_2) the solver time is 360 seconds.  Would you please let me know what process is done by Yalmip in the first code (Approach_1) that reduces the solving time considerably? 
Codes.zip

Mahdi Gh

unread,
Oct 28, 2015, 3:38:04 AM10/28/15
to YALMIP
Sorry, 

I attached a wrong pdf file. The correct pdf file is now attached. 
SDP_Problem_Correct.pdf

Johan Löfberg

unread,
Oct 28, 2015, 3:54:12 AM10/28/15
to yal...@googlegroups.com
Well, the model is simply more efficiently solved when looking at it as a parameterization of a dual model, which is default in YALMIP. They way you represent a model, which is unique on paper, is not unique when you actually compile data. 

http://www.control.isy.liu.se/~johanl/2009_OMS_DUALIZE.pdf (note, your model is a case which doesn't gain from being dualized, i.e., interpreted in primal. In terms of example 3.2, the "terrible" model is actually the good one, as the number of variables needed to represent V, is pretty low compared to the number of constraints)

BTW, this line
Objective_Matrix=zeros(Len,Len);

is completely redundant, and your for-loops can be removed
Constraints = [Constraints, reshape([A{:}],Len^2,[])'*V(:) >= 0];
Constraints = [Constraints, reshape([B{:}],Len^2,[])'*V(:) == 0];



Mahdi Gh

unread,
Oct 28, 2015, 6:21:27 AM10/28/15
to YALMIP
Thanks Johan.

What you explained was really helpful, specially those two lines of code that you suggested as an alternative to the "for" loops. 
Reply all
Reply to author
Forward
0 new messages