Large scale experiment using Gurobi (CPUs to memory ratio concern)

lahlou aziz

unread,

Dec 4, 2017, 11:32:46 AM12/4/17

to Gurobi Optimization

Hi,

I would like to perform large scale experiments and stress my mathematical model using Gurobi and would love to get some insights regarding the proportion of CPU cores to memory.

For instance, I have a machine composed of 32 CPU cores, I would like to know how much RAM should I put so as not to make it a bottleneck ?

Have anyone came through this matter before ? If yes, it would be great to get some insights and share the experience.

Thanks

Regards,

Aziz

Tobias Achterberg

unread,

Dec 4, 2017, 6:03:26 PM12/4/17

to gur...@googlegroups.com

There are some calculations you can perform to get a guess about the minimal memory
requirements.

To store the problem data you need 3 double vectors for each variable (bounds and
objective), one double and a char vector for each constraint (rhs and sense), and the
coefficient matrix. The coefficient matrix, stored column-wise, needs one int and one
double for each non-zero coefficient. Moreover, it needs two 64 bit integers for each
column to specify the start and end of the column in the coefficient matrix. So, we get

(8+8+8+8+8)*cols + (8+1)*rows + (4+8)*nonz = 40*cols + 9*rows + 12*nonz

bytes of memory consumption.

Then, presolve is applied. During presolve, there is additional memory overhead, but this
is released afterwards. Presolve produces another problem representation, namely the
presolved model. It has the same size formula, but of course the problem dimensions will
be different (usually smaller).

Now, for this presolved MIP model we also need the transposed matrix, which consumes
another 12*nonz bytes. Then, the LP relaxation of the presolved model is set up, again
with transposed matrix. So, here go another 40*cols + 9*rows + 24*nonz bytes.

For the LP relaxation we need to calculate an LU factorization. This may consume quite a
but of memory, but this greatly depends on the structure of the problem. For now, let's
just assume this gives another 12*nonz bytes.

Each thread needs its own LP relaxation to work with. Hence, for each thread after the
first, you have to add another 40*cols + 9*rows + 36*nonz bytes to store the model in both
column- and row-wise representation and an LU factorization.

Moreover, each thread may spawn a sub-MIP heuristic, which starts by copying the model.
There is your next 40*cols + 9*rows + 24*nonz bytes per thread.

There are quite a number of additional column and row vectors of working data flying
around during a MIP solve. I don't know exactly how many, but I guess something like
100*cols + 100*rows bytes per thread is some reasonable estimate.

Finally, there is the MIP search tree. This can become very big, but fortunately the open
nodes can be buffered pretty efficiently to the hard disk (if you enable the respective
parameter). So, this is not so much a concern about RAM. If you just add 1 or 2 GB of RAM
for the search tree, then you should usually be in good shape.

So, to summarize, you need something like:

80*cols + 18*rows + 36*nonz
+ (180*cols + 118*rows + 60*nonz) * threads
+ 2e+9 bytes

For example, consider a model with 1 million columns, 100k rows, and 10 million non-zeros,
and assume that presolve does not shrink the model significantly. You want to solve it
with 16 threads. Then you get

80*1e+6 + 18*1e+5 + 36*1e+7 + (180*1e+6 + 118*1e+5 + 60*1e+7) * 16 + 2e+9
= 15 GB

This sounds pretty realistic to me.

But please beware of the barrier algorithm! If you need to solve your LP relaxation (e.g.,
the initial LP relaxation) with barrier, then you may need to have much more RAM as
barrier can consume lots of memory.

In any case, the more RAM you have the better. For a 32 core box, I would go with at least
128 GB if you can afford it.

Regards,

Tobias

lahlou aziz

unread,

Dec 7, 2017, 12:31:18 AM12/7/17

to Gurobi Optimization

Hi tobias,

Many thanks for your explanations and suggestion.

Regards,

Aziz

Reply all

Reply to author

Forward