add_constraint becomes slow in presence of many constraints

36 views
Skip to first unread message

Peter Mueller

unread,
Apr 8, 2014, 10:07:05 AM4/8/14
to sage-s...@googlegroups.com
While analyzing why a program becomes unexpectedly slow (despite moderate memory usage and no swapping), I found out that the time consumption of the method  add_constraint slows down if there are already many constraints. The following simple (and silly) program displays this:

P = MixedIntegerLinearProgram(solver="Coin", maximization=False)
x = P.new_variable(real=True)
z = 0
t1 = cputime()
while True:
    P.add_constraint(x[0] + random()*x[1], max = random())
    z += 1
    if z%10000 == 0:
        t2 = cputime()
        print z, t2-t1
        t1 = t2

10000 0.860054
20000 1.528095
30000 2.23214
40000 3.788236
50000 4.580286
60000 7.488468
70000 15.997
80000 23.297456
90000 31.489968
100000 37.614351
110000 43.946747
120000 47.818988

So we see that adding the first 10000 constraints takes 0.86 seconds, while adding 10000 constraints when there are already 110000 constraints takes 47.82 seconds, and things keep worsening. 
Is this an implementation problem of the method add_constraint, or something which cannot really be avoided? Considering that the available powerful backends can solve huge systems, it would be a pity if actually creating the linear program would be the bottleneck.

-- Peter Mueller (Wuerzburg)

Nathann Cohen

unread,
Apr 21, 2014, 4:13:55 AM4/21/14
to sage-s...@googlegroups.com
Hello Peter !

Sorry for my late answer, I am away from everything at the moment, and I am reduced to answer your message with a mobile phone :-)

I believe that this thing is slow because we did not inform the LP data structure (which we do not encode ourselves in Sage) or the problem's size. And I guess it does not like to never know how many constraints there will be, and I fear that it does a lot of useless copies when you add a constraint (perhaps it copies the whole matrix just to add one line ?).

In the solvers API you have a command to add many constraints at the same time (all set to zero) that you can fill later. This you do not add new rows, just fill some numbers. I guess this would probably solve your problem : you just need to add a function to allocate many useless rows at first (when you build the LP) instead of adding them one by one.

Nathann

john_perry_usm

unread,
Apr 21, 2014, 2:23:55 PM4/21/14
to sage-s...@googlegroups.com
And I guess it does not like to never know how many constraints there will be, and I fear that it does a lot of useless copies when you add a constraint (perhaps it copies the whole matrix just to add one line ?).

I think this is right. Allocating & copying such a huge matrix repeatedly would be terrible. Perhaps we should introduce an API function to "add_constraints", which takes a list of lists, or a matrix? If a solver doesn't support such a thing, we could fall back on the brute force method.

john perry

Nathann Cohen

unread,
Apr 28, 2014, 8:49:15 AM4/28/14
to sage-s...@googlegroups.com
Helloooooooooooooooooo !!!
 
I think this is right. Allocating & copying such a huge matrix repeatedly would be terrible. Perhaps we should introduce an API function to "add_constraints", which takes a list of lists, or a matrix? If a solver doesn't support such a thing, we could fall back on the brute force method.

I am almost sure that all solvers support that. Some "allocate new constraints" function that would just be called befre creating the actual constraints :-)

Nathann 

Stephen Hartke

unread,
Apr 29, 2014, 2:03:19 PM4/29/14
to sage-s...@googlegroups.com
Adding row constraints one at a time is slow because of the internal data structures that simplex solvers use.  In large LPs/IPs, the constraint matrix is almost always sparse, and so allocating an entire dense matrix doesn't make sense.  Instead, a packed sparse matrix is used.  Because of the way the simplex algorithm works, quick access is needed to /columns/ of the constraint matrix, so the sparse matrix format used is ordered by columns.  When adding a row to such a matrix, the entire matrix needs to be reordered.  As the matrix gets large, this reordering becomes slower.

One solution is to just give the entire matrix to the solver all at once.  In GLPK, this matrix does not have to be column ordered, but it is reordered before solving (see the glp_sort_matrix command; this doesn't have to be called explicitly by the user but does get performed).  In CPLEX, the matrix does have to be column ordered, which is somewhat annoying to do (see CPXcopylp).  Gurobi seems to use a similar matrix format ("Compressed Sparse Column (CSC) format" according to their documentation).  For COIN, a packed sparse matrix is used.  The documentation doesn't seem to explicitly state that it's column ordered, but I assume it must be.

The up shot of all this is that it's worth thinking about what a nice Sage interface for large LP/IPs would look like.  I think allocating a large dense matrix doesn't make sense because of the sizes involved.  As Peter pointed out, it's more natural to specify LP/IPs by rows than by columns, but then a re-ordering must occur at some point.

Best wishes,
Stephen



--
You received this message because you are subscribed to the Google Groups "sage-support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sage-support...@googlegroups.com.
To post to this group, send email to sage-s...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-support.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages