Handling Large Scale Data

899 views
Skip to first unread message

jonp...@gmail.com

unread,
Oct 1, 2022, 9:41:27 AM10/1/22
to or-tools-discuss
Problem setup:
*
~2000 consignments
*  ~30 vehicles
*  "time", "number" constraint

Model setup:
Currently I handle it via making every node of type Pickup And Delivery. Thus, for normal "delivery" the pickup node would have same location as hub and act as a dummy node (Thus total nodes: 4k). Constraints are setup as described in the guide.

Requirement:
Should be able to plan them.
* Currently it says: No Assignment Found in 2122 seconds
* Further, I turned log_search = True, and not even "Root Node Processed" was printed.

Is 4k nodes with 30 vehicles unreasonable for OR tools? Or is my setup too naive? This thread tells me we can decompose a problem, run it parallely and then join them, but I"m not sure how exactly this would work. 

Maybe I'm handling it naively here, so please don't hesitate on guiding me towards best practices for large scale data.

Specifications:
* Ortools ver: 9.4.1874


Thanks!


J. E. Marca

unread,
Oct 1, 2022, 11:44:16 AM10/1/22
to or-tools-discuss
The easiest, biggest win is to make sure you have caching turned on.

I just posted about that recently to a similar question.


Try that, and you should see a 10x speed up at least.  If it doesn't help enough, then it will help if you could post more details about what you are doing.  You might have a bug, of course, but more likely you'll have to be clever about trying to reduce the search space. 

James

jonp...@gmail.com

unread,
Oct 2, 2022, 2:09:16 AM10/2/22
to or-tools-discuss
Dear James

Your suggestion worked phenomenally well.  Before cache iterated till: Solution # 457, whereas the caching fix pushed it till Solution # 889 (for the same amount of time). That's like 1.94x boost. Amazing so far! 

I noticed something. After caching, it iterated till #587 within ~ time limit 15%  where it utilized only `PairExchangeOperator`. Since this point, it switched to other operators such as ExchangeSubtrip, Exchange, RelocateNeighbors, TwoOpt in random order. Within 15% time limit it reached # 587, but required 99% just to reach #889. Clearly there's significant speed dent after 15%, which coincides with moving on to operators other than PairExchangeOperator.

My question being, is it that the cache only works for PairExchangeOperator? Or is it perhaps since it came first the memory stored only that part. Definitely not 10x as promised, but maybe my definition of "speed" is wrong. 


Thanks & Regards


aire...@gmail.com

unread,
Oct 2, 2022, 7:57:36 AM10/2/22
to or-tools-discuss
You may also try registertransitmatrix, it will give you more speed. As it stores the matrix in memory and avoid callbacks.

blind.line

unread,
Oct 2, 2022, 2:47:29 PM10/2/22
to or-tools...@googlegroups.com
The caching only saves the lookups locally. The speed up is because the c++ solver does not have to cross the language boundary out to python or whatnot every time it calls the callback. Instead it calls the callback one for each dimension/node/node pair/vehicle-node-pair etc. 

So it takes longer to start the solver part, but once it gets going it is way faster each cycle. 

But that is all it does. It just prevents calling the callback

The matrix dimension stuff is equivalent, but IMHO it is easier to use the cache feature. 

The different operators etc are unrelated to caching. That’s all solver side things. 

James

On Oct 2, 2022, at 04:57, aire...@gmail.com <aire...@gmail.com> wrote:

You may also try registertransitmatrix, it will give you more speed. As it stores the matrix in memory and avoid callbacks.
--
You received this message because you are subscribed to the Google Groups "or-tools-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to or-tools-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/or-tools-discuss/93923f04-ef3d-4f25-a9b5-d1808783e74dn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages