Best practice for solving large number of similar instances

61 views
Skip to first unread message

Hayssam Soueidan

unread,
May 28, 2018, 9:00:22 PM5/28/18
to MiniZinc
Hi,
I'm working on a project where I need to solve thousands of small to large "simple" instances of a knapsack-like problems. All my instances have the same structure, the same constraints but varies in the number of items (thus variables). The domains can be fixed for all the instances. 

I've successfully integrated minizinc into a workflow which 1. extract relevant data from a database, 2. generate the .mzn model (including the initial variable assignments) 3. call the minizinc driver for the CBC solver and 4. parse and interpret the solution. 

I've now hit a performance bottleneck at the flattening phase for large instances (cf verbose flattening stats). Flattening takes up to 30seconds while solving to optimality requires less than 1s. I've tried to disable the "optimized flattening" to no avail, I've also tried to split the toolchain (mzn2fzn then mzn-cbc); and finally tried to separate model and data definitions but didn't observe any significant improvement. 

If it helps, I've identified the set of constraint causing the problem: 
...
array[1..num_items] of int: item_label= [1,12,81,12, [10 000 more ints between 1..82], 53 ];

int: num_label = 82;
array[1..num_label] of var 0..num_items: cluster_distribution;
constraint forall(i in 1..num_label)(cluster_distribution[i]==sum(j in 1..num_items)(item_indicator[j] /\ item_label[j]==i));
var 1..num_label: nz_label;
constraint non_zero_label = sum(i in 1..num_label) (cluster_distribution[i]>0);
....


Basically, each of the 5000 items have a label (out of ~100 possible labels), and I try (amongst other objectives) to maximise the number of different labels (the non_zero_label var ). I've tried various formulation but this one seems to be the most efficient. 

Could anyone provide me with some guidance on how to accelerate this flattening phase? How would you approach the task of solving thousands of "similar" instances? 

Would it be beneficial to directly generate the MPS file and then call CBC natively? I expect minizinc to be quite efficient at compiling MPS file, but maybe I could get a speedup by exploiting the repeated structure of the instances? I however fear this would more or less boil down to re-coding a poorly written half-baked pseudo custom version of minizinc, which doesn't feel right. 

Thanks ! 



My system info: Os X and Ubuntu, mnz 2.1.7.

Compiling instance-2905-gd.mzn
MiniZinc to FlatZinc converter, version 2.1.7
Copyright (C) 2014-2018   Monash University, NICTA, Data61
Parsing file(s) 'instance-2905-gd.mzn' ...
processing file '../share/minizinc/std/stdlib.mzn'
processing file '../share/minizinc/std/builtins.mzn'
processing file '../share/minizinc/std/redefinitions-2.1.1.mzn'
processing file '../share/minizinc/std/redefinitions-2.1.mzn'
processing file '../share/minizinc/linear/redefinitions-2.0.2.mzn'
processing file '../share/minizinc/linear/redefinitions-2.0.mzn'
processing file '../share/minizinc/linear/redefinitions.mzn'
processing file '../share/minizinc/std/nosets.mzn'
processing file '../share/minizinc/linear/redefs_lin_halfreifs.mzn'
processing file '../share/minizinc/linear/redefs_lin_reifs.mzn'
processing file '../share/minizinc/linear/domain_encodings.mzn'
processing file '../share/minizinc/linear/redefs_bool_reifs.mzn'
processing file '../share/minizinc/linear/options.mzn'
processing file '../share/minizinc/std/flatzinc_builtins.mzn'
processing file 'instance-2905-gd.mzn'
 done parsing (70 ms)
Typechecking ... done (13 ms)
Flattening ... done (16504 ms), max stack depth 14
MIP domains ...82 POSTs [ 82,0,0,0,0,0,0,0,0,0, ], LINEQ [ 0,0,0,0,0,0,0,0,82, ], 82 / 82 vars, 82 cliques, 2 / 2 / 2 NSubIntv m/a/m, 0 / 127.085 / 20322 SubIntvSize m/a/m, 0 clq eq_encoded  ...  done (28 ms)
Optimizing ... done (8 ms)
Converting to old FlatZinc ... done (37 ms)
Generated FlatZinc statistics:
Variables: 21258 int, 20928 float
Constraints: 416 int, 20929 float
    This is a minimization problem.
Printing FlatZinc to '/var/folders/99/0zvzbfcj3h16g04d07w38wrw0000gn/T/MiniZinc IDE (bundled)-RzF4wk/instance-2905-gd.fzn' ... done (316 ms)
Printing .ozn to '/var/folders/99/0zvzbfcj3h16g04d07w38wrw0000gn/T/MiniZinc IDE (bundled)-RzF4wk/instance-2905-gd.ozn' ... done (111 ms)
Maximum memory 318 Mbytes.
  Flattening done, 17.09 s

Gleb Belov

unread,
May 29, 2018, 9:00:35 PM5/29/18
to mini...@googlegroups.com
Hi,

replace:

constraint forall(i in 1..num_label)(cluster_distribution[i]==sum(j in 1..num_items where item_label[j]==i)(item_indicator[j]));

It should save a few temporary variables during flattening which have to be created and eliminated.

In any case, it produces interesting fzn, I am happy to look further if you send the instance to my email.

--
You received this message because you are subscribed to the Google Groups "MiniZinc" group.
To unsubscribe from this group and stop receiving emails from it, send an email to minizinc+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/minizinc/3a23356e-5f1e-4245-9324-960e4c1590fc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hayssam Soueidan

unread,
May 30, 2018, 11:22:28 AM5/30/18
to MiniZinc
Hi Gleb,

Thanks for your interest! 
Indeed, the list comprehension does save a lot of time, I've reduced the flattening from 20s to 4s, thanks a lot! 

If you are willing to delve further (or just being curious about minizinc usages :) ) I've sent you an email with the full model. 

Thanks again,
Hayssam.


To unsubscribe from this group and stop receiving emails from it, send an email to minizinc+u...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages