State and future of the MiniZinc benchmarks library

164 views
Skip to first unread message

Michael Marte

unread,
Oct 5, 2020, 7:42:26 AM10/5/20
to MiniZinc
As the author of Yuck, I am using the MiniZinc benchmarks library for regression and performance testing. Lately I became concernced about the state and the future of the library:

  • Is seems that the library is not being maintained: new problems and instances from the MiniZinc challenges 2019 and 2020 have not yet been added, some pending pull requests have been ignored for years.
  • Over the years, the only source of new problems and instances were the MiniZinc challenges. Since the problem submitters are restricted to integer variables, the resulting collection does not make use of all MiniZinc features.
  • Problem submitters are asked to provide many instances. Unfortunately, not all of them get published immediately, some are kept as a secret supply for future challenges. This practice limits the access to test data which hampers the development of solvers.
  • I noticed that, over the years, there were only a few instances that no solver could cope with at the MiniZinc challenges. Besides, I recently had a look at the VRPLC instances as submitted to the MiniZinc challenge 2018. There are only five instances and, looking at the underlying paper "A branch-and-price-and-check model for the vehicle routing problem with location congestion" by Edward Lam and Pascal Van Hentenryck, these are the easy instances from a much larger instance set - with the other instances too hard for current CP technology. Is this absence of intractable instances a consequence of the fact that problem submitters are requested to submit only instances which range "from easy-to-solve to hard-to-solve for an ordinary CP system"? Anyway, not adding instances harder than what current CP technology can cope with, is not healthy and hampers the enhancement of solvers.
  • I noticed that many models are poorly documented and, as a fun exercise, I created the attached overview of the models from the MiniZinc challenge 2020.
    It's truly shocking: There are twelve models with neither a description nor a link to a publication, seven of which do not even mention the author! This is not helpful for anyone looking at these models, including and in particular students. Even if you can guess from the file name what the model is for, you are left with many questions: Which variant of the problem was modelled? Which approach was chosen and why? Omissions? Additions? Limitations?
    Another issue is the lack of licenses: If there is no license, then the copyright kicks in, limiting the (re)use of the respective models.

Concluding, I think that the MiniZinc benchmarks library needs some love and a vision.

Regarding future calls for problem submissions, I suggest to ask for self-explanatory models which come with the following information:

  • the author(s)
  • a problem description (preferably inline, links tend to rot)
  • relevant publications (preferably by means of bibliographical references, links tend to rot) 
  • the license
  • meaning of input parameters
  • meaning of variables
  • purpose of constraints

To increase the value of the library, I suggest

  • to update the library with the problems submitted to the 2019 and 2020 MiniZinc challenges
  • to provide more test data by publishing all instances that have been submitted in the past
  • to ask past submitters to create pull requests which make models self-explanatory and provide licenses
  • to split the MiniZinc challenge into two parts (base and extended) and to allow the submission of models with float and set variables to the extended challenge
mznc2020-models.pdf

guido.tack

unread,
Oct 6, 2020, 2:04:47 AM10/6/20
to MiniZinc
Hi Michael,

thanks for your thoughtful comments on the benchmarks repository. I agree that it would be nice to improve it along the lines you mentioned. Unfortunately, as you can imagine, this requires work.

Perhaps the name of the repository raises expectations that it currently can't meet. It should probably be called minizinc-challenge-models rather than minizinc-benchmarks, since the main purpose was to provide access to those models and instance data.

It is quite a struggle to even find 10 new models each year for the challenge (this was particularly true this year), and placing any additional burden on the submitters would probably not make that easier. But of course we can try to suggest a standard header to avoid some of the issues and encourage providing additional information. Using pull requests to add author and licensing information for the existing files is a great idea (this is going back over ten years now, but maybe we can fix at least some of the models).

Extending the challenge to float variables would also mean that we require even more new models and instances (and most solvers currently participating in the challenge wouldn't support them). So, while it would be nice, it's unfortunately not realistic right now. Set variables are different, since the translation to Boolean variables provided by MiniZinc should make those models compatible with all solvers. So from next year, we are happy to accept models that use set variables.

Regarding additional instance data, it's the same argument - we need 5 fresh instances each for 10 of the "old" models each year, otherwise we'd have to get even more new models submitted. So that's unlikely to change. However, for some problems a lot of instances are actually available, and we can probably release more than the 5 that have been used in the challenge.

I'll try to get the 2019 and 2020 models added in the next few days.

There's certainly other things that would be really nice to have. For example, auto generating 

All the things mentioned here depend on finding the time (or a volunteer) to actually do them, so if anyone wants to put their hand up, you're very welcome to help.

Cheers,
Guido

guido.tack

unread,
Oct 6, 2020, 2:33:36 AM10/6/20
to MiniZinc
Sorry, hit "post" a bit too soon! What I wanted to say was that it would be nice to auto generate a web page with an overview of all the benchmark models and instances, best known objectives, included globals, best solvers for each instance and model, etc.

Cheers,
Guido

Michael Marte

unread,
Oct 14, 2020, 4:19:39 AM10/14/20
to MiniZinc
Hi Guido,

I was wondering about the need for fresh instances, so I consulted the "Philosophy of the MiniZinc Challenge". I found that the main motivation for fresh instances is to avoid the "overfitting of solvers to a limited set of benchmarks", which is fair enough.

I hope that you will resolve to publish more instances (in cases where you have plenty) and that you will find the time to actually do it. Regarding future calls for models and instances, I suggest to also ask for very hard and unsolved instances, with the aim to foster research by publishing these instances. (Ignoring the hard instances might lead to a false impression regarding the abilities of CP solvers.)

Regarding the nature of the minizinc-benchmarks repository, it cannot be denied that it is the go-to place for MiniZinc models and instances, simply because (to my best knowledge) there is no other such repository except for Hakank's, which is more about modelling than benchmarking, though. Therefore, minizinc-benchmarks should not become a dumping ground, existing content should receive some maintenance and future additions should meet basic quality criteria. Asking past and future submitters to improve documentation (by adding and filling in the standard header) is an important step forward. (I would not worry about deterring future submitters; after spending days, weeks, or maybe months on a model, filling in the header is negligible extra work.) If you need help with reviewing pull requests, I would be happy to support you. However, three of the currently pending pull requests are by me and I would like someone else to approve (or reject) them.

Having a web page listing benchmarks and results would be very cool and helpful, indeed. Were you thinking about something like https://miplib.zib.de? Developing and maintaining such a page would require some commitment. But what about http://www.csplib.org? Would the minizinc-benchmarks page complement the CSPLib page or compete with it? I don't expect answers, I just want to point out the need for a clear vision before jumping into action.

Best,
Michael

PS. I have been maintaining a repository with MiniZinc challenge results: https://github.com/informarte/minizinc-challenge-results. It provides a script to put the results into a relational database which one can then use to mine the data.
Reply all
Reply to author
Forward
0 new messages