Question about iRace number of configurations

151 views
Skip to first unread message

Charles Prudhomme

unread,
Sep 1, 2021, 11:03:40 AM9/1/21
to The irace package: Iterated Racing for Automatic Configuration
Hi,

I discovered iRace quite recently, and was able to use it very quickly to evaluate my library. I use a grid computer and slurm to run iRace.

However, there is still something I don't get regarding the number of configurations.
In a few words, I first selected 41 different problems and override the following lines in `scenario.txt`:
parameterFile = "./parameters.txt"
logFile = "./tun20202.Rdata"
targetEvaluator = "./target-evaluator"
maxExperiments = 10000
sampleInstances = 1
deterministic = 1
parallel = 60
batchmode = "slurm"
elitist = 1
elitistNewInstances = 5

I also got stuff from Github (like `target-evaluator`), and adapted the output of my execution to return two values: cost and time.

Then, I selected 4 parameters which can take their value between 3 to 5 ones (3x4x4x5 = 240). 

I was expected, since the library is deterministic, to evaluate up to 240 configurations on up to 41 instances. But it appears the iRace printed this:

# 2021-08-23 16:32:32 CEST: Iteration 1 of 4

# experimentsUsedSoFar: 0

# remainingBudget: 10000

# currentBudget: 2500

# nbConfigurations: 416 

Moreover, some configurations have different IDs but represent the same combination:

        varh  valh           restarts flush

73   DOMWDEG BLAST '[luby,200,50000]'    32

106  DOMWDEG BLAST '[luby,200,50000]'    32

131  DOMWDEG BLAST '[luby,200,50000]'    32

411 DOMWDEGR  BMIN '[luby,200,50000]'    16

which is very confusing.

Can someone point me out what I did wrong?

Thank you in advance,
Best regards,
Charles Prud'homme


Manuel López-Ibáñez

unread,
Sep 1, 2021, 11:21:55 AM9/1/21
to The irace package: Iterated Racing for Automatic Configuration
Dear Charles,

This question is so frequent that it appears as FAQ 12.10 in the user guide: https://mlopez-ibanez.github.io/irace/irace-package.pdf
but I think it was lacking some details, so I will update it as follows:

Typically,  irace is applied to parameter spaces that are much larger than what can be explored within the budget given. Thus, irace does not try to detect
whether all possible configurations can be evaluated for the given budget and it does not waste computation time to check for repeated configurations. Thus,
if the parameter space is actually very small, the initial random sampling performed by irace may generate repeated configurations and/or never generate
some configurations, which is not ideal. If you still want to use (non-iterated) racing, the recommended approach is to provide all configurations explicitly to irace and execute a single race (nbIterations=1) with exactly the number of configurations provided (e.g., nbConfigurations=240). A future version of irace may automatically detect this case and switch to non-iterated racing without having to set additional options. Future versions may also implement computationally cheap checks for repeated configurations. If you are interested in implementing this, please contact us!

In fact, I started working on this automatic detection sometime ago, but I have so many other things to do that I haven't made much progress. Anyone interested on automatic configuration of small parameter spaces, please contact me! I think it would be an interesting topic of research. The best method may not be irace in that case, but having something that uses an interface similar to irace surely would be welcome by the community.

I hope the above answers your question!

Cheers,

Manuel.
--
Dr Manuel López-Ibáñez | "Beatriz Galindo" Senior Distinguished Researcher | University of Málaga, Spain | http://lopez-ibanez.eu
------------------------------------------------------------------------------
Evolutionary Computation Journal: Special Issue on Reproducibility: http://lopez-ibanez.eu/ecj-si-rep
------------------------------------------------------------------------------
Workshop on Space & AI in association with ECML/PKDD 2021 (deadline: August 18th) http://spaceandai.ijs.si
------------------------------------------------------------------------------

Charles Prudhomme

unread,
Sep 2, 2021, 9:15:03 AM9/2/21
to The irace package: Iterated Racing for Automatic Configuration
Dear Manuel,

Thank you for the quick and clear answer.
I didn't see that part in the user guide, thank you for pointing it out.

I will apply your suggestion as soon as my current experimentation ends.
I don't know if I can contribute, it was the first time I (indirectly) used R, so I'm quite uncomfortable with pushing code.
But I still can try.

Btw, I didn't get that iRace was more convenient to very large configuration space...
And as far as I'm concerned, there is something to be done with respect to instance selection, or instance hierarchy.


Best regards, 
Charles

Manuel López-Ibáñez

unread,
Sep 2, 2021, 10:27:53 AM9/2/21
to The irace package: Iterated Racing for Automatic Configuration
On Thursday, 2 September 2021 at 15:15:03 UTC+2 cpru...@gmail.com wrote:
Btw, I didn't get that iRace was more convenient to very large configuration space...

This is relative to the budget that you have. If the budget is enough to sample all possible configurations, then there is no point of doing an initial random sampling (or any kind of sampling) nor exploring the parameter space to search for good configurations. The only difficulty in that scenario would be how to allocate the budget to each configuration (how many instances to see per configuration). The latter problem is solved by doing a single race instead of iterated races (or if the budget is sufficiently large, doing a full factorial analysis).

There is a gray area where the budget is enough to evaluate most configurations but not all. Imagine that the parameter space is 240 configurations but your budget is only 720. In that case, you can at most do 3 runs per configuration, so it may be better to not evaluate some configurations at all and evaluate 5 or more times other configurations to be sure of the performance of the ones that you do evaluate. I don't think irace (as it is right now) will work well in such scenario, even if we implemented duplicate detection. It would be better to use some kind of LHS+racing+greedy selection (\mu+1) perhaps based on http://lopez-ibanez.eu/publications#WesLop2018ecj . That would be an interesting topic to explore if anyone is interested. Doing it in R has the benefit that you can reuse a lot from irace in terms of reading files, parallel execution, etc, but doing it in Python as a proof-of-concept just to see if it works would also be OK.
 
My rule of thumb to define "large" would be precisely what you have seen: If the initial random sampling contains duplicated configurations, then the parameter space is small for the amount of budget given. Maybe we should add a check for that...

And as far as I'm concerned, there is something to be done with respect to instance selection, or instance hierarchy.

Instance selection and instance hierarchy is an interesting topic also that may even have applications for large configuration spaces. Irace currently either randomly shuffles the instance list or uses the order given by the user, but there is surely smarter ways to dynamically handle the training instances. There is some work from Fawcett and Hoos on ordered races but the order is not dynamic, if I remember correctly. You can already tell irace touse a static order, the interesting improvement would be to do dynamic ordering.

Cheers,

Manuel.

Charles Prudhomme

unread,
Sep 6, 2021, 3:44:24 AM9/6/21
to The irace package: Iterated Racing for Automatic Configuration
Ok, that clearer now. Thank you for the pointer too.

I tried to follow the steps you gave to me, but I got a strange error:

#------------------------------------------------------------------------------

# irace: An implementation in R of (Elitist) Iterated Racing

# Version: 3.4.1.9fcaeaf

# Copyright (C) 2010-2020

# Manuel Lopez-Ibanez     <manuel.lo...@manchester.ac.uk>

# Jeremie Dubois-Lacoste  

# Leslie Perez Caceres    <leslie.per...@ulb.ac.be>

#

# This is free software, and you are welcome to redistribute it under certain

# conditions.  See the GNU General Public License for details. There is NO

# WARRANTY; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

#

# irace builds upon previous code from the race package:

#     race: Racing methods for the selection of the best

#     Copyright (C) 2003 Mauro Birattari

#------------------------------------------------------------------------------

# installed at: /usr/local/lib/R/site-library/irace

# called with: 

Warning: A default scenario file './scenario.txt' has been found and will be read

# Read 240 configuration(s) from file '/home/cprudhom/tuning2020/configurations.txt'

# 2021-09-06 09:39:12 CEST: Initialization

# Elitist race

# Elitist new instances: 5

# Elitist limit: 2

# nbIterations: 1

# minNbSurvival: 4

# nbParameters: 4

# seed: 1755106463

# confidence level: 0.95

# budget: 10000

# mu: 5

# deterministic: TRUE


# 2021-09-06 09:39:12 CEST: Iteration 1 of 1

# experimentsUsedSoFar: 0

# remainingBudget: 10000

# currentBudget: 10000

# nbConfigurations: 240

Error in race(scenario = scenario, configurations = raceConfigurations,  : 

  object 'raceConfigurations' not found

Calls: irace.cmdline -> irace.main -> irace -> race -> nrow

Execution halted

I double checked `scenario.txt' and it seems right to me.

Can you (again) enlighten me?

Best

Manuel López-Ibáñez

unread,
Sep 7, 2021, 5:52:07 PM9/7/21
to The irace package: Iterated Racing for Automatic Configuration
Dear irace users,

Just to keep the group informed (and potential future readers) the error reported by Charles seems to be fixed in the development version (3.5), which we will hopefully release soon. If you find this thread because you see the same error, you can install the development version from github by running within R:

library(devtools)
install_github("MLopez-Ibanez/irace")

Best,

Manuel.
Reply all
Reply to author
Forward
0 new messages