Question about running on multiple datasets

30 views
Skip to first unread message

Phoebe Liu

unread,
Apr 13, 2023, 12:06:50 PM4/13/23
to The irace package: Iterated Racing for Automatic Configuration
Hi Manual,

I was asking about how to work with datasets in irace previously. I am sorry for asking some similar questions again. 

I am using multiple datasets to tune the parameters, and to make things simple, say 3 datasets. They are saved in a list, so there is a list of three datasets. 

I want irace to search for the best configuration within a given parameter space across the 3 datasets. I am using the similar example in another conversation to illustrate:

//

library(irace)

# data generation
dataset1 <- matrix(runif(1000), nrow=500)
dataset2 <- matrix(runif(1000), nrow=500)
dataset3 <- matrix(runif(1000), nrow=500)

dataset <- list(dataset1, dataset2, dataset3)

# target runner function
target.runner <- function(experiment, scenario) {
 
  instance <- experiment$instance
  configuration <- experiment$configuration

  res <- list(value = my_algo(instance, configuration))
  return(list(cost = res$value))
}

parameters <- readParameters(text='
....
')
parameters <- readParameters(text = parameter_table)

# scenario
scenario <- list(targetRunner = target.runner,
                 instances = list(dataset[[1]], dataset[[2]], dataset[[3]]),  
                 maxExperiments = 200,  
                 logFile = "")  

# check that the scenario is valid.
checkIraceScenario(scenario, parameters = parameters)

# run irace
my_results <- irace(scenario = scenario, parameters = parameters)

//

It is a little redundant to generate a list of datasets and unlist it into three datasets for running. This part can be modified. But I would like to make sure if that is the way above for irace to evaluate the algorithm across multiple datasets. I do not want to have three results from three datasets generated in a single run. Instead, I want it to search and evaluate the dataset to produce a single cost value based on my proposed algorithm and then repeat the process across multiple datasets until the best configuration is found. So, in this case, is it the way above to do it?

If I put "instances = list(dataset)" in the scenario list instead of "instances = list(dataset[[1]], dataset[[2]], dataset[[3]])", it will have a different meaning, which is to evaluate all three datasets in a single run, rather than evaluate a single dataset in a single run. Do I understand it correctly? 

By importing the datasets in this way above, does it work in the same way as instancesFile does? (My dataset reading needs some function to process, so it is hard to use instancesFile directly.)

Thank you so much in advance!

Best regards,
Phoebe

Reply all
Reply to author
Forward
0 new messages