Hi Rachel,
I'm not sure I understand completely your setup and limitations. What does it mean to "split the parameter space" or "split the datasets"? In any case there are four main options:
The simplest form of parallelization is to execute the calls to target-runner in parallel. If you can reserve N CPUs in a single machine, then you can simply submit the job to the cluster and use the irace option "--parallel N" so that irace will evaluate multiple calls to target-runner in parallel. This means that you submit 1 job (= 1 run of irace) to the cluster but the job uses many CPUs within the same machine.
The second option is to use MPI to use CPUs from multiple machines. This means that you submit 1 job (= 1 run of irace) but the job is setup to use MPI. For this you need to install Rmpi:
https://docs.alliancecan.ca/wiki/R#Rmpi (Make sure that it works). Then replace:
R CMD BATCH test.R test.txt
with whatever way you use to call irace, maybe:
/path/to/irace/bin/irace --mpi 1 --parallel N
or
Rscript launch_irace.R
(In launch_irace.R, you need to use the scenario options mpi=TRUE and paralllel=N)
N here is the number of CPUs to use (ideally as many as you can, at least 64 or more) and it will be the same as
#SBATCH --ntasks=N # number of MPI processes
The 4th option is to write your own function targetRunnerParallel(). In this function you can use whatever R/Python packages you wish to do the parallelization. For example, you could use the package:
https://mllg.github.io/batchtools/ or the package:
https://future.batchtools.futureverse.org/ or something else that you find easy to work with. Irace will call the function with the list of executions it wants to do, and the function must return the results. If you use this 4th option, please let me know as we don't have an example of this in the documentation and it would be great to have one if you wish to share your code.
Best wishes,
Manuel.