SCOOP/DEAP already supports mapping evaluation/crossover/mutation
within multiprocessing. I have used it to calibrate hydrologic models
with populations as large as 100, and I forget how many genes (maybe
15). Most would say that a population of 100 is way to small, but I
discovered that a population of 60 to 100 was the best choice for
convergence time and genetic diversity over time to avoid bottlenecks.
But that is an aside.
Your title says you are looking to solve a multiprocessing/pickle
issue. This does not quite follow from the text of your message, but I
will tell you how I solved my problem and guess how it relates to your.
Each run of my hydrologic model took and where from 2 to 7 CPU hours,
and that model was not written in parallel, so each member of the
population represents a single run and its evaluation results. So I set
up the master process to read the pickle file, and passed the parameters
of the eval/run to each process and ran 1 instance on each node (each
run took many GB of RAM). The return code from the program was the eval
parameter that was then parsed and processed by DEAP/SCOOP and treated
accordingly. In this way the master process is the only one playing
with the pickle file, and I can also manage intermediate results and
effect restarts if the SCOOP run terminated. I still ended up burning
10's of thousands of CPU hours calibrating these models, but it worked
extremely well.
Hope this helps.
EBo --