Multiprocessing Pickle Issue

1,719 views
Skip to first unread message

Burt Murphy

unread,
Nov 9, 2017, 8:28:57 AM11/9/17
to deap-users
Hi there,

Thank you for the great library.

I'm currently trying trying to write a DEAP program based on the nsgaii example. I have a custom individual class and my mutation/evaluation/crossover methods are evaluated on separate servers via the GRPC (https://grpc.io/) toolkit. This works fine, until I add scoop or multiprocessing. I then hit what I think is a Pickling issue.

TypeError: no default __reduce__ due to non-trivial __cinit__

I'm pretty new to Python so if anyone can shed some light it'd be much appreciated.

Additionally,  is adding futures/scoop to the toolbox enough to map the evaluation/crossover/mutation calls to separate processes or is there other code that needs to be written?

Many thanks,
Burt

EBo

unread,
Nov 9, 2017, 8:53:25 AM11/9/17
to deap-...@googlegroups.com
SCOOP/DEAP already supports mapping evaluation/crossover/mutation
within multiprocessing. I have used it to calibrate hydrologic models
with populations as large as 100, and I forget how many genes (maybe
15). Most would say that a population of 100 is way to small, but I
discovered that a population of 60 to 100 was the best choice for
convergence time and genetic diversity over time to avoid bottlenecks.
But that is an aside.

Your title says you are looking to solve a multiprocessing/pickle
issue. This does not quite follow from the text of your message, but I
will tell you how I solved my problem and guess how it relates to your.

Each run of my hydrologic model took and where from 2 to 7 CPU hours,
and that model was not written in parallel, so each member of the
population represents a single run and its evaluation results. So I set
up the master process to read the pickle file, and passed the parameters
of the eval/run to each process and ran 1 instance on each node (each
run took many GB of RAM). The return code from the program was the eval
parameter that was then parsed and processed by DEAP/SCOOP and treated
accordingly. In this way the master process is the only one playing
with the pickle file, and I can also manage intermediate results and
effect restarts if the SCOOP run terminated. I still ended up burning
10's of thousands of CPU hours calibrating these models, but it worked
extremely well.

Hope this helps.

EBo --

Reply all
Reply to author
Forward
0 new messages