Scoop issues with garbage collection, ZMQerror pickle data was truncated

66 views
Skip to first unread message

Derek Tishler

unread,
Aug 18, 2018, 1:31:15 PM8/18/18
to deap-users
I ran into an issue today and wanted to share in case others happen across it. I had slightly modified my working code and could no longer evolve across multiple hosts once the algo was checkpointed and resumed the next day. I had made a silly small 1 line change that broke the networking for scoop.

This relates to using Scoop and a case where I began to see the error:

cPickle.UnpicklingError: pickle data was truncated

Searches suggest commonly occurs with oversized messages, but this was a dead end upon review.

It turns out this was occuring due to my use of gc.collect() to attempt and clean up after pandas. This garbage collections works well in my eval function and prevents the workers from growing over time in mem usage(500mb leaked to 2gb+ over many gens), but I accidently placed an unneeded(was not improving memory use on master) collect call within the deap algorithm(at the start of the for loop for 1+ gens) and this broke things badly for zmq and scoop as the master's scope seems to have been trashed. Once this line was removed things were back to working perfectly.

If you are attempting to implement garbage collection for memory improvements and see this warning all of a sudden then you may also have this issue. If anyone could explain this properly I would greatly appreciate it.

Derek Tishler

unread,
Aug 25, 2018, 7:24:58 PM8/25/18
to deap-users
I was seeing additional trouble with scoop and ZMQErrors or sendResultBack errors. I tried recompiling zmq and other things, but my main problem seems to be I was using the current github version of the code which is not stable. I have switched back to stable scoop via pip and the cluster is more reliable now.

Source:
Reply all
Reply to author
Forward
0 new messages