Pypy and DEAP

89 views
Skip to first unread message

Rupert Young

unread,
May 26, 2021, 3:31:52 PM5/26/21
to deap-users

I thought I'd try to speed up my GA by running with pypy.

I am surprised that pypy is actually 2.5 times SLOWER than python. For 10 generations of population 100, I get  165s versus 65s.

Is this expected? Anyone know why this is the case?

I did see some posts on here about pypy, but they are quite old, so wondered what's new.

Derek Tishler

unread,
Jun 1, 2021, 12:45:44 AM6/1/21
to deap-users
Very hard to saw without knowing what you are trying to speed up(in eval). I am unsure why pypy is not speeding up deap generally but im unsure if it should(seems pypy is not a magic bullet for every case):
https://stackoverflow.com/questions/49227389/possible-reasons-why-pypy-might-be-slower-than-default-compiler

Depending on what deap functionality you are using(selection algos, operators, and most things use loops on Objects) there may be very little worth trying to speed up in Deap(unless you want to fix dtypes and rewrite operators in Numba which would def speed things up, but maybe focus on that only if really worth it as it'll add complexity for little payoff unless doing big populations(over 100k or many more I guess).

I focus mostly on making my eval as FAST(cpu) and LITE(memory) as possible on a single core and then I make my evals parallel(single machine with Multiprocessing or use a cluster via Ray) but that has its limits(usually less than n_cpu-per-node vs being able to see speedup with all cpu utilized) . Tricks for this are non-deap related such vectorizing math with numpy, or using something like Numba to REALLY speed up loops if you have to use them. Another trick is using something like Ray to share memory objects(numpy arrays) across all all cpu (per node), this can hellp scale up big problems vs replicating the memory on each core if your data is big.

Using a profiler to help see where your eval time is going can be really handy for complex eval code. For example using pandas ops like .iloc is madness if you care about time...or using numpy functions on non numpy objects like a float vs using normal math...the profiler can show you these duh sorta python things you prob already know but maybe let slide when coding a fast solution.

Rupert Young

unread,
Jun 2, 2021, 1:52:10 PM6/2/21
to deap-users
Thanks for your response.

I have been looking at isolating code and using a profiler. I think I have identified some of my own code which is slower with pypy, so may not be related to DEAP at all.

I'll investigate further and then try again with DEAP once I have sorted that out.

I also want to look at multiprocessing. I did try before, but there was an issue to do with pickling which I need to sort out.

Reply all
Reply to author
Forward
0 new messages