Very hard to saw without knowing what you are trying to speed up(in eval). I am unsure why pypy is not speeding up deap generally but im unsure if it should(seems pypy is not a magic bullet for every case):
https://stackoverflow.com/questions/49227389/possible-reasons-why-pypy-might-be-slower-than-default-compilerDepending on what deap functionality you are using(selection algos, operators, and most things use loops on Objects) there may be very little worth trying to speed up in Deap(unless you want to fix dtypes and rewrite operators in Numba which would def speed things up, but maybe focus on that only if really worth it as it'll add complexity for little payoff unless doing big populations(over 100k or many more I guess).
I focus mostly on making my eval as FAST(cpu) and LITE(memory) as possible on a single core and then I make my evals parallel(single machine with Multiprocessing or use a cluster via Ray) but that has its limits(usually less than n_cpu-per-node vs being able to see speedup with all cpu utilized) . Tricks for this are non-deap related such vectorizing math with numpy, or using something like Numba to REALLY speed up loops if you have to use them. Another trick is using something like Ray to share memory objects(numpy arrays) across all all cpu (per node), this can hellp scale up big problems vs replicating the memory on each core if your data is big.
Using a profiler to help see where your eval time is going can be really handy for complex eval code. For example using pandas ops like .iloc is madness if you care about time...or using numpy functions on non numpy objects like a float vs using normal math...the profiler can show you these duh sorta python things you prob already know but maybe let slide when coding a fast solution.