muPlusLambda isn't saving the best models

75 views
Skip to first unread message

Elliot

unread,
Jun 29, 2016, 12:20:02 PM6/29/16
to deap-users
I'm using muPlusLambda like this:

pop, logbook = algorithms.eaMuPlusLambda(pop,
                                       toolbox
,
                                             
100,
                                             
1000,
                                             
0.7,
                                             
.3,
                                             
100,
                                             stats
,
                                             halloffame
=hof)

According to my understanding, selecting from population+offspring ensures really great parent models don't get killed off in favor of less fit offspring. However, looking at the log of my last run, you can see a slamming hot model, with MSE 5 times less than anything seen previously, gets killed off and the min goes up by a factor of ten. What's going on?

gen    nevals    avg    std    min          max        
0      200       inf    inf    0.0194461    1.79769e+308
1      1000      10.0132    13.9275    0.0234131    73.8478    
2      1000      5.46318    8.12007    0.021887     39.2811    
3      1000      3.24259    4.66334    0.0194461    34.5157    
4      1000      2.82575    4.01691    0.0194095    33.2678    
5      1000      3.0655     4.29188    0.0194461    39.2811    
6      1000      3.54069    4.48542    0.0194461    31.3281    
7      1000      4.08759    4.96957    0.0194461    31.3964    
8      1000      4.18326    5.02401    0.0107642    19          
9      1000      4.42301    5.24379    0.00373861    21          
10     1000      6.28996    18.4529    0.00226122    253.714    
11     1000      5.22774    5.99755    0.000628665    27          
12     1000      5.73174    6.63684    0.00542308     25          
13     1000      6.22329    7.2754     0.00636029     29    
   



Marc-André Gardner

unread,
Jun 29, 2016, 12:26:42 PM6/29/16
to deap-users
Hi Elliot,

What you actually want is call elitism. Mu+lambda only ensure that both parent and offspring populations are considered when creating the next generation, but the actual selection boils down to the selection method you chose. For instance, if you use a tournament, then your "slamming hot" individual may well just not be picked since while the tournament selection is random, especially if you have a small size tournament. In the same way, a very lame individual can just be selected because (by chance) it only had to compete with even lamer solutions...

In DEAP, you can use elitism, that is, selecting the n best individuals in a deterministic way by using the tools.selBest function. However, I would advise not to use it alone : just select something like the 5-10 best individuals, and for the remaining of the selection, use another stochastic method like a tournament. EAs do not like when you try to put too much pressure like that :)

Have fun with DEAP,

Marc-Andre

Elliot

unread,
Jun 29, 2016, 12:32:26 PM6/29/16
to deap-users
Thanks for replying so quickly. Could you give me a snippet of code showing how I would modify this line in eaMuPlusLambda to ensure the top n models are included?
Line: population[:] = toolbox.select(population + offspring, mu)

Also, why are EAs designed to select randomly at the cost of preserving the best genes? If it's not too complicated for a brief explanation :)

Best,
Elliot

Marc-André Gardner

unread,
Jun 29, 2016, 12:54:48 PM6/29/16
to deap-users
Well there could be prettier ways, but something like that would work :

population[:] = toolbox.select(population + offspring, mu - N) + tools.selBest(population + offspring, N)

As for your second question, a comprehensive explanation is actually complicated, but briefly, it is because of the importance of diversity in a population. Suppose that you get one good solution at generation 3, selecting it no matter what would make it take over the whole population. Since EAs rely on genetic material exchange to reach better solutions, having a population containing only one individual (with minor variations) is detrimental. This individual might be good, but maybe another better individual would have arisen by mating two not-that-good solutions. A small elitism rate would probably not be detrimental, but as in "real-life evolution", you must leave some liberty to the evolution randomness -- that's actually what's making EAs so distinctive from other machine learning algorithms like gradient descent.

Other people here could certainly suggest interesting books on this topic, if you're interested :)

Marc-Andre

Elliot

unread,
Jun 29, 2016, 1:24:15 PM6/29/16
to deap-users
Thanks. If my fitness tuples have more than one parameter, does selBest just use the first parameter? Or does it go along the Pareto front?

François-Michel De Rainville

unread,
Jun 29, 2016, 1:26:03 PM6/29/16
to deap-users
It will sort them lexicographically (sort on first objective, then for equal rank, sort on second objective and so on)

--
You received this message because you are subscribed to the Google Groups "deap-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deap-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Elliot

unread,
Jun 29, 2016, 1:32:01 PM6/29/16
to deap-users
Great, that's exactly what I need, my first objective is much more important than my second.

Thanks for all your help!
Elliot
Reply all
Reply to author
Forward
0 new messages