GP Static depth limit not enforced

Manuel Belmadani

unread,

Mar 15, 2016, 1:08:26 AM3/15/16

to deap-users

Hi everyone,

My software uses a Strongly Typed GP. On some of my more challenging datasets, I have the impression the solutions get stuck in a local maximum and essentially just adds redundant information to the solutions, making a very deep tree with no improvement over the actual output solution. It used to be pretty bad, but I got around it by setting a static limit as so:

 """ Static Limit for GP Tree """
            MAX_HEIGHT = 17 # Koza 1999                                                                                                                                                                                                      
            self.toolbox.decorate("mate", deap.gp.staticLimit(operator.attrgetter('height'), MAX_HEIGHT))
            self.toolbox.decorate("mutate", deap.gp.staticLimit(operator.attrgetter('height'), MAX_HEIGHT))

And that seemed to have fixed it, until I moved to a more challenging dataset. Out of 114 datasets, 18 of them terminated with this error:

Traceback (most recent call last):
  File "mysoftware.py", line 209, in <module>
    offspring, log = engine.run(options.ngen)
  File "/home/hpc3019/development/myproject/engine.py", line 133, in run
    timelimit=self.options.timelimit
  File "/home/hpc3019/development/myproject/evoalgo.py", line 96, in eaFortin
    fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
  File "/home/hpc3019/development/myproject/stgpfitness.py", line 160, in fitnessfunction
    pattern = self.toolbox.compile(expr=program)        
  File "/home/hpc3019/.local/lib/python2.6/site-packages/deap/gp.py", line 464, in compile
    return eval(code, pset.context, {})

How can I resolve this issue?

I'm using modified versions of the eaSimple with the improved NSGA-IIR (Fortin, F.-A. & Parizeau, M. Revisiting the NSGA-II crowding-distance computation. in Proceedings of the 15th annual conference on Genetic and evolutionary computation 623–630 (ACM, 2013).)

Where should I be looking, or what other information could help shed light on this? Could it be in the way I implemented my custom EA? Maybe my grammar design is flawed?

The strangest thing is that it used to be much worse; it wouldn't do this on trivial datasets. Then I ran it on 13 medium difficulty datasets, for NSGA-IIR, SPEA2 (with mu plus lambda), for 10 different random seeds (260 individual runs in total), and it happened only once. Now I'm running harder datasets, and it seems to occur more frequently.

Any suggestions or recommendations on where to look, or what to do, would be greatly appreciated.

François-Michel De Rainville

unread,

Mar 15, 2016, 8:49:02 AM3/15/16

to deap-users

Can you please add the error?

Thx

--
You received this message because you are subscribed to the Google Groups "deap-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to deap-users+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Manuel Belmadani

unread,

Mar 15, 2016, 1:51:46 PM3/15/16

to deap-users

Ahh, sorry, did not notice it was left out. It's the standard DEAP error for this situation.

MemoryError: DEAP : Error in tree evaluation : Python cannot evaluate a tree higher than 90. To avoid this problem, you should use bloat control on your operators. See the DEAP documentation for more information. DEAP will now abort.

Marc-André Gardner

unread,

Mar 15, 2016, 2:08:34 PM3/15/16

to deap-users

Hi Manuel,

That's an interesting problem. I see maybe two causes for that :
1) It's not what we think. If we look at the code in gp.py (https://github.com/DEAP/deap/blob/master/deap/gp.py) :

    try:
        return eval(code, pset.context, {})
    except MemoryError:
        _, _, traceback = sys.exc_info()

As you can see, DEAP assumes that a MemoryError is always caused by a tree too high. Of course, in practice, it may not be always this cause. So, for instance, if, when evaluating a tree, you use a too big dataset (and overflows the available memory), DEAP will display this error, even if, in this particular case, the message is misleading. That would explain why you do not experience this issue with smaller datasets.
Could you add some code in gp.py to assert the height of the trees causing this error?

2) The staticLimit decorator applies only on the operators explicitely decorated (of course). Are you sure that you do not do any other operation on the trees, including in the initialization? If you want, you can easily output the min/mean/max height with the stats module, so you can see when the problem arise. Is this at the first generation or after?

Thanks for your time, we will try to work this issue out!

Marc-André

François-Michel De Rainville

unread,

Mar 15, 2016, 2:16:49 PM3/15/16

to deap-users

3) When using decorators, be sure to use the return value of the function not the inplace modification. The former will have the result of the decorator and the latter only of the variation.

Manuel Belmadani

unread,

Mar 15, 2016, 2:50:49 PM3/15/16

to deap-users

Marc-André:

The memory issue is interesting. I had not considered it.

real    900m32.246s
user    899m26.060s
sys     0m32.833s
1757    50         0.977758        1               0.926626        0.0199056       -3618.73        -687.94         -5565.8         1418.56
1758    50         0.977758        1               0.926626        0.0199056       -3618.73        -687.94         -5565.8         1418.56
1759    50         0.977758        1               0.926626        0.0199056       -3618.73        -687.94         -5565.8         1418.56
1760    50         0.977758        1               0.926626        0.0199056       -3618.73        -687.94         -5565.8         1418.56

Traceback (most recent call last):

...(same trace as before)...

MemoryError: DEAP : Error in tree evaluation : Python cannot evaluate a tree higher than 90. To avoid this problem, you should use bloat control on your operators. See the DEAP documentation for more information. DEAP will now abort.

real    725m42.452s
user    724m14.344s
sys     1m22.018s
1806    50         0.964538        1               0.882966        0.0318138       -3942.1         -498.117        -6169.68        1786.99
1807    50         0.964538        1               0.882966        0.0318138       -3942.1         -498.117        -6169.68        1786.99
1808    50         0.964538        1               0.882966        0.0318138       -3942.1         -498.117        -6169.68        1786.99
1809    50         0.964538        1               0.882966        0.0318138       -3942.1         -498.117        -6169.68        1786.99

Traceback (most recent call last):

...(same trace as before)...

MemoryError: DEAP : Error in tree evaluation : Python cannot evaluate a tree higher than 90. To avoid this problem, you should use bloat control on your operators. See the DEAP documentation for more information. DEAP will now abort.

I both cases, this occurs after a few hours, after the population seems to stall (Those are my objectives' stats.) I'll modify gp.py and report what I find.

My datasets aren't huge (maybe 20Mbs at most), but I might have a memory leak somewhere. The software does seem to get slower and slower as the generation increases, and the solutions shouldn't be much more complex to evaluation.

I'm also running this on a cluster, I had not considered that as a factor, but would you think that maybe be an issue? If the different nodes share memory, for example (I'm just throwing this out there, not sure if this would actually make sense.)

I don't think I'm using any other operators than mating and mutating. I'll check the initialization, but I don't think I decorated anything else with the static limit. Logging the depth of the trees sounds like a good idea.

François-Michel:

Could you give me an example of returning an inplace modification versus the value of the function? I'm using decorators pretty much exactly like in the documentation, unless it's a subtle difference and I goofed up.

François-Michel De Rainville

unread,

Mar 15, 2016, 2:53:51 PM3/15/16

to deap-users

ind_2 = toolbox.mutate(ind_1)

ind_1 is mutated but the decoration has not been applied

ind_2 has both operations done.

Manuel Belmadani

unread,

Mar 21, 2016, 10:25:16 PM3/21/16

to deap-users

Alright, sorry for being MIA for a few days. I'm going through your suggestions, and I think I may have a lead. I included the logging of the maximum depth of the trees in the population.

In logbook.record(), I added

maxdepth=numpy.max([x.height for x in population])

In gp.py, I added:

assert (expr.height <= 17), "Static limit not respected. expr "+str(expr)+" has height "+str(expr.height)

(Hardcoded value, unfortunately. Does anyone know how to easily the decoratored static limit from gp.py?)

And seems like maxdepth exceed 17.

gen	evals	new	maxdepth	avg     	max     	min	std      	avg      	max	min     	std     
0  	0    	0      	4       	0.489572	0.517073	0  	0.0700974	-0.607054	1  	-1.10812	0.267261
0  	50   	55     	7       	0.501637	0.523923	0.5	0.00495394	-0.770868	-0.648516	-3.14516	0.402548
Writing gen 0 at path ./OUT/default/FORTIN/42/0.checkpoint
1  	50   	81     	7       	0.505192	0.582418	0.5	0.0143706 	-0.945452	-0.650123	-3.14516	0.603596
2  	50   	113    	10      	0.516542	0.583815	0.5	0.025932  	-1.63016 	-0.652377	-11.7192	1.7722  
3  	50   	149    	10      	0.531343	0.583815	0.501933	0.0269473 	-2.29389 	-0.777242	-11.7192	1.78354 
4  	50   	182    	12      	0.546434	0.619048	0.50678 	0.0267584 	-2.79703 	-0.930487	-11.7192	1.72253 
5  	50   	206    	11      	0.569386	0.666667	0.529491	0.0319262 	-3.66291 	-1.01325 	-11.7192	1.86898 
6  	50   	238    	13      	0.607038	1       	0.53255 	0.0874244 	-3.85223 	-0.693147	-11.7192	2.38831 
7  	50   	263    	13      	0.69657 	1       	0.560764	0.157543  	-3.23488 	-0.693147	-11.7192	2.93072 
8  	50   	292    	11      	0.765498	1       	0.560764	0.187721  	-3.66894 	-0.693147	-11.7192	3.63597 
9  	50   	321    	11      	0.800616	1       	0.560764	0.19084   	-4.08592 	-0.693147	-11.7192	3.84219 
10 	50   	345    	14      	0.816608	1       	0.583815	0.169276  	-4.02753 	-0.693147	-15.1533	3.62791 
11 	50   	376    	14      	0.839427	1       	0.588235	0.151574  	-4.07066 	-0.693147	-15.1533	4.07989 
12 	50   	409    	15      	0.863144	1       	0.596916	0.141477  	-4.13747 	-0.693147	-15.1533	4.30664 
13 	50   	432    	15      	0.76096 	1       	0.596916	0.128621  	-6.07579 	-1.38674 	-15.1533	4.01724

Traceback (most recent call last):

  File "mysoftware.py", line 213, in <module>
    offspring, log = engine.run(options.ngen)
  File "/home/y/development/myproject/engine.py", line 134, in run
    CLUTCH=self.options.clutch
  File "/home/y/development/myproject/evoalgo.py", line 111, in eaFortin
    fitnesses = toolbox.map(toolbox.evaluate, invalid_ind)
  File "/home/y/development/myproject/stgpfitness.py", line 157, in fitness_function
    pattern = self.toolbox.compile(expr=program)        
  File "build/bdist.linux-x86_64/egg/deap/gp.py", line 467, in compile
AssertionError: Static limit not respected. expr add(primitive_charclass(False, True, True, or_(False, True)), add(add(add(add('C', primitive_charclass(and_(or_(True, False), or_(True, True)), not_(not_(True)), and_(or_(True, not_(not_(or_(True, True)))), not_(True)), or_(True, True))), add(primitive_charclass(False, False, True, False), primitive_charclass(not_(False), or_(False, False), and_(True, False), and_(and_(or_(not_(not_(True)), not_(and_(False, False))), not_(and_(not_(or_(not_(True), or_(False, and_(or_(and_(or_(True, True), or_(False, False)), and_(not_(not_(True)), not_(True))), not_(or_(not_(False), or_(False, True))))))), and_(True, False)))), True)))), primitive_charclass(not_(and_(or_(True, and_(and_(and_(or_(True, False), not_(True)), not_(not_(True))), not_(and_(or_(False, False), and_(True, True))))), not_(True))), or_(False, False), and_(True, True), and_(False, not_(not_(or_(and_(not_(and_(not_(True), and_(False, True))), and_(not_(not_(True)), not_(not_(True)))), True)))))), add('G', 'A'))) has height 18

Any suggestions?

I also realize that my grammar has a lot of recursive boolean expressions. Would that obscure the evaluation of the depth in some way?

I checked that decoration was properly applied (François-Michel's suggestion)and didn't see anything suspicious, but I'll double check, because my trace seems to support that idea.

Message has been deleted

Manuel Belmadani

unread,

Mar 21, 2016, 11:27:12 PM3/21/16

to deap-users

I should've double-checked earlier. Turns out that the code I used from the NSGA-IIR repository did not return the decorated individual.

Before:

        offspring = toolbox.preselect(population, len(population))
        offspring = [toolbox.clone(ind) for ind in offspring]
                                                                                                                                                                                                                        
        for ind1, ind2 in zip(offspring[::2], offspring[1::2]):                                                                                                                                             
            if random.random() <= cxpb:                                                                                                                                                                     
                toolbox.mate(ind1, ind2)                                                                                                                                                                    
            if random.random() > mutpb:                                                                                                                                                                     
                toolbox.mutate(ind1)                                                                                                                                                                        
            if random.random() > mutpb:                                                                                                                                                                     
                toolbox.mutate(ind2)                                                                                                                                                                       
            del ind1.fitness.values, ind2.fitness.values

After:

        offspring = toolbox.preselect(population, len(population))
        offspring = [toolbox.clone(ind) for ind in offspring]
        offspring = varAnd(offspring, toolbox, mu, cxpb, mutpb)

The new version does not seem to produce the error anymore. varAnd would essentially accomplish the same thing as the manual version, correct?

I'll rerun my batch of jobs to see if the error recurs, but hopefully this did the trick.

François-Michel De Rainville

unread,

Mar 22, 2016, 9:10:00 AM3/22/16

to deap-users

Where did you find the NSGA-IIR code?

Manuel Belmadani

unread,

Mar 22, 2016, 2:14:57 PM3/22/16

to deap-...@googlegroups.com

I think this was it: https://github.com/DEAP/experimental/blob/master/fortin2013/nsga2.py#L100

I may have added a few things; it's been a while.

--
You received this message because you are subscribed to a topic in the Google Groups "deap-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/deap-users/g2oV4wH3Tps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to deap-users+...@googlegroups.com.

Reply all

Reply to author

Forward