Strongly typed GP without mathematical functions?

800 views
Skip to first unread message

Randy Olson

unread,
Jun 17, 2016, 4:04:00 PM6/17/16
to deap-users
Hello,

We've been using DEAP for a project with strongly typed GP. We have primitives that take parameters such as:

_primitive_operator(DataFrame, int, float): # returns DataFrame

We're having issues with DEAP when we want to have a fixed set of int and float terminals to pass as int and float parameters to the primitives. Yet DEAP keeps throwing errors when it wants to expand a GP tree and there are no primitives that produce an int or float.

We've found that creating "dummy" primitives that take a float and return the same float (and same for ints) makes the error go away, but that leads to "dummy primitive bloat" that we don't want. How can we prevent DEAP from attempting to expand the int and float parameters and only use the terminals for those instead?

Cheers,
Randy Olson

Manuel Belmadani

unread,
Jun 20, 2016, 4:46:18 PM6/20/16
to deap-users
Is this the error message you're getting?

IndexError: The gp.generate function tried to add a primitive of type '<type 'bool'>', but there is none available.

My terminals are bools, so I got around this using unary boolean operators as primitives, so at least I'm getting some kind of change in my individuals rather than bloated primitives.  But I'd be interested to know if there's a better/correct way to work around this.

Randy Olson

unread,
Jun 20, 2016, 5:01:33 PM6/20/16
to deap-users
Yes, we've been getting errors similar to that. Previously we had mathematical operators acting on the int and float parameters, but we ended up with bloated mathematical operators -- sometimes a dozen or more of them just to represent a single value. Very wasteful, and causes the mutations to primarily focus on making minor changes to a single int/float parameter rather than the more important aspects of the GP tree.

Daniel Angell

unread,
Jun 20, 2016, 9:09:47 PM6/20/16
to deap-users
def _gen_grow_safe(self, pset, min_, max_, type_=None):
    def condition(height, depth, type_):
        """Expression generation stops when the depth is equal to height
        or when it is randomly determined that a a node should be a terminal.
        """
        return type_ != pd.DataFrame or depth == height

    return self._generate(pset, min_, max_, condition, type_)

# Generate function stolen straight from deap.gp.generate
def _generate(self, pset, min_, max_, condition, type_=None):
    if type_ is None:
        type_ = pset.ret
    expr = []
    height = random.randint(min_, max_)
    stack = [(0, type_)]
    while len(stack) != 0:
        depth, type_ = stack.pop()

        # We've added a type_ parameter to the condition function
        if condition(height, depth, type_):
            try:
                term = random.choice(pset.terminals[type_])
            except IndexError:
                _, _, traceback = sys.exc_info()
                raise IndexError("The gp.generate function tried to add "
                                  "a terminal of type '%s', but there is "
                                  "none available." % (type_,)).with_traceback(traceback)
            if inspect.isclass(term):
                term = term()
            expr.append(term)
        else:
            try:
                prim = random.choice(pset.primitives[type_])
            except IndexError:
                _, _, traceback = sys.exc_info()
                raise IndexError("The gp.generate function tried to add "
                                  "a primitive of type '%s', but there is "
                                  "none available." % (type_,)).with_traceback(traceback)
            expr.append(prim)
            for arg in reversed(prim.args):
                stack.append((depth+1, arg))
    return expr

This is what Randy and I have in TPOT right now to get around this issue. Here pd.DataFrame is the input and output data type, and the only data type with primitives.

Marc-André Gardner

unread,
Jun 21, 2016, 2:34:53 AM6/21/16
to deap-users
Hi everyone,

This is a recurring question with DEAP, so I guess it should be the time to think of a more comprehensive fix than "you should not do that". To sum up, three problems can happen when there is no terminal / primitive of a given type :
  • If there is no primitive of a given type, then DEAP may have to stop the tree generation here for a branch: it has no other choices than adding a terminal.
  • If there is no terminal of a given type, then DEAP may have to continue the tree generation process, even if the stopping condition was reached.
  • If there is no primitive nor terminal of a given type, then it is simply impossible to generate a tree.

I think we all agree that the third case is and should stay an error: there is no point in requiring a non-existing type in a tree. The issue is with the other cases. Both of them may make the gp.gen** functions behave unpredictably. For instance, the doc currently says that the min and max parameters will be the minimum and maximum height of the produced tree, but this does not hold if DEAP cannot assume that there is at least one primitive and one terminal returning each type.


If there is no primitive of a given type, then DEAP may also produce an "unbreadable" tree, that would not be able to grow with crossover or mutations since only terminals can be swapped. Besides, if there is no terminal of a given type, DEAP might enter in a recursion loop, adding repeatedly a primitive taking as argument a type that no terminal has -- depending on the primitive set, this may be highly likely or unlikely. Even if it does finally end, the produced tree may then exceed the fatal 90, after which Python parser refuses to evaluate the tree.


As DEAP developpers, we do want to listen to our users, especially when there are sensible requests. However, we also want to avoid a change that would negatively affect many other users, including beginners. These exceptions are currently more like safe-guards.

So, I ask you, since you clearly have a good knowledge of STGP and DEAP internals, what should we do? I suggest three possible ways here, don't hesitate if you see others :


- Add a member to the primitive set stating if we want to use it in strict or "tolerant" mode. Strict mode would be the default, and would provide the same behavior as the one currently implemented. Tolerant mode (I'd have to find a better name, too) would deactivate all error catching: DEAP would just run its generate function until the tree is finished or until everything crashed.


- Same thing, but by adding an additionnal argument to the generate functions (after the type_ argument). I'm not a fan of adding more and more arguments, but this has the advantage to be able to jointly use strict and "tolerant" mode, with the same pset.


- We remove the exception, but only for the no primitive case. It remains an error when there is no terminal of a given type, the rationale behind that being that a smaller tree it usually not a problem, while a larger one (or even a recursion error) is. DEAP would first try to add a primitive (to fit the required min height), and only if there's none it would fallback to terminal (and throw an error if there is no suitable terminal too). But that would break the compatibility with older versions of DEAP, since the min argument would not mean the same thing.


Do you have other ideas or thoughts about it? Don't forget that we try to stay generic here.


Have a good day,


Marc-André

Daniel Angell

unread,
Jun 21, 2016, 4:54:31 PM6/21/16
to deap-users
Honestly I'm mostly in favor of your third suggestion, only removing the exception in the case of no primitives.

Manuel Belmadani

unread,
Jun 28, 2016, 10:13:28 PM6/28/16
to deap-users
I have a few points to share:

On how to transition the changes: I think I prefer suggestion #1 with the strict/tolerant optional argument on the pset. It could be bad if someone assumes the exception is still there and checks out the latest version of DEAP, then makes changes that would normally be caught by the exception. Also, for those who *do* want the tolerant mode, it's a very simple addition to their code

If I understand tolerant mode/suggestion #3 correctly, DEAP would try to run the generate function, potentially until it crashes (tree depth > 90), and if it doesn't crash, it could still take a very long time to complete, correct? My issue with this is that you could have a run going for a few hours only to end up with a crash, by pure randomness.  I imagine this workaround will typically make individual creation much more costly, making runtime more so unpredictable, right? That's another reason why I prefer strict/tolerant rather than suggestion #3; if I try a brand new grammar that runs into this issue, I'll be warned/errored unless I explicitly said not to.

On how to replace the exception: Instead of running the generate function until the tree is finished, would it be possible to invalidate/strike out the individual altogether? This sounds simpler to me, as min and max is still respected. Of course, then if we're expecting lambda new individuals per generation, we'd have (lambda - (#invalid inds)). In my case, I would prefer to have a slightly smaller batch of offspring but know their constraints, and know that they won't crash during generation. I suppose a crash is still possible if you run into a case where all your individuals are invalidated in the initial population. I'm not saying this method is better than the other suggestions, but in my specific (selfish!) requirements, I think this would work better. Just thought I'd mention it in case anyone else finds it interesting.

Finally, would it be difficult to override this manually? Worst comes to worst, I wouldn't mind extending this myself to try it out and keep DEAP as is. Maybe it'll turn out that the "strict" grammars are better at  converging on my specific problem anyways.

Thank you for opening the conversation, and sorry for not posting this sooner (was away on vacation).

Cheers

Robin Emig

unread,
Apr 22, 2021, 9:04:57 PM4/22/21
to deap-users
I wanted to check in on this to see if there is now a more preferred way. I have a problem which takes 8 input numbers and generate 8 output numbers but often fails with
(<class 'IndexError'>, IndexError("The gp.generate function tried to add a terminal of type '<class 'list'>', but there is none available."), <traceback object at 0x00000193FC1B1040>)

using the method where there is a primative which takes 8 numbers and returns a list


Derek Tishler

unread,
Apr 23, 2021, 8:44:14 AM4/23/21
to deap-users
As per the error and past experience with custom types in STGP; I believe In order to complete the trees the gp is missing a terminal of type List. If you can try a ones or zero List or some sort of terminal of this type your trees should generate successfully.

Annapareddy Satyanarayana reddy

unread,
Nov 18, 2021, 3:48:46 PM11/18/21
to deap-users

dp.PNGNeed help for how to code this formula in python 
Reply all
Reply to author
Forward
0 new messages