_primitive_operator(DataFrame, int, float): # returns DataFrame
IndexError: The gp.generate function tried to add a primitive of type '<type 'bool'>', but there is none available.
def _gen_grow_safe(self, pset, min_, max_, type_=None): def condition(height, depth, type_): """Expression generation stops when the depth is equal to height or when it is randomly determined that a a node should be a terminal. """ return type_ != pd.DataFrame or depth == height
return self._generate(pset, min_, max_, condition, type_)
# Generate function stolen straight from deap.gp.generatedef _generate(self, pset, min_, max_, condition, type_=None): if type_ is None: type_ = pset.ret expr = [] height = random.randint(min_, max_) stack = [(0, type_)] while len(stack) != 0: depth, type_ = stack.pop()
# We've added a type_ parameter to the condition function if condition(height, depth, type_): try: term = random.choice(pset.terminals[type_]) except IndexError: _, _, traceback = sys.exc_info() raise IndexError("The gp.generate function tried to add " "a terminal of type '%s', but there is " "none available." % (type_,)).with_traceback(traceback) if inspect.isclass(term): term = term() expr.append(term) else: try: prim = random.choice(pset.primitives[type_]) except IndexError: _, _, traceback = sys.exc_info() raise IndexError("The gp.generate function tried to add " "a primitive of type '%s', but there is " "none available." % (type_,)).with_traceback(traceback) expr.append(prim) for arg in reversed(prim.args): stack.append((depth+1, arg)) return expr
I think we all agree that the third case is and should stay an error: there is no point in requiring a non-existing type in a tree. The issue is with the other cases. Both of them may make the gp.gen** functions behave unpredictably. For instance, the doc currently says that the min and max parameters will be the minimum and maximum height of the produced tree, but this does not hold if DEAP cannot assume that there is at least one primitive and one terminal returning each type.
If there is no primitive of a given type, then DEAP may also produce an "unbreadable" tree, that would not be able to grow with crossover or mutations since only terminals can be swapped. Besides, if there is no terminal of a given type, DEAP might enter in a recursion loop, adding repeatedly a primitive taking as argument a type that no terminal has -- depending on the primitive set, this may be highly likely or unlikely. Even if it does finally end, the produced tree may then exceed the fatal 90, after which Python parser refuses to evaluate the tree.
As DEAP developpers, we do want to listen to our users, especially when there are sensible requests. However, we also want to avoid a change that would negatively affect many other users, including beginners. These exceptions are currently more like safe-guards.
So, I ask you, since you clearly have a good knowledge of STGP and DEAP internals, what should we do? I suggest three possible ways here, don't hesitate if you see others :
- Add a member to the primitive set stating if we want to use it in strict or "tolerant" mode. Strict mode would be the default, and would provide the same behavior as the one currently implemented. Tolerant mode (I'd have to find a better name, too) would deactivate all error catching: DEAP would just run its generate function until the tree is finished or until everything crashed.
- Same thing, but by adding an additionnal argument to the generate functions (after the type_ argument). I'm not a fan of adding more and more arguments, but this has the advantage to be able to jointly use strict and "tolerant" mode, with the same pset.
- We remove the exception, but only for the no primitive case. It remains an error when there is no terminal of a given type, the rationale behind that being that a smaller tree it usually not a problem, while a larger one (or even a recursion error) is. DEAP would first try to add a primitive (to fit the required min height), and only if there's none it would fallback to terminal (and throw an error if there is no suitable terminal too). But that would break the compatibility with older versions of DEAP, since the min argument would not mean the same thing.
Do you have other ideas or thoughts about it? Don't forget that we try to stay generic here.
Have a good day,
Marc-André