How do I generate a diverse set of individuals without infeasible solutions?

Caroline Gebara

unread,

Jun 28, 2022, 6:47:16 AM6/28/22

to deap-users

Hi

I am trying to find "all" possible solutions of diets consisting of combinations of different food items, for which I have a set of constraints (e.g. protein > 60g, 1800 < kcal < 2200, etc.). And for that, I found that GA might be useful for solving this problem, although I'm new to this topic. Thus my questions are:

How can I avoid evaluating infeasible individuals, so that I only keep individuals that fulfil my constraints?
Is there a way to store all feasible solutions from the generations, including non-optimal?

I found a similar example using DEAP here: https://jooskorstanje.com/genetic-algorithm-optimize-your-diet.html, however, the example does not come accross constraints and it only investigates the optimal solutions. From the webpage of the DEAP library, I found that I could use a penalty function to penalise infeasible individuals, e.g. example below:

def feasible(individual):
individual = individual[0]
if (1800. < sum(x*y for x,y in zip(kcal_data,individual)) < 2200. and
60.5 < sum(x*y for x,y in zip(prot_data,individual)) and

... and so on....):
return True
return False

toolbox.decorate("evaluate", tools.DeltaPenalty(feasible, 0))

However, is my understanding that it just adds a value to the fitness values, but still keep the individual. I have approx. 35 constraints. Would there be a way to put the constraints inside the evaluation function? Or to totally delete instead of penalising?

For my second question, I could store the last population/offspring, but then I only achieve the best individuals. Since I am interested in a diverse set of solutions, that are not necessarily optimal, would there be a way to globally store each solution that respect my set of constraints?

Any hint would be appreciated,

Thanks!

Caroline

Taavi Luik

unread,

Jun 29, 2022, 12:02:15 PM6/29/22

to deap-users

Hi Carolin,

you can have a look at Quality-Diversity algorithms and the qdpy library which is compatible with DEAP (and perhaps this example). Quality-Diversity algorithms introduce a container (to track global solutions; also have a look at the MAP-Elites algorithm), where you can define which dimensions you are interested in and their feasibility regions. Solutions that fall outside that feasibility region will not be added to the container. Adding new dimensions becomes quite easy since all heavy lifting is done by the library. The library has not been updated in a year, though, and if at some point I dug into the source code, I found some bugs. But it's still worth it to check it out!

Without qpdy library: if the evaluation function does not take too much time (which I think it shouldn't), then you can do evaluation first and filter out invalid solutions - just add one intermediate step.

Best,

Taavi

Junior Research Fellow

University of Tartu, Estonia

Caroline Gebara

unread,

Jun 29, 2022, 3:32:01 PM6/29/22

to deap-users

Hi Taavi

Thank you, I'll take a look at the qdpy library, sounds cool with the container!

I have tried to add the filter and store the intermediate solutions. However, i don't get any feasible solutions, although, I know that some exist. I suspect this is because the model conerge towards the optimised indicator, without checking for feasibility, and thus the solutions that I store do not cover the feasible solutions. Running more than the iterations and individuals than I do now (i.e. 1000 * 500) already takes quite a long time. Do you know if there is a smart way to easily delete infeasible solutions everytime an individual is evaluated?

What I have tried so far is adding a feasibility function:

----------------------------------------------------------------------------------

def feasible(individual):
individual = individual[0]
if (

#'Nutritional constraints'
7*1800 < sum(x*y for x,y in zip(list(adict['Calories']),individual)) < 7*2200 and # and
7*50 < sum(x*y for x,y in zip(list(adict['Protein']),individual)) and

.............and so on.................
):
return True
return False

----------------------------------------------------------------------------------

And then I have run the model and filtered by the feasible solutions:

----------------------------------------------------------------------------------

pop = toolbox.population(n=1000)

# Evaluate the entire population
fitnesses = list(map(toolbox.evaluate, pop))
for ind, fit in zip(pop, fitnesses): # individual and fit value in population
ind.fitness.values = fit

# CXPB is the probability with which two individuals
# are crossed
#
# MUTPB is the probability for mutating an individual
CXPB, MUTPB = 0.8, 0.3

# Extracting all the fitnesses of
fits = [ind.fitness.values[0] for ind in pop]

# Variable keeping track of the number of generations
g = 0

# check feasible solution in initial population:
pop_init_valid = list(filter(feasible,pop))

# make empty df to store feasible solutions at intermediate steps

pop_all = pd.DataFrame()

pop_all = pop_all.append(pop_init_valid)

fits_all = pd.DataFrame()

# Begin the evolution
while g < 500:
# A new generation
g = g + 1
#print("-- Generation %i --" % g)

# Select the next generation individuals
offspring = toolbox.select(pop, len(pop))

#offspring = list(filter(feasible,offspring))
# Clone the selected individuals
offspring = list(map(toolbox.clone, offspring))

# Apply crossover and mutation on the offspring
for child1, child2 in zip(offspring[::2], offspring[1::2]):
if random.random() < CXPB:
toolbox.mate(child1[0], child2[0])
del child1.fitness.values
del child2.fitness.values

for mutant in offspring:
if random.random() < MUTPB:
toolbox.mutate(mutant[0])
del mutant.fitness.values

#valid_ind = [ind for ind in offspring if ind.fitness.valid]

# Evaluate the individuals with an invalid fitness
invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
fitnesses = map(toolbox.evaluate, invalid_ind)
for ind, fit in zip(invalid_ind, fitnesses):
ind.fitness.values = fit

pop[:] = offspring

# Gather all the fitnesses in one list and print the stats
fits = [ind.fitness.values[0] for ind in pop]

pop_all = pop_all.append(list(filter(feasible,pop)))
----------------------------------------------------------------------------------

This is probably what the qdpy could take of (?), but it might be that there were a way to do it without :)

Best

Caroline

Taavi Luik

unread,

Jun 30, 2022, 6:06:51 AM6/30/22

to deap-users

Check that you are evaluating the feasibility and discarding infeasible solutions at every generation not just at the end (I can't evaluate it that well here because indentation is shown quite poorly in this forum).

I think that replacing

pop[:] = offspring

with

pop[:] = list(filter(feasible, offspring))

might do the trick. But then you might have very few individuals left in your population after a few generations - it depends on your population size, how the initial population is generated, and how selection is done.

Yep, qdpy would do the filtering of solutions and tracking diverse high quality solutions for you, but you can probably implement these things yourself. If you care about diversity of final solutions (regardless of using qdpy), select parents from the global archive (your pop_all variable), which is what MAP-Elites is doing. Having a steady-state population tends to get stuck in local optima. With qdpy, adding new dimensions in the future means changing just a few lines when you are creating the container, and I suspect that qdpy filtering is also faster than your current approach.