Is there a better way of doing this?

mattia

unread,

Mar 6, 2009, 5:19:22 AM3/6/09

to

Hi, I'm new to python, and as the title says, can I improve this snippet
(readability, speed, tricks):

def get_fitness_and_population(fitness, population):
return [(fitness(x), x) for x in population]

def selection(fitness, population):
'''
Select the parent chromosomes from a population according to their
fitness (the better fitness, the bigger chance to be selected)
'''
selected_population = []
fap = get_fitness_and_population(fitness, population)
pop_len = len(population)
# elitism (it prevents a loss of the best found solution)
# take the only 2 best solutions
elite_population = sorted(fap)
selected_population += [elite_population[pop_len-1][1]] +
[elite_population[pop_len-2][1]]
# go on with the rest of the elements
for i in range(pop_len-2):
# do something

Chris Rebert

unread,

Mar 6, 2009, 5:32:33 AM3/6/09

to mattia, pytho...@python.org

Removing the unnecessary use of sorted() and using list.pop() rather
than explicit indices:

def selection(fitness, population):
'''
Select the parent chromosomes from a population according to their
fitness (the better fitness, the bigger chance to be selected)
'''

fap = get_fitness_and_population(fitness, population)
fap.sort()

# elitism (it prevents a loss of the best found solution)
# take the only 2 best solutions

selected_population = [fap.pop()[1] for i in range(2)]

# go on with the rest of the elements

for fit, pop in fap:
#do something

Cheers,
Chris

--
I have a blog:
http://blog.rebertia.com

mattia

unread,

Mar 6, 2009, 6:07:22 AM3/6/09

to

Great, the for statement has not to deal with fap anymore, but with
another sequence, like this:

def get_roulette_wheel(weight_value_pairs):
roulette_wheel = []
for weight, value in weight_value_pairs:
roulette_wheel += [value]*weight
return roulette_wheel

def selection(fitness, population):
...
rw = get_roulette_wheel(fap)
for i in range(pop_len-2):
selected_population += [choice(rw)]
return selected_population

I think that using [choice(rw)]*len(fap) will produce the same sequence
repeted len(fap) times...

Chris Rebert

unread,

Mar 6, 2009, 6:43:22 AM3/6/09

to mattia, pytho...@python.org

On Fri, Mar 6, 2009 at 3:07 AM, mattia <ger...@gmail.com> wrote:
> Great, the for statement has not to deal with fap anymore, but with
> another sequence, like this:
>
> def get_roulette_wheel(weight_value_pairs):
> roulette_wheel = []
> for weight, value in weight_value_pairs:
> roulette_wheel += [value]*weight
> return roulette_wheel
>
> def selection(fitness, population):
> ...
> rw = get_roulette_wheel(fap)
> for i in range(pop_len-2):
> selected_population += [choice(rw)]
> return selected_population
>
> I think that using [choice(rw)]*len(fap) will produce the same sequence
> repeted len(fap) times...

Revision to this new code:

def get_roulette_wheel(weight_value_pairs):
return [[value]*weight for weight, value in weight_value_pairs]

def selection(fitness, population):
...
rw = get_roulette_wheel(fap)

for i in range(len(fap)):
selected_population.append(choice(rw))
return selected_population

mattia

unread,

Mar 6, 2009, 6:52:11 AM3/6/09

to

Great, append is equivalent to += right? or more efficient?

Chris Rebert

unread,

Mar 6, 2009, 6:59:56 AM3/6/09

to mattia, pytho...@python.org

Yes. .append(item) is equivalent to += [item] and is more efficient.

Peter Otten

unread,

Mar 6, 2009, 8:06:14 AM3/6/09

to

mattia wrote:

def selection1(fitness, population, N=2):
rest = sorted(population, key=fitness, reverse=True)
best = rest[:N]
del rest[:N]
# work with best and rest

def selection2(fitness, population, N=2):
decorated = [(-fitness(p), p) for p in population]
heapq.heapify(decorated)

best = [heapq.heappop(decorated)[1] for _ in range(N)]
rest = [p for f, p in decorated]
# work with best and rest

Both implementations assume that you are no longer interested in the
individuals' fitness once you have partitioned the population in two
groups.

In theory the second is more efficient for "small" N and "large"
populations.

Peter

Paul Rubin

unread,

Mar 6, 2009, 8:18:00 AM3/6/09

to

Chris Rebert <cl...@rebertia.com> writes:
> for i in range(len(fap)):
> selected_population.append(choice(rw))

"for i in range(len(something))" is a bit of a code smell. You could
instead say:

selected_population.extend(choice(rw) for x in fap)

The unused "x" is also a slight code smell, but the most obvious cures
involve using the itertools module in ways that are worse than the disease.

mattia

unread,

Mar 6, 2009, 12:16:05 PM3/6/09

to

Ok, but the fact is that I save the best individuals of the current
population, than I'll have to choose the others elements of the new
population (than will be N-2) in a random way. The common way is using a
roulette wheel selection (based on the fitness of the individuals, if the
total fitness is 200, and one individual has a fitness of 10, that this
individual will have a 0.05 probability to be selected to form the new
population). So in the selection of the best solution I have to use the
fitness in order to get the best individual, the last individual use the
fitness to have a chance to be selected. Obviously the old population anf
the new population must have the same number of individuals.

Peter Otten

unread,

Mar 6, 2009, 4:28:00 PM3/6/09

to

mattia wrote:

You're right, it was a bad idea.

Peter

mattia

unread,

Mar 6, 2009, 4:40:31 PM3/6/09

to

Here is my last shot, where I get rid of all the old intermediate
functions:

def selection(fitness, population):
lp = len(population)
roulette_wheel = []
for x in population:
roulette_wheel += [x]*fitness(x)
selected_population = [[]]*lp
selected_population[:2] = sorted(population, key=fitness,
reverse=True)[:2]
selected_population[2:] = [choice(roulette_wheel) for _ in range
(lp-2)]

andrew cooke

unread,

Mar 6, 2009, 4:46:44 PM3/6/09

to pytho...@python.org

i have not been following this discussion in detail, so someone may have
already explained this, but it should not be necessary to actually
construct the roulette wheel to select values from it. what you are doing
is selecting from a list where the there are different probabilities of
selecting different entries. i am pretty sure that can be done more
efficiently than by constructing a new list with many more entries whose
aim is to simulate that (which is what the roulette wheel seems to be in
your code, if i have understood correctly).

more precisely, i think you can adapt the trick used to select a line at
random from a file by scanning the file just once.

sorry if i have misunderstood,
andrew

Scott David Daniels

unread,

Mar 6, 2009, 5:13:47 PM3/6/09

to

mattia wrote:
> Here is my last shot, where I get rid of all the old intermediate
> functions:
>
> def selection(fitness, population):
> lp = len(population)
> roulette_wheel = []
> for x in population:
> roulette_wheel += [x]*fitness(x)
> selected_population = [[]]*lp
> selected_population[:2] = sorted(population, key=fitness,
> reverse=True)[:2]
> selected_population[2:] = [choice(roulette_wheel) for _ in range
> (lp-2)]

Try something like this to choose likely couples:

import random
import bisect

def choose_pairs(fitness_population, decider=random):
'''Pick and yield pairs weighted by fitness for crossing.

We assume weighted_population has fitness already calculated.
decide is a parameter to allow testing.
'''
total = 0
cumulative = []
candidates = []
for fitness, individual in set(fitness_population):
# calculate total weights, extract real candidates
if fitness > 0:
total += fitness
cumulative.append(total)
candidates.append(individual)
assert len(candidates) > 1
while True:
# pick a candidate by weight
c0 = decider.random() * total
first = bisect.bisect_left(cumulative, c0)
if first:
weighting = cumulative[first] - cumulative[first - 1]
else:
weighting = cumulative[0]
# pick another distinct candidate by fitness
c1 = choice = decider.random() * (total - weighting)
if choice >= cumulative[first] - weighting:
choice += weight # adjust to avoid selecting first
second = bisect.bisect_left(cumulative, choice)
yield candidates[first], candidates[second]

--Scott David Daniels
Scott....@Acm.Org

mattia

unread,

Mar 6, 2009, 5:13:43 PM3/6/09

to

Well, I believe that using the right distribution I can for sure find a
better way for doing the roulette wheel selection. When I'll have enough
time I'll pick up my statistics book.

mattia

unread,

Mar 6, 2009, 6:31:01 PM3/6/09

to

Thanks, I've found another solution here: http://www.obitko.com/tutorials/
genetic-algorithms/selection.php
so here is my implementation:

def create_chromosome(min, max, length):
return [randint(min, max) for i in range(length)]

def create_population(nelem, min, max, length):
# preconditions: nelem > 1 and nelem is even
if not nelem > 1:
nelem = 2
if not nelem%2 == 0:
print("The population must have an even number of elements.
Correcting...")
nelem += 1
return [create_chromosome(min, max, length) for i in range(nelem)]

def get_fap(fitness, population):
fap = []
total = 0
for x in population:
f = fitness(x)
fap += [(f, x)]
total += f
return sorted(fap, reverse=True), total

def my_rw():
list, tot = get_fap(sum, pop)
r = randint(0, tot-1)
i = 0
print(r)
for f, e in list:
i += f
print(i)
if i > r:
return e
return [] # never reached

if __name__ == "__main__":
pop = create_population(5, 0, 1, 10)
# selection_mat(sum, pop)
#print(rw(sum, pop))
list, tot = get_fap(sum, pop)
print(list)
print(tot)
for i in range(6):
print(my_rw())

Gabriel Genellina

unread,

Mar 6, 2009, 9:05:53 PM3/6/09

to pytho...@python.org

En Fri, 06 Mar 2009 21:31:01 -0200, mattia <ger...@gmail.com> escribió:

> Thanks, I've found another solution here:
> http://www.obitko.com/tutorials/
> genetic-algorithms/selection.php
> so here is my implementation:
>
>

> def get_fap(fitness, population):
> fap = []
> total = 0
> for x in population:
> f = fitness(x)
> fap += [(f, x)]
> total += f
> return sorted(fap, reverse=True), total

Imagine you're working with someone side by side. You write a note in a
piece of paper, put it into an envelope, and hand it to your co-worker. He
opens the envelope, throws it away, takes the note and files it inside a
folder right at the end. And you do this over and over. What's wrong in
this story?

Please save our trees! Don't waste so many envelopes - that's just what
this line does:

fap += [(f, x)]

Environmentally friendly Pythoneers avoid using discardable intermediate
envelopes:

fap.append((f, x))

Please recycle!

--
Gabriel Genellina

Paul Rubin

unread,

Mar 6, 2009, 9:14:44 PM3/6/09

to

"Gabriel Genellina" <gags...@yahoo.com.ar> writes:
> > for x in population:
> > f = fitness(x)
> > fap += [(f, x)]
> > total += f
> > return sorted(fap, reverse=True), total

> ...

> Environmentally friendly Pythoneers avoid using discardable
> intermediate envelopes:
>
> fap.append((f, x))

I'd probably use:

fap = list((fitness(x),x) for x in population)
total = sum(x for x,y in fap)
return sorted(fap, reverse=True), total

mattia

unread,

Mar 7, 2009, 4:12:07 AM3/7/09

to

Yes, sorry, I have to recycle! But how about this:
>>> rw = [[2,4], [4,5,6],[5,5]]
>>> rw += [[1,1]]*2
>>> rw
[[2, 4], [4, 5, 6], [5, 5], [1, 1], [1, 1]]
>>> rw = [[2,4], [4,5,6],[5,5]]
>>> rw.append([1,1]*2)
>>> rw
[[2, 4], [4, 5, 6], [5, 5], [1, 1, 1, 1]]
>>> rw = [[2,4], [4,5,6],[5,5]]
>>> rw.append([[1,1]]*2)
>>> rw
[[2, 4], [4, 5, 6], [5, 5], [[1, 1], [1, 1]]]
>>>
How can I recicle in this way using append?

Lie Ryan

unread,

Mar 7, 2009, 7:11:22 AM3/7/09

to

Not .append() but .extend()

>>> rw = [[2,4], [4,5,6],[5,5]]

>>> rw.extend([[1,1]]*2)

Peter Otten

unread,

Mar 7, 2009, 7:28:27 AM3/7/09

to

Lie Ryan wrote:

> mattia wrote:

>> Yes, sorry, I have to recycle! But how about this:

>>>>> rw = [[2,4], [4,5,6],[5,5]]
>>>>> rw += [[1,1]]*2
>>>>> rw
>> [[2, 4], [4, 5, 6], [5, 5], [1, 1], [1, 1]]

>> How can I recicle in this way using append?
>
> Not .append() but .extend()

Whether you use

items += [item]*N

or

items.extend([item]*N)

is mostly a matter of style. You can avoid the intermediate list with

items.extend(itertools.repeat(item, N))

but I don't think this approach is faster.

Peter

Steven D'Aprano

unread,

Mar 7, 2009, 11:49:56 PM3/7/09

to

Gabriel Genellina wrote:

> Imagine you're working with someone side by side. You write a note in a
> piece of paper, put it into an envelope, and hand it to your co-worker. He
> opens the envelope, throws it away, takes the note and files it inside a
> folder right at the end. And you do this over and over. What's wrong in
> this story?
>
> Please save our trees! Don't waste so many envelopes

Nice story, but the moral "conserve what you use" is not always good advice.
Bits are not envelopes -- sometimes it is more environmentally friendly to
throw them away and create new ones. Consider:

mylist[:] = [x for x in mylist if not condition(x)]

versus:

for i in xrange(len(mylist)-1, -1, -1):
x = mylist[i]
if condition(x):
del mylist[i]

The first "wastefully" creates a new list, and the second tries to recycle
bits by deleting the items in place. Unless mylist is so huge that your
computer starts thrashing trying to make two copies in memory, the first is
not only simpler to write and understand, but almost certainly much, much
faster than the second.

That's not the case in this specific example, but as a general principle,
it's worth remembering that it's often better to be wasteful with temporary
objects than to use miserly algorithms invented for machines with 64K of
memory.

(The same lessons can apply for re-world considerations as well. Recycling
doesn't just happen, it requires energy and water and other costs. If the
environmental costs of recycling something are worse than the environmental
costs of throwing it away and making a new one, then recycling that object
is actually harmful. But I digress.)

--
Steven

Tim Wintle

unread,

Mar 8, 2009, 3:02:59 AM3/8/09

to pytho...@python.org

On Sun, 2009-03-08 at 15:49 +1100, Steven D'Aprano wrote:
> If the environmental costs of recycling something are worse than the
> environmental costs of throwing it away and making a new one, then
> recycling that object is actually harmful. But I digress.

Unless you live in a country that imports most of these goods, in which
case by recycling you keep money in the economy rather than buying goods
from elsewhere - it's never mentioned, but I'm fairly certain that's one
of the main reasons that the UK government loves forcing us to recycle
so much.

(obviously doesn't change how "environmentally harmful" something is)

Tim Wintle