1) Don't use recursion.
2) Google is your friend.
-Carsten
I thought that this was a really interesting question, so I wrote up a
solution that doesn't use recursion. I didn't test it a whole lot, but
I'm pretty sure it works -- let me know if there are any oversights or
if you can make any improvements ;-) Also, not sure if the line breaks
will be mangled by my mail client, so bear with me if there are any
errors.
def fact(n):
'''Factorial'''
r = 1
for i in xrange(1, n + 1):
r *= i
return r
def nCr(n, r):
'''Number of combinations of r items from n things'''
return fact(n) / (fact(r) * fact(n - r))
def get_combinations(slots, tokens):
'''A generator yielding combinations from slots of size tokens'''
maxcombos = nCr(len(slots), tokens)
for index in xrange(maxcombos):
token_copy = tokens
combo = []
for val in xrange(1, len(slots) + 1):
if not token_copy:
break
thresh = nCr(len(slots) - val, token_copy - 1)
if index < thresh:
combo.append(slots[val-1])
token_copy -= 1
else:
index -= thresh
yield tuple(combo)
def powerset(s):
'''Returns the powerset of s'''
pset = set()
for num_tokens in xrange(1, len(s)):
for combo in get_combinations(s, num_tokens):
pset.add(combo)
# These two are special cases
pset.add(s)
pset.add(tuple())
return pset
if __name__ == '__main__':
print powerset((1, 2, 3, 4))
--
Evan Klitzke <ev...@yelp.com>
One more obvious thing that I omitted. You can make this a lot faster
by caching the values of nCr. This is a simple modification, that I'll
leave to the reader ;-)
--
Evan Klitzke <ev...@yelp.com>
My idea would be the following.
1) Turn your set into a list: lst
2) let lng be the number of elements.
3) let n range from 0 to 2 ** lng
4) now n represents subset as follows
consider n as a binary number
bit k is set in n <=> lst[k] is a member of the subset.
--
Antoon Pardon
That may help a little but my guess is the slowness comes from
the size (2**n) of the power set.
That's true if by "a little bit" you mean "a lot." Observe:
"""
def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset
for subset in recursive_powerset(s2):
yield subset.union(set([x]))
def nonrecursive_powerset(s):
# Four lines of code hidden.
# I want the OP to figure this out for himself.
import time
print " N Recursive Non-Recursive"
print " - ------------- -------------"
for n in range(8):
t1 = time.time()
x = list(recursive_powerset(set(range(n))))
t2 = time.time()
x = list(nonrecursive_powerset(set(range(n))))
t3 = time.time()
print "%4d %12.6fs %12.6fs" % (n,t2-t1,t3-t2)
"""
Results:
N Recursive Non-Recursive
- ------------- -------------
0 0.000029s 0.000026s
1 0.000027s 0.000028s
2 0.000063s 0.000036s
3 0.000379s 0.000067s
4 0.004795s 0.000191s
5 0.045020s 0.001054s
6 0.633989s 0.013931s
7 14.881078s 0.212958s
It is correct that a power set algorithm can never be truly fast because the
run time is exponential by definition, but the non-recursive (iterative)
method is still a lot faster than the recursive method.
-Carsten
Pardon the soliloquy, but now that I'm a bit more awake, I realize that
this is unnecessarily slow due to the duplicate invocation of the
recursive call. Changing it thusly cuts the run time roughly in half:
def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset
yield subset.union(set([x]))
However, this doesn't change the fact that the iterative method blows
the recursive method out of the water.
--
Carsten Haese
http://informixdb.sourceforge.net
def recursive_powerset(s):
def do_recursive(S):
if not S:
yield set([])
return
x=set(S.pop())
for t in do_recursive(S):
yield t
yield t|x
return do_recursive(s.copy())
Right you are. Note however that x=set(S.pop()) should be
x=set([S.pop()]) for this to run.
Forget everything I've said about timing comparisons, the correct
recursive implementation is actually faster than the iterative
implementation I was testing.
-Carsten
Here's a much simpler (and faster) solution I got from a coworker:
s = range(18)
result = []
l = len(s)
for i in range(2**l):
n = i
x = []
for j in range(l):
if n & 1:
x.append(s[j])
n >>= 1
result.append(x)
print result
--
Evan Klitzke <ev...@yelp.com>
def powerset(s):
d = dict(zip(
(1<<i for i in range(len(s))),
(set([e]) for e in s)
))
subset = set()
yield subset
for i in range(1, 1<<len(s)):
subset = subset ^ d[i & -i]
yield subset
>>> list(powerset('abc'))
[set([]), set(['a']), set(['a', 'b']), set(['b']), set(['c', 'b']),
set(['a', 'c', 'b']), set(['a', 'c']), set(['c'])]
If you're using the subsets as they appear and don't need to store
them all at once, then it's significantly faster (on my machine) if
you replace the line subset = subset ^ d[i & -i] with an in-place
update: subset ^= d[i & -i].
Mark