Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Fast powerset function

316 views
Skip to first unread message

Carsten Haese

unread,
Jul 13, 2007, 1:41:02 AM7/13/07
to pytho...@python.org
On Thu, 12 Jul 2007 21:33:11 -0700, Arash Arfaee wrote
> I need a powerset generator function. It's really slow with recursion. Does
anybody have any idea or code(!!) to do it in an acceptable time?

1) Don't use recursion.

2) Google is your friend.

-Carsten

Evan Klitzke

unread,
Jul 13, 2007, 2:32:22 AM7/13/07
to Arash Arfaee, pytho...@python.org
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks
> -Arash

I thought that this was a really interesting question, so I wrote up a
solution that doesn't use recursion. I didn't test it a whole lot, but
I'm pretty sure it works -- let me know if there are any oversights or
if you can make any improvements ;-) Also, not sure if the line breaks
will be mangled by my mail client, so bear with me if there are any
errors.

def fact(n):
'''Factorial'''
r = 1
for i in xrange(1, n + 1):
r *= i
return r

def nCr(n, r):
'''Number of combinations of r items from n things'''
return fact(n) / (fact(r) * fact(n - r))

def get_combinations(slots, tokens):
'''A generator yielding combinations from slots of size tokens'''
maxcombos = nCr(len(slots), tokens)
for index in xrange(maxcombos):
token_copy = tokens
combo = []
for val in xrange(1, len(slots) + 1):
if not token_copy:
break
thresh = nCr(len(slots) - val, token_copy - 1)
if index < thresh:
combo.append(slots[val-1])
token_copy -= 1
else:
index -= thresh
yield tuple(combo)

def powerset(s):
'''Returns the powerset of s'''
pset = set()
for num_tokens in xrange(1, len(s)):
for combo in get_combinations(s, num_tokens):
pset.add(combo)
# These two are special cases
pset.add(s)
pset.add(tuple())
return pset

if __name__ == '__main__':
print powerset((1, 2, 3, 4))


--
Evan Klitzke <ev...@yelp.com>

Evan Klitzke

unread,
Jul 13, 2007, 2:53:02 AM7/13/07
to pytho...@python.org

One more obvious thing that I omitted. You can make this a lot faster
by caching the values of nCr. This is a simple modification, that I'll
leave to the reader ;-)

--
Evan Klitzke <ev...@yelp.com>

Antoon Pardon

unread,
Jul 13, 2007, 5:16:27 AM7/13/07
to
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks

My idea would be the following.

1) Turn your set into a list: lst

2) let lng be the number of elements.

3) let n range from 0 to 2 ** lng

4) now n represents subset as follows

consider n as a binary number
bit k is set in n <=> lst[k] is a member of the subset.

--
Antoon Pardon

Paul Rubin

unread,
Jul 13, 2007, 5:25:59 AM7/13/07
to
Antoon Pardon <apa...@forel.vub.ac.be> writes:
> On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> > I need a powerset generator function. It's really slow with recursion. Does
> > anybody have any idea or code(!!) to do it in an acceptable time?
> My idea would be the following. ...

> 3) let n range from 0 to 2 ** lng

That may help a little but my guess is the slowness comes from
the size (2**n) of the power set.

Carsten Haese

unread,
Jul 13, 2007, 8:15:38 AM7/13/07
to pytho...@python.org
On 13 Jul 2007 02:25:59 -0700, Paul Rubin wrote

That's true if by "a little bit" you mean "a lot." Observe:

"""
def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset
for subset in recursive_powerset(s2):
yield subset.union(set([x]))

def nonrecursive_powerset(s):
# Four lines of code hidden.
# I want the OP to figure this out for himself.

import time

print " N Recursive Non-Recursive"
print " - ------------- -------------"
for n in range(8):
t1 = time.time()
x = list(recursive_powerset(set(range(n))))
t2 = time.time()
x = list(nonrecursive_powerset(set(range(n))))
t3 = time.time()
print "%4d %12.6fs %12.6fs" % (n,t2-t1,t3-t2)
"""

Results:

N Recursive Non-Recursive
- ------------- -------------
0 0.000029s 0.000026s
1 0.000027s 0.000028s
2 0.000063s 0.000036s
3 0.000379s 0.000067s
4 0.004795s 0.000191s
5 0.045020s 0.001054s
6 0.633989s 0.013931s
7 14.881078s 0.212958s

It is correct that a power set algorithm can never be truly fast because the
run time is exponential by definition, but the non-recursive (iterative)
method is still a lot faster than the recursive method.

-Carsten

Carsten Haese

unread,
Jul 13, 2007, 9:34:41 AM7/13/07
to pytho...@python.org
On Fri, 2007-07-13 at 08:15 -0400, I wrote:
> [...]

> def recursive_powerset(s):
> if not s: yield set()
> for x in s:
> s2 = s - set([x])
> for subset in recursive_powerset(s2):
> yield subset
> for subset in recursive_powerset(s2):
> yield subset.union(set([x]))
> [...]

Pardon the soliloquy, but now that I'm a bit more awake, I realize that
this is unnecessarily slow due to the duplicate invocation of the
recursive call. Changing it thusly cuts the run time roughly in half:

def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset

yield subset.union(set([x]))

However, this doesn't change the fact that the iterative method blows
the recursive method out of the water.

--
Carsten Haese
http://informixdb.sourceforge.net


Jyotirmoy Bhattacharya

unread,
Jul 13, 2007, 1:38:24 PM7/13/07
to
On Jul 13, 6:34 pm, Carsten Haese <cars...@uniqsys.com> wrote:
> def recursive_powerset(s):
> if not s: yield set()
> for x in s:
> s2 = s - set([x])
> for subset in recursive_powerset(s2):
> yield subset
> yield subset.union(set([x]))
>
Your recursive_powerset is buggy--it generates too many sets. The loop
over the elements of s is unnecessary. Here is one alternative:

def recursive_powerset(s):
def do_recursive(S):
if not S:
yield set([])
return
x=set(S.pop())
for t in do_recursive(S):
yield t
yield t|x
return do_recursive(s.copy())

Carsten Haese

unread,
Jul 13, 2007, 1:56:51 PM7/13/07
to pytho...@python.org

Right you are. Note however that x=set(S.pop()) should be
x=set([S.pop()]) for this to run.

Forget everything I've said about timing comparisons, the correct
recursive implementation is actually faster than the iterative
implementation I was testing.

-Carsten


Evan Klitzke

unread,
Jul 13, 2007, 2:13:37 PM7/13/07
to Arash Arfaee, pytho...@python.org
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks
> -Arash

Here's a much simpler (and faster) solution I got from a coworker:

s = range(18)
result = []

l = len(s)
for i in range(2**l):
n = i
x = []
for j in range(l):
if n & 1:
x.append(s[j])
n >>= 1
result.append(x)

print result


--
Evan Klitzke <ev...@yelp.com>

Mark Dickinson

unread,
Jul 14, 2007, 2:07:17 AM7/14/07
to
If you don't care about the order of the results, you can use a Gray
code (http://en.wikipedia.org/wiki/Gray_code): this has the advantage
of only adding or removing a single element to get from one subset to
the next.

def powerset(s):
d = dict(zip(
(1<<i for i in range(len(s))),
(set([e]) for e in s)
))

subset = set()
yield subset
for i in range(1, 1<<len(s)):
subset = subset ^ d[i & -i]
yield subset

>>> list(powerset('abc'))
[set([]), set(['a']), set(['a', 'b']), set(['b']), set(['c', 'b']),
set(['a', 'c', 'b']), set(['a', 'c']), set(['c'])]

If you're using the subsets as they appear and don't need to store
them all at once, then it's significantly faster (on my machine) if
you replace the line subset = subset ^ d[i & -i] with an in-place
update: subset ^= d[i & -i].

Mark


0 new messages