Google 網路論壇不再支援新的 Usenet 貼文或訂閱項目,但過往內容仍可供查看。

Re: Fast powerset function

瀏覽次數:316 次
跳到第一則未讀訊息

Carsten Haese

未讀,
2007年7月13日 凌晨1:41:022007/7/13
收件者:pytho...@python.org
On Thu, 12 Jul 2007 21:33:11 -0700, Arash Arfaee wrote
> I need a powerset generator function. It's really slow with recursion. Does
anybody have any idea or code(!!) to do it in an acceptable time?

1) Don't use recursion.

2) Google is your friend.

-Carsten

Evan Klitzke

未讀,
2007年7月13日 凌晨2:32:222007/7/13
收件者:Arash Arfaee、pytho...@python.org
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks
> -Arash

I thought that this was a really interesting question, so I wrote up a
solution that doesn't use recursion. I didn't test it a whole lot, but
I'm pretty sure it works -- let me know if there are any oversights or
if you can make any improvements ;-) Also, not sure if the line breaks
will be mangled by my mail client, so bear with me if there are any
errors.

def fact(n):
'''Factorial'''
r = 1
for i in xrange(1, n + 1):
r *= i
return r

def nCr(n, r):
'''Number of combinations of r items from n things'''
return fact(n) / (fact(r) * fact(n - r))

def get_combinations(slots, tokens):
'''A generator yielding combinations from slots of size tokens'''
maxcombos = nCr(len(slots), tokens)
for index in xrange(maxcombos):
token_copy = tokens
combo = []
for val in xrange(1, len(slots) + 1):
if not token_copy:
break
thresh = nCr(len(slots) - val, token_copy - 1)
if index < thresh:
combo.append(slots[val-1])
token_copy -= 1
else:
index -= thresh
yield tuple(combo)

def powerset(s):
'''Returns the powerset of s'''
pset = set()
for num_tokens in xrange(1, len(s)):
for combo in get_combinations(s, num_tokens):
pset.add(combo)
# These two are special cases
pset.add(s)
pset.add(tuple())
return pset

if __name__ == '__main__':
print powerset((1, 2, 3, 4))


--
Evan Klitzke <ev...@yelp.com>

Evan Klitzke

未讀,
2007年7月13日 凌晨2:53:022007/7/13
收件者:pytho...@python.org

One more obvious thing that I omitted. You can make this a lot faster
by caching the values of nCr. This is a simple modification, that I'll
leave to the reader ;-)

--
Evan Klitzke <ev...@yelp.com>

Antoon Pardon

未讀,
2007年7月13日 清晨5:16:272007/7/13
收件者:
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks

My idea would be the following.

1) Turn your set into a list: lst

2) let lng be the number of elements.

3) let n range from 0 to 2 ** lng

4) now n represents subset as follows

consider n as a binary number
bit k is set in n <=> lst[k] is a member of the subset.

--
Antoon Pardon

Paul Rubin

未讀,
2007年7月13日 清晨5:25:592007/7/13
收件者:
Antoon Pardon <apa...@forel.vub.ac.be> writes:
> On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> > I need a powerset generator function. It's really slow with recursion. Does
> > anybody have any idea or code(!!) to do it in an acceptable time?
> My idea would be the following. ...

> 3) let n range from 0 to 2 ** lng

That may help a little but my guess is the slowness comes from
the size (2**n) of the power set.

Carsten Haese

未讀,
2007年7月13日 上午8:15:382007/7/13
收件者:pytho...@python.org
On 13 Jul 2007 02:25:59 -0700, Paul Rubin wrote

That's true if by "a little bit" you mean "a lot." Observe:

"""
def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset
for subset in recursive_powerset(s2):
yield subset.union(set([x]))

def nonrecursive_powerset(s):
# Four lines of code hidden.
# I want the OP to figure this out for himself.

import time

print " N Recursive Non-Recursive"
print " - ------------- -------------"
for n in range(8):
t1 = time.time()
x = list(recursive_powerset(set(range(n))))
t2 = time.time()
x = list(nonrecursive_powerset(set(range(n))))
t3 = time.time()
print "%4d %12.6fs %12.6fs" % (n,t2-t1,t3-t2)
"""

Results:

N Recursive Non-Recursive
- ------------- -------------
0 0.000029s 0.000026s
1 0.000027s 0.000028s
2 0.000063s 0.000036s
3 0.000379s 0.000067s
4 0.004795s 0.000191s
5 0.045020s 0.001054s
6 0.633989s 0.013931s
7 14.881078s 0.212958s

It is correct that a power set algorithm can never be truly fast because the
run time is exponential by definition, but the non-recursive (iterative)
method is still a lot faster than the recursive method.

-Carsten

Carsten Haese

未讀,
2007年7月13日 上午9:34:412007/7/13
收件者:pytho...@python.org
On Fri, 2007-07-13 at 08:15 -0400, I wrote:
> [...]

> def recursive_powerset(s):
> if not s: yield set()
> for x in s:
> s2 = s - set([x])
> for subset in recursive_powerset(s2):
> yield subset
> for subset in recursive_powerset(s2):
> yield subset.union(set([x]))
> [...]

Pardon the soliloquy, but now that I'm a bit more awake, I realize that
this is unnecessarily slow due to the duplicate invocation of the
recursive call. Changing it thusly cuts the run time roughly in half:

def recursive_powerset(s):
if not s: yield set()
for x in s:
s2 = s - set([x])
for subset in recursive_powerset(s2):
yield subset

yield subset.union(set([x]))

However, this doesn't change the fact that the iterative method blows
the recursive method out of the water.

--
Carsten Haese
http://informixdb.sourceforge.net


Jyotirmoy Bhattacharya

未讀,
2007年7月13日 下午1:38:242007/7/13
收件者:
On Jul 13, 6:34 pm, Carsten Haese <cars...@uniqsys.com> wrote:
> def recursive_powerset(s):
> if not s: yield set()
> for x in s:
> s2 = s - set([x])
> for subset in recursive_powerset(s2):
> yield subset
> yield subset.union(set([x]))
>
Your recursive_powerset is buggy--it generates too many sets. The loop
over the elements of s is unnecessary. Here is one alternative:

def recursive_powerset(s):
def do_recursive(S):
if not S:
yield set([])
return
x=set(S.pop())
for t in do_recursive(S):
yield t
yield t|x
return do_recursive(s.copy())

Carsten Haese

未讀,
2007年7月13日 下午1:56:512007/7/13
收件者:pytho...@python.org

Right you are. Note however that x=set(S.pop()) should be
x=set([S.pop()]) for this to run.

Forget everything I've said about timing comparisons, the correct
recursive implementation is actually faster than the iterative
implementation I was testing.

-Carsten


Evan Klitzke

未讀,
2007年7月13日 下午2:13:372007/7/13
收件者:Arash Arfaee、pytho...@python.org
On 7/12/07, Arash Arfaee <Ar...@ece.ucsb.edu> wrote:
> I need a powerset generator function. It's really slow with recursion. Does
> anybody have any idea or code(!!) to do it in an acceptable time?
> Thanks
> -Arash

Here's a much simpler (and faster) solution I got from a coworker:

s = range(18)
result = []

l = len(s)
for i in range(2**l):
n = i
x = []
for j in range(l):
if n & 1:
x.append(s[j])
n >>= 1
result.append(x)

print result


--
Evan Klitzke <ev...@yelp.com>

Mark Dickinson

未讀,
2007年7月14日 凌晨2:07:172007/7/14
收件者:
If you don't care about the order of the results, you can use a Gray
code (http://en.wikipedia.org/wiki/Gray_code): this has the advantage
of only adding or removing a single element to get from one subset to
the next.

def powerset(s):
d = dict(zip(
(1<<i for i in range(len(s))),
(set([e]) for e in s)
))

subset = set()
yield subset
for i in range(1, 1<<len(s)):
subset = subset ^ d[i & -i]
yield subset

>>> list(powerset('abc'))
[set([]), set(['a']), set(['a', 'b']), set(['b']), set(['c', 'b']),
set(['a', 'c', 'b']), set(['a', 'c']), set(['c'])]

If you're using the subsets as they appear and don't need to store
them all at once, then it's significantly faster (on my machine) if
you replace the line subset = subset ^ d[i & -i] with an in-place
update: subset ^= d[i & -i].

Mark


0 則新訊息