Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

The FASTEST subsets function [Was: Subsets of a list]

340 views
Skip to first unread message

ol...@pobox.com

unread,
Jan 12, 2002, 3:56:01 AM1/12/02
to
This article will show the design of the fastest ever interpreted
subsets function. Sorry I'm too excited about this. Again, we start
with the mathematical definition of the problem, which leads to a
simple, correct and stunningly efficient solution. The final, fastest
ever solution is still pure functional.

We start with the definition of the problem: given a set 'l' and a
number 'n', return the set of all subsets of 'l' of cardinality
'n'. Sets are represented by lists.

The key was to choose the right definition.

Let ps(L) is a powerset of L that does not include {empty set} (that
is, a singular set whose element is the empty set).
length(ps(L)) = (- (expt 2 (length L)) 1)

Let L be a non-empty set and let L = A U B where A and B are two
disjoint subsets of L.

Obviously,
ps(L) = ps(A U B) =
ps(A) U ps(B) U { y U x | x <- ps(A), y <- ps(B) }
Let (subsets L n) = (filter (lambda (el) (= n (length el))) ps(L) )

Thus the desired function 'subsets' is a filtered powerset ps. This
seems to be a stupid definition. Let's not be hasty however. Note that
filter and union commute: the filter of a union of sets is the union
of filtered sets. Therefore, from the previous expression

(subsets L n) = (subsets (union A B) n)
= (subsets A n) U (subsets B n)
Union{ y U x | x <- (subsets A k), y <- (subsets B (- n k)),
k=1,n-1 }

Well, this is it. Note, we didn't say how to split L into two disjoint
subsets A and B. We can do as we wish. For example, we can choose to
split in such a way so that (length B) is n. In the most difficult
case where n = (/ (length L) 2), this corresponds to a "divide and
conquer" strategy, so to speak (at least in the first stages).

The Scheme code below implements this idea verbatim. It uses the
accumulator-passing style that was expounded earlier.

(define (subsets-v5 l n)

; The initialization function. Check the boundary conditions
(define (loop l ln n accum)
(cond
((<= n 0) (cons '() accum))
((< ln n) accum)
((= ln n) (cons l accum))
((= n 1)
(let fold ((l l) (accum accum))
(if (null? l) accum
(fold (cdr l) (cons (cons (car l) '()) accum)))))
(else
(split l ln n accum))))

; split l in two parts a and b so that (length b) is n
; Invariant: (equal? (append a b) l)
; ln is the length of l
(define (split l ln n accum)
(let loop ((a '()) (b l) (lna 0) (lnb ln))
(if (= lnb n) (cont a lna b lnb n accum)
(loop (cons (car b) a) (cdr b) (+ 1 lna) (- lnb 1)))))

; The main body of the algorithm
(define (cont a lna b lnb n accum)
(let* ((accum
(loop a lna n accum))
(accum ; this is actually (loop b lnb n accum)
(cons b accum))
)
(let inner ((k 1) (accum accum))
(if (> k (min lna (- n 1))) ; don't loop past meaningful boundaries
accum
(let ((as (loop a lna k '()))
(bs (loop b lnb (- n k) '())))
(inner (+ 1 k)
; compute the cross-product of as and bs
; and append it to accum
(let fold ((bs bs) (accum accum))
(if (null? bs) accum
(fold (cdr bs)
(append
(map (lambda (lst) (append lst (car bs))) as)
accum))))))))))

(loop l (length l) n '()))


The benchmark runs in 993 ms of user time and allocates only 36.5 MB
of memory, on Gambit-C interpreter. This is the absolute, incredible
record. Under SCM:

subsets-v3 (called combos by John David Stone)
;Evaluation took 1596 mSec (98 in gc) 657662 cells work, 4721364 env, 97 bytes other

subsets-v5:
;Evaluation took 700 mSec (322 in gc) 1708112 cells work, 610264 env, 105 bytes
other
That is, more than twice as fast.

Continuing the table from the previous post:

Procedure Gambit-C Bigloo 2.4b Bigloo 2.4b
interpreter, s interpreter, s compiler, s
subsets-v0 285.0 11.59 5.62 3.14
subsets-v1 6.3 5.45 2.22 0.34
subsets-v3 8.1 4.78 0.96 0.27
subsets-v20 14.1 5.53 0.96 0.26
subsets-v21 7.7 4.88 0.66 0.26
subsets-v22 5.0 3.18 0.62 0.25
subsets-v23 4.1 2.86 0.82 0.25
subsets-v5 0.9 1.56 1.10 0.76

Well, compiled code isn't that fast -- but that because subsets-v5
isn't too optimal. The append and map ought to be converted into the
accumulation-passing style. That will reduce the amount of garbage as
well. But it's past midnight.

The conclusion of this Friday night exercise is astonishingly trite.
What a Math teacher told us is true: we have to attack the algorithm
if we want to really big improvements. And Math rules!

Ji-Yong D. Chung

unread,
Jan 12, 2002, 2:43:41 PM1/12/02
to
I did not google on ths topic -- it is probably
too esoteric.

But what _is_ subsets function? What does it
compute, and what is it used for?

I would be grateful for additional comments.

<ol...@pobox.com> wrote in message
news:7eb8ac3e.02011...@posting.google.com...

John David Stone

unread,
Jan 12, 2002, 4:18:16 PM1/12/02
to
ol...@pobox.com (ol...@pobox.com) writes:

> The benchmark runs in 993 ms of user time and allocates only 36.5 MB
> of memory, on Gambit-C interpreter. This is the absolute, incredible
> record. Under SCM:
>
> subsets-v3 (called combos by John David Stone)
> ;Evaluation took 1596 mSec (98 in gc) 657662 cells work, 4721364 env, 97 bytes other
>
> subsets-v5:
> ;Evaluation took 700 mSec (322 in gc) 1708112 cells work, 610264 env, 105 bytes
> other
> That is, more than twice as fast.

Hmm. In my testing environment:

> (time (begin (subsets-v3 (list 1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20) 10) (values)))
(time (begin (subsets-v3 (...) ...) ...))
4 collections
200 ms elapsed cpu time, including 110 ms collecting
195 ms elapsed real time, including 106 ms collecting
4300568 bytes allocated, including 1818144 bytes reclaimed

> (time (begin (subsets-v5 (list 1 2 3 4 5 6 7 8 9 10
11 12 13 14 15 16 17 18 19 20) 10) (values)))
(time (begin (subsets-v5 (...) ...) ...))
12 collections
400 ms elapsed cpu time, including 320 ms collecting
456 ms elapsed real time, including 365 ms collecting
13524544 bytes allocated, including 4595512 bytes reclaimed

That is: Apart from garbage collection, the two procedures are
very closely comparable -- but SUBSETS-V3 creates only a third as much
garbage.

My guess is that, in my testing environment, SUBSETS-V3 is getting
a bigger advantage from Chez Scheme's optimizations. The results I get under
MzScheme are similar to Oleg's:

> (time (begin (subsets-v3 (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20) 10) (values)))
cpu time: 1530 real time: 1532 gc time: 580
> (time (begin (subsets-v5 (list 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20) 10) (values)))
cpu time: 640 real time: 635 gc time: 330

John David Stone - Lecturer in Computer Science and Philosophy
Manager of the Mathematics Local-Area Network
Grinnell College - Grinnell, Iowa 50112 - USA
st...@cs.grinnell.edu - http://www.cs.grinnell.edu/~stone/

Bradley J Lucier

unread,
Jan 12, 2002, 6:15:07 PM1/12/02
to
In article <7eb8ac3e.02011...@posting.google.com>,

ol...@pobox.com <ol...@pobox.com> wrote:
>The conclusion of this Friday night exercise is astonishingly trite.
>What a Math teacher told us is true: we have to attack the algorithm
>if we want to really big improvements. And Math rules!

John Rice at Purdue likes to say that in the past forty years algorithmic
improvements (FFT, multigrid, fast multipole methods, ...) have accounted
for more of the speedup in scientific computing than improvements in
hardware.

And he's right.

Brad Lucier

Jeffrey M. Vinocur

unread,
Jan 14, 2002, 11:55:22 PM1/14/02
to
In article <a1q390$3bg$1...@bob.news.rcn.net>,

Ji-Yong D. Chung <virtua...@erols.com> wrote:
> I did not google on ths topic -- it is probably
>too esoteric.
>
> But what _is_ subsets function? What does it
>compute, and what is it used for?

It was a question someone asked in this group last week.


--
Jeffrey M. Vinocur * jm...@cornell.edu
http://www.people.cornell.edu/pages/jmv16/

Eli Barzilay

unread,
Jan 15, 2002, 5:34:13 AM1/15/02
to
ol...@pobox.com (ol...@pobox.com) writes:
> This article will show the design of the fastest ever interpreted
> subsets function. Sorry I'm too excited about this. Again, we start
> with the mathematical definition of the problem, which leads to a
> simple, correct and stunningly efficient solution. The final,
> fastest ever solution is still pure functional.
> [...]

I had a little scripting fun with measuring the times for the
different programs, and it looks like the memoized simple version
still beats everything else. subsets-v23 has the best GC time, and
subsets-v5 has a slight advantage in its (GC - CPU) time for some
inputs. All this is on MzScheme 103, so other systems might look a
little different...

http://sardine.cs.columbia.edu:8080/subsets/

All graphs are timings of different lengths out of a set of 22
elements, msubset is the memoization of Oleg's first version, msubset1
is even simpler, removing the special case for len=1. The code is at

http://sardine.cs.columbia.edu:8080/subsets/subsets.ss

but this is just for reference, it contains all the code pieces with
some junk that makes gnuplot doo all the graphs.

--
((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay:
http://www.barzilay.org/ Maze is Life!

Doug Quale

unread,
Jan 16, 2002, 4:10:33 AM1/16/02
to
The subsets implementations posted so far have been interesting. I
would like to throw a lazy stream implementation into the ring. I
haven't benchmarked it extensively but it seems competitive with the
code suggested so far (much faster than the slow implementations but
not quite as fast as the fastest).

Obviously a lazy functional language like Haskell shows this
implementation in its best light. As Oleg posted, we can define the
subsets function mathematically as

subsets(S, 0) = {{}}
subsets({}, n) = {}
subsets({a} U S, n) = {{a}} x subsets(S, n-1) U subsets(S, n)

This translates directly into Scheme or Haskell. Scheme code has
already been posted. In Haskell it looks like this:

subsets _ 0 = [[]]
subsets [] _ = []
subsets (x:xs) n = map (x :) (subsets xs (n - 1)) ++ (subsets xs n)

This is slow, partly because it conses too much but mostly because the
same subproblems are solved repeatedly. Memoization can be used to
attack this performance problem. The lazy streams approach will give
many the benefits of memoization while avoiding most of the
disadvantages.

Memoization pros

- can be applied automatically without changing the naive
implementation

- memoized results can be saved across top level invocations of the
function

Memoization cons

- not always faster

- doesn't work well if function arguments are complex (the required
hash compare might be too slow)

- usually implemented using imperative machinery under the hood
(although the resulting memoized function has functional behavior)

- can lead to horrendous space leaks in the memo tables; recovering
this space might require manual control partly negating the first
pro listed above

In a lazy stream formulation, the nth element of the lazy stream will
be a list of the subsets of cardinality n. In other words, the stream
will be [subsets(S,0), subsets(S,1), subsets(S,2), ...]. The first
stream element will always be the list of the empty set. For all n
greater than length of the input list the element will be null. This
is very smooth in Haskell:

subsets s n = (subsets_stream s) !! n

subsets_stream [] = [[]] : repeat []
subsets_stream (x:xs) =
let r = subsets_stream xs
s = map (map (x:)) r
in [[]] : zipWith (++) s (tail r)

As a small bonus We can easily use subsets_stream to implement the
powerset function:

powerset s = concat (takeWhile (not . null) (subsets s))

Unfortunately this isn't as pretty in Scheme. Comparing the Scheme
implementation below to Haskell, Scheme is far more verbose since we
have to explicitly request laziness, we have to provide lazy
equivalents of standard library functions (CONS, CAR, CDR, LIST-REF
and MAP, although these could be put in an external library) and
finally currying is more convenient to use in Haskell than it is in
Scheme. (Often Haskell's pattern matching would be an advantage as
well, but here it plays no important part.) One tricky part about
explicitly requesting laziness is that it's easy to get wrong. Look
at the mix of uses of STREAM-MAP vs MAP and STREAM-CONS vs CONS in
SUBSET-STREAMS below.

Since most Scheme implementations don't natively support R5RS macros,
I have simply replaced all uses of STREAM-CONS with
(CONS (DELAY a) (DELAY b)). Notice that I use fully lazy streams.
The streams in SICP have lazy tails but eager heads.

; Most Scheme implementations don't have R5RS macros, alas.
;(define-syntax stream-cons
; (syntax-rules ()
; ((stream-cons hd tl) (cons (delay hd) (delay tl)))))

(define (subsets s n)
(stream-ref (subsets-stream s) n))

;; Return a lazy stream of all subsets of list XS. The nth element of
;; the stream is a list of the subsets of cardinality n (counting from 0).
(define (subsets-stream xs)
(if (null? xs)
(cons (delay (list '())) (delay (stream-repeat '())))
(let* ((r (subsets-stream (cdr xs)))
(s (stream-map (lambda (ss)
(map (lambda (y) (cons (car xs) y)) ss))
r)))
(cons (delay (list '()))
(delay (stream-map append s (stream-cdr r)))))))

;; The remainder should be part of a lazy streams library.
;; These are the lazy list equivalents of CAR, CDR, LIST-REF and MAP.
;; STREAM-REPEAT is inspired by Haskell's repeat function.

(define (stream-car s)
(force (car s)))

(define (stream-cdr s)
(force (cdr s)))

;; Create an infinite stream of X's.
(define (stream-repeat x)
(cons (delay x) (delay (stream-repeat x))))

(define (stream-ref s n)
(if (= n 0)
(stream-car s)
(stream-ref (stream-cdr s) (- n 1))))

;; Answer to an easy exercise in SICP. Most SICP exercises are harder ;-)
(define (stream-map f . streams)
(if (null? (car streams))
'()
(cons (delay (apply f (map stream-car streams)))
(delay (apply stream-map
(cons f (map stream-cdr streams)))))))

--
Doug Quale

Eli Barzilay

unread,
Jan 16, 2002, 6:05:48 AM1/16/02
to
Doug Quale <qua...@charter.net> writes:

> [...] The lazy streams approach will give many the benefits of


> memoization while avoiding most of the disadvantages.
>
> Memoization pros
>
> - can be applied automatically without changing the naive
> implementation
>
> - memoized results can be saved across top level invocations of the
> function

Well, your implementation fails *both* of these, the second one is not
important, but I consider the first one very important.


> Memoization cons
>
> - not always faster

Of course you should know when to use it, but when you *do*, it's
better than changing the function with an explicit matrix or whatever.


> - doesn't work well if function arguments are complex (the required
> hash compare might be too slow)

Same as above -- the basic assumption is that an argument hash lookup
is much faster than the computation for that argument, so you should
know when to use it. (Also, know what hash to use, the subsets was
particularly easy since an eq? test is enough.)


> - usually implemented using imperative machinery under the hood
> (although the resulting memoized function has functional
> behavior)

Ah, but that's not a bad thing -- it's a good thing that you can throw
efficiency machinery on the underlying implementation and keep your
code functional -- this is exactly the advantage of memoization over
standard dynamic programming techniques. (The implementation of
Haskell is imperative too...)


> - can lead to horrendous space leaks in the memo tables; recovering
> this space might require manual control partly negating the first
> pro listed above

Either you know what you're doing (for example, have a wrapper that
uses a fresh table for every top level call), or you can simply use
weak pointers.


> In a lazy stream formulation, the nth element of the lazy stream

> will be a list of the subsets of cardinality n. [...]

That's one formulation, but it points to one thing which is nice when
you're lazy -- you never care to generalize your code making it do
more than you really need since redundant computations will not be
done... (My standard example is instead of finding *a* solution for
the n-queen problem, you find all of them and pull out the first.)


> As a small bonus We can easily use subsets_stream to implement the
> powerset function:
>
> powerset s = concat (takeWhile (not . null) (subsets s))

The funny thing is that this powerset is much simpler in any case...
(Probably one line in Haskell)


> (define (subsets s n)
> (stream-ref (subsets-stream s) n))

> [...]

I've added this to my thing, called `ssubsets' (the last one).

http://sardine.cs.columbia.edu:8080/subsets/

Doug Quale

unread,
Jan 16, 2002, 4:53:42 PM1/16/02
to
Eli Barzilay <e...@barzilay.org> writes:

> Doug Quale <qua...@charter.net> writes:
>
> > [...] The lazy streams approach will give many the benefits of
> > memoization while avoiding most of the disadvantages.
> >
> > Memoization pros
> >
> > - can be applied automatically without changing the naive
> > implementation
> >
> > - memoized results can be saved across top level invocations of the
> > function
>
> Well, your implementation fails *both* of these, the second one is not
> important, but I consider the first one very important.

I don't think of lazy streams as requiring changes to the naive
implementation. I think lazy is the naive implementation if you look
at it the right way -- it follows directly from the mathematical
definition of the problem. It does require a different perspective,
so maybe it isn't as naive as I would like to think.

Saving results across top level invocations is at odds with the space
leak problem I mention later. For a restricted set of arguments for
this specific problem, you can encapsulate the answers for all of the
integer subsets in one place. This is a bit ugly in Scheme, and it
also changes the arguments to the subsets function. (Instead of
passing a list and a number we need two numbers.)

(define (ints-from n) (cons (delay n) (delay (ints-from (+ n 1)))))

(define (int-subsets-stream-from n)
(cons (delay (subset-stream (ints-from 1))
(delay (int-subsets-stream-from (+ n 1))))))

;; Return a list of all subsets of the integers {1, ..., N} of
;; cardinality R.
(define subsets
(let ((all-subsets (cons (delay '(())
(delay (int-subsets-stream-from 1))))))
(lambda (n r)
(stream-ref (stream-ref all-subsets n) r))))

Space leaks are a problem here and would require manual intervention.
(No better than memoization on this point.)

Ignoring the space issues, this is a one-liner in Haskell:

int_subsets = [subsets_stream [1 .. n] | n <- [0 ..]]
subsets n r = (int_subsets !! n) !! r

I didn't mention the performance benefits of memoization from avoiding
repeated computation of the same subproblems since it is an obvious
feature. I was thinking of it when I wrote "many", but perhaps many
wasn't the right word when I meant just two out of three.

> > Memoization cons
> >
> > - not always faster
>
> Of course you should know when to use it, but when you *do*, it's
> better than changing the function with an explicit matrix or whatever.

I agree. One thing to watch out for is that it isn't always clear
whether memoization will help. It is possible to have a single
function that when invoked with certain arguments will benefit from
memoization and with others will not -- how do you choose when to
memoize?

Of course laziness can certainly hurt performance as well. In Haskell
this problem shows up in the other direction -- you have to decide
when you want to be strict instead of lazy. My guess is that this is
normally harder than deciding when memoization will help.

> > - doesn't work well if function arguments are complex (the required
> > hash compare might be too slow)
>
> Same as above -- the basic assumption is that an argument hash lookup
> is much faster than the computation for that argument, so you should
> know when to use it. (Also, know what hash to use, the subsets was
> particularly easy since an eq? test is enough.)

Certainly. This basic assumption for memoization means that lazy
streams will work some times when memoization is not appropriate. The
lazy implementation doesn't need to compare function arguments.

> > - usually implemented using imperative machinery under the hood
> > (although the resulting memoized function has functional
> > behavior)
>
> Ah, but that's not a bad thing -- it's a good thing that you can throw
> efficiency machinery on the underlying implementation and keep your
> code functional -- this is exactly the advantage of memoization over
> standard dynamic programming techniques. (The implementation of
> Haskell is imperative too...)

I agree that mutation isn't always a bad thing, but I think sometimes
using it can be a bad thing. In particular, mutation has performance
implications for some types of generational garbage collection. With
this noted, I agree with you and I like to have mutation available.

I agree completely on the advantage of memoization over dynamic
programming. The memoized function treated as a black box is a
functional unit, and standard dynamic programming techniques can't be
treated that way. The dynamic programming techniques are normally a
lot harder to code correctly too, so bugs are much more likely.

Does anyone have a good dynamic programming implementation of the
classic count_change problem? I looked at this, but it's hard to get
the dynamic programming implementation correct. This problem memoizes
very nicely, and there is an elegant lazy streams implementation.
"Concrete Mathematics" gives the solution in closed form, but the
closed form looks to be about as hard (or harder!) to compute than
just using efficient ways to enumerate the combinations.

> > - can lead to horrendous space leaks in the memo tables; recovering
> > this space might require manual control partly negating the first
> > pro listed above
>
> Either you know what you're doing (for example, have a wrapper that
> uses a fresh table for every top level call), or you can simply use
> weak pointers.

If you use the wrapper, you lose advantage 2 (automatic saving of
results across top level calls). That's an advantage that laziness
doesn't necessarily automatically confer anyway, and I agree with your
earlier assessment that it often isn't very important. More
troublesome is the fact that even using wrappers can leave space leaks
in the memo tables. (More complex wrappers can be used to mitigate
both these issues. The memo table can be treated as an LRU cache that
is never permitted to grow larger than a certain size, for instance.
On the down side, this complicates the implementation and adds the
potential for poor performance. All the same, this sort of caching is
essential for reasonable performance in many programs. One example
out of thousands is the caching of icons in a GUI file manager.)

Weak pointers are an excellent solution with one caveat: they aren't a
standard part of Scheme so you can't write your code in R5RS. Weak
pointers need gc support so they aren't available in all Scheme
implementations . (But I think they should be. Hash consing is a
specific application of memoization that is very useful but really
requires weak pointers to be practical.) Of course lazy evaluation is
subject to space leaks also, and I think those leaks are usually
harder to track down, understand and fix than memo table leaks.

I like memoization and I didn't intend to try to bury it. (I should
have simply provided the lazy streams code and avoided
editorializing!) I only wanted to compare memoization to laziness
since both techniques are often useful in the same problem domains.
When both techniques apply to a problem, I think memoization is
generally easier and more natural to use in strict languages like
Scheme and SML. Lazy streams are shown to better advantage in lazy
languages like Haskell. (Of course I first saw lazy streams in Peter
Henderson's book which uses them in a lisp environment. At that time
Haskell didn't yet exist and I was unfamiliar with KRC and Miranda
etc. Actually I must have seen Burge's book "Recursive Programming
Techniques" around that time, but I didn't understand it!)

> That's one formulation, but it points to one thing which is nice when
> you're lazy -- you never care to generalize your code making it do
> more than you really need since redundant computations will not be
> done... (My standard example is instead of finding *a* solution for
> the n-queen problem, you find all of them and pull out the first.)

Right. It's pretty cute in Haskell. Since Haskell is fully lazy, you
can try something like

subsets [1 .. 1000] 10

in an interactive Haskell system like Hugs and it will start printing
solutions immediately. I don't think the computation would complete
before the universe ends (there are 1000!/((1000 - 10)! x 10!)
different subsets in the result list), but you can immediately start
processing as many of them as you time for. (The Scheme
implementation could be made to do this too. It would require using a
special writer to display the resulting lazy list, forcing each
element just before printing it. Fully lazy languages like Haskell do
this automatically.)


--
Doug Quale

Eli Barzilay

unread,
Jan 16, 2002, 6:47:38 PM1/16/02
to
Doug Quale <qua...@charter.net> writes:

> I don't think of lazy streams as requiring changes to the naive
> implementation. I think lazy is the naive implementation if you
> look at it the right way -- it follows directly from the
> mathematical definition of the problem. It does require a different
> perspective, so maybe it isn't as naive as I would like to think.

Yeah, well, I don't think there is any way to know for sure, if this
is only you who take it as a simpler way, or me who is to used to
eager computations. It's just one of these things that you either get
a feeling for, so I guess that some people would consider it a better
approach. (Probably the other example of a completely different mode
of thinking is Prolog...)


> [...] Space leaks are a problem here and would require manual


> intervention. (No better than memoization on this point.)

But you could make it clear the hash table on every call. The point
is that you should know what decisions to make, and then have some
library to handle them (so you'll get different memoization
functions), so your function is always the same. But even a direct
hash table allows you to have you're function almost unmodified.


> I agree. One thing to watch out for is that it isn't always clear
> whether memoization will help. It is possible to have a single
> function that when invoked with certain arguments will benefit from
> memoization and with others will not -- how do you choose when to
> memoize?

So we're definitely saying the same thing here -- you have to know
what you're doing. Initially, I thought that something like laziness
in Haskell will take care of these things automatically too, but I
never got too deep into it while I do frequently catch a thread on
c.l.functional from some poor soul trying to figure out why some tiny
piece of code ate away every free byte around... The conclusion that
could be made here is that you could use laziness too, if you *know*
how to use it. In any case, my opinion is that memoization is better
than hand-optimizing solutions, and I can accept the fact that if you
see lazy evaluation as natural than that's another way to go. Of
course there are always cases where there's nothing else but sitting
and thinking how to optimize things manually, this one just turned as
a nice proof of a concept...


> Certainly. This basic assumption for memoization means that lazy
> streams will work some times when memoization is not appropriate.
> The lazy implementation doesn't need to compare function arguments.

Can you think of a simple example? (That would be useful for future
teaching...)


> I agree completely on the advantage of memoization over dynamic
> programming. The memoized function treated as a black box is a
> functional unit, and standard dynamic programming techniques can't
> be treated that way. The dynamic programming techniques are
> normally a lot harder to code correctly too, so bugs are much more
> likely.

Heh, that's a good enough description to save for future arguments.
(Then again, the problem with people who do only theory is that they
consider all this as just low-level software engineering, throwing it
to the I-don't-care-about-it-as-long-as-it-works zone. Probably the
same reason we're still stuck with Turing Machines for teaching CS
foundations...)


[I'll continue very briefly due to time constraints and avoiding "me
too"s.]

> Weak pointers are an excellent solution with one caveat: they aren't
> a standard part of Scheme so you can't write your code in R5RS.

Which is why I prefer the newer approach of Scheme being a general
family and omitting to a good implementation...


> I like memoization and I didn't intend to try to bury it.

I didn't think you did... The nice thing that came out is that it
was also pretty close (the difference can just as well be the extra
function call overhead). I wonder if this is related to the fact that
a lazy value is computed only once so you also get memoization... Let
me see (replacing delays and forces by thunks), yeah -- looks like
it's taking exponential time (couldn't even get more than 8/22), so no
wonder they're the same. Now I'm even more curious if there's a
problem where laziness is better than memoization!

> Fully lazy languages like Haskell do this automatically.)

Well, R5RS allows some flexibility:

| * Some implementations may implement "implicit forcing," where the
| value of a promise is forced by primitive procedures like cdr and +:
| (+ (delay (* 3 7)) 13) ==> 34

I wonder if it is possible to hack something up to make it do so
without a full evaluator implementation...

Bradley J Lucier

unread,
Jan 17, 2002, 1:35:43 AM1/17/02
to
BTW, there are 184756 subsets of size 10 in a length-20 list set,,
so in Gambit, which takes 12 bytes per cons cell, this takes

> (* 184756 (+ 12 (* 10 12)))
24387792

bytes. Interpreted, the computation takes 38298824 bytes,
compiled, it takes 31401552 bytes using subsets-v5. That's
not so bad.

Brad

Eli Barzilay

unread,
Jan 17, 2002, 1:42:44 AM1/17/02
to

But what about lists that share parts (which most of it is)?

Eli Barzilay

unread,
Jan 17, 2002, 3:48:53 AM1/17/02
to
b...@cs.purdue.edu (Bradley J Lucier) writes:

> BTW, there are 184756 subsets of size 10 in a length-20 list set,,
> so in Gambit, which takes 12 bytes per cons cell, this takes
>
> > (* 184756 (+ 12 (* 10 12)))
> 24387792
>
> bytes. Interpreted, the computation takes 38298824 bytes,
> compiled, it takes 31401552 bytes using subsets-v5. That's
> not so bad.

Just for fun:

(define (count! l)
(define c 0)
(define !! (list '!!))
(let loop ((l l))
(if (and (pair? l) (not (eq? (car l) !!)))
(let ((x (car l)) (y (cdr l)))
(set-car! l !!)
(set! c (+ 1 c))
(loop x) (loop y))
0))
c)

and then:

| > (define x (subsets-v5 '(a b c d e f g h i j k l m n o p q r s t) 10))
| > (length x)
| 184756
| > (count! x)
| 1111280

That is, only ~1MB was actually used.

Al Gore Rhythm Petrofsky

unread,
Jan 17, 2002, 4:59:26 AM1/17/02
to
Eli Barzilay <e...@barzilay.org> writes:
> b...@cs.purdue.edu (Bradley J Lucier) writes:
>
> > BTW, there are 184756 subsets of size 10 in a length-20 list set,,
> > so in Gambit, which takes 12 bytes per cons cell, this takes
> >
> > > (* 184756 (+ 12 (* 10 12)))
> > 24387792
> >
> > bytes.
>
> But what about lists that share parts (which most of it is)?

I believe the minimum number of unique pairs in the result is

537471 = [ C(20,10) + C(19,9) + C(18,8) + ... + C(11,1) ] + C(20,10)

where C(n,m) = n!/(m!n-m!). (The extra C(20,10) are the pairs
connecting the subsets into a list.) I'll leave the proof of this
claim as an exercise for Oleg.

537471 is the number of conses done by the accumulator-based version:

(define (subsets l n)
(let ss ((l l) (len (length l)) (n n) (little-acc '()) (big-acc '()))
(cond ((zero? n) (cons little-acc big-acc))
((> n len) big-acc)
(else (ss (cdr l) (- len 1) (- n 1) (cons (car l) little-acc)
(ss (cdr l) (- len 1) n little-acc big-acc))))))

You can save an additional whopping ten pairs by reusing parts of the
argument. I proclaim the following the STINGIEST subsets function
ever known to man:

(define (subsets l n)
(let ss ((len (length l)) (n n) (little-acc '()) (big-acc '()))
(cond ((zero? n) (cons little-acc big-acc))
((> n len) big-acc)
(else (let ((last-pair (list-tail l (- len 1))))
(ss (- len 1) (- n 1)
(if (eq? (cdr last-pair) little-acc)
last-pair
(cons (car last-pair) little-acc))
(ss (- len 1) n little-acc big-acc)))))))

-al


P.S. A drawback of the stream and memoization versions is that they
take a ridiculous amount of time to find all the length 18 subsets of
a 20 element list.

Eli Barzilay

unread,
Jan 17, 2002, 5:42:08 AM1/17/02
to
Al Gore Rhythm Petrofsky <al-gore...@petrofsky.org> writes:

> I believe the minimum number of unique pairs in the result is
>
> 537471 = [ C(20,10) + C(19,9) + C(18,8) + ... + C(11,1) ] + C(20,10)
>

> [...] I'll leave the proof of this claim as an exercise for Oleg.

Hahaha!!


> 537471 is the number of conses done by the accumulator-based version:

> [...]
> (define (subsets l n) [...])


>
> You can save an additional whopping ten pairs by reusing parts of the
> argument. I proclaim the following the STINGIEST subsets function
> ever known to man:
>

> (define (subsets l n) [...])

Well, my count! thing shows that (when you use it on both the original
and the result so it takes it into account), and it also shows that it
is the stingiest winning by these 10 cells. Anyway, this is the count
for all the rest:

subsets 537471
subs 537471
subs2 537471
combos 537471
subset 2032316
msubset 537526
msubset1 537471
subsets-v23 537471
subsets-v5 1111280
ssubsets 537471


> P.S. A drawback of the stream and memoization versions is that they
> take a ridiculous amount of time to find all the length 18 subsets
> of a 20 element list.

Huh?? Either you doing something wrong, or I did something wrong, or
something is strange with the implementation you use, but I get a
"uniform" 10ms time for both (I think that's as small as it can get
above zero).

Doug Quale

unread,
Jan 17, 2002, 11:00:40 AM1/17/02
to
Al Gore Rhythm Petrofsky <al-gore...@petrofsky.org> writes:

> P.S. A drawback of the stream and memoization versions is that they
> take a ridiculous amount of time to find all the length 18 subsets of
> a 20 element list.

Huh??? Did you actually try this? It ain't true.

--
Doug Quale

Allegation Petrofsky

unread,
Jan 17, 2002, 2:14:18 PM1/17/02
to
Al Gore Rhythm Petrofsky <al-gore...@petrofsky.org> writes:
> P.S. A drawback of the stream and memoization versions is that they
> take a ridiculous amount of time to find all the length 18 subsets of
> a 20 element list.

Oops. That was two stupid claims in one. Sorry. Please ignore.

Allegiance Petrofsky

unread,
Jan 17, 2002, 2:33:42 PM1/17/02
to
Eli Barzilay <e...@barzilay.org> writes:

> Well, my count! thing shows that (when you use it on both the original
> and the result so it takes it into account), and it also shows that it
> is the stingiest winning by these 10 cells. Anyway, this is the count
> for all the rest:
>
> subsets 537471
> subs 537471
> subs2 537471
> combos 537471
> subset 2032316
> msubset 537526
> msubset1 537471
> subsets-v23 537471
> subsets-v5 1111280
> ssubsets 537471

That's the count of unique pairs in the result, right? There's also
the question of how many pairs are consed but not included in the
result. That can be measured by redefining cons to keep a count.
Then there's the question of how much auxiliary garbage is generated
by the system (for environments, continuations, etc.). That can't be
generically measured. The accumulator-based version conses no pairs
other than those in the result, and under most compilers I believe
there will be close to zero auxiliary garbage.

-al

Eli Barzilay

unread,
Jan 17, 2002, 2:47:57 PM1/17/02
to
Arbitrary Petrofsky <alleg...@petrofsky.org> writes:

> That's the count of unique pairs in the result, right? There's also
> the question of how many pairs are consed but not included in the
> result. That can be measured by redefining cons to keep a count.
> Then there's the question of how much auxiliary garbage is generated
> by the system (for environments, continuations, etc.). That can't
> be generically measured.

So for a complete analysis you'll have to see how much space the
process takes, but that will be very platform dependent. So, I think
I should stop playing with this thing...

Allegory Petrofsky

unread,
Jan 17, 2002, 3:33:58 PM1/17/02
to
Al Gore Rhythm Petrofsky <al-gore...@petrofsky.org> writes:
> Eli Barzilay <e...@barzilay.org> writes:
> > b...@cs.purdue.edu (Bradley J Lucier) writes:
> >
> > > BTW, there are 184756 subsets of size 10 in a length-20 list set,

> > But what about lists that share parts (which most of it is)?


>
> I believe the minimum number of unique pairs in the result is
>
> 537471 = [ C(20,10) + C(19,9) + C(18,8) + ... + C(11,1) ] + C(20,10)
>
> where C(n,m) = n!/(m!n-m!).

Belated realization: the first term is more succinctly written as C(21,10).

An elegant proof still eludes me.

-al

Doug Quale

unread,
Jan 17, 2002, 7:55:37 PM1/17/02
to
Eli Barzilay <e...@barzilay.org> writes:

> Doug Quale <qua...@charter.net> writes:
>
> > Certainly. This basic assumption for memoization means that lazy
> > streams will work some times when memoization is not appropriate.
> > The lazy implementation doesn't need to compare function arguments.
>
> Can you think of a simple example? (That would be useful for future
> teaching...)

Uh, ..., you've pinned me down here. I can't think of any good
examples, so maybe this is more a theoretical argument than a
practical one. There are certainly many examples of large data
structures passed between functions (the intermediate data structures
used by a compiler is one good example), but I don't see any need for
memo functions there. One practical example that works against my
argument is hash consing. Hash consing is an application of
memoization where the arguments can be very large, but it works very
well all the same.


--
Doug Quale

Bob

unread,
Jan 17, 2002, 9:33:17 PM1/17/02
to
hmmm...it's all platform dependent...when the registers are full, you have
to use the frame, so depending on how many registers your processor has
available, and how your implementation allocates register and frame space,
you can get varying results.....but, I am already satisfied with your tests
and results from the 15th.....so, agreed, it was fun while it lasted....

Bob


"Eli Barzilay" <e...@barzilay.org> wrote in message
news:sk8zaw7...@mojave.cs.cornell.edu...

Eli Barzilay

unread,
Jan 17, 2002, 10:45:11 PM1/17/02
to
"Bob" <rwge...@indiana.edu> writes:

> hmmm...it's all platform dependent...when the registers are full, you have
> to use the frame, so depending on how many registers your processor has
> available, and how your implementation allocates register and frame space,
> you can get varying results.....but, I am already satisfied with your tests
> and results from the 15th.....so, agreed, it was fun while it lasted....

[This is just what I said, you have to measure the process itself, but
even that wouldn't be to useful...]

Thomas Baruchel

unread,
Jan 18, 2002, 12:09:57 PM1/18/02
to
Le Sat, 12 Jan 2002 14:43:41 -0500, Ji-Yong D. Chung a écrit :
> I did not google on ths topic -- it is probably
>too esoteric.
>
> But what _is_ subsets function? What does it
>compute, and what is it used for?

Hi, I'm the one who posted the initial question ;-)
What they are speaking about isn't at all esoteric.
I very often work with sets of integers, and have to
compute combinations of various kinds. I need the following
function: a set (written as a list) of integers being
given, I want to know the list of all subsets having
n elements in it: if the initial set is '(4 5 6),
then (subsets '(4 5 6) 2) should return:
((4 5) (5 6) (4 6))
(the order isn't important, either for the sublists or for
the elements in a sublist). You see it isn't at all esoteric...
I nevertheless was surprised to see how enthousiasts were all
in the group and I want to thank them all.

--
QlpoOTFBWSZTWcwiz1oAAC1fgHQTwOeABVAABAT7Zp4lMAC4hET1DQNGhoBoyGaQYyaZAyaG
QZGmBGDTJE01NMTJkZDQaAUhm7W8Wu9WYGQZg2Vd+s8PsaiAZJoF5jaDsEQUaCEQHgnxdw5H
siRDfoqLyg4gHe6/TCLCgm0gY3zjVSswgknIk85qBbV7GNcqz8yWcUOcrT4SlYICcQUgKxM2
gumlEIhPgCSCC4gUHVb3pREx/vdlGkW5r2P5Z+LuSKcKEhmEWetA | mimencode + bzip2

0 new messages