(clojure.contrib.lazy-seqs/combinations) should return [[]], not nil?

1 view
Skip to first unread message

Jason Wolfe

unread,
Jan 21, 2009, 8:34:30 PM1/21/09
to Clojure
I just got bit by (clojure.contrib.lazy-seqs/combinations) returning
nil, not [[]] as I expected. I could see arguments for either being
the "correct" output, but let me give my case for [[]].

In my mind, asking what the output of (combinations) should be is
completely analogous to asking what the value of 0^0 should be.
Either 0 or 1 seems defensible, but the accepted answer is 1 (for
reasons I won't go into here). Similarly, Clojure (*) ==> 1.

If you don't see the connection, note that for all non-empty seqs of
seqs "args",
(= (count (apply combinations args)) (apply * (map count args))).

If (= (combinations) [[]]), then this would hold always. Plus, this
is what would make it work in my application. I can go into more
details if you like, but the basic idea use case is finding the set of
all allowed variable bindings, given a possible set of possible values
for each variable. If there are no variables, this set should consist
of the single "empty binding" rather than "no bindings possible".

Will anyone sign off on adding this as a clojure.contrib issue?

Thanks,
Jason

Jason Wolfe

unread,
Jan 23, 2009, 12:53:16 PM1/23/09
to Clojure
Just to clarify: I dont' care what the type of the output is. I guess
(seq [[]]) would actually be more consistent with the existing
function. The point is that the output of 0-arg combinations should
be a seq containing an empty vector (or whatever other types, I don't
care), not an empty seq.

Anyone willing to sign off??

Stephen C. Gilardi

unread,
Jan 23, 2009, 1:11:06 PM1/23/09
to clo...@googlegroups.com

On Jan 23, 2009, at 12:53 PM, Jason Wolfe wrote:

> Just to clarify: I dont' care what the type of the output is. I guess
> (seq [[]]) would actually be more consistent with the existing
> function. The point is that the output of 0-arg combinations should
> be a seq containing an empty vector (or whatever other types, I don't
> care), not an empty seq.
>
> Anyone willing to sign off??

Please do enter it as an issue. I'd be interested in hearing from
Chouser before making the change. He added combinations to lazy-seqs.

Thanks!

--Steve

Stephen C. Gilardi

unread,
Jan 23, 2009, 1:23:56 PM1/23/09
to clo...@googlegroups.com

On Jan 23, 2009, at 1:11 PM, Stephen C. Gilardi wrote:

> I'd be interested in hearing from Chouser before making the change.
> He added combinations to lazy-seqs.

I think it's quite cool that simply removing the "when" accomplishes
this.

--Steve

Jason Wolfe

unread,
Jan 23, 2009, 1:29:17 PM1/23/09
to clo...@googlegroups.com

Yes, exactly :) Returning nil for no-args was "hacked on" in a sense,
in that the loop would have naturally done the "right thing" otherwise.

Issue here: http://code.google.com/p/clojure-contrib/issues/detail?id=6

Thanks,
Jason

Chouser

unread,
Jan 23, 2009, 11:49:52 PM1/23/09
to clo...@googlegroups.com
On Fri, Jan 23, 2009 at 1:11 PM, Stephen C. Gilardi <sque...@mac.com> wrote:
>
> Please do enter it as an issue. I'd be interested in hearing from Chouser
> before making the change. He added combinations to lazy-seqs.

I did what now? My memory must be going. Here are some other
implementations I apparently went through:

http://paste.lisp.org/display/68701

Reviewing the IRC log from that day, it appears I put no particular
thought into then no-args case, so I'm happy to defer to Jason's logic
on the matter. This is the best version?

(defn combinations
"Takes any number of seqs and returns a lazy list of ordered
combinations (pick 1 from each seq). See also 'for'"
[& seqs]
(if (empty? seqs) '(())
(for [item (first seqs)
more (apply combinations (rest seqs))]
(cons item more))))

--Chouser

Mark Engelberg

unread,
Jan 24, 2009, 12:36:40 AM1/24/09
to clo...@googlegroups.com
On Fri, Jan 23, 2009 at 8:49 PM, Chouser <cho...@gmail.com> wrote:
> This is the best version?

Depends on your definition of "best".

I just posted a version at http://paste.lisp.org/display/74134 which
is many times faster.

Also, I have to say that calling this function "combinations" is
fairly nonstandard. Usually combinations refers to all the ways of
taking n elements from a given collection.

So (combinations [1 2 3] 2) would give you back ((1 2) (1 3) (2 3)).

selections seems like a better name to me for this particular function.

jaw...@berkeley.edu

unread,
Jan 24, 2009, 2:43:00 AM1/24/09
to clo...@googlegroups.com
> On Fri, Jan 23, 2009 at 8:49 PM, Chouser <cho...@gmail.com> wrote:
>> This is the best version?
>
> Depends on your definition of "best".
>
> I just posted a version at http://paste.lisp.org/display/74134 which
> is many times faster.

I haven't had a chance to look at that in detail yet, but I'm 100% in
favor of faster.

> Also, I have to say that calling this function "combinations" is
> fairly nonstandard. Usually combinations refers to all the ways of
> taking n elements from a given collection.
>
> So (combinations [1 2 3] 2) would give you back ((1 2) (1 3) (2 3)).

That had nagged me a bit too ... I agree.

> selections seems like a better name to me for this particular function.

I think the usual mathematical name is be cartesian-product (or maybe
cross-product), although selections is OK with me too.

-Jason

Mark Engelberg

unread,
Jan 24, 2009, 3:47:36 AM1/24/09
to clo...@googlegroups.com
On Fri, Jan 23, 2009 at 11:43 PM, <jaw...@berkeley.edu> wrote:
> I think the usual mathematical name is be cartesian-product (or maybe
> cross-product), although selections is OK with me too.

Yes, now that you've reminded me, I agree that cartesian-product is
the usual mathematical name, and that would probably be my first
choice.

Jason Wolfe

unread,
Jan 26, 2009, 2:08:46 AM1/26/09
to Clojure
OK, cartesian_product it is.

Two comments on your version. First, unlike mine (and the current
"combinations"), it takes a single argument, a seq of seqs (rather
than multiple seq arguments); which of these ways is preferred?

Second, I had the clauses of my for loop backwards, which was slowing
things down. I think this version:

(defn combinations "Take a seq of seqs and return a lazy list of
ordered combinations (pick 1 from each seq)"
[seqs]
(if (empty? seqs) '(())
(for [rst (combinations (rest seqs))
item (first seqs)]
(cons item rest))))

is actually significantly faster than the iterative one you posted
(and also much simpler); do your benchmarks agree?




On Jan 24, 12:47 am, Mark Engelberg <mark.engelb...@gmail.com> wrote:

Mark Engelberg

unread,
Jan 26, 2009, 2:59:47 AM1/26/09
to clo...@googlegroups.com
For simple inputs, the two approaches have similar performance. On
complex inputs, my tests show the iterative version tends to run about
twice as fast.

Try running on an extreme input like:
["ACDFG" "A" "B" "C" "D" "E" "F" "ABCD" "G" "H" "I" "ABEG" "J" "K"
"BCDE" "L" "ABCDG" "M" "EF" "NABC" "ABCDFG" "ABDEFG" "DGHI" "ABCDEFG"]

and let me know if you also see the 2x performance difference.

Mark Engelberg

unread,
Jan 26, 2009, 1:18:49 PM1/26/09
to clo...@googlegroups.com
Also, it's worth pointing out that your newer version prints the
combinations out in a non-standard order. I don't know whether people
should really rely on the order, but I think most people would expect
it to print out in the same order as a series of nested for loops.

Jason Wolfe

unread,
Jan 26, 2009, 1:22:03 PM1/26/09
to clo...@googlegroups.com

Yes, yours runs more than 2x faster on this input. On the other hand,
mine runs almost 2x faster on
(range 3000) (range 3000) ... As far as I can tell, yours is faster
when there are a large number of input seqs, especially if any of them
have 0 or 1 elements. Anyway, I'm happy with either implementation,
although probably they should take multiple seq arguments rather than
a seq of seqs?

-Jason

Jason Wolfe

unread,
Jan 26, 2009, 1:26:16 PM1/26/09
to clo...@googlegroups.com
> Also, it's worth pointing out that your newer version prints the
> combinations out in a non-standard order.

Good point ... I shouldn't have tried to avoid adding the "let":

(defn combinations "Take a seq of seqs and return a lazy list of
ordered combinations (pick 1 from each seq)"
[seqs]
(if (empty? seqs) '(())

(let [rec (combinations (rest seqs))]
(for [item (first seqs)
rst rec]
(cons item rst)))))


Mark Engelberg

unread,
Jan 26, 2009, 1:28:30 PM1/26/09
to clo...@googlegroups.com
I've tested that already, and it takes even longer for all but trivial
inputs, because rec now prevents the combinations sequence from being
garbage collected, and the memory allocation bogs things down
tremendously.

Jason Wolfe

unread,
Jan 26, 2009, 1:38:19 PM1/26/09
to clo...@googlegroups.com
On Jan 26, 2009, at 10:28 AM, Mark Engelberg wrote:

>
> I've tested that already, and it takes even longer for all but trivial
> inputs, because rec now prevents the combinations sequence from being
> garbage collected, and the memory allocation bogs things down
> tremendously.

Ah, I noticed the unpredictable timings too but didn't make that
connection -- thanks. OK, I'm happy to go with your version then :)

-Jason

Mark Engelberg

unread,
Jan 26, 2009, 7:14:42 PM1/26/09
to clo...@googlegroups.com
I updated the paste with the changes we've discussed.

http://paste.lisp.org/display/74134#3

I'll keep playing around to see if I can get it faster, but so far,
this is the fastest way I've found to generate the cartesian product
in standard order.

Mark Engelberg

unread,
Jan 26, 2009, 7:24:28 PM1/26/09
to clo...@googlegroups.com
Hey, I just looked back at the original post that started the thread.
At some point, I had changed my function so (cartesian-product)
returned nil instead of (nil), but based on Jason's post, I've changed
it back in another paste annotation.

Jason Wolfe

unread,
Jan 26, 2009, 8:34:53 PM1/26/09
to clo...@googlegroups.com

Hah, that would have been funny if after all this, the original point
of the thread got lost :). Anyway, thanks for all your hard work on
this! It's been very interesting for me, since I think I finally
understand (at least somewhat) these "iterative" versions of things
I've seen floating around.

-Jason

Reply all
Reply to author
Forward
0 new messages