Usually my 'bugs' are in my own code, but this time I talked to three of four people on IRC and came up with a one-liner that suffers the same problem my project has.
I was optimizing some code to not retain the head of the sequence, but when I put a map inside a reduce over a gigantic partitioned seq, I got a stack overflow error.
Some people on IRC thought it'd be a problem with reduce or chunked seqs.
The problem is very much reproducible on Clojure 1.2 and one workaround is to use pmap, because threads get their own stack. But as said in the docstring of pmap, it's very inefficient for small operations like this.
The one-liner:
http://gist.github.com/659491
The project:
http://github.com/pepijndevos/Clomian/blob/master/src/clomian.clj#L33
I hope this can get fixed or that someone can give me a better solution.
Groeten,
Pepijn de Vos
--
Sent from my iPod Shuffle
http://pepijndevos.nl
Am 02.11.2010 um 12:58 schrieb Pepijn de Vos:
> The one-liner:
> http://gist.github.com/659491
I would expect this is because you pile lazy seq on lazy seq, which then get realised, unfolding the whole thing resulting in the stack overflow. Try this:
user=> (reduce #(doall (map + %1 %2)) (partition 5 (range 1e6)))
(99999500000 99999700000 99999900000 100000100000 100000300000)
Sincerely
Meikel
I'm sorry... I don't quite understand this explanation. Do you mean
that reduce is realizing the entire list all at once? I would think
it would grab an element one at a time. Sorry for the stupid
question, but there's something subtle here that I'm not
understanding.
> user=> (reduce #(doall (map + %1 %2)) (partition 5 (range 1e6)))
> (99999500000 99999700000 99999900000 100000100000 100000300000)
Is it because of the way the lazy sequence is being generated? Would
different implementations of partition or range do better?
Thanks in advance!
-John
I think the problem is that this reduction will build an expression like this:
(map + ... (map + ... (map + ... (map + ... <one million nesting levels>))))
When clojure tries to realize an element of the resulting lazy seq,
every level will result in a nested method call, which will eventually
blow the stack.
// raek
Thank you both for the explanation. I hadn't realized that reduce
could work with lazy sequences in that way. I figured it was doing
the equivalent of a "doall" already for each element of the sequence,
but apparently not. :-) So now I see how the nesting can happen.
Thanks!
-John