The first instance must be holding onto the head of (range 50000000)
for some reason.
If I had to hazard a guess I'd say it's something in the
implementation of the def special form that causes it. Further, I'd
hypothesize that the let form somehow shields the range from having
its head held onto. One piece of evidence is this:
user=> (def x (let [y 100] (reduce + 0 (range 50000000))))
#'user/x
user=> x
1249999975000000
Even if the let binding is completely irrelevant to the range and the
reduce, it doesn't OOME.
You may want to consider the heap size you have allocated to java. I
believe the default is 128.
For example you can set this yourself:
java -Xms256m -Xmx1024m
This provides 256mb initial heap and permits heap to grow to 1024mb.
I've been using Leiningen, so in my case I just changed the settings
in the source, before install.
There's probably a list of pros/cons to upping the default heap size
that you may want to consider.
Tim
On Dec 21, 7:09 am, Miles Trebilco <miles.van...@gmail.com> wrote:
> Why does this cause an out of memory error:
>
> (def out_of_mem
> (reduce + 0 (range 50000000)))
>
> while this does not:
>
> (def not_out_of_mem
> (let [result 0]
> (reduce + result (range 50000000))))
>
> and neither does this in the REPL:
> (reduce + 0 (range 50000000)))
>
> - Miles
--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
If the workarounds mentioned actually work (I haven't tried) I really
don't understand why. This *looks* like a genuine bug to me, but I
really don't know Clojure's internals well enough (yet) to be able to
have the slightest hint where to start looking. I don't see any
reason why (reduce + (range <largenumber>)) should take so much
memory.
--
Chris Riddoch
If the workarounds mentioned actually work (I haven't tried) I really
don't understand why. This *looks* like a genuine bug to me, but I
really don't know Clojure's internals well enough (yet) to be able to
have the slightest hint where to start looking. I don't see any
reason why (reduce + (range <largenumber>)) should take so much
memory.
--
Chris Riddoch
Range doesn't return a lazy seq? Or reduce somehow doesn't work
lazily? This is a little discouraging - it seems like this is a
perfect example of a case where laziness could significantly improve
things.
--
Chris Riddoch
No, both are lazy. Something else is going on here, involving def
holding onto the head of any sequence whose expression is in the def
form but not in a nested let or fn similar scope-creating form.
It's a consumer rather than a producer (or both) of sequences; and, as
such, it's lazy inasmuch as it does not hold onto the head of the
sequence, so if you pass it a lazy sequence it does not keep the whole
thing realized and in memory at one time.
Lazy producer -- sequence is made a bit at a time, on demand, instead
of realized all at once.
Lazy consumer -- sequence is traversed without retaining parts
internally (holding onto the head).
Map is both; range is a lazy producer; reduce is a lazy consumer but
an eager producer, though not generally of a transformed version of
the seq. It's possible to be a lazy consumer and an eager producer of
even a transformed seq though; (doall (map f s)) presumably generates
and discards an element of s for each element of output it generates,
and holds onto the head of the output only, never holding more than
one element of s at a time. A seq view of a collection will be an
eager consumer but might be a lazy producer; (apply list a-coll) will
eager-consume and eager-produce but makes a new data structure the
size of the input, while a lazy seq view may only need to generate the
equivalent of a single cons cell at a time, holding an element in
"first" and a closure wrapping an Iterator or index of some sort in
"rest" as the generator of (next the-seq-view). This reduces memory
overhead somewhat, especially if traversal is never completed,
compared to a seq view generator outputting (list element1 element2
element3 ...).
So there's two sorts of "lazy behavior" at issue: on the producer
side, does it generate a whole list or other data structure all at
once, or only a bit at a time as needed? On the consumer side, does it
hold onto the head of a passed-in seq or does it consume and discard
one element at a time? The difficulty with clarity of language arises
when something is a lazy consumer but an eager producer, like reduce,
or vice versa.
If you want to use lazy to generally only mean "lazy producer", that's
OK, but it would be helpful to have some alternative term for a lazy
consumer (that is, anything that traverses a seq without internally
hanging onto the head).