Out of memory

45 views
Skip to first unread message

Miles Trebilco

unread,
Dec 21, 2010, 9:09:36 AM12/21/10
to Clojure
Why does this cause an out of memory error:

(def out_of_mem
(reduce + 0 (range 50000000)))

while this does not:

(def not_out_of_mem
(let [result 0]
(reduce + result (range 50000000))))

and neither does this in the REPL:
(reduce + 0 (range 50000000)))


- Miles

Ken Wesson

unread,
Dec 21, 2010, 2:04:00 PM12/21/10
to clo...@googlegroups.com

The first instance must be holding onto the head of (range 50000000)
for some reason.

If I had to hazard a guess I'd say it's something in the
implementation of the def special form that causes it. Further, I'd
hypothesize that the let form somehow shields the range from having
its head held onto. One piece of evidence is this:


user=> (def x (let [y 100] (reduce + 0 (range 50000000))))
#'user/x
user=> x
1249999975000000

Even if the let binding is completely irrelevant to the range and the
reduce, it doesn't OOME.

Tim Robinson

unread,
Dec 21, 2010, 2:24:33 PM12/21/10
to Clojure
You may want to consider the heap size you have allocated to java. I
believe the default is 128.

For example you can set this yourself:

java -Xms256m -Xmx1024m

This provides 256mb initial heap and permits heap to grow to 1024mb.

I've been using Leiningen, so in my case I just changed the settings
in the source, before install.

There's probably a list of pros/cons to upping the default heap size
that you may want to consider.

Tim

Laurent PETIT

unread,
Dec 21, 2010, 2:56:11 PM12/21/10
to clo...@googlegroups.com
2010/12/21 Tim Robinson <tim.bl...@gmail.com>

You may want to consider the heap size you have allocated to java. I
believe the default is 128.

For example you can set this yourself:

java -Xms256m -Xmx1024m

Indeed, but in this example, there is a problem. As Ken said, it seems that the "locals clearing" does not apply to def.
Maybe there's a technical reason to this, but this problem is not related to the heap size, IMHO.


 

This provides 256mb initial heap and permits heap to grow to 1024mb.

I've been using Leiningen, so in my case I just changed the settings
in the source, before install.

There's probably a list of pros/cons to upping the default heap size
that you may want to consider.

Tim

On Dec 21, 7:09 am, Miles Trebilco <miles.van...@gmail.com> wrote:
> Why does this cause an out of memory error:
>
> (def out_of_mem
>   (reduce + 0 (range 50000000)))
>
> while this does not:
>
> (def not_out_of_mem
>  (let [result 0]
>   (reduce + result (range 50000000))))
>
> and neither does this in the REPL:
>   (reduce + 0 (range 50000000)))
>
>  - Miles

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Jeff Palmucci

unread,
Dec 22, 2010, 11:04:44 AM12/22/10
to Clojure
I've worked around this sort of thing in the past by wrapping the
initialization in a closure. My macros:

(defmacro once-fn "Define a function that should only be called once.
Releases local storage earlier"
[args & body]
`(^{:once true} fn* ~args ~@body))

(defmacro top-level-run "work around a memory leak in the repl"
[& body]
`((once-fn []
~@body)))

You'll find that:

(def out_of_mem (top-level-run (reduce + 0 (range 500000000))))

does not run out of memory.

Laurent PETIT

unread,
Dec 22, 2010, 11:54:37 AM12/22/10
to clo...@googlegroups.com
2010/12/22 Jeff Palmucci <jpal...@gmail.com>
Couldn't just it be a wrap with (let [] ), and let the choice of running it once or not by choosing either def or defonce :

user=> (def x (reduce + (range 50000000)))
java.lang.OutOfMemoryError: Java heap space (NO_SOURCE_FILE:1)
user=> (def x (let [] (reduce + (range 50000000))))
#'user/x
user=> (defonce y (let [] (reduce + (range 50000000))))
#'user/y
user=> (defonce y (let [] (reduce + (range 50000000))))
nil
user=> (defonce z (reduce + (range 50000000)))
#'user/z
user=> (defonce z (reduce + (range 50000000)))
nil
user=> x
1249999975000000
user=> y
1249999975000000
user=> z
1249999975000000
user=>

Chris Riddoch

unread,
Dec 22, 2010, 5:32:14 PM12/22/10
to clo...@googlegroups.com
On Wed, Dec 22, 2010 at 9:54 AM, Laurent PETIT <lauren...@gmail.com> wrote:
> 2010/12/22 Jeff Palmucci <jpal...@gmail.com>
>>
>> I've worked around this sort of thing in the past by wrapping the
>> initialization in a closure. My macros:
>
> Couldn't just it be a wrap with (let [] ), and let the choice of running it
> once or not by choosing either def or defonce :

If the workarounds mentioned actually work (I haven't tried) I really
don't understand why. This *looks* like a genuine bug to me, but I
really don't know Clojure's internals well enough (yet) to be able to
have the slightest hint where to start looking. I don't see any
reason why (reduce + (range <largenumber>)) should take so much
memory.

--
Chris Riddoch

David Nolen

unread,
Dec 22, 2010, 5:46:33 PM12/22/10
to clo...@googlegroups.com
On Wed, Dec 22, 2010 at 5:32 PM, Chris Riddoch <ridd...@gmail.com> wrote

If the workarounds mentioned actually work (I haven't tried) I really
don't understand why.  This *looks* like a genuine bug to me, but I
really don't know Clojure's internals well enough (yet) to be able to
have the slightest hint where to start looking.  I don't see any
reason why (reduce + (range <largenumber>)) should take so much
memory.

--
Chris Riddoch

A entire collection of 5e7 *objects* is being realized into memory as it is being reduced down to a single value to be stored into a var. I would expect this to perform poorly in any language.

David  

Chris Riddoch

unread,
Dec 22, 2010, 6:08:21 PM12/22/10
to clo...@googlegroups.com
On Wed, Dec 22, 2010 at 3:46 PM, David Nolen <dnolen...@gmail.com> wrote:
> A entire collection of 5e7 *objects* is being realized into memory as it is
> being reduced down to a single value to be stored into a var. I would expect
> this to perform poorly in any language.

Range doesn't return a lazy seq? Or reduce somehow doesn't work
lazily? This is a little discouraging - it seems like this is a
perfect example of a case where laziness could significantly improve
things.

--
Chris Riddoch

Ken Wesson

unread,
Dec 22, 2010, 6:10:19 PM12/22/10
to clo...@googlegroups.com

No, both are lazy. Something else is going on here, involving def
holding onto the head of any sequence whose expression is in the def
form but not in a nested let or fn similar scope-creating form.

David Nolen

unread,
Dec 22, 2010, 6:38:41 PM12/22/10
to clo...@googlegroups.com

reduce is not lazy. It is eager. chouser corrected me of my misconception.

David 

Ken Wesson

unread,
Dec 22, 2010, 8:06:24 PM12/22/10
to clo...@googlegroups.com

It's a consumer rather than a producer (or both) of sequences; and, as
such, it's lazy inasmuch as it does not hold onto the head of the
sequence, so if you pass it a lazy sequence it does not keep the whole
thing realized and in memory at one time.

Lazy producer -- sequence is made a bit at a time, on demand, instead
of realized all at once.
Lazy consumer -- sequence is traversed without retaining parts
internally (holding onto the head).

Map is both; range is a lazy producer; reduce is a lazy consumer but
an eager producer, though not generally of a transformed version of
the seq. It's possible to be a lazy consumer and an eager producer of
even a transformed seq though; (doall (map f s)) presumably generates
and discards an element of s for each element of output it generates,
and holds onto the head of the output only, never holding more than
one element of s at a time. A seq view of a collection will be an
eager consumer but might be a lazy producer; (apply list a-coll) will
eager-consume and eager-produce but makes a new data structure the
size of the input, while a lazy seq view may only need to generate the
equivalent of a single cons cell at a time, holding an element in
"first" and a closure wrapping an Iterator or index of some sort in
"rest" as the generator of (next the-seq-view). This reduces memory
overhead somewhat, especially if traversal is never completed,
compared to a seq view generator outputting (list element1 element2
element3 ...).

So there's two sorts of "lazy behavior" at issue: on the producer
side, does it generate a whole list or other data structure all at
once, or only a bit at a time as needed? On the consumer side, does it
hold onto the head of a passed-in seq or does it consume and discard
one element at a time? The difficulty with clarity of language arises
when something is a lazy consumer but an eager producer, like reduce,
or vice versa.

If you want to use lazy to generally only mean "lazy producer", that's
OK, but it would be helpful to have some alternative term for a lazy
consumer (that is, anything that traverses a seq without internally
hanging onto the head).

Paul Mooser

unread,
Dec 23, 2010, 11:29:27 AM12/23/10
to Clojure
So, doesn't this represent a bug at least ? I'm sometimes confused
when this sort of issue doesn't get more attention, and I'm uncertain
what the process is for filing a bug, since my impression is that we
are supposed to have issues validated by discussion on the group
before filing an actual ticket or issue.

David Nolen

unread,
Dec 23, 2010, 11:42:52 AM12/23/10
to clo...@googlegroups.com
It probably doesn't get much attention because it's a case that no one actually employs in real programs.

I'd be surprised if a proper patch for this was submitted, that it would get rejected.

David 
Reply all
Reply to author
Forward
0 new messages