Running out of memory when using loop/recur and destructuring

Paul Mooser

unread,

Oct 30, 2009, 3:15:06 PM10/30/09

to Clojure

I was working with a large data set earlier today, and I had written a
loop/recur where I was passing in a huge seq to the first iteration,
and I was surprised when I ran out of heap space, because I was very
careful not to hold on to the head of the seq, and I though that loop
ended up rebinding all of its params (and didn't hold on to the
initial values).

When I run the following code:

(defn loop-test []
(loop [[head & tail] (repeat 1)]
(recur tail)))

It will blow the heap on a 1.5 JDK, but I can't seem to make that
happen under a 1.6 JDK. I ran into a previous issue that was similar
(with respect to only manifesting on an older JDK), and it turned out
to be a bug in clojure (fixed in svn 1153).

I noticed that this doesn't occur if I don't use destructuring:

(defn loop-test2 []
(loop [s (repeat 1)]
(recur (rest s))))

This version seems to run forever on either JDK, using a constant
amount of heap space.

Is this behavior due to some artifact of destructuring I'm not aware
of (or something else I'm missing), or is there a bug? If it sounds
like a bug, can anyone else reproduce?

Thanks!

John Harrop

unread,

Oct 30, 2009, 9:19:18 PM10/30/09

to clo...@googlegroups.com

On Fri, Oct 30, 2009 at 3:15 PM, Paul Mooser <taro...@gmail.com> wrote:

Is this behavior due to some artifact of destructuring I'm not aware
of (or something else I'm missing), or is there a bug? If it sounds
like a bug, can anyone else reproduce?

Thanks!

I vaguely remember something like this coming up before, months ago. It may have been a bug. That the behavior depends on the JVM and is nonbothersome with a more recent JVM version suggests so.

Paul Mooser

unread,

Oct 31, 2009, 7:22:43 PM10/31/09

to Clojure

I actually restructured my code (not the toy example posted here) to
avoid the destructuring, and was disappointed to find it also
eventually blows up on 1.6 as well. I'm reasonably certain in that
case that I'm not holding on to any of the sequence (since I don't
refer to it outside the invocation for the initial value of the loop,
so I'm still hoping people can help me suss this out.

Paul Mooser

unread,

Oct 31, 2009, 7:42:31 PM10/31/09

to Clojure

A user on IRC named hiredman had the excellent idea (which should have
occurred to me, but didn't) to macroexpand my code.

A macro expansion of

(loop [[head & tail] (repeat 1)] (recur tail))

results in:

(let* [G__10 (repeat 1)
vec__11 G__10
head (clojure.core/nth vec__11 0 nil)
tail (clojure.core/nthnext vec__11 1)]
(loop* [G__10 G__10]
(clojure.core/let [[head & tail] G__10]
(recur tail))))

So, if I'm interpreting this correctly, it appears if you destructure
in this way, there is going to be a reference to the seq held outside
the loop itself. Does this mean, then, that this kind of heap
explosion is inevitable using destructuring with large lazy seqs? It's
hard for me to believe that is the case, and I'm definitely not a
macro expert, so I'd be happy to be shown to be wrong.

Paul Mooser

unread,

Oct 31, 2009, 8:24:41 PM10/31/09

to Clojure

From looking at the source code the loop macro, it looks like this
might be particular to destructuring with loop, rather than being
related to destructuring in general ?

Paul Mooser

unread,

Nov 2, 2009, 12:49:09 PM11/2/09

to Clojure

I'm a little surprised I haven't seen more response on this topic,
since this class of bug (inadvertently holding onto the head of
sequences) is pretty nasty to run into, and is sort of awful to debug.
I'm wondering if there's a different way to write the loop macro so
that it doesn't expand into an outer "let" form.

Christophe Grand

unread,

Nov 2, 2009, 1:39:43 PM11/2/09

to clo...@googlegroups.com

Hi Paul,

It's indeed surprising and at first glance, it looks like a bug but
after researching the logs, this let form was introduced in the
following commit
http://github.com/richhickey/clojure/commit/288f34dbba4a9e643dd7a7f77642d0f0088f95ad
with comment "fixed loop with destructuring and inter-binding
dependencies".

Thus this commit allows to write (loop [[x & xs] s y xs] ...) but
introduces this head-retention behaviour.

Right now I can't see how loop can be made to support both cases.
Hopefully someone else will.

Christophe

--
Professional: http://cgrand.net/ (fr)
On Clojure: http://clj-me.cgrand.net/ (en)

John Harrop

unread,

Nov 2, 2009, 3:59:24 PM11/2/09

to clo...@googlegroups.com

On Mon, Nov 2, 2009 at 2:39 PM, Christophe Grand <chris...@cgrand.net> wrote:

Right now I can't see how loop can be made to support both cases.
Hopefully someone else will.

In the meantime, remember that it's always worth trying to implement seq-processing in terms of map, reduce, filter, for, and friends if possible, or lazy-seq, before resorting to loop/recur.

Paul Mooser

unread,

Nov 2, 2009, 4:31:19 PM11/2/09

to Clojure

This is great advice, of course. On the other hand, I feel it's
important to be explicitly clear about which forms will hold on to
(seemingly) transient data. Certain things are explicitly clear about
this (such as the docstring for doseq), and this particular case is
unfortunate because in the common case, loop doesn't hold on to the
reference. I think the inconsistency in this case is dangerous,
especially since loop/recur is the generic iteration construct.

Paul Mooser

unread,

Nov 3, 2009, 12:17:21 AM11/3/09

to Clojure

Good job tracking down that diff -- upon looking at it, unfortunately,
I obviously don't understand the underlying issue being fixed (the
inter-binding dependencies) because the "old code" basically matches
what I would think would be the way to avoid introducing this in an
outer let form -- clearly the old code must have significant or subtle
issues of its own!

Alex Osborne

unread,

Nov 3, 2009, 1:53:31 AM11/3/09

to clo...@googlegroups.com

Paul Mooser wrote:
> Good job tracking down that diff -- upon looking at it, unfortunately,
> I obviously don't understand the underlying issue being fixed (the
> inter-binding dependencies) because the "old code" basically matches
> what I would think would be the way to avoid introducing this in an
> outer let form

Lets macro-expand Christophe's example with both the old loop and the
new loop.

(loop [[x & xs] s
y xs]
...)

So with the old loop we get this:

(loop*
[G__13702 s
G__13703 xs]
(let [[x & xs] G__13702
y G__13703]
...))

See the problem? "xs" is used before it's defined.

The new loop uses the outer-let to get around this:

(let [G__13697 s
[x & xs] G__13697
y xs]
(loop* [G__13697 G__13697
y y]
(let [[x & xs] G__13697
y y]
...)))

What initially occurs to me is to move the outer loop into loop*'s vector:

(loop*
[G__13702 s
G__13703 (let [[x & xs] G__13702] xs)]
(let [[x & xs] G__13702
y G__13703]
x))

A problem with that is we're going to have to put in a destructuring let
of all the previous arguments for each loop* argument, so it'll blow up
in size pretty quickly and I guess the JVM won't optimize the unused
bindings away as it can't be sure the (first s) and (rest s) that [[x &
xs] s] expands into are side-effect free.

Paul Mooser

unread,

Nov 3, 2009, 1:27:10 PM11/3/09

to Clojure

Ah -- I hadn't understood that when using destructuring, that
subsequent bindings could refer to the destructured elements. I should
have, since clojure "only" has let*, and this behavior seems
consistent with that, for binding.

Eeww. It seems like quite a thorny issue to solve, even if simple to
describe.

What's the procedure for creating a ticket for this? Is it at least
acknowledged that this IS a bug? I don't see it in the list of
assembla tickets for clojure.

Brian Hurt

unread,

Nov 3, 2009, 4:20:27 PM11/3/09

to clo...@googlegroups.com

We encountered similar problems at work trying to wrap I/O up into lazy seq's. The problem is that it is very easy to accidentally hold on to the head of a seq while enumerating it's elements. In addition, we had problems with not closing file descriptors. A common pattern was to open a file, produce a lazy seq of the contents of the file, closing the file when the last element was read. The seq would then be passed off to other parts of the code, which would read some of the elements, and then drop the seq- leaking the open file handle (at least until the gc got around to it).

We finally said "don't use a seq unless you don't mind all the elements being in memory!" and wrote a producer class. The producer class is similar to a normal Java iterator, in that getting the next element updates the state of the object- however maps and filters are applied lazily, and there is an additional close function which says that no more elements need to be produced (allowing for the closing the underlying file descriptor, for example).

I disbelieve in golden hammers. Seqs (aka lazy lists) are incredibly useful in a lot of places, and I'm glad that Clojure has them. On the other hand, there are times and uses where seqs are the wrong tool to use. Of course, the same can be said of producers.

Brian

Paul Mooser

unread,

Nov 3, 2009, 5:19:49 PM11/3/09

to Clojure

I understand the pragmatism of your approach, but it's really
unfortunate. Seqs are a really convenient abstraction, and the ability
to model arbitrarily large or infinite ones (with laziness) is really
useful. In my opinion, only using seqs when all of the data can be fit
into memory really undermines the value of the abstraction (by
narrowing its usages so severely), and also makes laziness far less
useful (except possibly as a way to amortize costs over time, rather
than as a way to model infinite things).

This path has been well-tread, but the danger of hanging on to the
head of the list is due to the caching behavior of lazy seqs, which is
important for consistency - otherwise, walking the same seq twice
might result in different results.

As with most engineering efforts, there are trade-offs, but I've been
willing to accept the extra caution I need to employ when dealing with
lazy seqs. I've run into a few of these kinds of bugs over time, and
I'm guessing it's generally because in my uses, I'm dealing with
millions of records, and far more data than I can fit in memory. I'm
not sure that this indicates that seqs are the wrong tool in this
instance (as you seem to say), but the answer isn't clear to me.

Brian Hurt

unread,

Nov 3, 2009, 5:47:18 PM11/3/09

to clo...@googlegroups.com

On Tue, Nov 3, 2009 at 5:19 PM, Paul Mooser <taro...@gmail.com> wrote:

I understand the pragmatism of your approach, but it's really
unfortunate. Seqs are a really convenient abstraction, and the ability
to model arbitrarily large or infinite ones (with laziness) is really
useful. In my opinion, only using seqs when all of the data can be fit
into memory really undermines the value of the abstraction (by
narrowing its usages so severely), and also makes laziness far less
useful (except possibly as a way to amortize costs over time, rather
than as a way to model infinite things).

I agree. I don't like having to ditch seqs. And producers bring their own downsides- for example, being imperative constructs, they open the door for race conditions on multi-threaded code in a way that seqs don't. If anything, producers have a more-limited range of applicability than seqs, or even iterators, do. Also, polluting the meme-space with three constructs which are very similiar, but subtly different, is also a problem I'm not happy with.

But here's an example of the sorts of problems we were hitting. OK, we all know that doseq doesn't hold on to the head of the seq. But what if I write:
(defn print-seq [ s ]
(doseq [ x s ]
(println x)))
Does this code hold on to the head of the seq (in the argument to the function)? I'm honestly not sure- and strongly suspect that the answer depends upon (among other things) which JVM you run the code on (and which optimizations it will perform), and how long the code has been running (and thus what optimizations have been performed on the code).

And even if it doesn't, then I have no doubt that with a little complication, I can develop code that does (or at least might) hold on the head of the seq unnecessarily. Which means this is not only an issue for the original writer of the code, but also the maintainer.

This path has been well-tread, but the danger of hanging on to the
head of the list is due to the caching behavior of lazy seqs, which is
important for consistency - otherwise, walking the same seq twice
might result in different results.

As with most engineering efforts, there are trade-offs, but I've been
willing to accept the extra caution I need to employ when dealing with
lazy seqs. I've run into a few of these kinds of bugs over time, and
I'm guessing it's generally because in my uses, I'm dealing with
millions of records, and far more data than I can fit in memory. I'm
not sure that this indicates that seqs are the wrong tool in this
instance (as you seem to say), but the answer isn't clear to me.

It's not clear. If you know that millions of records are as large as you're going to see, then seqs are the right tool- and if you load everything into memory, oh well. If the number of records might creep into the billions or trillions, then seqs (with their risk of wanting to keep everything in memory) are a bad choice IMHO.

Brian

Paul Mooser

unread,

Nov 3, 2009, 9:49:05 PM11/3/09

to Clojure

In the particular case given below, I'd assume that during the
invocation of print-seq, the binding to "s" (the head of the sequence)
would be retained, because my mental model for the execution
environment of a function is that it is the environment in which they
were declared, extended with the bindings for their parameters. So,
anything inside that function execution should have a reference to
that environment, and I would expect those bindings to exist until the
function has completed. I seem to recall that in clojure's
implementation, environments aren't reified as such, but I believe the
behavior is the same.

If certain JDKs are smart enough to avoid this, I consider that an
implementation detail of that JDK, and thus (as with most
optimizations), it's not something you should depend upon for
correctness.

I can't dispute your point that subtle bugs can occur with seqs, but I
think that in most cases, it isn't that bad. I'm hopeful that as we
find these kinds of bugs in core clojure forms, that we can get them
addressed. Most of the bugs of this sort that I've introduced in my
own code have been relatively straightforward to diagnose and debug.

On Nov 3, 2:47 pm, Brian Hurt <bhur...@gmail.com> wrote:
> I agree. I don't like having to ditch seqs. And producers bring their own
> downsides- for example, being imperative constructs, they open the door for
> race conditions on multi-threaded code in a way that seqs don't. If
> anything, producers have a more-limited range of applicability than seqs, or
> even iterators, do. Also, polluting the meme-space with three constructs
> which are very similiar, but subtly different, is also a problem I'm not
> happy with.
>
> But here's an example of the sorts of problems we were hitting. OK, we all
> know that doseq doesn't hold on to the head of the seq. But what if I
> write:
> (defn print-seq [ s ]
> (doseq [ x s ]
> (println x)))
> Does this code hold on to the head of the seq (in the argument to the
> function)? I'm honestly not sure- and strongly suspect that the answer
> depends upon (among other things) which JVM you run the code on (and which
> optimizations it will perform), and how long the code has been running (and
> thus what optimizations have been performed on the code).
>
> And even if it doesn't, then I have no doubt that with a little
> complication, I can develop code that does (or at least might) hold on the
> head of the seq unnecessarily. Which means this is not only an issue for
> the original writer of the code, but also the maintainer.
>

Mark Engelberg

unread,

Nov 3, 2009, 11:51:23 PM11/3/09

to clo...@googlegroups.com

I agree that seqs carry a large degree of risk. You have to work very
hard to avoid giving your large sequences a name, lest you
accidentally "hang on to the head".

In Clojure's early days, I complained about this and described some of
my own experiments with uncached sequences. Rich said he was
experimenting with another model for uncached iterator-like
constructs, I think he called them streams. As far as I know, none of
that has ever made it into Clojure. So I still feel there's a need
here that eventually needs to be addressed.

Clojure's built-in "range" function (last time I looked) essentially
produces an uncached sequence. And that makes a lot of sense.
Producing the next value in a range on-demand is way more efficient
and practical than caching those values. I think that Clojure
programmers should have an easy way to make similarly uncached
sequences if that's what they really want/need. (Well, obviously you
can drop down into Java and use some of the same tricks that range
uses, but I mean it should be easy to do this from within Clojure).

John Harrop

unread,

Nov 4, 2009, 12:46:39 AM11/4/09

to clo...@googlegroups.com

On Tue, Nov 3, 2009 at 1:53 AM, Alex Osborne <a...@meshy.org> wrote:

The new loop uses the outer-let to get around this:

(let [G__13697 s
[x & xs] G__13697
y xs]
(loop* [G__13697 G__13697
y y]
(let [[x & xs] G__13697
y y]
...)))

Now, if that were

(let [G__13697 (java.lang.ref.SoftReference. s)

[x & xs] (.get G__13697)

y xs]

(loop* [G__13697 (.get G__13697)

y y]

(let [[x & xs] G__13697

y y]

...)))

instead ...

Christophe Grand

unread,

Nov 4, 2009, 8:47:56 AM11/4/09

to clo...@googlegroups.com

On Tue, Nov 3, 2009 at 7:27 PM, Paul Mooser <taro...@gmail.com> wrote:
>
> Ah -- I hadn't understood that when using destructuring, that
> subsequent bindings could refer to the destructured elements. I should
> have, since clojure "only" has let*, and this behavior seems
> consistent with that, for binding.
>
> Eeww. It seems like quite a thorny issue to solve, even if simple to
> describe.

Well, in truth, there's a way to fix the loop macro but it is too
ugly: the idea is to wrap the outer let in a closure and replace the
loop by a function, thus we could benefit from the locals clearing on
tail call.

The real solution would be pervasive locals clearing but I seem to
remember Rich saying he'd like to delay such work until clojure in
clojure.

> What's the procedure for creating a ticket for this? Is it at least
> acknowledged that this IS a bug?

It's better to wait for Rich's opinion on this problem before creating a ticket.

Paul Mooser

unread,

Nov 4, 2009, 12:47:34 PM11/4/09

to Clojure

Well, I care (conceptually) more about the fix being made, rather than
the exact timeframe. If we had to wait until clojure-in-clojure, I
think I could live with that, since the issue can be readily avoided.
We'll see if Rich has a chance to chime-in to acknowledge whether or
not he considers this a bug.

On Nov 4, 5:47 am, Christophe Grand <christo...@cgrand.net> wrote:

Chouser

unread,

Nov 4, 2009, 4:16:20 PM11/4/09

to clo...@googlegroups.com

On Tue, Nov 3, 2009 at 11:51 PM, Mark Engelberg
<mark.en...@gmail.com> wrote:
>
> Clojure's built-in "range" function (last time I looked) essentially
> produces an uncached sequence. And that makes a lot of sense.

'range' has since changed and now produces a chunked lazy seq
(master branch post-1.0).

> Producing the next value in a range on-demand is way more efficient
> and practical than caching those values. I think that Clojure
> programmers should have an easy way to make similarly uncached
> sequences if that's what they really want/need.

This can be done by implementing the ISeq interface, today with
proxy, in the future with newnew/reify/deftype/etc.

(defn incs [i]
(proxy [clojure.lang.ISeq] []
(seq [] this)
(first [] i)
(next [] (incs (inc i)))))

user=> (let [r (range 1e9)] [(first r) (last r)])
java.lang.OutOfMemoryError: GC overhead limit exceeded (NO_SOURCE_FILE:0)

user=> (let [r (incs 10)] [(first r) (nth r 1e9)])
[10 1000000010]

Both those examples retain the head, but since 'incs' isn't
a lazy-seq, the intermediate values can be garbage-collected.
Note the difference between the seq abstraction and the lazy-seq
implementation.

--Chouser

Paul Mooser

unread,

Nov 4, 2009, 5:03:05 PM11/4/09

to Clojure

I completely understand the difference between the ISeq interface, and
the particular implementation (lazy-seq) that results in these
problems. It would be fairly straightforward, I think, to write some
kind of uncached-lazy-seq which doesn't exhibit these problems, but
I've felt that is sidestepping and issue and introduces issues of its
own.

Paul Mooser

unread,

Nov 5, 2009, 2:06:39 PM11/5/09

to Clojure

It does make me wonder, however, if having the lazy-seq cache things
is sort of conflating laziness and consistency, since as you point
out, not all ISeq implementations do any sort of caching.

I wonder if it would be interesting to decompose it into 'lazy-
seq' (uncached), and 'cached-seq'. I understand that this is unlikely
to ever happen, but it occurred to me last night when I was idly
thinking about this.

Paul Mooser

unread,

Nov 6, 2009, 12:56:34 PM11/6/09

to Clojure

So, I've been hoping that Rich (or someone?) would weigh in on this,
and give the go-ahead to file a ticket on it. Do people feel at this
point there is consensus that this is indeed a bug?

Rock

unread,

Nov 9, 2009, 4:31:08 PM11/9/09

to Clojure

I've been following this thread, and I must say I'm puzzled that Rich
hasn't said anything at all about this issue yet. It seems important
enough to hear his own opinion.

John Harrop

unread,

Nov 9, 2009, 5:07:11 PM11/9/09

to clo...@googlegroups.com

On Mon, Nov 9, 2009 at 4:31 PM, Rock <rocco...@gmail.com> wrote:

I've been following this thread, and I must say I'm puzzled that Rich
hasn't said anything at all about this issue yet. It seems important
enough to hear his own opinion.

My observation over the past few months is that Rich has long absences away from the list, busy with other things. Maybe he cloisters himself sometimes to go on uninterrupted, focused development binges? It would explain why Clojure has grown and matured as rapidly as it has. :)

Paul Mooser

unread,

Nov 9, 2009, 9:06:22 PM11/9/09

to Clojure

I imagine he's just busy. At this point, I plan to create a ticket on
assembla, if that's possible - I think I just need to create a login
and then file it.

On Nov 9, 2:07 pm, John Harrop <jharrop...@gmail.com> wrote:

Rich Hickey

unread,

Nov 10, 2009, 7:21:42 AM11/10/09

to clo...@googlegroups.com

On Wed, Nov 4, 2009 at 8:47 AM, Christophe Grand <chris...@cgrand.net> wrote:
>
> On Tue, Nov 3, 2009 at 7:27 PM, Paul Mooser <taro...@gmail.com> wrote:
>>
>> Ah -- I hadn't understood that when using destructuring, that
>> subsequent bindings could refer to the destructured elements. I should
>> have, since clojure "only" has let*, and this behavior seems
>> consistent with that, for binding.
>>
>> Eeww. It seems like quite a thorny issue to solve, even if simple to
>> describe.
>
> Well, in truth, there's a way to fix the loop macro but it is too
> ugly: the idea is to wrap the outer let in a closure and replace the
> loop by a function, thus we could benefit from the locals clearing on
> tail call.
>
> The real solution would be pervasive locals clearing but I seem to
> remember Rich saying he'd like to delay such work until clojure in
> clojure.
>

Right - pervasive locals clearing will definitely do the trick here.
Interestingly, when I was at Microsoft and asked them about handling
this issue for the CLR they stated plainly it wasn't an issue at all -
their system can fully detect that any such references are in fact
unreachable and are subject to GC. So, in a sense, all locals clearing
on my part is a workaround for a JVM weakness in this area.

>> What's the procedure for creating a ticket for this? Is it at least
>> acknowledged that this IS a bug?
>
> It's better to wait for Rich's opinion on this problem before creating a ticket.
>

No ticket, please. This issue is well understood and has been
discussed at length.

Thanks,

Rich

John Harrop

unread,

Nov 10, 2009, 11:00:22 AM11/10/09

to clo...@googlegroups.com

On Tue, Nov 10, 2009 at 7:21 AM, Rich Hickey <richh...@gmail.com> wrote:

Right - pervasive locals clearing will definitely do the trick here.
Interestingly, when I was at Microsoft and asked them about handling
this issue for the CLR they stated plainly it wasn't an issue at all -
their system can fully detect that any such references are in fact
unreachable and are subject to GC. So, in a sense, all locals clearing
on my part is a workaround for a JVM weakness in this area.

Did you see my earlier post? One possible workaround in this specific instance could be to use SoftReference to make a "clearable" local in the let outside the loop. In case of any worries about the reference being cleared in the few nanos before it can be copied into the loop variables, you could also use a mutable store, e.g.

let [G__13697 (atom s)

[x & xs] @G__13697

x (atom x)

xs (atom xs)

y (atom @xs)

grab (fn [a] (let [x @a] (reset! a nil) x))]

(loop* [G__13697 (grab G__13697)

y (grab y)]

(let [[x & xs] G__13697

y y]

...)))

How it works:

1. When the outer let is needed due to destructuring, the nondestructuring binds are wrapped in (atom). The destructuring binds are not, but each destructuring-produced binding is then rebound to itself wrapped in (atom). Any reference to one of the preceding locals is wrapped in (deref).

2. When the loop needs to initialize its loop variables, it wraps access to the let's variables in grab, which returns the contents of the atom while also resetting it.

It's not very CPU efficient, since it uses atoms which seem to be slow, but it will prevent head-retention of lazy seqs IF (let [x the-seq x something-else]) lets go of the head of the-seq when the second binding of x in the same let is performed.

By the time the loop body begins executing, the only things the outer let's bindings are holding onto will be tiny little empty atoms, consuming just a few heap bytes each.

Christophe Grand

unread,

Nov 10, 2009, 11:15:43 AM11/10/09

to clo...@googlegroups.com

Or you can rely on the existing local clearing on tail calls:
(#(let [G__13697 s

[x & xs] G__13697
y xs]

((fn [G__13697 y]

(let [[x & xs] G__13697
y y]

...) G__13697 y))))

It _should_ work (untested).

Christophe

Reply all

Reply to author

Forward