Fine-grained locals clearing

Rich Hickey

unread,

Dec 10, 2009, 9:10:44 AM12/10/09

to Clojure

One of the objectives of Clojure is to reduce incidental complexity.
And one of the biggest sources of incidental complexity in Clojure was
the retention of the head of a lazy sequence due to its being
referenced by some local (argument or local (let) binding). One might
expect that, if no subsequent code in the body of a function uses that
arg/local, it would be subject to GC. Unfortunately, on the JVM, that
is, in many cases, not true - the local is considered a live reference
and is thus not GCed. This yields the infamous 'holding onto the head'
problem, and subsequent Out Of Memory errors on large data sets.

I had put in place a workaround, which was the 'clearing/nulling-out'
of locals on the tail call of the function. This helps in many, but
not all, cases. Not all logic flows are amenable to rearrangement to
leverage this cleanup. And there are many cases where the local is not
visible - e.g. when destructuring.

The full solution is to track, during compilation, the lifetime of all
locals on all branches of the execution path and to emit code that
clears them at the point of last use in any particular branch.

I'm happy to announce I have implemented this fine-grained locals
clearing in the compiler, in the 'new' branch. It should automatically
cover all cases in which the code doesn't explicitly reuse the head -
including non-tail usage, destructuring etc. In short, such cases
should 'just work' from now on.

N.B. that this is strictly a lifetime management issue and does not
change the nature of lazy sequences - they are real, linked data
structures, the tail of which might not yet have been created. They
are most emphatically *not* ephemeral streams of values. However, with
fine-grained locals clearing, they are subject to GC 'as you go',
delivering the benefits of both.

If you've got a pet case of incidental head-retention, please try out
the 'new' branch and let me know how it goes.

Thanks,

Rich

Garth Sheldon-Coulson

unread,

Dec 10, 2009, 9:23:43 AM12/10/09

to clo...@googlegroups.com

Rockin'.

--
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clo...@googlegroups.com
Note that posts from new members are moderated - please be patient with your first post.
To unsubscribe from this group, send email to
clojure+u...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Stephen C. Gilardi

unread,

Dec 10, 2009, 9:40:37 AM12/10/09

to clo...@googlegroups.com

On Dec 10, 2009, at 9:10 AM, Rich Hickey wrote:

> I'm happy to announce I have implemented this fine-grained locals
> clearing in the compiler, in the 'new' branch. It should automatically
> cover all cases in which the code doesn't explicitly reuse the head -
> including non-tail usage, destructuring etc.

What a great change! We ran into the problem this solves just this week: running out of memory on long lazy seqs with large elements that we intended to be processing one item at a time.

> In short, such cases should 'just work' from now on.

... adding yet another entry on the long list of Clojure facilities that adhere to the principle of least surprise.

Nicely done!

--Steve

George Jahad

unread,

Dec 10, 2009, 11:24:12 AM12/10/09

to Clojure

+1 As cool as the new branch is, this is the first compelling reason
I've seen to go to my boss and say we need to switch to it now.

Thanks Rich!

Richard Newman

unread,

Dec 10, 2009, 2:26:24 PM12/10/09

to clo...@googlegroups.com

> +1 As cool as the new branch is, this is the first compelling reason
> I've seen to go to my boss and say we need to switch to it now.
>
> Thanks Rich!

Speaking of which... I know the new branch is where Rich publishes his
current work. Does anyone (Rich included!) have any opinion on whether
it's reasonable to do 'mainstream' development against new (rather
than working against master and occasionally testing against new)?

I currently work against master, but I keep my checkout up-to-date.

Imaginary counter-points would be "bugs that get fixed on master don't
get merged into new", "it's really only for experimentation", "it's
often broken for days at a time", "things behave differently to
master", "functions keep changing their names", "building a jar
doesn't work", etc.

I ask because I maintain a bunch of libraries and do quite a lot of
new development, and so I see a possible win/win: I get to enjoy new
improvements, but I also act as a real-world-code tester for the
community. That only applies, though, if new has a modicum of
stability and reliability.

Chouser

unread,

Dec 10, 2009, 4:15:37 PM12/10/09

to clo...@googlegroups.com

On Thu, Dec 10, 2009 at 2:26 PM, Richard Newman <holy...@gmail.com> wrote:
>> +1 As cool as the new branch is, this is the first compelling reason
>> I've seen to go to my boss and say we need to switch to it now.
>>
>> Thanks Rich!
>
> Speaking of which... I know the new branch is where Rich publishes his
> current work. Does anyone (Rich included!) have any opinion on whether
> it's reasonable to do 'mainstream' development against new (rather
> than working against master and occasionally testing against new)?
>
> I currently work against master, but I keep my checkout up-to-date.

I would not recommend doing large amounts of development against
'new'. Things change there in ways that break compatibility with
earlier changes there (not usually with 'master' through), so if
you start using features from 'new' and keep your checkout
up-to-date, your code may break at any time without notice.

For context: I don't think I've ever recommended against using
master.

I would however recommend playing around with 'new', trying out
some of the new features on key parts of your code. Also, it may
be very useful to try all your code on 'new' *without* taking
advantage of the new features, and reporting back on any
breakage.

You won't have to wait too long though, I suspect. Once 1.1 is
out the door I assume 'new' will be merged into 'master' fairly
quickly.

--Chouser

Paul Mooser

unread,

Dec 10, 2009, 4:25:26 PM12/10/09

to Clojure

I can't express how thrilled I am that you did this work.Thanks so
much - since I've run into a few of these classes of bugs, I'll see if
I can switch over to new and try to run against some big data sets and
give some feedback, if I can find the time.

Richard Newman

unread,

Dec 10, 2009, 5:04:49 PM12/10/09

to clo...@googlegroups.com

> Also, it may be very useful to try all your code on 'new' *without*
> taking
> advantage of the new features, and reporting back on any
> breakage.

That's more what I was thinking. While I find the new features
interesting, I'm less jazzed about spending the time to build on
features that might go away. (Chicken and egg, of course.)

It sounds like it's mostly the bleeding edge of `new` that's bleeding,
so to speak, and the stable part is fairly reliable. If that's a
correct impression, then it's probably worth developing against it.

I'm not overly concerned about the occasional regression, because it's
easy enough to switch to master to verify. It's systematic regression
that I'd like to avoid.

> You won't have to wait too long though, I suspect. Once 1.1 is
> out the door I assume 'new' will be merged into 'master' fairly
> quickly.

All the more reason to bang on it now, I suppose!

Daniel Werner

unread,

Dec 12, 2009, 10:57:16 AM12/12/09

to Clojure

On Dec 10, 3:10 pm, Rich Hickey <richhic...@gmail.com> wrote:
> I'm happy to announce I have implemented this fine-grained locals
> clearing in the compiler, in the 'new' branch.

Is there a chance for this feature to find its way into master before
Clojure 1.1 is released?

Rich Hickey

unread,

Dec 12, 2009, 12:21:58 PM12/12/09

to clo...@googlegroups.com

Unlikely. It is a pretty extensive change, and hasn't yet seen the
kind of exercise the rest of the code in 1.1 has.

That said, the bulk of what might constitute a version 1.2 is already
done, so the gap between 1.1 and 1.2 might be short.

Rich

Reply all

Reply to author

Forward