Resource cleanup when lazy sequences are finalized

113 views
Skip to first unread message

David Andrews

unread,
Aug 3, 2010, 3:21:00 PM8/3/10
to Clojure
I want to create a lazy seq backed by an open file (or db connection,
or something else that needs cleanup). I can't wrap the consumer in a
with-anything.

Is there a general method for cleaning up after the consumer discards
its reference to that lazy seq? I'm vaguely aware of Java finalize,
but am also aware that it is unpredictable (e.g. you aren't guaranteed
to be driven at the next gc).

Does Clojure even provide the ability to define a finalize method for
a lazy seq?

(Sipping water from a firehose...)

Brian Hurt

unread,
Aug 3, 2010, 4:40:03 PM8/3/10
to clo...@googlegroups.com

The descriptor(/db connection/etc) will be cleaned up eventually, when the finalizer runs.  The problem with this is that you don't know when the finalizer will run, and it can be arbitrarily delayed.  So it's possible for a program to run out of file descriptors, if there are file descriptors which are garbage but not yet collected.

Another possibility, if you know that the consumer will force the entire list before discarding it, is to concat on a zero-element list which closes the descriptor, like:

(concat orig-list
    (lazy-seq (do (.close fd) [])))

On the other hand, the consumer doesn't force the whole list, then the descriptor never gets closed.

So the real answer is: this isn't a good use for seqs.  It looks like one, but it isn't.

Brian

David Andrews

unread,
Aug 3, 2010, 5:41:36 PM8/3/10
to Clojure
On Aug 3, 4:40 pm, Brian Hurt <bhur...@gmail.com> wrote:
> So the real answer is: this isn't a good use for seqs.

I was afraid that was the case. Too bad, 'cause seqs are otherwise
elegant. Thanks for the sanity check.

Jeff Palmucci

unread,
Aug 3, 2010, 5:28:18 PM8/3/10
to Clojure
See my library at http://github.com/jpalmucci/clj-yield, which makes
this trivial.

For example, here is a function I use to read a sequence of java
serialized objects from a stream:

(defn read-objects [path]
(with-yielding [out 1000]
(with-open [stream (java.io.ObjectInputStream.
(java.io.BufferedInputStream.
(java.util.zip.GZIPInputStream.
(java.io.FileInputStream. (io/file path)))))]
(loop []
(try
(let [next (.readObject stream)]
(yield out next)
(recur))
(catch java.io.EOFException e (.close stream)))))))

When the sequence returned by with-yielding becomes garbage
collectable, yield will throw an exception causing with-open to close
the file.

Note that with-yielding will use a thread from the thread pool, so its
a bad idea to have hundreds of active with-yieldings at once.

Cameron

unread,
Aug 4, 2010, 10:56:54 AM8/4/10
to Clojure
Not 100% on this, but this is what I do when reading files...

(with-open [rdr (BufferedReader. (FileReader. file-name))]
(reduce conj [] (line-seq rdr)))

That ensures that the whole seq is realized without closing the
handle, but it also allows you to wrap the whole block with a take
function if you only cared about the first few lines. As far as I
know, this would still close the resources after whether you realize
the whole sequence or only take part of it. Can someone who knows a
bit better confirm?


On Aug 3, 5:28 pm, Jeff Palmucci <jpalmu...@gmail.com> wrote:
> See my library athttp://github.com/jpalmucci/clj-yield, which makes

Meikel Brandmeyer

unread,
Aug 4, 2010, 11:08:15 AM8/4/10
to Clojure
Hi,

On Aug 4, 4:56 pm, Cameron <cpuls...@gmail.com> wrote:

> Not 100% on this, but this is what I do when reading files...
>
> (with-open [rdr (BufferedReader. (FileReader. file-name))]
>     (reduce conj [] (line-seq rdr)))

An easier way to do this is doall.

> That ensures that the whole seq is realized without closing the
> handle, but it also allows you to wrap the whole block with a take
> function if you only cared about the first few lines. As far as I
> know, this would still close the resources after whether you realize
> the whole sequence or only take part of it. Can someone who knows a
> bit better confirm?

When you place the take around the with-open block or around the
doall, the whole file will be read into memory and stay there until
the take sequence is fully realised. If you really want only a certain
number of lines of the file, you have to move the take into the doall
in the with-open.

Sincerely
Meikel

Jeff Palmucci

unread,
Aug 4, 2010, 12:12:37 PM8/4/10
to Clojure
Right, that'll work, but it is no longer lazy in the sense that it
will read the whole sequence into memory (a problem for me because my
sequences are 10s of GB long, compressed).

The feature I was trying to show is that the "yield" function allows
you to make *arbitrary* non-lazy code lazy. (not just for cleanup, but
for anything)

In this particular case, the producer thread will only read 1000
objects ahead before blocking (inside the yield function) and waiting
for the consumers to catch up. Won't blow up memory.

Also, in this example, with-open doesn't really work for producing
lazy sequences. As you pointed out you must read the whole file to
avoid closing too early. However, with 'yield' it works because the
body of the with-yielding has its own thread. The body of the with-
open doesn't finish until the file is completely read.

David Andrews

unread,
Aug 5, 2010, 5:59:50 PM8/5/10
to Clojure
On Aug 3, 5:28 pm, Jeff Palmucci <jpalmu...@gmail.com> wrote:
> See my library athttp://github.com/jpalmucci/clj-yield, which makes
> this trivial.

This looks really nice, Jeff. Thanks. Exactly what I was looking
for.

I notice that the garbage-monitor deftype yields a classname error in
IBM Java6. I renamed it to garbage_monitor and all seems copacetic.

Jeff Palmucci

unread,
Aug 12, 2010, 10:33:14 AM8/12/10
to Clojure
Any interest in moving this to clojure-contrib? It seems like a pretty
useful facility to have for a language like clojure that relies so
heavily
on lazy sequences.

Also, it's an easy way to implement a pretty common parallel
programming pattern, two threads communicating through a buffered
queue.

David Andrews

unread,
Aug 13, 2010, 1:49:03 PM8/13/10
to Clojure
Easy enough to use via lein (thanks for uploading it to clojars BTW).
I think it deserves a place in cc.
Reply all
Reply to author
Forward
0 new messages