Proposal: clojure.io

10 views
Skip to first unread message

Phil Hagelberg

unread,
Dec 31, 2009, 9:58:49 PM12/31/09
to clo...@googlegroups.com

I've been looking over our use of contrib in our large-ish project
at work. About 90% of the invocations of contrib functions are
I/O-related. I wonder if it would be a good idea to include a clojure.io
namespace in Clojure itself. I've mentioned the idea a few times on IRC,
and people seemed to be very much in favour.

I've prototyped this in the attached file. Most of the functions come
from contrib's duck-streams library. I've taken everything from
duck-streams except read-lines, file-str, make-parents, and pwd.

read-lines was left out since using it will quite often result in a
leak; closing resources inside a lazy-seq is almost always
problematic. with-in-reader should be used instead. file-str was omitted
because file from java-utils is much nicer. make-parents was omitted in
favour of a new more general function, mkdir. pwd was omitted because
the JVM doesn't have a notion of a "working directory" as most
environments do. But it could be moved to contrib's java-utils if people
are fond of it.

I've also taken the file, delete-file, and delete-file-recursively
functions from java-utils. I've added support for treating ~ as $HOME to
file since that was present in duck-streams' file-str function. delete-file
and delete-file-recursively are especially necessary if you ever write
tests for functions that create files. relative-path-string and as-file
were also taken from java-utils simply because they are used for the
file function.

I welcome discussion about this proposal. Do you think it's necessary? Are
there any functions we should leave out? Any others we should promote from
contrib?

-Phil

io.clj

Konrad Hinsen

unread,
Jan 1, 2010, 4:40:24 AM1/1/10
to clo...@googlegroups.com
On 01.01.2010, at 03:58, Phil Hagelberg wrote:

> I welcome discussion about this proposal. Do you think it's
> necessary? Are
> there any functions we should leave out? Any others we should
> promote from
> contrib?

I am very much in favour of a clojure.io library. It's difficult to do
much useful computing without I/O, and what there is in clojure.core
is insufficient.

Your selection covers all I need and more, so it's fine with me. My
only suggestion is to move all I/O-related functions from clojure.core
to clojure.io for consistency.

Konrad.

Stefan Kamphausen

unread,
Jan 1, 2010, 6:27:43 AM1/1/10
to Clojure
Hi,


On Jan 1, 3:58 am, Phil Hagelberg <p...@hagelb.org> wrote:
> I've been looking over our use of contrib in our large-ish project
> at work. About 90% of the invocations of contrib functions are
> I/O-related.

evil mutable little bastards, these files ;-)

> I wonder if it would be a good idea to include a clojure.io
> namespace in Clojure itself.

inc

> I've prototyped this in the attached file. Most of the functions come
> from contrib's duck-streams library. I've taken everything from
> duck-streams except read-lines, file-str, make-parents, and pwd.
>

> read-lines was left out since ...

Sounds reasonable.

> I welcome discussion about this proposal. Do you think it's
> necessary? Are there any functions we should leave out?
> Any others we should promote from contrib?

I don't know whether it's in contrib already but I'd very much
appreciate a few macros lifting the burden of Java's io-lib with
Reader, BufferedReader, File, PushbackReader, WhateverReader
and ...Writers. Coming from the Lisp world this feels rather ugly to
my fingers.

Kind regards,
Stefan

Sean Devlin

unread,
Jan 1, 2010, 10:49:44 AM1/1/10
to Clojure
Phil,
Overall I think this is a good idea, but I get the feeling duck-
streams isn't quite ready, at least not today. However, this isn't to
say that it couldn't be ready if we worked hard on it over the next
few months.

Here are some things to look into off the top of my head

1. I'd recommend adding support for general unix file utilities.
I've written some of them myself, and you can review/borrow/steal code
from here:

http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/file_utils.clj

2. duck-streams works very well for text, but not binary formats. I
know there was discussion about it here:

http://groups.google.com/group/clojure/browse_thread/thread/416ca90d3ce2fa3/d64648f34e5c8668

3. There should be a brainstorming session to see what objects reader/
writer will dispatch over, and the multimethods improved accordingly.
This should probably happen anyway.

Also, there's a bigger question of how Clojure will support IO. This
becomes very platform specific, at least part of it do. Rich/the
community will have to decide how to handle IO on multiple platforms,
or to stick w/ the JVM for now.

I know what I've described is a lot of work, but it's what is involved
with creating a really well polished library. I think in the end this
will be library we're all proud of. Thanks for bringing the idea
up :)

Sean

Stuart Sierra

unread,
Jan 1, 2010, 2:34:47 PM1/1/10
to Clojure
On Dec 31 2009, 9:58 pm, Phil Hagelberg <p...@hagelb.org> wrote:
> I wonder if it would be a good idea to include a clojure.io
> namespace in Clojure itself. I've mentioned the idea a few times on IRC,
> and people seemed to be very much in favour.

I've considered this too, but I know Rich Hickey has plans for a
dedicated I/O library. It's been on http://clojure.org/todo for a
while. Details are sketchy, but presumably it would be functional,
stream-based, and thread-safe.

The I/O contrib libs are mostly convenience wrappers around the
java.io classes and do not represent a well-thought-out API. (I know
because I wrote some of them.)

-SS

Phil Hagelberg

unread,
Jan 1, 2010, 4:34:07 PM1/1/10
to clo...@googlegroups.com
Sean Devlin <francoi...@gmail.com> writes:

> 1. I'd recommend adding support for general unix file utilities.
> I've written some of them myself, and you can review/borrow/steal code
> from here:
>
> http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/file_utils.clj

These are all one-line wrappers around Java methods. I know Rich has
stated that the Clojure philosophy is to avoid such wrappers unless they
provide significant additional value. Perhaps this will need to change
as people start to write more cross-platform code, but I don't know if
we're to that point yet.

> 2. duck-streams works very well for text, but not binary formats. I
> know there was discussion about it here:
>
> http://groups.google.com/group/clojure/browse_thread/thread/416ca90d3ce2fa3/d64648f34e5c8668

I believe it's the case that reader and writer don't work for binary
formats, but copy should. Copying between two files or between a file
and a byte array should work fine unless I'm missing something. This
should be better-documented though.

> 3. There should be a brainstorming session to see what objects reader/
> writer will dispatch over, and the multimethods improved accordingly.
> This should probably happen anyway.

It covers everything I can think of, but if you've got suggestions let's
hear them.

-Phil

Kevin Downey

unread,
Jan 1, 2010, 5:14:54 PM1/1/10
to clo...@googlegroups.com
I think something more abstract would be good. A function or macro
where you pass it an "IO Spec" and it takes care of all the class
stuff.

(io/read [:bytes :from <SOMETHING> :as p]
(do-stuff-with-a-byte p))

(io/read [:lines :from <SOMETHING> :as p]
(do-stuff-with-a-string p))

(io/read [:lines :from <SOMETHING>]) ;no :as binding or body, results
in a lazy-seq of lines

so you can specify the IO behavior you want, and let the
implementation map the spec to a class.

that being said, you would still want something like duckstreams for
interop, and possibly io/read and io/write would be implemented using
duckstreams. So I see no harm in pulling duckstreams in.

P.S. scopes are great for the lazy-seq case

> --
> You received this message because you are subscribed to the Google
> Groups "Clojure" group.
> To post to this group, send email to clo...@googlegroups.com
> Note that posts from new members are moderated - please be patient with your first post.
> To unsubscribe from this group, send email to
> clojure+u...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/clojure?hl=en

--
And what is good, Phaedrus,
And what is not good—
Need we ask anyone to tell us these things?

Steven E. Harris

unread,
Jan 1, 2010, 7:51:22 PM1/1/10
to clo...@googlegroups.com
Kevin Downey <red...@gmail.com> writes:

> (io/read [:lines :from <SOMETHING> :as p]
> (do-stuff-with-a-string p))
>
> (io/read [:lines :from <SOMETHING>]) ;no :as binding or body, results
> in a lazy-seq of lines

And it's important to specify whether "p" in the former or the head item
in the sequence in the latter remain valid after the next step to the
next line. Though obviously easier to use, it annoys me when a
line-by-line library function forces fresh allocation for every line.

I'd like to see an option to reuse a buffer that holds the
most-recently-read line, and stepping to the next line overwrites that
buffer. The line offered is a read-only /view/ of the mutable
buffer. "Saving" a line for future use hence requires an explicit copy
-- something that's not likely idiomatic in Java or Clojure.

--
Steven E. Harris

Alexander Kjeldaas

unread,
Jan 1, 2010, 9:32:19 PM1/1/10
to clo...@googlegroups.com
My 2c.

In any clojure.io library, make sure none of the warts that are planned to be fixed in NIO2 are codified.

JDK7 includes work on Path, large directory traversal, event notifications etc.

Alexander

2010/1/1 Phil Hagelberg <ph...@hagelb.org>

Sean Devlin

unread,
Jan 1, 2010, 10:42:10 PM1/1/10
to Clojure
On Jan 1, 4:34 pm, Phil Hagelberg <p...@hagelb.org> wrote:

> Sean Devlin <francoisdev...@gmail.com> writes:
> > 1.  I'd recommend adding support for general unix file utilities.
> > I've written some of them myself, and you can review/borrow/steal code
> > from here:
>
> >http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/s...

>
> These are all one-line wrappers around Java methods. I know Rich has
> stated that the Clojure philosophy is to avoid such wrappers unless they
> provide significant additional value. Perhaps this will need to change
> as people start to write more cross-platform code, but I don't know if
> we're to that point yet.

You missed a small detail. There is a call to the to-file function in
there. This allows the file utilities to work on file objects or
string paths the same way. I like this trick because it lets the fns
work on a broader set of inputs

> > 2.  duck-streams works very well for text, but not binary formats.  I
> > know there was discussion about it here:
>

> >http://groups.google.com/group/clojure/browse_thread/thread/416ca90d3...


>
> I believe it's the case that reader and writer don't work for binary
> formats, but copy should. Copying between two files or between a file
> and a byte array should work fine unless I'm missing something. This
> should be better-documented though.
>
> > 3.  There should be a brainstorming session to see what objects reader/
> > writer will dispatch over, and the multimethods improved accordingly.
> > This should probably happen anyway.
>
> It covers everything I can think of, but if you've got suggestions let's
> hear them.

The java.sql.Clob & Blob interfaces might be a nice cases to add
(where possible).

Sean

>
> -Phil

Phil Hagelberg

unread,
Jan 2, 2010, 12:27:35 AM1/2/10
to clo...@googlegroups.com
Stuart Sierra <the.stua...@gmail.com> writes:

> On Dec 31 2009, 9:58 pm, Phil Hagelberg <p...@hagelb.org> wrote:
>> I wonder if it would be a good idea to include a clojure.io
>> namespace in Clojure itself. I've mentioned the idea a few times on IRC,
>> and people seemed to be very much in favour.
>
> I've considered this too, but I know Rich Hickey has plans for a
> dedicated I/O library. It's been on http://clojure.org/todo for a
> while. Details are sketchy, but presumably it would be functional,
> stream-based, and thread-safe.

Hmm... last I heard those features were abandoned or shelved, but I
could be wrong. If there are plans and ideas for a more functional I/O
library in the near future we should be discussing that instead. I'm
just speaking from what's worked for me in the past year.

-Phil

ianp

unread,
Jan 1, 2010, 2:11:08 PM1/1/10
to Clojure
> > I wonder if it would be a good idea to include a clojure.io
> > namespace in Clojure itself.

+1.

> > read-lines was left out since ...

On the basis that it's less painful to add new stuff in later than to
remove stuff I agree that erring on the side of caution in the correct
approach.

> Overall I think this is a good idea, but I get the feeling duck-
> streams isn't quite ready, at least not today.  However, this isn't to
> say that it couldn't be ready if we worked hard on it over the next
> few months.

Well, this would be a fairly large change so I'm guessing that it
wouldn't hit master until 1.2 at least, which gives us some time to
work with.

> Also, there's a bigger question of how Clojure will support IO.  This
> becomes very platform specific, at least part of it do.  Rich/the
> community will have to decide how to handle IO on multiple platforms,
> or to stick w/ the JVM for now.

I think that sticking with the JVM is the way to go, at least for the
time being. Running atop the JVM is one of Clojure's real strengths,
and not something that we should discard lightly (no offence meant to
the folks working on the CLI port of course!).

Cheers,
Ian.

Reply all
Reply to author
Forward
0 new messages