The case for a dedicated I/O spec

23 views
Skip to first unread message

Hannes Wallnoefer

unread,
Sep 3, 2009, 6:43:28 AM9/3/09
to comm...@googlegroups.com
CommonJS I/O functionality is currently only described in the
"Streams" section of the filesystem API proposal:

https://wiki.mozilla.org/ServerJS/Filesystem_API/A

However, there are many places other than file access where stream
based I/O plays a central role. Networking comes to mind (both generic
sockets and higher level protocols such as HTTP), and Narwhal has
introduced a BinaryIO class that wraps a stream interface around a
ByteArray buffer. Having a unified API for all these domains would be
highly desirable.

Currently, neither module nor class names are specified for CommonJS
I/O functionality. I propose to create a dedicated I/O spec, starting
with the streams section of the file system proposal and fill in the
missing pieces. If there's a rough consensus for this on the list, I
volunteer to do the initial wiki editing.

Hannes

Wes Garland

unread,
Sep 3, 2009, 10:04:10 AM9/3/09
to comm...@googlegroups.com
Hannes:

You are exactly on target.

I have been planning to implement generic streams around Filesystem/A, but simply have not had time. 

I believe what's in Filesystem/A is sound, going to generic streams my mental notes include
  - single Stream class
  - properties/methods are not defined unless they can be used (i.e. no seek method on sockets, no write method on read-only streams, etc)
  - seenEOF property   (true if we've tried to read and could not because of EOF, or select()=yes, read()=0 on stream, etc)
  - canRead/canWrite implemented with select() or similar
  - partial writes required

Wes
--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Kris Kowal

unread,
Sep 3, 2009, 5:19:51 PM9/3/09
to comm...@googlegroups.com
On Thu, Sep 3, 2009 at 3:43 AM, Hannes Wallnoefer<han...@helma.at> wrote:
> Currently, neither module nor class names are specified for CommonJS
> I/O functionality. I propose to create a dedicated I/O spec, starting
> with the streams section of the file system proposal and fill in the
> missing pieces. If there's a rough consensus for this on the list, I
> volunteer to do the initial wiki editing.

Thank you, it's definitely time to formalize an "io" module. It would
be nice to be able to reference the IO spec in my next draft of the
file system API instead of including it.

My only concern with the IO module is that the file.open method should
be the exclusive gatekeeper for the privilege of creating file-based
streams. I think that it needs to be clear that constructing raw
streams from files is a feature that would not be available in a
sandbox. That's my reasoning behind keeping the stream API abstract
and not specifying any IO constructors. There obviously *will* be
constructors for these objects, but I'm content to leave those as an
implementation detail. To be clear about my reasoning, in any secure
system, there will be two layers; the outer layer being privileged and
capable of constructing IO streams, a way to instantiate a hardened
copy of "open" and the file API, and an inner layer that receives the
secured file API and a IO APIs.

I think this can be sufficiently addressed with two sections. One
that specifies the interfaces, and another that specifies the
constructors for those interfaces but also specifies that they may not
be available.

Kris Kowal

Wes Garland

unread,
Sep 3, 2009, 9:05:31 PM9/3/09
to comm...@googlegroups.com
On Thu, Sep 3, 2009 at 5:19 PM, Kris Kowal <cowber...@gmail.com> wrote:
My only concern with the IO module is that the file.open method should
be the exclusive gatekeeper for the privilege of creating file-based
streams.  

I personally would like to see a unified io module which can do things like  [precise syntax/names unimportant]:

var stream1 = new io.Path("/etc/passwd").open("r");
var stream2 = new io.TCPIP_Socket("any/0", 80).listen("r+");
var stream3 = new io.TCPIP_Socket("127.0.0.1", 80).connect("r+");
var stream4 = new io.UNIX_Socket("/dev/log").open("w");
var stream5 = new io.Stream(0); /* file descriptor */

In order for these to all work -- all the constructors return something which is instanceof Stream -- the rules aren't really hard.  Everything that's current laid out in filesystem/A works, provided we use duck typing to indicate stream capabilities.

Then, to do nice sandboxing, we can pass in either instanciate Strams, or the constructors. For example, you want a sandbox to have access to make TCP/IP sockets, you give the sandbox the TCPIP_Socket constructor, but not the Path constructor.

This really useful, because once all the streams start behaving the same, we have the same reusability that we get with file descriptors in UNIX -- code need not care about the type of the stream, on its capabilities, which are duck-typed in.

Wes

Daniel Friesen

unread,
Sep 3, 2009, 11:34:06 PM9/3/09
to comm...@googlegroups.com
I've been playing around with similar ideas in MonkeyScript for awhile.

stream1 is similar to my File object:
var stream1 = (new File("/etc/passwd")).open("r");
When I think about it my "File" is a lot like a strict path (as opposed
to FilePath which is based off of an abstract path system). I could
consider merging my File, Directory, and FilePath into one class.
(Though the name File is attractive since it matches up with w3c's api;
And I feel strongly on FilePath over simply Path as I already am working
on a generic path system which can be extended into subsystems for
various purposes; FilePath, URLPath (the path portion of a url),
mutating paths from a http request, etc...)
For sockets, don't mind it being a little old: (And I think I meant
`"tcp"` not `tcp`)
http://draft.monkeyscript.org/api/io/Socket.html
I'm used to address/port being on bind/connect, but that's probably just
a show of hands thing.
I didn't understand enough about UNIX Sockets to jump into writing about
them in the api.

My Stream is a little different than that there. A Stream as I've been
implementing is basically a dumb wrapper (in fact I'm implementing it in
pure js with no Rhino dependency). The Stream class itself knows
absolutely nothing about how to read anything from anywhere, write, or
whatever.
You construct a Stream passing it an object, that object contains
certain keys like contentConstructor (required), read:, write:, etc...
These keys have extremely simple (and fairly flexible) apis that do the
most basic operations for read/write, etc...
Rather than a static definition of *Reader must have these methods,
etc... It's based on what you pass to it. If you tell it how to read:
the resulting stream will have .read and .skip, If you tell it how to
write: the resulting stream will have .write, If you tell it how to
.seek (haven't completely thought about this one yet) you'll be able to
.seek the stream.
The values you pass it don't become the actual end .read, etc...
methods. Those are implemented in the Stream class itself, they make use
of your raw methods to do the rest of the work. As an example:
function BufferStream(buf) {
return new Stream({
contentConstructor: buf.contentConstructor,
read: function(len, bufNoSkip) {
if ( this.position >= buf.length )
throw Stream.EOF;
if ( !bufNoSkip )
return len;
return buf.slice(this.position, len);
},
...
});
}
read's first parameter is a length;
- This may be a number > 1 of bytes/chars to read, or Infinity if
.read() or .read(Infinity) was used.
- If Infinity is used and you don't have everything available (ie:
You're handling a real stream rather than doing something like that
BufferStream) then you can simply change it to a number to use as a max.
-- You don't have to do any buffering, Stream handles buffering for
.read() on it's own, remember .read(Infinity) is not the same as .read()
so that would be a bad idea anyways.
read's second parameter is a little wierd, but seams to make sense after
I explain it: I could never come up with a good name and called it bufNoSkip
- For normal reading will be truthy, for skips will be false
-- This means if you have an optimized way of skipping you can look for
if( !bufNoSkip ) and return the length skipped. Otherwise you can just
return data like normal and Stream will handle discarding the data and
incrementing position by itself.
- If read is buffering (it already has a buffer) then bufNoSkip will be
a buffer object (truthy). You can return data and Stream will handle
adding it to the buffer on it's own, or if you have a more efficient way
to do it, you can append directly to the buffer itself and instead
return the length that you wrote into the buffer.
- Returning the respective sequence type (String/Blob) with .length == 0
is considered EOF. (You see a Stream.EOF there, (I'm thinking of
switching to return rather than throw). That's NOT something that gets
thrown to the user, it's a helper. returning/throwing that is merely a
signal to Stream that EOF has been reached but you don't already have a
proper sequence type. The stream will simply create an empty String or
empty Blob. It's simply a helper so you can write something abstract).

So a read can be as simple as read data and return. And is flexible
enough to have optimizations for skipping and reading directly into
buffers. And the user get's the comfortable .read api we've defined.


My Stream system does actually fit in with what you're talking about.
instanceof Stream always works. All my .open() like methods are
implemented as something that returns the generic Stream. Capabilities
like .read, .write, .seek are all duck typed in.
As a bonus, it's so generic you can write a Stream for almost ANYTHING
with a simple chunk of code that tells Stream how to do a few raw actions.
((Of course, I'm not saying we should ALL go and use my stream class))

+1 for a generic Stream class.

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Hannes Wallnoefer

unread,
Sep 5, 2009, 10:12:15 AM9/5/09
to CommonJS
I've set up a minimal IO main page:

https://wiki.mozilla.org/ServerJS/IO

I don't want to mindlessly copy over the Streams section from the
Filesystem page, so I'll take some time over the weekend to clean it
up into a proper proposal. I'm with Kris in that raw streams will be
described as abstract base class or interface, whereas derived classes
such as text or buffered streams will include constructors (wrapping a
raw stream).

If anyone feels like posting their own proposals please feel free to
do so.

Hannes

On Sep 4, 5:34 am, Daniel Friesen <nadir.seen.f...@gmail.com> wrote:
> I've been playing around with similar ideas in MonkeyScript for awhile.
>
> stream1 is similar to my File object:
> var stream1 = (new File("/etc/passwd")).open("r");
> When I think about it my "File" is a lot like a strict path (as opposed
> to FilePath which is based off of an abstract path system). I could
> consider merging my File, Directory, and FilePath into one class.
> (Though the name File is attractive since it matches up with w3c's api;
> And I feel strongly on FilePath over simply Path as I already am working
> on a generic path system which can be extended into subsystems for
> various purposes; FilePath, URLPath (the path portion of a url),
> mutating paths from a http request, etc...)
> For sockets, don't mind it being a little old: (And I think I meant
> `"tcp"` not `tcp`)http://draft.monkeyscript.org/api/io/Socket.html

Daniel Friesen

unread,
Sep 5, 2009, 7:47:33 PM9/5/09
to comm...@googlegroups.com
Think I should throw up my generic closure based Stream class as a
proposal or something for people to mull over in their heads?

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Ash Berlin

unread,
Sep 5, 2009, 7:51:46 PM9/5/09
to comm...@googlegroups.com

On 6 Sep 2009, at 00:47, Daniel Friesen wrote:

>
> Think I should throw up my generic closure based Stream class as a
> proposal or something for people to mull over in their heads?
>
> ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://
> daniel.friesen.name]

Feel free to post it as an interface, but specs should not include
implementation details, just behavioral+interface

Daniel Friesen

unread,
Sep 5, 2009, 8:10:09 PM9/5/09
to comm...@googlegroups.com
Would the:
new Stream({
contentConstructor: String,
read: function(len, bufNoSkip) {
...
}
});
Definition be behavioral+interface or an implementation detail?

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

George Moschovitis

unread,
Sep 6, 2009, 4:09:37 AM9/6/09
to CommonJS
> I/O functionality. I propose to create a dedicated I/O spec, starting
> with the streams section of the file system proposal and fill in the
> missing pieces.

+1
Reply all
Reply to author
Forward
0 new messages