JSGI "Stream" extension

4 views
Skip to first unread message

Isaac Z. Schlueter

unread,
Jan 28, 2010, 8:53:36 PM1/28/10
to CommonJS
Hey, CommonJS, what do you think of this?

http://wiki.commonjs.org/wiki/JSGI/StreamExtension

Wanna share your opinions? I know you got em!

--i

mob

unread,
Jan 28, 2010, 11:01:20 PM1/28/10
to CommonJS

Quick thoughts ...

- How do you read from input. There is no read method.
- Streams should be capable of full duplex (read + write). This spec
may not use it, but Streams should be capable of it
- How do you know when you can write without blocking? Need a
writable event and perhaps rename "data" to readable.
- Why do you have to mandate deferred execution. A more relaxed
definition of 'non-recursive' should suffice and it makes
implementation easier and permits non-queued implementations
- Need removeListener for completeness
- Is there a definition of how to fire an event. Specifically, is
there en Emitter definition?

I think you are on the right track. Way back when we were trying to
get async JSGI 0.3 a few months ago, I mentioned that streams were a
possible answer. We've implemented something close to this, but we
found we needed a bit more power than this proposal currently has. We
needed more control over the actual eventing to prevent blocking.
Pausing events was not really sufficient. Also, the semantics of read
and write with respect to blocking and short returns needs to be
closely specified.

- mob

Daniel Friesen

unread,
Jan 29, 2010, 12:14:42 AM1/29/10
to comm...@googlegroups.com
Take a look at my HTTP-Gateway plan as well.
http://wiki.accessjs.org/wiki/HTTP-Gateway
Erm... *cough* Yes, on two separate spurs of wishful or idealistic
thinking I ended up purchasing both accessjs.org and standardjs.org...

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Ryan Dahl

unread,
Jan 29, 2010, 1:27:56 AM1/29/10
to comm...@googlegroups.com

I think an good interface would be this.

1. A request handler function which gets called for each request. It
is passed two arguments, a request object and a response object.

2. The request object is

2A. to receive an HTTP request body, the request object implements a
"readable stream" meaning it has the following:
- event: 'data'
- event: 'eof'
- method: pause()
- method: resume()

2B. also the following
- member: headers
- member: version
- member: url

3. The response object is

3A. to send the response body, the response object implements a
"writable stream" meaning it has the following
- method: write()
- method: close()
- event: 'drain'

3B. also the following
- member: headers
which is an object, but may already contain suggested response headers
when passed to the request handler
- method: begin(statusCode[, reasonPhrase])
which will begin sending the response and its headers.

Ryan Dahl

unread,
Jan 29, 2010, 1:37:52 AM1/29/10
to comm...@googlegroups.com
On Thu, Jan 28, 2010 at 10:27 PM, Ryan Dahl <coldre...@gmail.com> wrote:
> I think an good interface would be this.

For slightly more background on this
http://thread.gmane.org/gmane.comp.lang.javascript.nodejs/1850

> 1. A request handler function which gets called for each request. It
> is passed two arguments, a request object and a response object.
>
> 2. The request object is
>
> 2A. to receive an HTTP request body, the request object implements a
> "readable stream" meaning it has the following:
> - event: 'data'
> - event: 'eof'

Maybe 'end' is better

> - method: pause()
> - method: resume()

This is to throttle the upload. If you're streaming to a socket, or
even to disk, it's possible the request comes in faster than you can
handle it. Calling pause() will tell the underlying TCP socket to stop
pulling in new data.

> 2B. also the following
> - member: headers
> - member: version
> - member: url
>
> 3. The response object is
>
> 3A. to send the response body, the response object implements a
> "writable stream" meaning it has the following
> - method: write()
> - method: close()
> - event: 'drain'

write() returns true or false. True if it was able to flush the data
to the socket immediately. False if not. If it wasn't able to, then
the underlying software should buffer the data and send it when
possible. If false was returned, the 'drain' event will be fired when
the buffer waiting to go to the socket is finally flushed.

> 3B. also the following
> - member: headers
> which is an object, but may already contain suggested response headers
> when passed to the request handler
> - method: begin(statusCode[, reasonPhrase])
> which will begin sending the response and its headers.

Maybe 'start' instead of 'begin'? Maybe 'open'?

mob

unread,
Jan 29, 2010, 1:49:13 AM1/29/10
to CommonJS
> I think an good interface would be this.
>
> 1. A request handler function which gets called for each request. It
> is passed two arguments, a request object and a response object.

Ejscript uses one object for both request/response. ie. a single
Request object. This is also a full-duplex stream. ie. you can read
incoming body data and write outgoing data on the one object. It is
much easier to pass round just one object and full-duplex streams work
very nicely.

>
> 2. The request object is
>
> 2A. to receive an HTTP request body, the request object implements a
> "readable stream" meaning it has the following:
> - event: 'data'
> - event: 'eof'
> - method: pause()
> - method: resume()

I presume data means readable? If so, do you also have a read() method
to read the incoming data?
I would also assume that if you get a data/readable event, then read()
is defined not to block?

>
> 2B. also the following
> - member: headers
> - member: version
> - member: url

That is the bare minimum. It gets a bit more complex when middleware
starts routing and changing the url components (pathInfo and
scriptName)

>
> 3. The response object is
>
> 3A. to send the response body, the response object implements a
> "writable stream" meaning it has the following
> - method: write()
> - method: close()
> - event: 'drain'

I think a better model is to only write in response to a "writable"
event. The node paradigm where you can write and it will just buffer
all written data is not ideal. The process can grow and you have to
manually back off. A better evented model is using "writable" events.

What does "drain" do?

>
> 3B. also the following
> - member: headers
> which is an object, but may already contain suggested response headers
> when passed to the request handler
> - method: begin(statusCode[, reasonPhrase])
> which will begin sending the response and its headers.

So middleware can modify both status and headers, you need to have
separate properties for both that are readable and writable.

-mob

mob

unread,
Jan 29, 2010, 1:51:57 AM1/29/10
to CommonJS
> > - method: pause()
> > - method: resume()
>
> This is to throttle the upload. If you're streaming to a socket, or
> even to disk, it's possible the request comes in faster than you can
> handle it. Calling pause() will tell the underlying TCP socket to stop
> pulling in new data.

Why not just not read it? If you get readable (data) events when
incoming data is available, you just respond to those events as data
arrives. If the data comes too fast, just ignore it and read it later
when you are ready. The eventing system should only issue readable/
data events when new data comes in.

This is simpler than having to manually throttle incoming data. Same
for output.

-mob

Ryan Dahl

unread,
Jan 29, 2010, 2:12:35 AM1/29/10
to comm...@googlegroups.com
On Thu, Jan 28, 2010 at 10:49 PM, mob <m...@embedthis.com> wrote:
>> I think an good interface would be this.
>>
>> 1. A request handler function which gets called for each request. It
>> is passed two arguments, a request object and a response object.
>
> Ejscript uses one object for both request/response. ie. a single
> Request object. This is also a full-duplex stream. ie. you can read
> incoming body data and write outgoing data on the one object. It is
> much easier to pass round just one object and full-duplex streams work
> very nicely.

Yeah. That might be okay. Something like this?

function (r) {
setTimeout(function () {
r.responseHeaders["Content-type"] = "text/plain";
r.beginResponse(200);
r.write("hello world\n");
r.close();
});
}

>> 2. The request object is
>>
>> 2A. to receive an HTTP request body, the request object implements a
>> "readable stream" meaning it has the following:
>> - event: 'data'
>> - event: 'eof'
>> - method: pause()
>> - method: resume()
>
> I presume data means readable?

Yes. req.addListener('data', function (data) { puts(data); });

> If so, do you also have a read() method to read the incoming data?

No.

> Why not just not read it? If you get readable (data) events when
> incoming data is available, you just respond to those events as data
> arrives. If the data comes too fast, just ignore it and read it later
> when you are ready. The eventing system should only issue readable/
> data events when new data comes in.
>
> This is simpler than having to manually throttle incoming data. Same
> for output.

I disagree. This models better what is actually happening. Data
arrives, you deal with it.

>> 2B. also the following
>> - member: headers
>> - member: version
>> - member: url
>
> That is the bare minimum. It gets a bit more complex when middleware
> starts routing and changing the url components (pathInfo and
> scriptName)

That's not necessary and almost nobody uses them.

>> 3. The response object is
>>
>> 3A. to send the response body, the response object implements a
>> "writable stream" meaning it has the following
>> - method: write()
>> - method: close()
>> - event: 'drain'
>
> I think a better model is to only write in response to a "writable"
> event. The node paradigm where you can write and it will just buffer
> all written data is not ideal. The process can grow and you have to
> manually back off. A better evented model is using "writable" events.
>
> What does "drain" do?

Most people don't want to worry about buffering data. The people who
do can use the 'drain'.

>> 3B. also the following
>> - member: headers
>> which is an object, but may already contain suggested response headers
>> when passed to the request handler
>> - method: begin(statusCode[, reasonPhrase])
>> which will begin sending the response and its headers.
>
> So middleware can modify both status and headers, you need to have
> separate properties for both that are readable and writable.

response.statusCode - sure.
I question the usefulness of this though. Only one "middleware" can
begin the response - I imagine that one does know the status code.

mob

unread,
Jan 29, 2010, 2:34:18 AM1/29/10
to CommonJS
> Yeah. That might be okay. Something like this?
>
>   function (r) {
>     setTimeout(function () {
>       r.responseHeaders["Content-type"] = "text/plain";
>       r.beginResponse(200);
>       r.write("hello world\n");
>       r.close();
>     });
>   }

Yes. You can just put the two objects together into one.

> > If so, do you also have a read() method to read the incoming data?
>
> No.
>
>

> I disagree. This models better what is actually happening. Data
> arrives, you deal with it.

Huh? How does it arrive. We use the same notification event "data" or
"readable", but we read from a stream to get it. The data has already
arrived as you say, but read() is the API to extract it into wherever
you want to put it.

How do you propose to "deal with it"? what is the api?

>
> > What does "drain" do?
>
> Most people don't want to worry about buffering data. The people who
> do can use the 'drain'.

If you just write and have the platform absorb all the data, your not
really event based. Your really just trading memory for time.
You need to be deterministic with memory too.

A better model is to have a writable event and to also buffer what the
user writes. That way, as you say, the user doesn't have to worry
about buffering if they don't choose to. But when you are downloading
say a 1GB file, you can just respond to writable events and write a
block per writable event.

Like this:

request.addListener("writable", function(event, rq) {
getMoreData(data)
rq.write(data)
}


> > So middleware can modify both status and headers, you need to have
> > separate properties for both that are readable and writable.
>
> response.statusCode - sure.
> I question the usefulness of this though. Only one "middleware" can
> begin the response - I imagine that one does know the status code.

Not a big point or issue, but downstream middleware can modify status
and headers. The final stage in the chain defines the definitive
status response code and headers.

-mob

Ryan Dahl

unread,
Jan 29, 2010, 2:47:41 AM1/29/10
to comm...@googlegroups.com
>> > If so, do you also have a read() method to read the incoming data?
>>
>> No.
>>
>> I disagree. This models better what is actually happening. Data
>> arrives, you deal with it.
>
> Huh?  How does it arrive. We use the same notification event "data" or
> "readable", but we read from a stream to get it. The data has already
> arrived as you say, but read() is the API to extract it into wherever
> you want to put it.

Ah you want to have a "readable" event and then read() it? I think
that might be okay - but it seems too complicated for the users.
They're still going to need away to stop the "readable" event.

> How do you propose to "deal with it"?  what is the api?

You do something with it. You write it to a file, you print it to a
socket, you parse it. The API is

req.addListener('data', function (data) { ... });

When you get too much you throttle with pause().

>> > What does "drain" do?
>>
>> Most people don't want to worry about buffering data. The people who
>> do can use the 'drain'.
>
> If you just write and have the platform absorb all the data, your not
> really event based. Your really just trading memory for time.
> You need to be deterministic with memory too.
>
> A better model is to have a writable event and to also buffer what the
> user writes. That way, as you say, the user doesn't have to worry
> about buffering if they don't choose to. But when you are downloading
> say a 1GB file, you can just respond to writable events and write a
> block per writable event.
>
> Like this:
>
> request.addListener("writable", function(event, rq) {
>   getMoreData(data)
>   rq.write(data)
> }

This complicates the interface without adding any more functionality.
You now have to add something to start and stop the "writable" events
from occurring.

mob

unread,
Jan 29, 2010, 2:49:07 AM1/29/10
to CommonJS
I'm being a bit dense. I now get it. In your suggestion, the "data"
event supplies the data with the event.
That can work, but it means you have to read it or store it.

The alternative is to make Request a stream and so you read from it
using read(). The advantage is
that you can choose to read or not read the data. You can also
potentially have higher performing interfaces
as you can read data without copying through an event data parameter.
But the big advantage is you can
pass the Request stream to other things like a stream. For example to
read lines, you can wrap it with
a TextStream. To decompress it, wrap with a Uncompress stream etc.
Being a stream is very flexible.
Lastly, you don't ever need to tell the stream to pause or stop
sending data. It will flow control automatically.

-mob

mob

unread,
Jan 29, 2010, 2:52:39 AM1/29/10
to CommonJS
> Ah you want to have a "readable" event and then read() it? I think
> that might be okay - but it seems too complicated for the users.
> They're still going to need away to stop the "readable" event.

You don't need to stop the readable event. You just make the semantics
that you only get a readable event
when new data arrives. It doesn't repeat.

> > request.addListener("writable", function(event, rq) {
> >   getMoreData(data)
> >   rq.write(data)
> > }
>
> This complicates the interface without adding any more functionality.
> You now have to add something to start and stop the "writable" events
> from occurring.

It adds real value. It make memory totally deterministic. The
underlying buffering won't grow out of control.
As with readable, you get one writable event on a transition to a
writable state. You don't need to start or stop them.

Again, the big win is to make it a stream. You can wrap it with a
TextStream, Compress, or other stream wrapper and write
through a stack of streams. That is the real win.

-mob

Kris Zyp

unread,
Jan 29, 2010, 9:44:05 AM1/29/10
to comm...@googlegroups.com, Isaac Z. Schlueter
Thank you getting this proposal together, Isaac, some great ideas in
there. A few questions/thoughts on this though.
* The proposal is called an "extension" to JSGI. Does this mean it would
work with current specification of JSGI? If so, can a response body be
both forEachable and be a stream at the same time? Or is stream an
alternate possible value of the response's body (it would seem not,
because the spec says response.body MUST be a Stream object)?

* Is this possible to implement on GAE? Obviously this is an important
platform to be able to support. I am not sure, but the normative
requirements of the Timing section seem like they might rule out a
compliant implementation on GAE. If this is just an extension, I suppose
it could just be ignored on those platforms, but I am not sure of the
intent there.

* If this is replacement for current forEach handling of a response
body, it seems this API would force (or at least strongly push towards)
buffering of the entire response body into memory in the case of
synchronous streaming (the very situation it is trying to avoid). For
example, if I was serializing an array of data to JSON in streaming
style, it seems like the most natural way to code with ES-JSGI would be:
function(request){
var response = {
status: 200,
headers: {},
body: new request.jsgi.stream()
};
dataSet.forEach(function(item){
response.body.write(JSON.stringify(item));
});
response.body.close();
return response;
}
In this situation the entire stream of data has to buffered, since the
JSGI server does not have access to the status and headers (to start the
HTTP response) until the function returns. On the otherhand, with a
forEachable body:
function(request){
return {
status: 200,
headers: {},
body: {
forEach: function(write){
dataSet.forEach(function(item){
write(JSON.stringify(item));
}
}
}
};
}
The forEach sequence does not start until the JSGI server has access to
the status and headers, so this can streamed without any internal buffering.

* It seems this proposal conflates the concerns of reading and writing
onto the same object/interface, which greatly and unnecessarily
increases the complexity of the interface. As a rough counterproposal
for streams, why couldn't a stream interface consist of the two interfaces:
Stream
- ready(writer) - This is called when the stream is ready to be receive
data. This must be called when the stream is initially ready, and may be
called at multiple in the future if and when the stream reaches a state
where it can receive data without adding to internal buffers. The call
must provide a single parameter, a writer object that conforms to the
Writer interface. This should only be called once per event turn.

- pause() - This is called to by the reader, to indicate that the writer
should refrain from sending data to the stream. This method could even
be optional, only existing if the writer actually supports the ability
to pause.

Writer
write(data) - This should be called to send data through the stream

close() - This indicates that no more data will be sent.

This seems like a much slimmer API, does not require a branded class,
and I believe meets the goal of avoiding buffering of data. It actually
makes it easier to avoid data buffering, as it avoids the problems with
synchronous streaming. The example from above:
function(request){
return {
status: 200,
headers: {},
body: {
ready: function(writer){
dataSet.forEach(function(item){
writer.write(JSON.stringify(item));
}
writer.close();
},
pause: function(){ ... }
}
};
}
Which does not require buffering any data.

Alternately, if this is an extension, it seems like it would be more
compatible with JSGI if it actually worked with forEach. We have
discussed allowing forEach to return a promise to support asynchronous
streaming of bodies. I think this fits best with the JSGI model, and
properly leverages promises to separate asynchronous and behavioral
concerns. However, one of the great ideas of your proposal is the stream
control functionality, where stream readers can indicate to writers to
slow down/pause, and when to pick things up again (drained or resume).
It seems like this functionality could easily be added a promise
returning forEach mechanism, as true compatible extension. This would
ultimately be my preference, as forEach meets such a broad range of use
cases from the simple returning on an array, to complex async streaming
situations without requiring multiple mental models. I could write that
into a proposal if desired.
Thanks,
Kris

inimino

unread,
Jan 29, 2010, 11:20:56 AM1/29/10
to comm...@googlegroups.com
On 2010-01-29 00:49, mob wrote:
> You can also potentially have higher performing interfaces as you can
> read data without copying through an event data parameter.

What does this mean?

--
http://inimino.org/~inimino/blog/

mob

unread,
Jan 29, 2010, 11:43:04 AM1/29/10
to CommonJS

On Jan 29, 8:20 am, inimino <inim...@inimino.org> wrote:
> On 2010-01-29 00:49, mob wrote:
>
> > You can also potentially have higher performing interfaces as you can
> > read data without copying through an event data parameter.
>
> What does this mean?

To pass the data to the event callback requires reading it from the
underlying http storage with some kind of conversion for encodings.
If you just get a notification without copying the data into a JS
"data" object, then a read routine can copy the data directly into the
best data type. For example, a read() could copy into a byte array,
readString() could copy into a string etc.

However, I think this is a small benefit compared with being able to
use the requeset input and output streams as objects that can be
passed to other modules for processing. That is the real gain by not
passing data as a parameter to the data/readable event.

-mob

inimino

unread,
Jan 29, 2010, 12:13:45 PM1/29/10
to comm...@googlegroups.com
On 2010-01-29 09:43, mob wrote:

> On Jan 29, 8:20 am, inimino wrote:
>> On 2010-01-29 00:49, mob wrote:
>>
>>> You can also potentially have higher performing interfaces as you can
>>> read data without copying through an event data parameter.
>>
>> What does this mean?
>
> To pass the data to the event callback requires reading it from the
> underlying http storage with some kind of conversion for encodings.
> If you just get a notification without copying the data into a JS
> "data" object, then a read routine can copy the data directly into the
> best data type. For example, a read() could copy into a byte array,
> readString() could copy into a string etc.

Ah, I see what you mean. The alternative in the case when the data
is passed with the event is to have the HTTP request object carry
around some flag or mode that determines how the incoming data is
turned into a JavaScript object. For example, node has a
setBodyEncoding() method or some such on the HTTPRequest interface.

In either case, the caller must set the flag or mode before the
data has been read, so these are equivalent from an efficiency
perspective.

> However, I think this is a small benefit compared with being able to
> use the requeset input and output streams as objects that can be
> passed to other modules for processing. That is the real gain by not
> passing data as a parameter to the data/readable event.

Of course events can be proxied to some other handler as well, but
I agree there is a convenience to passing streams around directly.

--
http://inimino.org/~inimino/blog/

Isaac Z. Schlueter

unread,
Jan 29, 2010, 3:23:04 PM1/29/10
to CommonJS
This is awesome. Lot of good stuff in this thread already.

Re: "extension"

So, I'm not really married to any type of process or wording here. I
just want a really good API that is embarrassingly performant, and can
meet a lot of use cases, and be simple enough for us all to start
throwing middleware out there that works together on it. I'd like to
see lots of friendly competition and collaboration between server
developers and middleware developers. That's what CommonJS is for,
right?

If we bat this toy around for a while, and decide it doesn't make
sense to call it an extension, but we all like it enough to prefer it
to the current JSGI, then maybe we can call it JSGI 0.4. Or if we
don't want to supplant the current thing with this, then we can give
it an entirely different name, and fork off in another direction.
Whatever. I just want the platform, and I want you all to help me
build cool stuff for it.


Re: maintaining forEachable support, etc.

Yeah, when I wrote the spec, my feeling was that forEachable objects
could be supported easily by a middleware, so there's no need to allow
it in the spec. (Essentially, I have the same answer for the
forEachable crowd that the forEachable crowd has for the String
crowd.)

However, you bring up some interesting cases that I hadn't considered.

function(request){
return {
status: 200,
headers: {},
body: { forEach: function (write) {
dataSet.forEach(function(item){
write(JSON.stringify(item));
}
}}
};
}

What about something like this? It's not quite as pretty, but
certainly possible, and I'm guessing someone else here could probably
do an even better job. (The good news is, it would only have to be
done once; one you've got your foreachable-streaming middleware, you
just wrap it around any forEachable app, and you're done.)

function (request) {
var s = new (request.jsgi.stream);
setTimeout(function () {
dataSet.forEach(function (item) {
s.write(JSON.stringify(item));
});
s.close();
});
return { status : 200, headers : {}, body : s };
}

This way, we don't start writing to the body stream until the server
has started sending the response and wired the out.body stream to the
actual HTTP pipe. Of course, it requires some way to defer
execution. Maybe we could either build that into the spec, or enhance
the Stream API with some kind of event to signal that a reader is
attached?


Re: GAE

I don't really know much about GAE. Care to flesh out why this would
be a problem there? Is there no way to do deferred execution on GAE?

The timing section is there to enable this kind of behavior for simple
cases:

exports.app = function (req) {
var out = { status : 200, headers : { "content-type":"text/
plain" }, body : new (req.jsgi.stream) };
out.body.write("Hello, world");
out.body.close();
return out;
};

The server hooks into the underlying HTTP lib, and does something like
this:

function MyServer (theActualRequest, theResponseHandle) {
var response = app( makeJSGIish(theActualRequest) );
theResponseHandle.sendHeader(response.status, response.headers);
response.addListener("data", function (chunk) {
theResponseHandle.sendBody(chunk);
});
response.addListener("eof", function () {
theResponseHandle.finish();
});
}

So, if out.body.write emits the "data" event immediately, in the
current execution tick, then the chunks will be missed, and I'll have
to do some setTimeout or event magic to facilitate getting all the
bytes to send down the output pipe.

*Some* kind of deferred execution is simply necessary to do
asynchronous streaming of data, and *any* kind of deferred execution
can be made to work. The question is whether it's built into the
Stream implementation, or if the app has to do silly stuff like using
setTimeout and whatnot. I'd prefer to put it into the Stream spec,
since that simplifies app and middleware development.

Can GAE apps do *anything* asynchronously? If so, then the timing
mandate should be achievable. If not, then I disagree with it being
an "obviously important platform", and would instead consider it an
"obviously broken platform".


Re: read vs write vs both

> - Streams should be capable of full duplex (read + write). This spec
> may not use it, but Streams should be capable of it

For the purpose of a server, and an underlying Stream API that talks
to an actual socket, yes, Streams are of course generally either
readable or writeable and not both. However, in order to support
complex middleware, the Stream implementation sitting on
request.jsgi.stream MUST be both readable and writable, which is what
the proposal specifies.

The JSGI server can of course tie a write-only stream to the app
(request).body's "data" event for actually writing to the HTTP
connection, and have a read-only stream connected to the request body
which passes its data into the request.jsgi.input*. The ejsgi
reference implementation does exactly this with the very stream-like
APIs that node provides.

* off topic: maybe one day we can name this request.body, to exploit
the parallel with response.body?


> - How do you read from input. There is no read method.

You read by attaching a listener to the "data" event.


Re: removeListener

I'd rather not go down that road here. Can we just say that
addListener will guarantee that the listener will be called when the
event fires, and then fork off into another conversation about a
commonjs event API? For the purpose of this proposal, the only thing
that matters is addListener. In my opinion, removeListener is rarely
needed, rarely implemented well, and is a code smell.


Re: allowing forEach to return a promise to support asynchronous
streaming of bodies.

So, something like this, then?

exports.app = function (req) {
var p = new Promise;


return {
status : 200,
headers : {},
body : { forEach : function (write) {

someStream.addListener("data", write);
someStream.addListener("eof", function () { p.emitSuccess
() });
return p;
}}
};
};

Compare to:

exports.app = function (req) {
var s = new (req.jsgi.stream);
someStream.addListener("data", function (chunk) { s.write
(chunk) });
someStream.addListener("eof", function () { s.close() });


return {
status : 200,
headers : {},

body : s
};
};

I'm not sure I agree with you about a promise-returning forEach not
requiring "multiple mental models". That forEach is very different
from the one on Array.prototype. Having an input stream and an output
stream, with a symmetrical API, solves the same use cases more
elegantly, imo, especially when you have several layers of middleware
wrapped around the app.


Re: renaming "eof" to "end"

+1. <3 this. Especially since "eof" means "end of file", and it will
be something other than a file a lot of the time.


Re: Various proposals to radically change the shape of the JSGI API

There are some very interesting ideas that mob and ryah are throwing
around. And now I see inimino getting into it, so it's only going to
get crazier ;)

These are good ideas. But (a) I don't want to derail the streaming
JSGI discussion, and (b) I don't want those ideas to be buried in this
thread. Can we accept, at least for the purpose of this discussion,
that we're just talking about replacing the forEachable body with a
writeable stream, and the implications of that change, and that's it?

Just to clarify:
The Server gets the request, and calls the app, passing the JSGI
request object as an argument.
The return value of the app function gives the Server the status and
headers, and a body (stream or forEachable, debating now.)
The Server sends the response.

So, the app is just a function that's called and returns a value. It
doesn't start the response or do any of that. The argument and the
return aren't the same thing. Middleware is a function that takes an
app as an argument, and returns an transformed app as a result.

If you want to do something fundamentally different than this, then
that's great, choices are great, and we should talk all about it in a
different thread. Let's deal with one radical change at a time.


Thanks for all the feedback.


--i

Kris Zyp

unread,
Jan 29, 2010, 4:18:22 PM1/29/10
to comm...@googlegroups.com, Isaac Z. Schlueter
On 1/29/2010 1:23 PM, Isaac Z. Schlueter wrote:
> Re: GAE
>
> I don't really know much about GAE. Care to flesh out why this would
> be a problem there? Is there no way to do deferred execution on GAE?
>
GAE forces a thread-per-request-until-complete model. So no, there is no
way to truly defer execution in GAE. However, it is possible to have an
event queue that would hole enqueued functions that would get executed
immediately after the JSGI app function was executed, and before the
response. But this couldn't really block or wait anything, all the
events would need to be immediately ready to execute.

> Can GAE apps do *anything* asynchronously? If so, then the timing
> mandate should be achievable. If not, then I disagree with it being
> an "obviously important platform", and would instead consider it an
> "obviously broken platform".
>

Heh, don't you wish you could dismiss platforms so easily :). Like it or
not, the G in GAE pretty much guarantees it is an important platform.


> * off topic: maybe one day we can name this request.body, to exploit
> the parallel with response.body?
>

+1 to that, it is only an input from one side's perspective.

Certainly, if your input is a stream, it would presumably be easiest to
make your output be a stream. Converting is the pain. My assertion is
that promise-returning forEach (a lazy array, if you will) is a more
generic, more broadly suitable mechanism outside of transferring raw
bytes or characters. If you had a lazy array returned as a result set
from a DB (which seems like a pretty frequent probable use case for
applications), we are back to a pretty clean pipeline of streaming
objects to text:

exports.app = function (req) {


return {
status : 200,
headers : {},

body : { return forEach : function (write) {
return lazyResultSet.forEach(function(item){
write(JSON.stringify(item));
});
}}
};
};


> Re: Various proposals to radically change the shape of the JSGI API
>
> There are some very interesting ideas that mob and ryah are throwing
> around. And now I see inimino getting into it, so it's only going to
> get crazier ;)
>

I've mentioned this before, but I still think it might be a good idea to
standardize on a something quite different from JSGI, an event-oriented
HTTP interface, more similar to Node's. Promise-style JSGI could then
easily layer on top of it for more application level concerns, while the
more event-oriented interface could potentially be used for code that
needs to deal more directly with minimizing buffers, events and so
forth. There would also be platforms like GAE that are more limited,
that could just implement JSGI, and wouldn't drag down the more advanced
streaming and eventing capabilities that we want to achieve. Layering
promise level functionality on top of event level functionality has been
suggested for other APIs, this might be a good place to do so.

--

Thanks,
Kris

inimino

unread,
Jan 29, 2010, 4:49:29 PM1/29/10
to comm...@googlegroups.com
On 2010-01-29 14:18, Kris Zyp wrote:
> I've mentioned this before, but I still think it might be a good idea to
> standardize on a something quite different from JSGI, an event-oriented
> HTTP interface, more similar to Node's.

This seems a promising approach. With the popularity of Comet, Web
Sockets, etc, I think it's better to start with a greenfield design
based on the Web application use cases a modern HTTP server interface
needs to support, rather than starting from JSGI and trying to make
incremental changes.

What are the use cases that this needs to support? What are all the
kinds of middleware people expect to be able to write? What kinds of
things do people expect to be able to do within an app or request
handler?

Some of the more mature SSJS platforms have experience with what's
required here, while node has a powerful platform but no middleware
ecosystem to speak of yet. If we're going to end up with something
that works for everyone, it would help to start by finding what the
requirements are.

--
http://inimino.org/~inimino/blog/

Isaac Z. Schlueter

unread,
Jan 29, 2010, 4:54:52 PM1/29/10
to CommonJS
On Jan 29, 1:18 pm, Kris Zyp <kris...@gmail.com> wrote:
> On 1/29/2010 1:23 PM, Isaac Z. Schlueter wrote:> Re: GAE
> > I don't really know much about GAE. Care to flesh out why this would
> > be a problem there? Is there no way to do deferred execution on GAE?
>
> GAE forces a thread-per-request-until-complete model. So no, there is no
> way to truly defer execution in GAE.

Ok, so, that's definitely Doing It Wrong. But you could implement an
event loop in JavaScript, of course.


> If you had a lazy array returned as a result set
> from a DB (which seems like a pretty frequent probable use case for
> applications)

Why couldn't your DB return an object stream? Then you're back to
this:

function app (req) {


var s = new (req.jsgi.stream);

dbstream.addListener("data", function (item) { s.write(JSON.stringify
(item)) });
dbstream.addListener("end", function () { s.close() });


return { status : 200, headers : {}, body : s };
};

With just a tiny of library love, we could do something like this:

function app (req) { return {
status : 200,
headers : {},
body : filter(dbstream, JSON.stringify)
}};

function filter (input, fn) {
var out = new Stream;
input.addListener("data", function (c) { out.write( fn(c) ) });
input.addListener("end", function () { out.close() });
out.addListener("pause", function () { input.pause() });
out.addListener("resume", function () { input.resume() });
return out;
};


Ryan, you brought up a good point about stream.write returning false
if it's not able to send the data right now. Ie, for the purposes of
streaming JSGI, if the buffer is not empty, and not paused. But for
files or tcp it could mean other things. That opens a lot of doors.
I'm going to implement it that way. So, does that mean that "drain"
only gets emitted if a previous write() returned false?


--i

Kris Zyp

unread,
Jan 29, 2010, 4:58:44 PM1/29/10
to comm...@googlegroups.com

On 1/29/2010 2:54 PM, Isaac Z. Schlueter wrote:
>
>> If you had a lazy array returned as a result set
>> from a DB (which seems like a pretty frequent probable use case for
>> applications)
>>
> Why couldn't your DB return an object stream? Then you're back to
> this:
>
>

Because it wouldn't work with a normal array. You would need to know
that the input is an object stream, you couldn't (easily) generically
apply this to normal arrays and async streaming agnostically.

--
Thanks,
Kris

Irakli Gozalishvili

unread,
Jan 29, 2010, 5:19:12 PM1/29/10
to comm...@googlegroups.com
Personally I do like idea of read / writable stream, even though I'm not sure what is the point's of the addListener do you expect more the one listener per event type ?

I specially like that on the one hand paradigm is close to HTML 5th WebSockets and on the other one, as it was pointed out, it is pretty simple to layer middleware mimicing forEach-able and promise based API's.

Actually I would've propose to stay even closer to the websocket specs that would've allow wider adoption and lower gap for newcomers. Why don't have just object being passed similar to WebSocket instance where you can implement

onmessage -> addListener("data"
onclose -> addListener("end"


I do think we shouldn't ignore GAE or other similar platforms and have idea of how to achieve this.

I also do agree with inimino and think that streaming based web is a future so it better to think in that paradigm, rather than shape current to that one. We're just starting so lets use that advantage rather then trying to go old school and layering something on top.  

--
Irakli Gozalishvili
Web: http://rfobic.wordpress.com/
Phone: +31 614 205275
Address: Taksteeg 3 - 4, 1012PB Amsterdam, Netherlands



--
You received this message because you are subscribed to the Google Groups "CommonJS" group.
To post to this group, send email to comm...@googlegroups.com.
To unsubscribe from this group, send email to commonjs+u...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/commonjs?hl=en.


George Moschovitis

unread,
Jan 29, 2010, 5:56:32 PM1/29/10
to CommonJS
> Can GAE apps do *anything* asynchronously?  If so, then the timing
> mandate should be achievable.  If not, then I disagree with it being
> an "obviously important platform", and would instead consider it an
> "obviously broken platform".

I think GAE works asynchronously in the low level: async protocol-
buffers RPC.
The high level interface is synchronous at the moment.

However, things are starting to change. For example, the latest Python
SDK exposes the asynchronous nature of the urlfetch infrastructure by
providing a high level async api:

http://code.google.com/appengine/docs/python/urlfetch/asynchronousrequests.html

I would expect similar async apis for the datastore etc.

BTW, GAE is obviously *not* a broken platform, it leverages Google's
infrastructure and more than 10 years of R&D.

I am using Narwhal/Jack on top of GAE for real world projects and it
works great!

regards,
-g.

--
http://www.appenginejs.org
http://www.gmosx.com/blog

Isaac Z. Schlueter

unread,
Jan 29, 2010, 6:10:10 PM1/29/10
to CommonJS
On Jan 29, 2:19 pm, Irakli Gozalishvili <rfo...@gmail.com> wrote:
> onmessage -> addListener("data"
> onclose -> addListener("end"
I dig this. (Adding to my list of options for an eventual show of
hands.)

> I do think we shouldn't ignore GAE or other similar platforms and have idea
> of how to achieve this.

Awesome.

On Jan 29, 2:56 pm, George Moschovitis <george.moschovi...@gmail.com>
wrote:


> BTW, GAE is obviously *not* a broken platform, it leverages Google's
> infrastructure and more than 10 years of R&D.
>
> I am using Narwhal/Jack on top of GAE for real world projects and it
> works great!

By saying "obviously broken", I didn't mean that it's not fixable, or
not useful. Just that requiring programs to be synchronous is a
flaw. Google has enough of an infrastructure to throw lots of
hardware at the problem, but thread-per-request isn't just bad for
memory reasons. It also limits what your programs can do. I'm
hopeful that it'll support some kind of deferring soon. But if not,
like I said before, you could probably implement an event loop in JS
and use that to simulate streaming.


--i

Isaac Z. Schlueter

unread,
Jan 29, 2010, 6:12:47 PM1/29/10
to CommonJS

Ok. Couldn't you do something like this?

Stream.prototype.forEach = function (cb) {


var p = new Promise;

this.addListener("data", function (chunk) { cb(chunk) });
this.addListener("end", function () { p.emitSuccess() });
return p;
};

--i

Kris Zyp

unread,
Jan 29, 2010, 7:51:23 PM1/29/10
to comm...@googlegroups.com, inimino

On 1/29/2010 2:49 PM, inimino wrote:
> On 2010-01-29 14:18, Kris Zyp wrote:
>
>> I've mentioned this before, but I still think it might be a good idea to
>> standardize on a something quite different from JSGI, an event-oriented
>> HTTP interface, more similar to Node's.
>>
> This seems a promising approach. With the popularity of Comet, Web
> Sockets, etc, I think it's better to start with a greenfield design
> based on the Web application use cases a modern HTTP server interface
> needs to support, rather than starting from JSGI and trying to make
> incremental changes.
>
>

My intent with this was that JSGI and Node's API could co-exist as
standards at different levels in the stack. Some servers could choose to
implement Node's API, and one could use a Node-JSGI adapter, and others
(like those running on GAE that couldn't really implement Node's API
properly) could directly implement JSGI. Middleware may choose to target
JSGI due to the ease of doing so, but code could still target the
evented interface when needed. The nice thing is that we already have
most in this place. Obviously we have an implementation of the Node API,
and I don't think it would be too onerous to get another. We already
have a JSGI-Node adapter [1], and we already have existing
implementations of JSGI 0.3.

With this approach, there may still be some minor adjustments to be had.
Having implemented the JSGI-Node adapter, I definitely feel the need for
JSGI support for asynchronously reading the request body (perhaps
mirroring the response body), and the character encoding in Node felt
awkward and ethnocentric to me.

[1] http://github.com/kriszyp/jsgi-node

> Some of the more mature SSJS platforms have experience with what's
> required here, while node has a powerful platform but no middleware
> ecosystem to speak of yet. If we're going to end up with something
> that works for everyone, it would help to start by finding what the
> requirements are.
>
>

IMO, the encapsulation of asynchronicity with promises is part of what
makes it so easy to pass around requests and responses in JSGI, and has
made it possible to build a good library of JSGI middleware. I've
certainly enjoyed using it, the Pintura framework has a stack of about a
dozen middleware, easily handling asynchronous Comets applications on
top of them.
--

Thanks,
Kris

Shanti Rao

unread,
Jan 31, 2010, 7:46:11 PM1/31/10
to CommonJS
Hi gang,

JSDB (www.jsdb.org) has been able to implement a web server for a few
years now, and I've learned a few things about JSGI-type functions.

1. There's no one best way to do it. Sometimes, you want to cache the
response body before you generate the response code and header.
Sometimes, you don't. Sometimes, you can implement the body generator
as an anonymous function. Sometimes, you'd rather not. Sometimes you
want to transfer from disk to socket as fast as you can, without
taking the time to create a JS String.

2. It can be really, really hard to debug server-side code, especially
when the code would make more sense if it were written in Scheme. By
separating the web server function from the CGI function, you're
giving yourself the opportunity to test them separately. The interface
wants to be simple (like Stream.write() on one end, and Stream.read()
on the other). Anything fancy, and your test harness will work
differently than your actual server.

3. HTTP can be a really simple protocol. One JSGI-like function form I
use sometimes looks like this. It assumes the HTTP server, having
found the function, already returned 200 OK and some minimal headers.
I offer it as a demonstration of an effective pattern for using a
Stream to communicate between a JSGI-like function and a HTTP server.

function foobar(client)
{
client.writeln("Content-type: text/html")
client.writeln("Content-length: 12")
client.writeln()
client.writeln("Hello, world")
}

4. Then again, sometimes you want to assume less about what your HTTP
server has already said to the client, and you don't know what or how
much you're going to send until after you've done the processing.
Then, returning a response object is the way to go.

Also, I already have a Stream class (http://www.jsdb.org/
jsdbhelp.html#Stream) and I'd hate for it to conflict with CommonJS's
Stream.

Keep It Simple and Successful,

Shanti

On Jan 28, 5:53 pm, "Isaac Z. Schlueter" <i...@foohack.com> wrote:
> Hey, CommonJS, what do you think of this?
>
> http://wiki.commonjs.org/wiki/JSGI/StreamExtension
>
> Wanna share your opinions?  I know you got em!
>

> --i

Daniel Friesen

unread,
Jan 31, 2010, 8:21:51 PM1/31/10
to comm...@googlegroups.com
That's basically what I've been specing at
http://wiki.accessjs.org/wiki/HTTP-Gateway

> 4. Then again, sometimes you want to assume less about what your HTTP
> server has already said to the client, and you don't know what or how
> much you're going to send until after you've done the processing.
> Then, returning a response object is the way to go.
>
> Also, I already have a Stream class (http://www.jsdb.org/
> jsdbhelp.html#Stream) and I'd hate for it to conflict with CommonJS's
> Stream.
>
> Keep It Simple and Successful,
>
> Shanti
>
> On Jan 28, 5:53 pm, "Isaac Z. Schlueter" <i...@foohack.com> wrote:
>
>> Hey, CommonJS, what do you think of this?
>>
>> http://wiki.commonjs.org/wiki/JSGI/StreamExtension
>>
>> Wanna share your opinions? I know you got em!
>>
>> --i
>>
>
>

~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]

Isaac Z. Schlueter

unread,
Feb 1, 2010, 5:29:08 PM2/1/10
to CommonJS
Ok. Seems like we've got a few different ideas here, and in the
interest of moving this towards some kind of resolution, I thought we
could have a show of hands about which direction you think makes the
most sense regarding the open questions:

http://wiki.commonjs.org/wiki/JSGI/StreamExtension/vote

Regarding foreachable-with-promise vs body-is-a-stream, Kris and I
have our preferences, and it's fairly trivial to do anything in one
that you can do in the other, so as far as I can see, it's just a
matter of getting some consensus around which one should be the spec
and which one should be sugared on with middleware or by the server
implementation.

Anything other open questions you think should be on that list?

I left the "what is a stream, exactly" question open for now, because
I'd like to resolve whether or not it matters before we start debating
the finer points. If there's a reasonable quorum in favor of that
approach, we can move to working out exactly what the interface should
look like. (I think it's pretty close now, just don't want to
prematurely shut down that line of investigation.)

Thanks!

--i

mob

unread,
Feb 1, 2010, 5:58:59 PM2/1/10
to CommonJS

On Feb 1, 2:29 pm, "Isaac Z. Schlueter" <i...@foohack.com> wrote:
> Ok.  Seems like we've got a few different ideas here, and in the
> interest of moving this towards some kind of resolution, I thought we
> could have a show of hands about which direction you think makes the
> most sense regarding the open questions:
>
> http://wiki.commonjs.org/wiki/JSGI/StreamExtension/vote
>
> Regarding foreachable-with-promise vs body-is-a-stream, Kris and I
> have our preferences, and it's fairly trivial to do anything in one
> that you can do in the other, so as far as I can see, it's just a
> matter of getting some consensus around which one should be the spec
> and which one should be sugared on with middleware or by the server
> implementation.

+1 for stream.

>
> Anything other open questions you think should be on that list?
>
> I left the "what is a stream, exactly" question open for now, because
> I'd like to resolve whether or not it matters before we start debating
> the finer points.  If there's a reasonable quorum in favor of that
> approach, we can move to working out exactly what the interface should
> look like. (I think it's pretty close now, just don't want to
> prematurely shut down that line of investigation.)

I actually think this is hampering our efforts by not having stream
specified.
I'd like to do a new proposal for Streams which leverages the latest
async discussions and learning. I'll post this in a day or so.

-mob

Kris Zyp

unread,
Feb 1, 2010, 6:45:06 PM2/1/10
to comm...@googlegroups.com, Isaac Z. Schlueter
So I guess I need to actually make a proposal for forEach with promise
and the rationale for it, so we know what we are voting on (to date it
has just been some discussions). Also to clarify, I definitely want the
body to be a stream. My assertion has been that forEach with promise is
consistent simple mechanism for making the body a stream. So to be more
accurate it seems that the if we were to vote with the information below
it would be foreachable-with-promise-as-the-body-stream vs
something-else-as-the-body-stream (and it seems a little poorly posed to
be pitting a specific mechanism against a generic idea, if that is the
intent). Anyway, I'll try to get a proposal up soon.
Kris

--
Thanks,
Kris

Isaac Z. Schlueter

unread,
Feb 1, 2010, 7:19:03 PM2/1/10
to CommonJS
On Feb 1, 3:45 pm, Kris Zyp <kris...@gmail.com> wrote:
> Also to clarify, I definitely want the
> body to be a stream. My assertion has been that forEach with promise is
> consistent simple mechanism for making the body a stream.

I see. It was not my intent to try to pit a general concept against a
specific implementation, but rather to pit a mostly-settled
implementation against another mostly-settled implementation. (That
is, even saying "with promises" leaves open the question as to exactly
what interface a "promise" exposes.)

From the discussions floating about here, it was my understanding that
a stream is pretty close to the specification in http://github.com/isaacs/ejsgi,
with perhaps some room for some more bikeshedding.

In addition, "stream as response body" is intended to exploit the
similarity between the request.input, request.jsgi.error, and
response.body. That is, they should all be the same kind of thing,
with the same API for reading/writing/pausing/etc.

ANYway, looks like you're feeling prompted to write up what
"foreachable+promise" means exactly, and mob is drawing up plans for a
"stream" interface proposal, and everyone seems to be on board about
doing some kind of async something, so things are looking good :)

--i

mob

unread,
Feb 1, 2010, 7:48:44 PM2/1/10
to CommonJS
> ANYway, looks like you're feeling prompted to write up what
> "foreachable+promise" means exactly, and mob is drawing up plans for a
> "stream" interface proposal, and everyone seems to be on board about
> doing some kind of async something, so things are looking good :)

Agreed. General direction == good.

However, I'm going to throw a big of a wrench with a Streams
discussion. Hopefully I can keep it as a separate discussion and not
slow down this discussion. We all use the term Stream, but there is
quite a difference in what it means when you start to talk about
async, full/half duplex, flow control, events etc. I think it is time
to really nail this element down as it is de-stabalizing other higher
order functions.

-mob

Isaac Z. Schlueter

unread,
Feb 1, 2010, 8:04:46 PM2/1/10
to CommonJS
On Feb 1, 4:48 pm, mob <m...@embedthis.com> wrote:
> We all use the term Stream, but there is
> quite a difference in what it means when you start to talk about
> async, full/half duplex, flow control, events etc.

With that in mind, let's try to spec a simple generic API that can be
as agnostic as possible as to the underlying architecture. IMO, the
stream spec at http://github.com/isaacs/ejsgi is a pretty good
start. ;)

--i

mob

unread,
Feb 1, 2010, 8:19:51 PM2/1/10
to CommonJS
> With that in mind, let's try to spec a simple generic API that can be
> as agnostic as possible as to the underlying architecture.  IMO, the
> stream spec athttp://github.com/isaacs/ejsgiis a pretty good
> start. ;)

Understand the ;-) But that does have some weaknesses with regard to
flow control.
I'll put my asbestos underpants on when I post the proposal.

-mob

Ryan Dahl

unread,
Feb 1, 2010, 10:21:07 PM2/1/10
to comm...@googlegroups.com

Glad to hear your opinion. I agree, one-way (in particular
half-closed) streams are important to support. You probably already
have some concrete ideas, but may I suggest looking at (if you haven't
already) http://search.cpan.org/dist/AnyEvent/lib/AnyEvent/Handle.pm
It's a little more specific ( for TCP streams) but I think it's a good API.

Daniel Friesen

unread,
Feb 1, 2010, 11:57:21 PM2/1/10
to comm...@googlegroups.com
When you think of ideas for specing a generic stream api don't forget to
look over the current proposals.
http://wiki.commonjs.org/wiki/IO/A
http://wiki.commonjs.org/wiki/IO/B/Stream/Level0
http://wiki.commonjs.org/wiki/IO/B/Stream/Level1

I particularly spent a fair bit of time on the idea of a level 0 stream
pattern that works abstractly of any implementation or which type of
content comes out of the stream.

Also don't forget that high-level stream api with events and pauses
won't make people trying to write very low level protocols that can
switch between text and binary very happy.

mob

unread,
Feb 2, 2010, 12:07:00 AM2/2/10
to CommonJS
Thanks for the links. I'll re-read them all to make sure.

I've got something in mind that evolves what we have. Works sync and
async with events and flow control. It is a bit simpler than most of
the prior proposals, but we'll see if it withstands the harsh glare of
many eyes.

-mob

Kris Zyp

unread,
Feb 2, 2010, 12:43:57 PM2/2/10
to comm...@googlegroups.com
Here is my proposal for body streams as forEach-able objects returning
promises (I tried to build upon and mix in some of the ideas from
Isaac's EJSGI proposal):
http://wiki.commonjs.org/wiki/JSGI/ForEachStreaming

Before we even try to compare non-forEachable streams with forEachable
streams, I first wanted to make sure we are clear about something in the
EJSGI proposal. This current design leads to buffering the entire
response body for the most straightforward usage. This is a serious
flaw. One can employ workarounds, but the most likely usage of EJSGI
completely fails to meet its own design goal. One shouldn't have to
resort to putting the response function in a setTimeout/queue function
just avoid sinking one's own server. It is untenable to implement
something with such negative consequences.

However, that being said, this is certainly a correctable issue with
EJSGI. One simply needs to provide stream writing object through a
callback, rather than through a direct constructor available on the
request. The callback be can called by the server when it is ready to
stream the response, and then response writing mechanism can begin. Such
a structure is much easier to work with on the request input stream as
well, because the reader calls the stream when it is ready to receive
data, and it doesn't need to worry about missed events.

This correction actually would bring EJSGI closer in structure to
forEachable streams. The crux differences to decide between would then
basically boil down to whether the callback occurs through a
forEach(writeFunction) that returns a promise, or whether it occurs
through some startStream(writeObject) where writeObject holds the write
and close methods. The functionality of either approach is identical,
all the same streaming abilities: writing, closing, listening, pausing
and resuming, can easily be achieved with either API. This would come
down to simple API style preference, except for two important key
advantages of the forEach design: forEach works with arrays. This not
only makes it extremely easy to write very simple functions (since you
can simply return an array), we should also be looking at how this
design is going affect other parts of an application. As we discussed
previously in this thread, it is very likely that one would use a
similiar data structure to hold streaming objects that would end up
being streamed in serialization. Being able to treat existing static
arrays as a valid form of a streamed object data structure provides a
much greater opportunity for code reuse. Compare the amount of existing
code that uses JavaScript arrays to the amount of code that uses (or
will ever use) EJSGI streams.

The second key advantage to forEach is that it is fully backwards
compatible with JSGI 0.3. With a completely different API, we are forced
to break compatibility or support two different paradigms at the same
time, forEach and some other streaming API which induces unnecessary
mental overload. I know some of you don't care about backwards
compatibility, but the reality is that there are existing
implementations, and the spec is in a much better position if can work
with these implementations rather than telling them that they are now
invalid.

Daniel Friesen pointed out that we do have already have a streaming
proposal for I/O as well:
http://wiki.commonjs.org/wiki/IO/B/Stream/Level1
Another possibility is that we follow this streaming API for JSGI. This
may be a good idea for streaming. I would certainly prefer that our
response bodies be as consistent as possible with other mechanisms.
Having a streaming API that is inconsistent with both arrays and File
I/O streaming seems like unnecessary fragmentation. However, I do think
that having HTTP streaming match File streaming could be a red herring
in terms of benefit. Matching these APIs certainly makes streaming files
easier, but the reality is that application developers rarely need to
actually implement file static handling. We implement it in a middleware
or appliance provided by a framework or server, and/or setup a proxy to
handle static files and we are done with. The real work of most web
applications lies in taking data from a database and
rendering/serializing it into a proper layout in HTML (or JSON or XML).
And looking at the database drivers for Node
(http://wiki.github.com/ry/node/modules#database), AFAICT, every one of
them represents database query result sets as arrays (or an array within
an object), not as some special object stream. Once again, if our data
sources are forEach-able, it is most consistent to serialize with
forEach-ables.

--
Thanks,
Kris

Daniel N

unread,
Feb 2, 2010, 4:16:25 PM2/2/10
to comm...@googlegroups.com
On Wednesday, February 3, 2010, Kris Zyp <kri...@gmail.com> wrote:
> Here is my proposal for body streams as forEach-able objects returning
> promises (I tried to build upon and mix in some of the ideas from
> Isaac's EJSGI proposal):
> http://wiki.commonjs.org/wiki/JSGI/ForEachStreaming
>
> Before we even try to compare non-forEachable streams with forEachable
> streams, I first wanted to make sure we are clear about something in the
> EJSGI proposal. This current design leads to buffering the entire
> response body for the most straightforward usage. This is a serious
> flaw. One can employ workarounds, but the most likely usage of EJSGI
> completely fails to meet its own design goal. One shouldn't have to
> resort to putting the response function in a setTimeout/queue function
> just avoid sinking one's own server. It is untenable to implement
> something with such negative consequences.
>
I've been following this discussion with a lot of interest. Most know
I'm in the camp that we should not hamper our efforts to design a new
spec for async web apps with a previous spec that is designed from a
sync point of view.

W.r.t. Ejsgi having to buffer the response body in full, would you
care to explain why you are under this impression?. I've written
streamers using the interface that I don't believe buffers more that a
few chunks of each response at any given time.

~Daniel

Kris Zyp

unread,
Feb 2, 2010, 5:26:39 PM2/2/10
to comm...@googlegroups.com, Daniel N

On 2/2/2010 2:16 PM, Daniel N wrote:
> On Wednesday, February 3, 2010, Kris Zyp <kri...@gmail.com> wrote:
>
>> Here is my proposal for body streams as forEach-able objects returning
>> promises (I tried to build upon and mix in some of the ideas from
>> Isaac's EJSGI proposal):
>> http://wiki.commonjs.org/wiki/JSGI/ForEachStreaming
>>
>> Before we even try to compare non-forEachable streams with forEachable
>> streams, I first wanted to make sure we are clear about something in the
>> EJSGI proposal. This current design leads to buffering the entire
>> response body for the most straightforward usage. This is a serious
>> flaw. One can employ workarounds, but the most likely usage of EJSGI
>> completely fails to meet its own design goal. One shouldn't have to
>> resort to putting the response function in a setTimeout/queue function
>> just avoid sinking one's own server. It is untenable to implement
>> something with such negative consequences.
>>
>>
> I've been following this discussion with a lot of interest. Most know
> I'm in the camp that we should not hamper our efforts to design a new
> spec for async web apps with a previous spec that is designed from a
> sync point of view.
>
> W.r.t. Ejsgi having to buffer the response body in full, would you
> care to explain why you are under this impression?. I've written
> streamers using the interface that I don't believe buffers more that a
> few chunks of each response at any given time.
>

The very simplest, most straightforward usage I can think of:

function(request){
var body = new request.jsgi.stream();
var response = {


status: 200,
headers: {},
body:

};
body.write("hello world");
body.close();
return response;
}

With the body.write() call, Iits impossible to do anything except
buffer, because the server has not yet received the status and headers
to start the response. There is nothing wrong with the streaming
interface itself in this situation, but the streaming interface should
be provided with a callback when the server is ready to send something,
otherwise you end up writing to a stream that isn't connected to
anything yet.

--
Thanks,
Kris

Ryan Dahl

unread,
Feb 2, 2010, 5:35:13 PM2/2/10
to comm...@googlegroups.com, Daniel N

I liked mob's idea about just having one duplex object. So what about

function(request){
request.status = 200;
request.write("hello world");
request.close();
}

Ryan Dahl

unread,
Feb 2, 2010, 5:36:27 PM2/2/10
to comm...@googlegroups.com
On Tue, Feb 2, 2010 at 2:35 PM, Ryan Dahl <coldre...@gmail.com> wrote:
> I liked mob's idea about just having one duplex object. So what about
>
>  function(request){
>   request.status = 200;
>   request.write("hello world");
>   request.close();
>  }
>

Oops, that should have been

function(request){
request.writeHeader(200, {});

Kris Zyp

unread,
Feb 2, 2010, 5:47:33 PM2/2/10
to comm...@googlegroups.com, Ryan Dahl

I like Node's API a lot more than this, it makes clear that you are
writing to response, not to the request (which obviously isn't
possible). Do you have any objections to us creating a CommonJS HTTP
Event Interface specification based on Node's API? (I am still
suggesting that this would be distinct from JSGI since it is so
radically different).

--
Thanks,
Kris

Ryan Dahl

unread,
Feb 2, 2010, 5:58:55 PM2/2/10
to Kris Zyp, comm...@googlegroups.com
On Tue, Feb 2, 2010 at 2:47 PM, Kris Zyp <kri...@gmail.com> wrote:
>
> I like Node's API a lot more than this, it makes clear that you are
> writing to response, not to the request (which obviously isn't
> possible).

I think people will get used to it.

> Do you have any objections to us creating a CommonJS HTTP
> Event Interface specification based on Node's API? (I am still
> suggesting that this would be distinct from JSGI since it is so
> radically different).

Yeah - I do mind. :)
I want to change Node's HTTP API soon and I'm enjoying this discussion.

mob

unread,
Feb 2, 2010, 6:28:36 PM2/2/10
to CommonJS

On Feb 2, 2:35 pm, Ryan Dahl <coldredle...@gmail.com> wrote:

> I liked mob's idea about just having one duplex object. So what about
>
>   function(request){
>    request.status = 200;
>    request.write("hello world");
>    request.close();
>   }

+1 (of course)

I like having a low level status, write and close API. Often it is
easier to use than trying to force the code to return a {} like JSGI
sync.
this works well.

-mob

mob

unread,
Feb 2, 2010, 6:34:19 PM2/2/10
to CommonJS
> I like Node's API a lot more than this, it makes clear that you are
> writing to response, not to the request (which obviously isn't
> possible). Do you have any objections to us creating a CommonJS HTTP
> Event Interface specification based on Node's API? (I am still
> suggesting that this would be distinct from JSGI since it is so
> radically different).

I've got some issues with the Node API as currently stands. But we
could use it as a starting point and evolve, but I think we should
apply the best of recent learning to get this right. There are some
error cases that are not coverable with the current Node API. But that
is just my opinion.

If we start with Node, great. But lets fix the issues. If we are just
going to standardize on the existing NodeAPI -- it will hurt in the
long run. But I suppose it will short-circuit the need for CommonJS
discussions ;-)

-mob

George Moschovitis

unread,
Feb 2, 2010, 6:56:58 PM2/2/10
to CommonJS
> I like having a low level status, write and close API. Often it is
> easier to use than trying to force the code to return a {} like JSGI
> sync.
> this works well.


On the other hand I like the 'functional flavor' of the JSGI API: The
app is a function, you pass an input (request/env) you get an output
(response).

-g.

Tom Robinson

unread,
Feb 2, 2010, 7:15:51 PM2/2/10
to comm...@googlegroups.com

On Feb 2, 2010, at 2:35 PM, Ryan Dahl wrote:

> I liked mob's idea about just having one duplex object. So what about
>
> function(request){
> request.status = 200;
> request.write("hello world");
> request.close();
> }

I worry about this style API making middleware prohibitively difficult.

There's a bit of elegance in passing the request object as an argument and returning a response object.

I maintain that middleware is one of the most interesting aspects of JSGI-like interfaces.

-tom

mob

unread,
Feb 2, 2010, 7:20:34 PM2/2/10
to CommonJS

On Feb 2, 4:15 pm, Tom Robinson <tlrobin...@gmail.com> wrote:
> I worry about this style API making middleware prohibitively difficult.
>
> There's a bit of elegance in passing the request object as an argument and returning a response object.
>
> I maintain that middleware is one of the most interesting aspects of JSGI-like interfaces.

How does this make it prohibitive to make middleware? Can you
explain?

-mob

Daniel N

unread,
Feb 2, 2010, 7:26:56 PM2/2/10
to comm...@googlegroups.com
Middleware that is expecting to handle a response now may not get at that response, or may try to deal with it in addition to the application endpoint. 

These are not show stoppers, and it's what chain dealt with, but it is not all straight forward and could require some extra grey matter expenditure.  

Chain dealt with this by allowing for the addition of maybe callbacks (callbacks that are called if the request comes back this way) and callbacks that are executed before the headers are sent.  

Listeners can be added to the response so that cleanup can happen later.  Other than that, the idea was that you could respond directly if you wanted, or hand back the callback to deal with the response.  When I changed that to allow dealing with a stream, it became so similar to ejsgi that I decided to depricate in favor of ejsgi and common ground.  

The biggest difference between ejsgi and chain is that chain allows for direct sending of headers and body chunks.

~Daniel

Daniel Friesen

unread,
Feb 3, 2010, 1:50:34 AM2/3/10
to comm...@googlegroups.com
Daniel N wrote:
>
>
> On Wed, Feb 3, 2010 at 11:20 AM, mob <m...@embedthis.com
> <mailto:m...@embedthis.com>> wrote:
>
>
>
> On Feb 2, 4:15 pm, Tom Robinson <tlrobin...@gmail.com
I think we had a discussion awhile back that could provide some
reference and ideas on this topic.
Though I can't seam to find it.

It was a discussion on an API where some sort of response function was
passed to the app and called to start with the status and headers to
start the response.
I came up with a variation of it and provided some examples of how
MiddleWare could work.

Ack... T_T Now that I think about it. We never finished dealing with
that copyright issue on the wiki.

Mike Wilson

unread,
Feb 18, 2010, 4:43:07 PM2/18/10
to comm...@googlegroups.com
Ryan Dahl wrote:
> On Thu, Jan 28, 2010 at 10:49 PM, mob <m...@embedthis.com> wrote:
> > This is simpler than having to manually throttle incoming data.
> > Same for output.
>
> I disagree. This models better what is actually happening. Data
> arrives, you deal with it.

mob wrote:
> The alternative is to make Request a stream and so you read from
> it using read(). The advantage is that you can choose to read or
> not read the data. [...]
> Lastly, you don't ever need to tell the stream to pause or stop
> sending data. It will flow control automatically.

Yes, I think this is better as it uses the buffers and flow-control
of the actual communication device (socket etc) instead of doing
unlimited/throttle-controlled transfers on user space buffers.

Ryan Dahl wrote:
> Only one "middleware" can begin the response - I imagine that one
> does know the status code.

It is possible to design streaming middleware where status code and
headers are modified by any stage in the middleware chain. The
response is so to say "begun" on every stage in the chain as each
middleware can consider itself the head of the chain itself is
aware of, but of course only the outer middleware begins the real
response. (I sketched on this in my Pipe proposal.)

Kris Zyp wrote:
> I've mentioned this before, but I still think it might be a
> good idea to standardize on a something quite different from JSGI,
> an event-oriented HTTP interface, more similar to Node's.

+1

Best regards
Mike

Reply all
Reply to author
Forward
0 new messages