streaming api

59 views
Skip to first unread message

Dominic Sisneros

unread,
Feb 2, 2010, 11:47:59 PM2/2/10
to rack-...@googlegroups.com
The perl folks jumped on the rack & wsgi trend in a big way with psgi and are doing a lot of interesting stuff.

One of the things that they have done differently is have an optional streaming interface.

http://bulknews.typepad.com/blog/2009/10/psgiplack-streaming-is-now-complete.html

So, basically the idea is the same as the original Python WSGI's start_response but this callback is NOT an optional parameter to the app because that stands in the way of everybody in the chain including middleware and that sucks. Instead, an app can optionally return a callback that accepts another callback to which you can return the response array ref (code, headers and body) if you want to delay your response.

my $app = sub {
  my $env = shift;
  return sub {
    my $respond = shift;
    # do some event stuff
    $event->callback(sub { $respod->([ $code, $headers, $body ]) });
  };
};

If you also want to delay the content body delivery as well (i.e. streaming) you can omit the body, in which case you'll get the writer object that has write(), close() and poll_cb(). 

my $app = sub {
  my $env = shift;
  return sub {
    my $respond = shift;
    my $w = $respond->([ 200, [ 'Content-Type' => 'text/plain' ]]); # no $body here
    # do more event stuff
    $event->callback(sub { $w->write($body) });
    $event->done_callback(sub { $w->close });
  };
};

That post also has links to other posts about their streaming rack like interface

Should rack pursue something like this?


Eric Wong

unread,
Feb 3, 2010, 5:59:39 AM2/3/10
to rack-...@googlegroups.com

Hi Dominic,

Rack already lets you stream the body. The body response only has to
respond to #each and the #each call can delay as much as it wants. Ruby
1.8 has (somewhat) cheap threads, and 1.9 has even cheaper Fibers so
you can sleep synchronously inside them.

Additionally, James Tucker made some non-standard Rack extensions in
Thin (using EventMachine) that let you use EventMachine callbacks
similar to the Perl examples above. Take a look at the examples
packaged with Thin and their use of env["async.callback"] and also the
async_sinatra gem. James had plans to start working on a Rack 2.x based
entirely around asynchronous code, but (afaik) hasn't gotten to
publishing anything yet.

The Ebb server also has env["async.callback"] support using Rev instead
of EM. However, Ebb+Rev has far less traction than Thin+EM.
async_sinatra relies on EventMachine and can't use Rev at the moment.

There's also Rainbows! (and Zbatery) which may use EventMachine and
should[1] emulate async extensions found in Thin.


IMHO, Ruby 1.9 Fibers are a "close enough" concurrency solution for Ruby
and already fit the Rack 1.x specs without modifications. While Fibers
will always be more expensive than async models, I can live with that
given the application is already implemented in Ruby...


[1] - I'm the author of Rainbows! and Zbatery, but I don't know of
anybody actively using EventMachine with them. Please let us know if
you find anything broken. I have only had some small demos running
Revactor and some of the Fiber-based concurrency models. The only two
production sites I've ever heard of using Rainbows! were running the
ThreadSpawn concurrency model.

--
Eric Wong

James Tucker

unread,
Feb 3, 2010, 6:02:33 AM2/3/10
to rack-...@googlegroups.com
Yes this looks disturbingly familiar to my thin patches from 1.2

Tom Robinson

unread,
Feb 3, 2010, 6:25:58 AM2/3/10
to rack-...@googlegroups.com
On Feb 2, 2010, at 8:47 PM, Dominic Sisneros wrote:

The perl folks jumped on the rack & wsgi trend in a big way with psgi and are doing a lot of interesting stuff.

One of the things that they have done differently is have an optional streaming interface.

http://bulknews.typepad.com/blog/2009/10/psgiplack-streaming-is-now-complete.html

So, basically the idea is the same as the original Python WSGI's start_response but this callback is NOT an optional parameter to the app because that stands in the way of everybody in the chain including middleware and that sucks. Instead, an app can optionally return a callback that accepts another callback to which you can return the response array ref (code, headers and body) if you want to delay your response.

FWIW, JSGI (Rack/WSGI for JavaScript) is investigating/debating using promises for asynchronous / streaming. Here's a comparison of two versions of "AJSGI" to PSGI's approach (with your PSGI examples ported to JavaScript):


Lots of discussion going on in the CommonJS mailing list too BTW: http://groups.google.com/group/commonjs

-tom

Jeremy Hinegardner

unread,
Feb 3, 2010, 1:08:36 PM2/3/10
to rack-...@googlegroups.com

You might want to also check out Rack::StreamingProxy

http://github.com/aniero/rack-streaming-proxy

enjoy,

-jeremy


--
========================================================================
Jeremy Hinegardner jer...@hinegardner.org

Randy Fischer

unread,
Feb 3, 2010, 1:14:16 PM2/3/10
to rack-...@googlegroups.com
Rack already lets you stream the body.  The body response only has to
respond to #each and the #each call can delay as much as it wants.  Ruby
1.8 has (somewhat) cheap threads, and 1.9 has even cheaper Fibers so
you can sleep synchronously inside them.


I'm more interested in getting streaming content on the request
side of things (go ahead and parse the headers,  then just give
me an IO object on the content).  

Reason:  I get largish (> 4G) bodies occasionally in a particular
archival service;  I need to do md5 and sha1 and perhaps other
checksums soon.   I don't want the web server doing a copy
to some temp space, thanks, I'll write it and checksum it and
so on on the fly.

Anyone doing work on this at the rack level?

-Randy Fischer

James Tucker

unread,
Feb 4, 2010, 5:32:16 AM2/4/10
to rack-...@googlegroups.com
Not really no, as it's down to the webserver at this time, not down to rack, as rack expects a full formed request, most servers deliver just that.


-Randy Fischer

Eric Wong

unread,
Feb 4, 2010, 5:39:26 AM2/4/10
to rack-...@googlegroups.com

lib/rack/file.rb is an example of how to use the #each method
to stream a regular file off the filesystem:

def each
F.open(@path, "rb") { |file|
while part = file.read(8192)
yield part
end
}
end

You should be easily able to adapt it to any IO object, not just File.
If you happen to be using Rainbows! with a Fiber-based concurrency
model, you can actually just wrap a Rainbows::Fiber::IO[1] object around
your as a response body:

I actually just started working on this earlier today for another
experiment:

# -*- encoding: binary -*-
# ResponseBody is not tied to Rainbows! and may be used with any
# IO-like class. This is intended to be used as a body in a Rack
# response.
class ResponseBody < Struct.new(:upstream, :initial)

def each(&block)
buf = initial || ""
yield(initial) unless buf.empty?

while upstream.readpartial(16384, buf)
yield buf
end
end

def close
upstream.close
end
end

# This is very much tied to Rainbows!
class Upstream < Rainbows::Fiber::IO

# non-blocking
def initialize(addr = Socket.sockaddr_in(80, '192.168.0.1'))
socket = Socket.new(Socket::AF_INET, Socket::SOCK_STREAM, 0)
super(socket, ::Fiber.current)
begin
socket.connect_nonblock(addr)
rescue Errno::EINPROGRESS
wait_writable # schedules other Fibers while this waits
retry
rescue Errno::EISCONN
end
end
end

# Usage in the Rack app should be something like this:
lambda { |env|
# open a socket to the upstream server, this may schedule other fibers
upstream = Upstream.new(sockaddr)

# send request, should be small enough to be non-blocking
upstream.write("GET /foo HTTP/1.0\r\n\r\n")

# wait for it, this can schedule and run other fibers while waiting
buf = upstream.readpartial(16384)

# TODO the parse_initial method used below is not implemented
status, headers, initial_body = parse_initial(buf)

[ status, headers, ResponseBody.new(upstream, initial_body) ]
}

[1] - http://git.bogomips.org/cgit/rainbows.git/tree/lib/rainbows/fiber/io.rb?id=v0.90

--
Eric Wong

Eric Wong

unread,
Feb 4, 2010, 5:41:22 AM2/4/10
to rack-...@googlegroups.com
Eric Wong <normal...@yhbt.net> wrote:
> Randy Fischer <randy....@gmail.com> wrote:
> > > Rack already lets you stream the body. The body response only has to
> > > respond to #each and the #each call can delay as much as it wants. Ruby
> > > 1.8 has (somewhat) cheap threads, and 1.9 has even cheaper Fibers so
> > > you can sleep synchronously inside them.
> > >
> > I'm more interested in getting streaming content on the request
> > side of things (go ahead and parse the headers, then just give
> > me an IO object on the content).

Nevermind, I didn't notice the "request side" of things, I need sleep :x

--
Eric Wong

Tom Robinson

unread,
Feb 4, 2010, 10:00:34 AM2/4/10
to rack-...@googlegroups.com

Also this PSGI interface allows for returning a response *asynchronously*. Rack does not allow for async responses (e.x. long polling) nor async streaming (e.x. streaming comet).

James Tucker

unread,
Feb 4, 2010, 11:34:16 AM2/4/10
to rack-...@googlegroups.com

Actually rack /allows/ for streaming bodies in the api, although specific scheduling of the body rendering is restricted by middleware code and the servers approach to body#each.

That being said, the latter two do work /around/ racks api in Thin and rainbows, although writing middleware for this api is not so trivial.

Reply all
Reply to author
Forward
0 new messages