S3 streaming upload and download

263 views
Skip to first unread message

João Pereira

unread,
Jan 24, 2012, 10:28:12 AM1/24/12
to golia...@googlegroups.com
Hi,

Does anyone have some code sample/experience of how to perform this in goliath with large files? I need to process these downloads in chunks yielding them to #response from my store s3 backend class (doing the same for the current filesystem backend class without problems).

I have already seen the uber-s3 gem but no mentions to streaming and with happening gem the server goes down when trying to do this:

             # store/s3.rb
       def get
          item = Happening::S3::Item.new(@bucket,  ...)
          item.get.stream do |chunk|
            yield chunk
          end
       end

      # goliath server app file
      def response(env)
         # ...
         operation = proc { store.get { |chunk| env.chunked_stream_send chunk } }
         callback  = proc { env.chunked_stream_close }

         EM.defer operation, callback
         # Proper streaming response ...
      end

After issuing the request with curl, server aborts with:

[9742:INFO] 2012-01-24 15:21:20 :: Starting server on 0.0.0.0:9000 in development mode. Watch out for stones.
(eval):11:in `yield': can't yield from root fiber (FiberError)
from (eval):11:in `get'
        ...



Peter Kieltyka

unread,
Jan 24, 2012, 12:55:39 PM1/24/12
to golia...@googlegroups.com, golia...@googlegroups.com
Are you just downloading a file stored in a s3 bucket and sending it as a response in streams?

Why not just use em-http-request? It would work fine its publicly accessible. Otherwise you'll need s3 url auth to access the resource, but still doable.

Joao Pereira

unread,
Jan 24, 2012, 1:39:36 PM1/24/12
to golia...@googlegroups.com
Hi,

Yes, I need to be able to:

1. POST file to server, buffer it on a tempfile and upload it to s3 later
2. GET fiie from s3 and stream it in chunks to the client

Although the files are private, but I have access to the private and access key and bucket name. Would not em-http-request incur in the same error? As in this sample happening returns a em-http response, and #stream is actually just the em-http #stream method.

Peter Kieltyka

unread,
Jan 24, 2012, 1:52:16 PM1/24/12
to golia...@googlegroups.com
Cool. I wasn't aware that Happening returned an em-http object, at least thats a step into the right direction. Look at the source of em-http-request and try to figure out what is going on.

Also.. did you look at the example in the goliath source, "stream.rb" ?

Joao Pereira

unread,
Jan 24, 2012, 1:54:33 PM1/24/12
to golia...@googlegroups.com
Hi,

Currently I have this post/get streaming process working fine for the filesystem backend option, currently i'm trying to figure whats happening with that error in my s3 backend implementation.

João Pereira

unread,
Jan 24, 2012, 4:51:29 PM1/24/12
to golia...@googlegroups.com
As expected, tried to use em-http, making the file public on s3, and the problem stills there, cant yield from there. Any way to do this work?

Ilya Grigorik

unread,
Jan 25, 2012, 3:27:15 AM1/25/12
to golia...@googlegroups.com
Which of the get's is failing? 

item.get.stream do |chunk| -- ?

Hard to tell from your stack trace. It may be due to how the variables are getting captured once you start descending down multiple callbacks.

operation = proc { |env| store.get { |chunk| env.chunked_stream_send chunk } }

Try explicitly capturing the env param...

ig

João Pereira

unread,
Jan 25, 2012, 7:20:58 AM1/25/12
to golia...@googlegroups.com
Hi,

Sorry for the little details about the backtrace, I've created a simplified server example of my own (I've cleaned a lot of details, also it tests with a public dropbox file for convenience), ready to run with backtrace on comments in this small gist: https://gist.github.com/1675971

I've been around this for hours since yesterday and cant get this to work... :S

Ilya Grigorik

unread,
Jan 26, 2012, 2:13:07 AM1/26/12
to golia...@googlegroups.com
https://gist.github.com/1681479 - does the trick for me.

João Pereira

unread,
Jan 26, 2012, 7:06:17 AM1/26/12
to golia...@googlegroups.com
Great! It works perfectly, I was getting crazy around that, thanks :) Also, this is a best approach when compared with my previous defer, as it keeps the connection open till download end. 

One last trick question: 

I need to move the download logic to a separate class, so I have done it this way: https://gist.github.com/1675971 , moving the proc and callbacks to the store class. It successfully downloads the file but when it tries to hit env.chunked_stream_close server aborts, saying me that he doesn't know about 'env'. This way the server goes down, the file as been downloaded, it already have the expected size, but is corrupted as the connection was aborted.

How can I close the stream properly from there (outside server file)?

I have solved this problem by modifying the finish proc callback to yield nil and then, inside next_tick, if chunk.nil? I close the stream, but perhaps there is a best way to do this.

Ilya Grigorik

unread,
Jan 27, 2012, 1:28:48 AM1/27/12
to golia...@googlegroups.com
Wouldn't "store.get(env) { |chunk| env.chunked_stream_send chunk }" do the trick?

Joao Pereira

unread,
Jan 27, 2012, 7:29:18 AM1/27/12
to golia...@googlegroups.com
Yes it would do it, I was thinking that env would be available all around, thanks.

Reply all
Reply to author
Forward
0 new messages