Concurrent file uploads, mongrel clusters, and Amazon S3

Jake Cahoon

unread,

Apr 19, 2007, 12:16:55 PM4/19/07

to ur...@googlegroups.com

I am trying to get a stable deployment of my site but keep running into issues with mongrel, file uploads, and using Amazon S3.

Is anyone using S3 to store attachments? I am using Rick Olsen's plugin, attachment_fu, and set the storage mechanism as s3. It is a very fast way to get s3 storage working with your site. My problem seems to be a combination of two issues. First, Rails won't allow concurrent file uploads because it blocks while mongrel is answering a request. Second, the AWS-S3 library that attachment_fu uses to communicate with S3 isn't thread-safe.

The site keeps crashing if more than a couple people start using it.

Anyone solved this problem or have suggestions or work arounds?

Thanks,

Jake Cahoon

Jamis Buck

unread,

Apr 19, 2007, 12:25:48 PM4/19/07

to ur...@googlegroups.com

Jake,

We're using S3 storage at 37signals for Campfire, Highrise, and
Basecamp. Here's (more or less) how we're doing it.

1. We have an upload.cgi script that we use to proxy the uploads into
our app. This prevents a large upload from tying up a listener. The
upload.cgi simply accepts an upload, saves it to a shared directory,
rewrites the parameters to include the filename instead of the
upload, and then forwards the request on to the app.

2. We store the file on our own filesystem first, and have an
asynchronous batch process running that uploads pending files to S3
in the background.

3. For the first 30 minutes after a file has been uploaded to S3, we
continue to serve the file from our local filesystem, to give the
file time to propogate throughout S3's system. 30 minutes is way
overkill, but we ran into issues (especially with Campfire) where a
file would occasionally not be available immediately after it was
uploaded to S3, so we implemented this buffer period.

We are using the AWS::S3 library (Marcel wrote it for us while he was
working here, after all), but we are using our own glue code to tie
it into our applications. (I hadn't heard of attachment_fu until you
mentioned it here.)

So, that works for us. It's a lot of infrastructure, and a lot of
moving parts, but it has been extremely robust for us, and allows us
to process a very high volume of uploads and downloads. I understand
that the S3 team is working on a way to allow applications to upload
directly to S3...we're eagerly awaiting that feature, and if it turns
out to be compatible with our applications, we'll definitely take
advantage of that since it would allow us to get rid of the
upload.cgi proxy on the front end.

- Jamis

Jake Cahoon

unread,

Apr 19, 2007, 2:47:51 PM4/19/07

to ur...@googlegroups.com

Jamis,

Thanks for the advice. Its nice to hear some confirmation of the road I thought we might need to go down. I was hoping for a simpler solution but it doesn't look like I'll wiggle my way out of this one.

This will be a fun challenge.

Jake

--
Jake Cahoon
M: 801-369-9813
Skype: jake.cahoon

Reply all

Reply to author

Forward