On 22.04.2013, at 13:37, James Abbott <
abbo...@gmail.com> wrote:
>
> Hi Ben,
>
> many thanks for addressing my question.
>
>> I assume your users are authenticated? You have a user record of some description? Once your upload is received you could store the generated ID somewhere in your user record - whether that's something like user#uploaded_files or similar.
>
> No, that's not the case (at least at this stage). I have no models / records of any kind - I'm envisioning this as a web page with an upload form that anyone can use so long as the files are in the right format. So if you uploaded a file, you'd see its output displayed on the same page.
You should be aware that it'd be very easy to run a DoS against your server if someone uploads a large file and/or one that takes a long time to process. So while you don't need to have a DB for that, you need some way to store some data temporarily, like Redis.
>> As a general point: it's generally considered best practice to do any heavy lifting out-of-band. If possible you should run a separate worker that watches for uploaded files and processes them outside of your web's worker process. You could modify your UI to poll for completion of the job and present the result to your user when it has finished.
>
> Do you have suggestions to any reading material / tutorials that explain this, ideally through an example? Would EventMachine be suitable for this?
No, you usually use a queue for that (eg. like Redis and Resque, though there are many, many other solutions). Your web process adds a job with a payload to the queue (the payload can basically be anything, you probably want to put in the file name/path, and maybe some other info), and then one or more workers on the other end listen for incoming jobs, and process them, putting the result somewhere where a web process can later retrieve it.
That way, your web process will never take too long to process a single request, and your web page will stay responsive. You can then use something like long polling to have the client wait until the result is ready - this is something were EventMachine may come in handy.
So, the whole thing could look like this:
* User uploads a file to your web server
* Web process stores the file somewhere, and puts its name and other info into the worker queue. It then tells the browser to start waiting for the result.
* A background worker receives the job, and starts processing it.
* After it has finished processing, it puts the result into Redis using the key the web process gave it
* As the web browser keeps polling the server for the result, it sees that there's now data available in Redis under the given key
* Your web server returns that data
g, Markus