Background processing options for Rails app

44 views
Skip to first unread message

Steve Jorgensen

unread,
Jan 19, 2013, 6:17:07 AM1/19/13
to pdx...@googlegroups.com
Hi all,

There seem to be a gazillion ways to run background processes from
Rails, and I'd like some advice on good options for a particular kind of
application.

I have code Rails working now to communicate with a piece of hardware an
store data in a database. Next, I need to run this in 2 different ways:
on a schedule and on demand by a Rails user clicking a button. The
process is a bit slow, so I'll want to run it in the background, but I
want it to start running as quickly as possible when requested, and the
process should never be executed twice at the same time in different
processes.

It would seem that I need a daemon running an event queue. For scheduled
execution, I figure I can just have a cron job execute a script that
adds a request to the queue, and with only one daemon process running, 2
instances of the task won't ever be executed concurrently.

For responsiveness, beaneater/beanstalkd looks like a good choice. The
only trouble is, how do I manage it cleanly? I'll want to make sure that
it completes the task in progress before exiting if it is asked to stop,
and I'll want to make sure that it is restarted automatically if it dies.

The daemons gem looks like a good tool for handling clean shutdown and
auto-restart, but it's not clear to me how to integrate daemons and
beaneater to get the result that I'm looking for.

Any thoughts?

Thanks,

-- Steve J.

Mike Perham

unread,
Jan 19, 2013, 9:35:12 AM1/19/13
to pdx...@googlegroups.com
Sidekiq! Use the clockwork gem to manage cron and inject jobs every time period.

Guaranteeing unique execution is a harder problem; I don't know of any OSS that can do that.
> --
> You received this message because you are subscribed to the Google Groups "pdxruby" group.
> To post to this group, send email to pdx...@googlegroups.com.
> To unsubscribe from this group, send email to pdxruby+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Matthew Boeh

unread,
Jan 19, 2013, 10:06:36 AM1/19/13
to Portland Ruby Brigade
The simplest way to ensure that a given task is only executing once at a time is to work outside the queuing system and work with some kind of lock in the database. ActiveRecord's optimistic locking combined with a "job in process at" timestamp should be sufficient. The caveat is that you need to be prepared to deal with the worker process crashing/the server going down/etc. and leaving the timestamp in place. Having the timestamp be ignored after a certain period might work for you.

If you can get away with allocating only a single worker process per type of job, obviously that would work. If you can segment out your queue into smaller classes of job, each served by only one worker, you can still perform jobs concurrently without doing the same job more than once at a time. I've taken this approach in situations where I have a particular class of long-running, resource-intensive jobs that I can't afford to run more than once at a time per node.

Finally, the approach I'm working on right now is an external (on another machine, preferably) watchdog service which keeps track of the unique ID of jobs and the unique ID of the workers which are handling them. If the job doesn't finish and the worker has gone away, the watchdog can reissue the job (and unlock the record so a new job can run against it.)


--
You received this message because you are subscribed to the Google Groups "pdxruby" group.
To post to this group, send email to pdx...@googlegroups.com.
To unsubscribe from this group, send email to pdxruby+unsubscribe@googlegroups.com.

Steve Jorgensen

unread,
Jan 19, 2013, 12:31:23 PM1/19/13
to pdx...@googlegroups.com
It won't be as responsive if the job won't run when queued until the
next cron time period, which is why beanstalkd looks nice.

Won't pretty much any request queueing system (including Sidekiq) be
able to guarantee unique execution since if there's only one process
allocated, it can only process one task at a time and let additional
incoming requests wait in the queue?

Sam Livingston-Gray

unread,
Jan 19, 2013, 1:17:53 PM1/19/13
to pdx...@googlegroups.com
Re: single process: I should probably just let Mike answer, but isn't Sidekiq's selling point that it uses threads to do I/O-intensive jobs in parallel?

--
(Sent from phone; please excuse brevity.)

Matthew Boeh

unread,
Jan 19, 2013, 3:16:35 PM1/19/13
to Portland Ruby Brigade
Cron would be just for the scheduled runs, not the on-demand runs. You'd still set up a message queue with something like RabbitMQ, Redis, or beanstalkd to handle the on-demand jobs.

Of course, if you're only serving one long-running background job at a time, it's not going to be very responsive anyway. Unless you're doing a very low volume of jobs, I suppose.


To unsubscribe from this group, send email to pdxruby+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.


--
You received this message because you are subscribed to the Google Groups "pdxruby" group.
To post to this group, send email to pdx...@googlegroups.com.
To unsubscribe from this group, send email to pdxruby+unsubscribe@googlegroups.com.

Mike Perham

unread,
Jan 19, 2013, 4:16:34 PM1/19/13
to pdx...@googlegroups.com
The job will run within an ms of being queued. Cron is to enqueue jobs on a schedule, which you asked for.

Using one single threaded process is a trivial way to ensure serial, unique execution but doesn't scale terribly well. Using a database as a lock server is probably your simplest solution, to Matthew's point.

Steve Jorgensen

unread,
Jan 19, 2013, 5:41:57 PM1/19/13
to pdx...@googlegroups.com
Sidekiq lets you configure the thread pool size, so can specify 1.

Steve Jorgensen

unread,
Jan 19, 2013, 5:43:11 PM1/19/13
to pdx...@googlegroups.com
There's only 1 user in this case, which results in a very low volume of jobs. <g>


On 1/19/13 12:16 PM, Matthew Boeh wrote:
Cron would be just for the scheduled runs, not the on-demand runs. You'd still set up a message queue with something like RabbitMQ, Redis, or beanstalkd to handle the on-demand jobs.

Of course, if you're only serving one long-running background job at a time, it's not going to be very responsive anyway. Unless you're doing a very low volume of jobs, I suppose.


On Sat, Jan 19, 2013 at 9:31 AM, Steve Jorgensen <ste...@stevej.name> wrote:
It won't be as responsive if the job won't run when queued until the next cron time period, which is why beanstalkd looks nice.

Won't pretty much any request queueing system (including Sidekiq) be able to guarantee unique execution since if there's only one process allocated, it can only process one task at a time and let additional incoming requests wait in the queue?
<snip>
Reply all
Reply to author
Forward
0 new messages