(possibly web based) alternative to cron for running scheduled jobs?

54 views
Skip to first unread message

Michael Pearson

unread,
Aug 29, 2011, 8:14:19 PM8/29/11
to rails-...@googlegroups.com
Hi,

I just posted this to serverfault[0] and was hoping that somebody here had solved the problem with a nice drop-in Ruby-friendly solution:

We're using cron to manage our backups and other jobs in multiple locations. Using chef to populate files in cron.daily, cron.hourly, etc has worked pretty well for us so far, but with some issues:

    • I don't want to have to manage a mailserver on the system just to receive cron output
    • I want to be able to put output in my cron jobs without receiving email about them if nothing went wrong
    • I don't want to have to check /var/log/messages to see if jobs failed without output
    • I don't want to have to log in to the system to find that the backup job is still running

Optimally, I'd like a web-based frontend that I can use to see this information, either as an extension on cron or a complete replacement.

I can solve the above problems myself with a bit of scripting, but I'm sure that this is a problem that others have solved already.

Note that I acknowledge that this is a completely separate issue from verifying the backups after they've been completed.

Nicholas Faiz

unread,
Aug 29, 2011, 9:08:34 PM8/29/11
to rails-...@googlegroups.com
I think https://github.com/tobi/delayed_job is a straightforward and Rails friendly way of doing it. I set it up once but went back to cron, as it was simpler for what we wanted.

Michael Pearson

unread,
Aug 29, 2011, 9:12:12 PM8/29/11
to rails-...@googlegroups.com
Yeah, we're already using delayed_job here, but as far as I can tell they solve different problems.


On Tue, Aug 30, 2011 at 11:08 AM, Nicholas Faiz <nichol...@gmail.com> wrote:
I think https://github.com/tobi/delayed_job is a straightforward and Rails friendly way of doing it. I set it up once but went back to cron, as it was simpler for what we wanted.



James Healy

unread,
Aug 29, 2011, 9:18:26 PM8/29/11
to rails-...@googlegroups.com
On 30 August 2011 11:12, Michael Pearson <mipe...@gmail.com> wrote:
> Yeah, we're already using delayed_job here, but as far as I can tell they
> solve different problems.

I use delayed job to solve many of the issues you mention, but it
fails your multiple locations requirement.

I have simple cron tasks that pop jobs(including backups) onto the
delayed job queue.

Hoptoad is setup to track job errors and a simple rails controller
queries the delayed_jobs table to report failed, running and upcoming
jobs.

However, there's no support for multiple queues so you won't easily be
able to backup multiple machines.

James

Pat Allan

unread,
Aug 29, 2011, 9:19:09 PM8/29/11
to rails-...@googlegroups.com
I did reply to the ServerFault question, but just sharing my thoughts here for those who don't click through...

* whenever gem to provide a nice interface over the top of cron:
https://github.com/javan/whenever
* backup gem that's super flexible/configurable for defining backups (if backups are the focus - sounds like there's other elements as well):
https://github.com/meskyanichi/backup
* free account on SendGrid as an SMTP server for < 200 emails/day:
http://sendgrid.com/

--
Pat

> --
> You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
> To post to this group, send email to rails-...@googlegroups.com.
> To unsubscribe from this group, send email to rails-oceani...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.

Michael Pearson

unread,
Aug 29, 2011, 9:22:48 PM8/29/11
to rails-...@googlegroups.com
That's an excellent solution - in all honesty, I hadn't even thought of just using cron to add to the DJ queue.

Each backup location manages its own backups - pull, not push, so multiple locations will be managed by chef.

Any good drop-in dashboards for DJ? Or does it have its own that I haven't found yet?

--
You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
To post to this group, send email to rails-...@googlegroups.com.
To unsubscribe from this group, send email to rails-oceani...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.

Ben Taylor

unread,
Aug 29, 2011, 9:19:50 PM8/29/11
to rails-...@googlegroups.com
It seems like what you want is relatively custom. Perhaps the best plan is to implement it yourself? As awful as it might sound a quick approach could be:

 * Rails app with a Task Model rule:text script:file (use the text to work out how often to run it, use paperclip for the file)
 * Rake task in that app which iterates over every task and checks if the rule applies, if so run the script and write the output to a Run model, if it fails email you and write the output to the Run model as an error
 * Active Admin interface? Or your own CRUD solution even scaffolding would be fine.

Add features for things you want. Make sure it never dies. Run the rake task with a cron job that runs every minute/hour however often you want your granularity to be. Should probably also fork out the rake task incase it takes too long and misses another job.

 - Ben

James Healy

unread,
Aug 29, 2011, 9:27:15 PM8/29/11
to rails-...@googlegroups.com
On 30 August 2011 11:22, Michael Pearson <mipe...@gmail.com> wrote:
> Any good drop-in dashboards for DJ? Or does it have its own that I haven't
> found yet?

Nope, but I haven't really looked very hard. If you find one, please
let us know as my homebrew dashboard is pretty damn utilitarian.

James

Chris Berkhout

unread,
Aug 29, 2011, 9:32:48 PM8/29/11
to rails-...@googlegroups.com
Another option along the same lines is to use Resque (with it's built
in web interface) to manage jobs:
http://railscasts.com/episodes/271-resque
https://github.com/defunkt/resque

And the gem resque-scheduler:
https://github.com/bvandenbos/resque-scheduler
to schedule things to be added to queues at certain times.

There's a good blog post on it:
http://www.perfectline.ee/blog/cron-tasks-for-your-rails-application-with-resque

Cheers,
Chris

Robert Gravina

unread,
Aug 29, 2011, 9:46:19 PM8/29/11
to rails-...@googlegroups.com
Yes, Resque is really nice. The blog post goes through gihub's
evaluation of the existing popular job handlers and why they wrote
Resque.

You can scheduled jobs using crontab syntax in a yml file, and/or from
code via Resque.enqueue(JobClass) or
Resque.enqueue_at(30.seconds.from_now, JobClass).

Jobs are just Ruby classes with a perform method, so you can run them
from the Rails console if you like. The web UI allows you to see
failed jobs, and retry them (very handy if a job takes parameters, so
you want to re-run *that* job).

Robert

Andrew Boag

unread,
Aug 29, 2011, 11:25:56 PM8/29/11
to rails-...@googlegroups.com
We use delayed job and it's a very powerful bit of kit. We also moved from a cron-based system of dealing with regular tasks.

The one thing to bear in mind is, as with all daemons, you need to make sure that delayed_job itself is still running (it might explode). You can use God for this but there is always the chance that this itself will fail (which has happened to us).

Our approach was to regularly put a job (use cron for this) in the delayed_job queue that touches a file somewhere in /var/run or /tmp ... this way you can easily set up a nagios check on the machine to test if the file is getting created (and hence, delayed job is processing the jobs in the queue).

Just something that we discovered in our travels ...
--
You received this message because you are subscribed to the Google Groups "Ruby or Rails Oceania" group.
To post to this group, send email to rails-...@googlegroups.com.
To unsubscribe from this group, send email to rails-oceani...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/rails-oceania?hl=en.


-- 

----

Andrew Boag - Director
Catalyst IT
andre...@catalyst-au.net

mob: +61 421 528 125
ddi: +61 2 8002 1758

www.catalyst-au.net

Samuel Richardson

unread,
Aug 29, 2011, 11:32:00 PM8/29/11
to rails-...@googlegroups.com
Any reason you didn't go with monit to keep delayed job running?

Samuel Richardson
www.richardson.co.nz | 0405 472 748

Andrew Boag

unread,
Aug 29, 2011, 11:47:53 PM8/29/11
to rails-...@googlegroups.com
Various reasons, we have a lot of nagios experience and infrastructure. The questions with all monitoring (in this case monitoring meaning = "if application/service death happens, take action") are often more around escalation path policy and documentation rather than the mechanics of how you check things are alive i.e. "What to do if we get SMS at 3am about delayed job failure?" assuming the sysadmin who receives the message has had very limited exposure to a this server or delayed job.

We have had a lot of activity with systems such as SMS gateway servers and we tend to lean towards end-to-end tests. In the case of SMS gateways, we sent an sms to and from a GSM device that was physically plugged into our infrastructure. Note that in BAU we are not sending out the messages via the GSM device, it's just for testing. All we do is check for the sms message on the GSM device every few minutes. If a message has not arrived in 10 minutes then we know there is a problem and pagers start beeping. This could be anything from the SMSC connection having died, to a power cut, to the GSM device having fried itself (all valid reasons for sysadmin to get involved).

We haven't used monit, but it looks to me like it's more of a solution that will try to "solve" the failure i.e. restart apache/postgres if the system goes away. This is fine but what if _this_ (i.e. the monit script) process fails? How do you notice? At what stage, and by what means, should an actual person get involved?

It's certainly horses for courses and all that for this sort of stuff. Main thing is that you understand your own approach and it makes sense to all involved

Dmytrii Nagirniak

unread,
Aug 30, 2011, 12:40:48 AM8/30/11
to rails-...@googlegroups.com
We haven't used monit, but it looks to me like it's more of a solution that will try to "solve" the failure i.e. restart apache/postgres if the system goes away.

I think the main purpose of monit is to monitor and take an action. The action can be "solving" the issues as well as sending notifications and waking up stuff.

 
This is fine but what if _this_ (i.e. the monit script) process fails?

It looks like guys already thought about that. So it it should be handled pretty well.


Andrew Boag

unread,
Aug 30, 2011, 1:03:58 AM8/30/11
to rails-...@googlegroups.com
Ok, more banter ... we should probably go off list if this goes on any more (it's not really ruby chatter any more :-)


On 30/08/11 14:40, Dmytrii Nagirniak wrote:
We haven't used monit, but it looks to me like it's more of a solution that will try to "solve" the failure i.e. restart apache/postgres if the system goes away.

I think the main purpose of monit is to monitor and take an action. The action can be "solving" the issues as well as sending notifications and waking up stuff.

Sure, like I said, I'm not an expert on monit and if you can get it working. Awesome.

Nagios has a relatively shiny GUI which allows you to selectively downtime checks etc and all that.



 
This is fine but what if _this_ (i.e. the monit script) process fails?

It looks like guys already thought about that. So it it should be handled pretty well.

Once again, looks good. What about if it jams after having init-ed ?

Still, there ain't no perfect solution to this. We just happen to have a lot of experience (from trial and lots of error) with our approach.

Samuel Richardson

unread,
Aug 30, 2011, 1:09:10 AM8/30/11
to rails-...@googlegroups.com
I think it has relevance to Ruby, at least I know I like to monitor my delayed_jobs/redis/sphinx using something (usually Monit).

I originally asked because I was curious how others were handling doing the same thing.

You've mentioned a couple of times, "what happens if monit/init falls over" how do you manage Nagios falling over? Do you have redundant systems or just notice if an SMS hasn't come in for a while?

Samuel Richardson
www.richardson.co.nz | 0405 472 748


Andrew Boag

unread,
Aug 30, 2011, 1:31:19 AM8/30/11
to rails-...@googlegroups.com, Samuel Richardson
On 30/08/11 15:09, Samuel Richardson wrote:
I think it has relevance to Ruby, at least I know I like to monitor my delayed_jobs/redis/sphinx using something (usually Monit).

I originally asked because I was curious how others were handling doing the same thing.

You've mentioned a couple of times, "what happens if monit/init falls over" how do you manage Nagios falling over? Do you have redundant systems or just notice if an SMS hasn't come in for a while?
Good question. So this descends into a war of which system will go forever without falling over monit or nagios ... neither I'm sure - everything breaks eventually for some reason.

But with our setup if nagios died, we would notice.

On the "monitored" machine the nagios nrpe agent would return a failure if it died and we get alerts. Same for if someone pulls the plug out of the machine.

And yes, we have multiple nagios instances (in different data centres) connecting in to test the monitored machines.

Look, monit and nagios are doing different things with a slight overlap in some cases - neither can do all that the other can. There is also the likes of cacti which is also a "monitoring" solution - for trend analysis. Same verb in English, but different roles completely.

When I think "monitoring and nagios" I'm really thinking - something that will page me in the instance the ethernet cord is proverbially kicked out when someone is moving a rack ... We want to be made aware so that we can take some action. And we try to monitor as many of our services and applications in this way in a simplistic "Is it working?" fashion which is not always easy or practical.

This is a topic I've given some thought to and been involved in from a management point of view. And to repeat myself the "enterprise" monitoring package is far more about staff management and documentation than technology. I don't care how the SMS gets generated to tell me there is an issue, as long as someone who receives it is actually being able to resolve the issue - requiring knowledge and competence.

We always put our alerts to the "3am Sysadmin test" - i.e. would a sysadmin who has logged in to this box once before in their life be able to make sense of what to do based on wiki documentation and the information in the alert? If not, we don't include the monitoring (or we may do only business hours).
Reply all
Reply to author
Forward
0 new messages