Hi,
I have some batch jobs running and am using Quartz Schedular. They run based on a specific time schedule (eg 55 and 25 minutes each hour over 24 hours).
Thing is, I have (at a minimum) 2 nodes running the same software for disaster recovery reasons. So I need a way to only run a job by one of the nodes.
Turns out its quite tricky to get right and I keep finding bugs which are hard to reproduce!
The current implementation requires each node to write to the db first to say its going to run the job, then read back the record to see if it succeeded in the write (there's a unique key on the job name) and if so, run the job.
However, there seem to be issues with overlapping jobs (I think, hard to diagnose after the fact and they are intermittent).
I was wondering if there was an easier solution out there I could easily use?
Thanks
Rakesh