doesn´t run a job when last scheduled is still running

67 views
Skip to first unread message

Hector Barranco

unread,
Dec 17, 2018, 7:12:54 AM12/17/18
to schedulix
Hello,

I have one job, it is launched every 30 minutes. Sometimes this job doesn't finish in this time. It is possible to skip a job if the last scheduled is still running.

Thanks,

Dieter Stubler

unread,
Dec 17, 2018, 8:59:48 AM12/17/18
to schedulix

Hi Hector,


It is quite easy to solve your problem.

Since this is a common requirements and might be necessary for more than just one job, we first we create a template for this.
We use the following script which is more compact to communicate than the steps to do this in the GUI.

Replace the environment  'SERVER@LOCALHOST' with a valid environment in your setup.
Feel free to adapt the folder structure to your requirements.

begin multicommand

create or alter named resource RESOURCE.TEMPLATES with usage = CATEGORY;
create or alter named resource RESOURCE.TEMPLATES.BATCHLOCK with usage = SYNCHRONIZING;

create or alter folder SYSTEM.TEMPLATES;
create or alter folder SYSTEM.TEMPLATES.SKIP_IF_RUNNING;
create or alter resource RESOURCE.TEMPLATES.BATCHLOCK in SYSTEM.TEMPLATES.SKIP_IF_RUNNING with online;

create or alter job definition SYSTEM.TEMPLATES.SKIP_IF_RUNNING.LOCK
with
environment = 'SERVER@LOCALHOST',
profile = 'STANDARD',
run program = '0',
resource = ( RESOURCE.TEMPLATES.BATCHLOCK STICKY lockmode = X ),
timeout = 0 MINUTE state 'SKIPPED',
type = JOB;

create or alter job definition SYSTEM.TEMPLATES.SKIP_IF_RUNNING.MAIN
with
profile = 'STANDARD',
type = BATCH,
required = ( SYSTEM.TEMPLATES.SKIP_IF_RUNNING.LOCK state = ALL REACHABLE);

create or alter job definition SYSTEM.TEMPLATES.SKIP_IF_RUNNING.UNLOCK
with
environment = 'SERVER@LOCALHOST',
profile = 'STANDARD',
run program = '0',
resource = ( RESOURCE.TEMPLATES.BATCHLOCK STICKY lockmode = X),
type = JOB,
required = ( SYSTEM.TEMPLATES.SKIP_IF_RUNNING.MAIN state = ALL REACHABLE);

create or alter job definition SYSTEM.TEMPLATES.SKIP_IF_RUNNING.SKIP_IF_RUNNING
with
profile = 'STANDARD',
type = BATCH,
master,
children = (
SYSTEM.'TEMPLATES'.'SKIP_IF_RUNNING'.'LOCK',
SYSTEM.'TEMPLATES'.'SKIP_IF_RUNNING'.'MAIN',
SYSTEM.'TEMPLATES'.'SKIP_IF_RUNNING'.'UNLOCK'
);

end multicommand;

After running the above script you can now use the template as follows.
You hava a batch like SYSTEM.YOUR_BATCH which should skip if it is still running,
Remove ths time schedule (repeat every 30 Minutes) from your Batch.
Copy the Folder SYSTEM.TEMPLATES.SKIP_IF_RUNNING to another folder for example just SYSTEM.
Rename the copy (SYSTEM.SKIP_IF_RUNNING in our example) to a name like SYSTEM.YOUR_BATCH_PROTECTED
Add your original Batch as a child to the batch SYSTEM.YOUR_BATCH_PROTECTED.MAIN.
Schedule the batch SYSTEM.YOUR_BATCH_PROTECTED.MAIN.YOUR_BATCH_PROTECTED to run every 30 minutes.
Everything should work as you expect now.
For a deeper explanation have a look in the documentation about the STICKY flag of resource requirements.

Please let us know whether this was helpful.

Regards
Dieter

Hector Barranco

unread,
Dec 27, 2018, 9:18:36 AM12/27/18
to schedulix
Hello Dieter,

It worked perfectly, thank you very much for your assistance.

Kind regards,

Hector Barranco

unread,
Jan 17, 2019, 10:55:38 AM1/17/19
to schedulix
Hello Dieter,

I tried your script in a testing environment and it worked perfectly, but in the production environment doesn't work. Both environment are the same.

This is the screen with the error.

Captura18.PNG

 Do you have any idea about the problem?

Kind regards,

Ronald Jeninga

unread,
Jan 17, 2019, 11:42:49 AM1/17/19
to schedulix
Hi Hector,

obviously there's something wrong with your LOCK Job.
I think it terminated prematurely with a "Cannot run in any scope because of resource shortage" (or alike).
Which then means (I'm guessing) that you created the named resource (that's a definition) but forgot to instantiate the resource in the folder.

So please check this. (I might be wrong here).

If I'm wrong, please give us more details on the lock job.

Best regards,

Ronald

Hector Barranco

unread,
Feb 18, 2019, 7:38:07 AM2/18/19
to schedulix
Hello,

Finally I found the problem, some time ago I changed the default exit state in environment production. After restore the default states it worked again.

Thanks for your support.

Ronald Jeninga

unread,
Feb 18, 2019, 8:25:40 AM2/18/19
to schedulix
Hi Hector,

happy it works again. :-)

It's always dangerous to change defaults in a running system.

Thank you for your feedback!

Best regards,

Ronald

Hector Barranco

unread,
Feb 27, 2019, 6:46:45 AM2/27/19
to schedulix
Hello,

Please, could you support me with this problem?

In some occasion, if the batch failed this is not released, in next scheduled when it is launched again it is skipped . How can It configured to execute again even if the last job was failed?.

Kind regards,


fail schedulix.png


Ronald Jeninga

unread,
Feb 28, 2019, 4:10:13 AM2/28/19
to schedulix
Hi Hector,

Jobs that are "RED" require an operator intervention and don't count as final.
You've defined FAILURE as a restartable state, which is OK, but that implies that the system asks you how to proceed with those Jobs that have Exit State FAILURE.
If you allocate your resources with KEEP_FINAL, which is also OK in this use case (no concurrent executions), the allocations won't vanish if a Job runs into FAILURE.
And this is why your subsequent batch is skipped.

If you don't want to do anything else than to cancel the "RED" jobs, it'll be easier to use an Exit State Profile that defines FAILURE as a final state.
The downside is that you can't rerun the Job if it fails, but actually you don't want to.

Obviously you'll have to reconsider the definitions of your dependencies. If a predecessor runs into FAILURE, what should the successor do? Run? Skip?

HTH

Best regards,

Ronald

Hector Barranco

unread,
Mar 1, 2019, 11:19:36 AM3/1/19
to schedulix
Hello Ronald,

I followed your instructions and now it thinks is working as we need. When a job finish successful or failure it is launched again in next schedule.

Very thanks for your support Ronald

solved.png


Reply all
Reply to author
Forward
0 new messages