Batch Dependency

66 views
Skip to first unread message

aked...@gmail.com

unread,
Oct 27, 2023, 9:15:31 AM10/27/23
to schedulix
Hello,
We have a batch that needs to run periodically throughout the day. Right now we have this batch scheduled every 30 minutes but occasionally the batch runs longer than 30 minutes causing the running batch to conflict with the next submitted batch. 

Is there a good way in Schedulix to have the next submitted batch wait until the prior batch is completed? 

Thank you,
Andrew 

Ronald Jeninga

unread,
Oct 27, 2023, 12:27:34 PM10/27/23
to schedulix
Hi Andrew,

sure, this is where resources come into play.
I assume that the batch consists of multiple jobs (and maybe other batches).
That would cover the more general case.

You are going to need a synchronizingnamed  resource. The name doesn't play a role, but something that makes its role understandable would make sense.

Now you can create a resource either in scope GLOBAL (or some scope below, but it should be visible to the jobs that are going to allocate it).
From here you have two options:
1. Use a LOCK and UNLOCK job
2. Let the Master (the top level scheduling entity) allocate the resource and keep it until it is FINAL

In the first case you create a job that allocates the resource with the STICKY flag set. This job should run as the first job of your batch.
It should also lock the resource with an exclusive lock (X).
And you create another job that allocates the resource and runs as the last job of the batch. Again with lockmode Exclusive and the Sticky Flag set.
As a run program you can use 0 (just the number zero), which is basically a /bin/true without overhead.

The second option is easy. The master is converted into a job (with run program 0) because it needs to allocate the resource with exclusive lock.
Be sure to specify KEEP_FINAL as the keep option. This will guarantee that the resource is first released when all children are final.
There's another crucial detail to take care of. The first job that can run should wait for its master. Hence you'll need a dependency that telles the first job to wait until the master is JOB_FINAL.
This way it'll have to wait until the master was able to allocate the resource.

I hope that this all makes sense to you.
If not, please ask again and I'll try to answer it again a little more in detail.

Enjoy your weekend!

Best regards,

Ronald

aked...@gmail.com

unread,
Oct 27, 2023, 3:10:41 PM10/27/23
to schedulix
Thanks Ronald, this is a huge help!! We will take a look and see what we can figure out.

I hope you have a great weekend as well!

Thank you,
Andrew

aked...@gmail.com

unread,
Oct 30, 2023, 10:33:04 AM10/30/23
to schedulix
Hello Ronald,
We made an attempt but haven't had any luck yet.

This is what we currently have set up for testing trying to follow these steps:

In the first case you create a job that allocates the resource with the STICKY flag set. This job should run as the first job of your batch.
It should also lock the resource with an exclusive lock (X).
And you create another job that allocates the resource and runs as the last job of the batch. Again with lockmode Exclusive and the Sticky Flag set.
As a run program you can use 0 (just the number zero), which is basically a /bin/true without overhead.

The batch:
Step1Resource.JPG


On  PROCESS_STEP_2_SCHEDULIX_TEST_SPIKE_CONCURRENCY our run program just waits for 20 seconds before completing
Required Resource Details:
Step2Resource.JPG


When the batch runs a second time while the first one is still running, both run:
TestRun.JPG

Does anything stand out to you that we might have missed?

Thank you again for all your help!
Andrew

Ronald Jeninga

unread,
Oct 30, 2023, 11:33:43 AM10/30/23
to schedulix
Hi Andrew,

in fact it looks pretty good what you've done.
I'll create a working example and post it here. You can then try it, analyze it and compare what you've missed.
You've used a few too many options you don't need but that shouldn't harm. (You don't need the sticky name, you don't need the keep mode and you don't need the resource state mapping).

Best regards,

Ronald

aked...@gmail.com

unread,
Oct 30, 2023, 3:45:27 PM10/30/23
to schedulix
Thanks Ronald, that sounds like a great plan!

Thank you,
Andrew

Ronald Jeninga

unread,
Oct 31, 2023, 5:39:07 AM10/31/23
to schedulix
Hi Andrew,

I've attached a small example as "sdms-script".
You can load it into your system doing

    sdmsh < ronalds_example.sdms

But it'll make sense to look at it first. It is a simple text file, any editor will do.
You might want to replace the specified environment (server@localhost) with an environment of your choice.

It will create a folder with the example:

folder.png

It will also create a Category and a Named Resource:

rss.png

An instance of the named resource is created in scope GLOBAL:

rss_in_global.png

In order to get rid of the example, you remove the resource in GLOBAL, then you remove the folder and last but not least you remove the synchronizing resource.

If you start two instances of the example, the master list will look like:

second_waiting.png

Now, if you click on the waiting master, you'll find that the LOCK job is in SYNCHRONIZE_WAIT.
You might be interested in what resource(s) it is waiting for and you click on the Resources(Req) tab, where you'll find something like

rss_req_2nd_batch.png
 
If you now navigate to Jobserver and Resources and click on the BATCH_SYNC resource in GLOBAL, the resource information is shown.
On the Allocations tab you'll find the request queue:

rss_request_queue.png

The UNLOCK job was executing an SDMSpopup.sh to give me time to create the screen shots I wanted. This is why you see its allocation. 
Normally you will see a REQUEST from the UNLOCK job together with the MASTER_RESERVATION while your PAYLOAD is running.

I hope that gives you a better insight on how things work.

Best regards,

Ronald
ronals_example.sdms

aked...@gmail.com

unread,
Oct 31, 2023, 8:42:39 AM10/31/23
to schedulix
Thanks Ronald! The step we missed as adding the resource in GLOBAL and it looks like we have a working prototype now.

I greatly appreciate all the help!

Thank you,
Andrew

Ronald Jeninga

unread,
Oct 31, 2023, 9:32:18 AM10/31/23
to schedulix
Hi Andrew,

I'm glad I could help!
But I'm a little confused now. In order to run your original attempt, the resource must have been allocated somewhere.
If not, you'd get a "Cannot run in any scope because of resource shortage" error message and the job would be set to ERROR.
My guess is that you've created the resource as a defined resource at master level (or even at job level).
If I am right, that would explain why your batch ran and why two of them could run concurrently.

Basically the defined resources are used to manage things that are specific to that batch.
In my TicTacToe example I need a bunch of resources to represent the board.
That way each instance of the TicTacToe batch has its own set of resources that reflects the current state of the game.
More or less exactly the opposite of what you wanted to achieve :-)

Best regards,

Ronald

Reply all
Reply to author
Forward
0 new messages