mass update job definitions

106 views
Skip to first unread message

dpo...@shaklee.com

unread,
Apr 5, 2017, 8:06:47 PM4/5/17
to schedulix
Hello all,

I am looking to do a mass update. Essentially I have a large number of jobs running the same command with different options. I now want to find those jobs and update their kill program with the same command for all of them. Below is a sample of one that has been updated. So I want to target all jobs that are running sdscom.sh. I want to change the kill program to the one shown. Most of the kill programs are currently blank. All of the jobs I want to target are in the system.ops.nightly processing folder (yes I did not include my notes in the name. They will get removed later). From another post, I have zope telling me what commands it is running. So I have an idea what my alter statement needs to be (I am assuming ZOPE provides more than needed). I am just trying to figure out how do I create my list of target jobs and how I feed that list into the alter command.


Auto Generated Inline Image 1

Ronald Jeninga

unread,
Apr 6, 2017, 3:01:07 AM4/6/17
to schedulix
Hi dpower,

this is where "sdmsh" comes into play.
Basically the procedure is to generate a "sdmsh script" from the database and execute it.

In your case, your job definitions (scheduling entities) can be identified by the run program.
The command you want to generate looks like

ALTER JOB DEFINITION <full-qualified-path>
WITH KILL PROGRAM = 'some kill program';

The hard part is the <full-qualified-path> at first glance. But a small extension of the SQL select syntax solves this.

Try something like

echo "select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''colt shoots \$PID'';' from sci_c_scheduling_entity where run_program = 'xyz' with id job;" | sdmsh

This will give you a list of commands that do the bulk change.
Note the non SQL "with" option. This is capable of translating an ID to a (non quoted!) full qualified name.

HTH

Regards,

Ronald


Ronald Jeninga

unread,
Apr 6, 2017, 4:31:30 AM4/6/17
to schedulix
Hi dpower,

an additional small remark. Executing the command line above yields something like

[SYSTEM@localhost:2506] SDMS> -bash-4.2$ echo "select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''colt shoots \$PID'';' from sci_c_scheduling_entity where run_program = 'xyz' with id job;" | sdmsh

Connect

CONNECT_TIME : 06 Apr 2017 06:56:09 GMT

Connected

[SYSTEM@localhost:2506] SDMS> select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''colt shoots $PID'';' from sci_c_scheduling_entity where run_program = 'xyz' with id job;

Selected Values

?COLUMN?             ID                                                                          ?COLUMN?                                
-------------------- --------------------------------------------------------------------------- ----------------------------------------
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0440_DOG.WON                                                WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0190_CRITICAL_REGION_STICKY.UNLOCK                          WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0255_BATCH_SKIP.UNLOCK                                      WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0255_BATCH_SKIP.LOCK                                        WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0250_BATCH_WAIT.UNLOCK                                      WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0193_CRITICAL_REGION_STATE.UNLOCK                           WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0500_GRANTS.REPORTS.REPORT 1                                WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0420_FROG.WON                                               WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0190_CRITICAL_REGION_STICKY.LOCK                            WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0250_BATCH_WAIT.LOCK                                        WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0420_FROG.ESCAPED                                           WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.ESCAPED                                          WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.WON                                              WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0193_CRITICAL_REGION_STATE.LOCK                             WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0196_CRITICAL_REGION_KEEP_FINAL.CRITICAL_REGION_KEEP_FINAL  WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0440_DOG.ESCAPED                                            WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.SHOW_LABYRINTH_MASTER                            WITH KILL PROGRAM = 'colt shoots $PID';


17 Row(s) selected

Now you may want to execute all those commands atomically, just to prevent "half" updates.
In that case you prepend the list of commands with "BEGIN MULTICOMMAND" and append a "END MULTICOMMAND;". This way all changes will be made within one transaction.
If any of the commands fails, none of them will be executed.

The resulting script would look like

BEGIN MULTICOMMAND
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0440_DOG.WON                                                WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0190_CRITICAL_REGION_STICKY.UNLOCK                          WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0255_BATCH_SKIP.UNLOCK                                      WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0255_BATCH_SKIP.LOCK                                        WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0250_BATCH_WAIT.UNLOCK                                      WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0193_CRITICAL_REGION_STATE.UNLOCK                           WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0500_GRANTS.REPORTS.REPORT 1                                WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0420_FROG.WON                                               WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0190_CRITICAL_REGION_STICKY.LOCK                            WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0250_BATCH_WAIT.LOCK                                        WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0420_FROG.ESCAPED                                           WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.ESCAPED                                          WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.WON                                              WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0193_CRITICAL_REGION_STATE.LOCK                             WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0196_CRITICAL_REGION_KEEP_FINAL.CRITICAL_REGION_KEEP_FINAL  WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0440_DOG.ESCAPED                                            WITH KILL PROGRAM = 'colt shoots $PID';
ALTER JOB DEFINITION SYSTEM.EXAMPLES.E0430_FROG2.SHOW_LABYRINTH_MASTER                            WITH KILL PROGRAM = 'colt shoots $PID';
END MULTICOMMAND;

Regards,

Ronald

PS. I replaced the xyz ('xyz') by a simple zero ('0') to produce my list of statements. If you have installed the examples, a similar list would be the result.

dpo...@shaklee.com

unread,
Apr 6, 2017, 12:53:05 PM4/6/17
to schedulix
Ronald,

I expected to use sdmsh but the documentation is severely lacking in examples. Now, I think you are heading in the direction I need to go. My table name was different for some reason. I am not sure why. I did fix that but now I am getting an error that I don't know what to do with nor how to address.
 
[mysql_gen]$ echo "select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''/opt/schedulix/scripts/sdscom.sh \$SDS_HOSTNAME \$SDS_PORT \$SDS_DEFAULT_TIMEOUT abort \$SUPERMAN'';' from SCHEDULING_ENTITY where run_program like '/opt/schedulix/scripts/sdscom.sh' with id job;" | sdmsh
Connect
CONNECT_TIME : 06 Apr 2017 16:44:29 GMT
Connected
[SYSTEM@scheduli:2506] SDMS> select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''/opt/schedulix/scripts/sdscom.sh $SDS_HOSTNAME $SDS_PORT $SDS_DEFAULT_TIMEOUT abort $SUPERMAN'';' from SCHEDULING_ENTITY where run_program like '/opt/schedulix/scripts/sdscom.sh' with id job;
ERROR:03210252200, Version 3888402 of id 14576 not found
[SYSTEM@scheduli:2506] SDMS> [mysql_gen]$ 
 
What does this mean and how do I address it?

dpo...@shaklee.com

unread,
Apr 6, 2017, 1:04:03 PM4/6/17
to schedulix
Ahh....
Why does the version keep changing?

[SYSTEM@scheduli:2506] SDMS> show job definition 14576;
ERROR:03210252200, Version 3890694 of id 14576 not found
[SYSTEM@scheduli:2506] SDMS> show job definition 14576;
ERROR:03210252200, Version 3890730 of id 14576 not found
[SYSTEM@scheduli:2506] SDMS> show job definition 14576;
ERROR:03210252200, Version 3890733 of id 14576 not found
[SYSTEM@scheduli:2506] SDMS> show job definition 14576;
ERROR:03210252200, Version 3890745 of id 14576 not found
[SYSTEM@scheduli:2506] SDMS> show job definition 14576;
ERROR:03210252200, Version 3890748 of id 14576 not found

dpo...@shaklee.com

unread,
Apr 6, 2017, 1:52:30 PM4/6/17
to schedulix
ugg!!!

I restarted everything and now I get this:

[SYSTEM@scheduli:2506] SDMS> select 'ALTER JOB DEFINITION', id, ' WITH KILL PROGRAM = ''/opt/schedulix/scripts/sdscom.sh $SDS_HOSTNAME $SDS_PORT $SDS_DEFAULT_TIMEOUT abort $SUPERMAN'';' from SCHEDULING_ENTITY where run_program = '/opt/schedulix/scripts/sdscom.sh' with id job;

ERROR:03110251037, Key 14576 not found (de.independit.scheduler.server.repository.SDMSSchedulingEntityTable)


 

Ronald Jeninga

unread,
Apr 7, 2017, 2:50:08 AM4/7/17
to schedulix
Hi dpower,

admitted, if you don't know how the system works internally, this seems to be a strange behaviour, but in fact it isn't.

The view sci_c_scheduling_entity _should_ be there. If not, something went wrong while building the DB schema during install.
The view is actually quite simple and filters out all removed objects (you'll find the definition in the sql/*_gen directories).

Since you used the base table in your query, also removed objects are found.
While resolving the _actual_ full qualified path, the server stumbles over a non current object and aborts the query.

How does this work?
If you start a read/write transaction, a new transaction id is drawn. If something changes, a new version of the object is created and valid_from/valid_to of both the old and the new object are set.
If you remove an object, only the valid_to of the old object is set.
If you start a read/only transaction, the last issued r/w transaction number is used as a reference. This way it is guaranteed that you always see a consistent view (read committed/before image read).

Now imagine you removed object 14576 at transaction 3,000,000.
The last issued R/W transaction number was 3,888,402. If the server searches for this version of object 14576, it doesn't find anything. Hence the error message.
Your jobservers continue to poll, or any other writing transactions are performed. This increases the last R/W transaction number.
Thus the next R/O query you try will be performed with a new version number, but the removed object still doesn't exist (and won't ever come into existence again). So the server reports the same error, but this time with the new version number.
This is why the version number keeps changing on retries. And of course a restart doesn't fix this "problem".

To get this right:
1. investigate why the sci_c_scheduling_entity view doesn't exist and use the view instead of the base table (preferred solution)
2. as an alternative, you can test the valid_to column to be 2^63 - 1 (resp: ... WHERE VALID_TO = 9223372036854775807)

Now your queries should work.

Regards,

Ronald

PS. Feel free to send me examples you'd like to see in the syntax documentation.
From my perspective most statements are pretty straightforward, but my perspective is severely biased as I had a large part in designing and implementing the language.

dpo...@shaklee.com

unread,
Apr 7, 2017, 2:59:26 PM4/7/17
to schedulix
I guess mine is case sensitive and using the view in uppercase worked.

dpo...@shaklee.com

unread,
Apr 7, 2017, 3:06:53 PM4/7/17
to schedulix
Ronald,

So I got it working now but in one of your posts you mention:
Note the non SQL "with" option. This is capable of translating an ID to a (non quoted!) full qualified name.

It looks like I will need it quoted. Can I get the fully qualified name that is quoted?

dpo...@shaklee.com

unread,
Apr 7, 2017, 3:14:44 PM4/7/17
to schedulix
Don't get me wrong. I have been able to use a text editor to get the commands with quotes. I was just wondering if there was a way to get them properly quoted.

Ronald Jeninga

unread,
Apr 7, 2017, 3:55:05 PM4/7/17
to schedulix
Hi dpower,

we've written some functions (stored procedures) to do the same in the database as well.
But we didn't publish them yet. So I attached the file for mariadb/mysql to this post.
So far we didn't develop these functions for all supported DBMS, it's more a work in progress thing.

You only need it quoted if you decided to use case sensitive names and/or to use strange characters like white space in names.
The fact that your underlying RDBMS is case sensitive has no influence on the scheduling system.

Maybe I am going to extend the "with <columnname> = <object type>" functionality to support quoting.
It won't be very much work, but quite useful.

Regards,

Ronald
functions.sql_os

dpo...@shaklee.com

unread,
Apr 7, 2017, 4:04:02 PM4/7/17
to schedulix
Ronald,

Wow you are up!!

Thank you for the documentation. I will look it over. However, I did want to thank you as I now have been able to mass update the 180 affected jobs with the updated commands that I needed!!

Thank you so much for your patients and your answers.
Reply all
Reply to author
Forward
0 new messages