How to setup webusers, users, groups and servers

wim.ve...@billinghouse.nl

unread,

Sep 24, 2018, 7:58:54 AM9/24/18

to schedulix

Hi,

I am configuring a schedulix system to work for us:)

I have made my jobs, I can run them and they work as expected. But this is all done with the default sdmsadm user.

What I roughtly want is to create web-users that belong to one of three groups:

- tester

- user acceptance tester

- production

Users in the production group are allowed to submit jobs on the production and uat systems. users in the user acceptance tester group are allowed to submit on tester and uat systems. Testers only can submit to the test servers.

I have created a job structure like

SYSTEM

+- <SERVER1>

| +- JOBS

| | +- job 1, ....

| +- BATCH-1

| | +- batch and children (using jobs)

| +- BATCH-2

+- <SERVER-2>

+- <copy of structure under SERVER-1

SERVER could be a development server, a UAT server or production system. Each system has its own top level folder.

What I understood so far is:

WebUsers in schedulix are actually zope things. They do not exist in the schedulix server (and there is no create or alter command to make them there). A WebUser is associated with one or more users in the Schedulix Server. These Users are associated with groups in the server.

So I would have to create 3 Users (p.a. uat, tester and Production) and 3 groups (possibly use the same names) and assign the correct groups to the each user.

Then I will have to create WebUsers and let them use one of the 3 Users to access the server. So if Tom and Jerry are both user acceptance testers, they should both use the same UAT user to access the server.

At SERVER folder level I would have to set the correct group so everything below that is only visible for the WebUsers that use a User which has the correct group.

Because we want to script everything (for various reasons beyond this discussion) I initially would like to create a script that creates all this. But I can only find server commands to create the User and the Groups, but not the WebUsers. They seem to be in the zope scope.

Question 1. How can we script creating the WebUsers ?

In due time the user wants to have the user management (of the WebUsers) to be centralized using an LDAP/AD central authentication system. We found a link to use that for the schedulix server (https://groups.google.com/forum/#!searchin/schedulix/ldap%7Csort:date/schedulix/1RyubrzYF7w/TS6W385fBQAJ), but I guess this is not for the WebUsers.

Question 2. Do we need to consult the Zope manual to use LDAP/AD for webusers (as these are the user credentials known by the actual users) ?

I have also a question about the default setup of jobservers (the setup_jobserver script). I noticed it defines three resources, # RESOURCE.STATIC.HOSTS.hostname, RESOURCE.STATIC.USERS.username and RESOURCE.STATIC.SERVERS.jobserver name.

So the environment needs all three of them to be able to use the jobserver I suppose.

We want one job active per server at maximum (the actual implementation does not allow to run jobs simultaneously). But it must be possible to run jobs in parallel on different servers. In our case servers are added and removed regularly. Adding/removing a system for the jobs is very easily scripted, but adding/removing job servers is more difficult (it requires steps at the os level and creating client processes with ports etc, which in turn have impact on the proxy server etc). Therefore we would like to have the jobservers independent of the servers (so not one jobserver per server). Ideally we would like to the system to dynamically select a random jobserver/environment from a pool to allow simultaneous run different job. We will create a resource requirement per server, so a second master job for the same server would get blocked until the resource becomes available to avoid parallel runs on the same server.

In our case servers are added and removed regularly. Adding/removing a system for the jobs is very easily scripted, but adding/removing job servers is more difficult (it requires steps at the os level and creating client processes with ports etc, which in turn have impact on the proxy server etc) we would like to have the jobservers independent of the servers (so not one jobserver per server).

Question 3. What would be the best approach to create such a jobserver/environment pool ?

I understand we should probably create environments for each jobserver and possibly select the environment from a pool. I do not know if we also need a scope or something like that.

Regards,

Wim Veldhuis.

Ronald Jeninga

unread,

Sep 25, 2018, 11:07:54 AM9/25/18

to schedulix

Hi Wim,

let me start with your last question.

The setup_jobserver script is a help to setup jobservers. It isn't 100% production proof as you've discovered already, but it does its job pretty well.

But of course this script doesn't know anything about semantics. It only knows you want to install a jobserver called X on node Y as user Z, or so.

And exactly this information is used to setup an environment and a set of resources such that the new jobserver can be uniquely addressed.

This on itself is an important result, but it's only a start.

Our view on jobservers and environments addressing them is that jobservers have some purpose and at least the environment addressing that jobserver should reflect that purpose.

So you could define an environment FTP_SERVER, NUMBER_CRUNCHER, DB_SERVER, ... requiring resources like RESOURCE.STATIC.FTP_SERVER, RESOURCE.STATIC.NUMBER_CRUNCHER, etc.

All jobservers that can be used as an FTP_SERVER get a resource RESOURCE.STATIC.FTP_SERVER. The jobservers on the Crays get a resource RESOURCE.STATIC.NUMBER_CRUNCHER, and so on.

A job that requests Environment NUMBER_CRUNCHER will automatically be executed by a jobserver that have the resource RESOURCE.STATIC.NUMBER_CRUNCHER.

Because you've already set up a tree for each environment (Dev, Test, Prod), you can attach an environment to each top level folder.

You create an environment DEV, TEST and PROD, requiring the resources RESOURCE.STATIC.RUNMODE.DEV, RESOURCE.STATIC.RUNMODE.TEST, RESOURCE.STATIC.RUNMODE.PROD.

The DEV environment is then required by the top level development folder, and so on.

The effect is that jobs residing somewhere in the development tree requires a, let me say, NUMBER_CRUNCHER. But the requirement from the (grand)parent folder is automatically added.

This way this job will be executed by a NUMBER_CRUNCHER jobserver that also offers the DEV resource.

If development finishes and the job needs to be deployed, you simply move or copy it to the production tree. Started from there, it will automatically select a production NUMBER_CRUNCHER.

Still the initial setup is important. If you happen to have a pool of 200 NUMBER_CRUNCHERs and you encounter a problem every time when node 137 is executing a certain job, it is more than convenient if you can address exactly this physical environment in order to investigate the cause of the problem.

The other two questions aren't that easy to answer, apart from the confirmation that both creating users by a script and to access AD from Zope is possible.

I know I once wrote a script to create web users. But that's quite some years ago. In the meantime a lot happened, which means that my original script wouldn't even work any more. (It would be a good starting point though).

But as a matter of fact I developed it for a customer and the script isn't my property.

The basic idea is to use wget to send the message to the zope server that normally is sent by the browser.

To enable Zope to use AD in fact requires an Apache in front of it.

Then the Apache server needs some configuration changes as well as the Zope server.

That's a little bit much to explain here. And quite a lot of effort at this stage where you're talking about creating three users.

Best regards,

Ronald

wim.ve...@billinghouse.nl

unread,

Sep 26, 2018, 8:25:10 AM9/26/18

to schedulix

Somehow my response got lost :(

Part of the question was a misunderstanding of resources. I though having acquired the resources meant another job could not use the same jobserver while the first was still running.

I finally figured out the synchronizing resources (I needed that to avoid running two jobs on the same external system at the same time). However I have a question related to synchronizing resources and batch jobs. It seems the synchronizing resources can only be aquired by jobs, not by batches. That could mean batches could interfere, which I would like to prevent.

Assume two batches A and B, each with two sequential jobs (1 and 2). These batches are submitted for the same external system. As I understand it this can lead to the following orders of execution:
A1 B1 B2 A2, A1 B1 A2 B2 and A1 A2 B1 B2. Is there an easy way using synchronizing resources to make certain A1 A2 B1 B2 is order of jobs ?

Regards,

Wim Veldhuis.

Ronald Jeninga

unread,

Sep 26, 2018, 8:44:52 AM9/26/18

to schedulix

Hi Wim,

maybe you sent that post directly to me? At least I tried to answer a mail of you this morning.

Basically you just change the batch into a job with run program "0". Now you can allocate the resource with KEEP FINAL at "batch" level, which will prevent the race condition to occur.

But still you could end up with B1, B2, A1, A2 if B happens to be submitted a little earlier or A has to wait for other reasons. And the only other possible order of execution is of course the desired A1,A2,B1,B2.

A second approach is to use sticky resource requests. The effect in this example will be identical.

The advantage of the use of sticky resources is that you can create critical regions that span several subtrees. The downside is that it's more work.

In most cases my initial suggestion will work perfectly fine.

HTH

Regards,

Ronald

wim.ve...@billinghouse.nl

unread,

Sep 27, 2018, 5:04:21 AM9/27/18

to schedulix

Hi Ronald,

Life can be so simple. Just convert the batch into a job. I was already trying to think out complex schedules with sticky state and extra jobs.

Works indeed like a charm.

The difference between a batch and job is a lot smaller than I thought.

I guess a batch is just a placeholder for batch specfic resource instances. Is this correct, or are there more differences between a batch and job ?

Wim.

Ronald Jeninga

unread,

Sep 27, 2018, 7:29:01 AM9/27/18

to schedulix

Hi Wim,

indeed, the difference between a batch and a job isn't that big. In fact a job is, more or less, a batch with a command line that can issue resource requests.

In most situations batches will be used to structure the job flow. Comparable with using a function call instead of an inline list of statements in programming.

There's a small difference in defaulting. Children of batches are static children per default, because a batch can't have a run program and won't submit dynamic children.

Children of jobs are dynamic children per default. But batches can have dynamic children and jobs can have static children.

A job will always try to execute its run program and will always end up with either an ERROR, or some exit code.

An ERROR means that some mistake in the definition has been made, so I can ignore that. The exit code maps to some exit state.

This is not true for batches, which is why you can define the batch default in an exit state profile. If you don't define a batch default, the default will be something like the final state with the lowest preference.

The processing of a batch involves much less overhead than the processing of a job.

The state model of a batch is like: SUBMITTED -> DEPENDENCY_WAIT -> FINISHED -> FINAL

That's a lot less compared to a real job: SUBMITTED -> DEPENDENCY_WAIT -> SYNCHRONIZE_WAIT -> RESOURCE_WAIT -> RUNNABLE -> STARTING -> STARTED -> RUNNING -> FINISHED -> FINAL

Since nearly every state change needs a database transaction, we have an extra of about 6 transactions if I compare batches with jobs.

The recommended run program "0" is strictly speaking a job, but is handled a little differently. Instead of being executed by some jobserver it immediately "terminates" with exit code 0.

So we have a state model like: SUBMITTED -> DEPENDENCY_WAIT -> SYNCHRONIZE_WAIT -> RESOURCE_WAIT -> RUNNABLE -> FINISHED -> FINAL

(The rule is: a command line that can be interpreted as an integer won't be executed, but it will be regarded to have run and exited with the integer from the command line. This evaluation is done after parameter substitution).