How to build a windows job server?

258 views
Skip to first unread message

zheng...@gmail.com

unread,
Feb 7, 2017, 6:20:14 AM2/7/17
to schedulix
Hi Mr Stubler,

     I installed the schedulix on my company`s centos server and it works normally. Now I want to set my pc as a jobserver 
so that the schedulix can run some jobs on Windows system. I`ve searched relevant information in this topic Not able to get schedulix job server running on Windows
and also download the additional files to install schedulix on Windows. When I run the command "%BICSUITEHOME%\bin\RUN_JOBSERVER" "%BICSUITECONFIG%\myjobserver.conf"

it reminds me like this FATAL   [Jobserver]     07-02-2017 15:55:51 CST ***ERROR*** C:\bicsuite\etc\myjobserver.conf, line 1: (04301271423) Identifier expected
                                 FATAL   [Jobserver]     07-02-2017 15:55:51 CST Program aborted

myjobserver.conf is RepoHost= 192.168.1.24
                            RepoPort= 2506
                            RepoUser= "GLOBAL.'AT_GROUP'.'KEVIN_PC'"
                            RepoPass= G0H0ME

this is my configuration of my job server.

Is there any mistake I should repair or a manual of how to set a job server on Windows.



Many thanks!




Ronald Jeninga

unread,
Feb 7, 2017, 8:13:59 AM2/7/17
to schedulix
Hi,

since you use the IP Address of the scheduling server's host, you'll have to quote it (else the scanner recognizes a number instead of a name).
Hence


RepoHost= "192.168.1.24"
RepoPort= 2506
RepoUser= "GLOBAL.'AT_GROUP'.'KEVIN_PC'"
RepoPass= G0H0ME


Off Topic:
Of course we feel honoured that you like the default password so much.
But the strength of it is comparable with "secret" or even worse. Note, it is the published default password in a schedulix environment. Probably the first thing a hacker would try.
We only store the SHA256 hash of a password. Nearly every character is permitted (I'd be careful with \0 since that character has a special meaning to the main parser/scanner).

And just because it is nice to know where some defaults come from:
G0H0ME is like GOHOME is like GO HOME is GEH HEIM (in German) is like GEHEIM is SECRET (in English). (Another "proof" that the strength of G0H0ME is comparable to SECRET.

Best regards,

Ronald

zheng...@gmail.com

unread,
Feb 7, 2017, 9:25:28 PM2/7/17
to schedulix
Hi,

    Thanks for your help, I quote the host like what you say and run the command again, now it reminds me like this

ERROR   [Jobserver]     08-02-2017 10:03:21 CST (04305141859) Invalid config entry: HTTPPORT=: (04305141714) Error converting value: For input string: ""
ERROR   [Jobserver]     08-02-2017 10:03:21 CST (04305141920) Missing required config entry: JOBFILEPREFIX
ERROR   [Jobserver]     08-02-2017 10:03:22 CST (04301271455) localhost:2506: Connection refused: connect (java.net.ConnectException)
ERROR   [Jobserver]     08-02-2017 10:03:53 CST (04301271455) localhost:2506: Connection refused: connect (java.net.ConnectException)
ERROR   [Jobserver]     08-02-2017 10:04:24 CST (04301271455) localhost:2506: Connection refused: connect (java.net.ConnectException)
ERROR   [Jobserver]     08-02-2017 10:04:55 CST (04301271455) localhost:2506: Connection refused: connect (java.net.ConnectException)

   seems the entry is not set correctly and the RepoHost is changed to "localhost" automatically itself, so what does it mean? could you tell me how to repair this?
hope to hearing from you soon, many thanks!

zheng...@gmail.com

unread,
Feb 7, 2017, 10:23:06 PM2/7/17
to schedulix
Hi!
   I noticed the error message "Missing required config entry: JOBFILEPREFIX " and find the JOBFILEPREFIX is not set in my job server`s 
config tab. I consult the configure of example server "/home/schedulix/taskfiles/host_1-" but I don`t understand what it means, there`s no file name like 'host_1' in this floder but
starttimes.GLOBAL.'EXAMPLES'.'HOST_1'.'SERVER'
starttimes.GLOBAL.'EXAMPLES'.'HOST_2'.'SERVER'
starttimes.GLOBAL.'EXAMPLES'.'LOCALHOST'.'SERVER'
so what should I do to solbe these problems?
Danke sehr!

Ronald Jeninga

unread,
Feb 8, 2017, 2:11:52 AM2/8/17
to schedulix
Hi,

if a jobserver has to execute a job (commandline), it writes a taskfile with all relevant information.
This taskfile is read by a jobexecutor which then executes the commandline.
The consequence is that jobservers can be restarted at any time without loss of information.

If no jobs are running, you won't find a taskfile. That's why the directory is empty.
The TASKFILEPREFIX is extended with the jobid to ensure unique file names.

In short: to get rid of the problem, define the TASKFILEPREFIX to point to some location in your file system.
Ideally the taskfiles are stored on a separate device to prevent errors because of file system full (in which case the jobserver can't start any more jobs).

Regards,

Ronald

Ronald Jeninga

unread,
Feb 8, 2017, 2:17:18 AM2/8/17
to schedulix
Hi,

if you change the entry REPOHOST (or any other value), make sure you uncheck the inherit box to the right of it.
As you noticed, your efforts won't show effect if you don't.

Usually the REPOHOST will be the same for all jobservers. This means that the value need to be set only once at GLOBAL level.
Everything below GLOBAL (hence everything) will inherit the value. O(1) effort for configuration ;-)

Regards,

Ronald

Ni Kevin

unread,
Feb 8, 2017, 3:29:17 AM2/8/17
to schedulix
Hi,

I set the REPOHOST at global and inherit it at job server config. Now the value doesn`t change in myjobserver.conf 
but it stll reminds me this ERROR
Something went wrong : java.net.BindException: Address already in use: JVM_Bind

I read the manual of configuration about Job servers and resource and it says the REPOHOST  "Host name or IP address of the scheduling server",
I`m wondering whether the IP of scheduling server is the same with IP of my CentOS server?

Thank you for your help !
Zhengfei

Ronald Jeninga

unread,
Feb 8, 2017, 3:53:16 AM2/8/17
to schedulix
Hi,

if you unchecked the inherit box somewhere in between GLOBAL and your jobserver, no inheritance will take place.

The BindException tells you that you are using a port which is already in use. Most probable the HTTPPORT or the NOTIFYPORT.
In both mentioned cases the error isn't fatal, but you won't have access to your log files (HTTPPORT), or your jobserver won't be notified if there's something to do (NOTIFYPORT) and you'll have to wait until the jobserver does a GET NEXT JOB.

The statement in the manual is a bit sloppy, agreed. It could say something like "the hostname or IP address of the system that runs the scheduling server process" to make it correct.
But that's a long sentence and doesn't really add something. The original statement shortens this by introducing some ambiguity (server in the sense of a computer that runs server processes vs. a server process).
Since processes don't have an IP address, the intended meaning can be derived.

Regards,

Ronald
Message has been deleted

Ni Kevin

unread,
Feb 8, 2017, 6:53:21 AM2/8/17
to schedulix
Hi,

now I have finished build a windows server and create a test job. Just followed the step Not able to get schedulix job server running on Windows
to add named resource and environment and add the resource to the job server
the job ran failed,


Error Message Job cannot run in any scope because of resource shortage 
Job Exit State FAILURE 
Final false 
Restartable true 
Rerun 0 
Exit Code NONE 
Exit State Mapping UNIX 
Priority [0,100] lower value means higher priority
Raw Priority 0 
Dynamic Priority 0 
Server NONE       why?
Program Pid NONE 
Run Program echo hello world 
Run Commandline NONE 
Rerun Program NONE 
Rerun Commandline NONE 
Workdir Definition NONE 
Workdir NONE 
Logfile Definition ${JOBID}.log 
Logfile
Error Logfile Definition ${JOBID}.log 
Error Logfile
Environment KEVIN_PCSERVER 
Footprint NONE 
Expected Runtime [Sec] 0 
Kill Program NONE 
Kill Id NONE 
Kill Exit Code NONE 


seems to be close to success

Many thanks!
Zhengfei

Ronald Jeninga

unread,
Feb 8, 2017, 7:20:13 AM2/8/17
to schedulix
Hi,

first of all you'll have to check if the Windows jobserver is running and is connected to the scheduling server.
If you look at the properties tab of the jobserver, you'll see the fields REGISTERED (checkbox should be "checked") and CONNECTED (should be "checked" as well).
If this is not the case, the jobserver isn't running and didn't manage to connect to the scheduling server and register itself.
The jobserver's log file might provide some information.

If the jobserver is registered and not connected, it got some connect data from the server but that data isn't correct for some reason.
A popular mistake is to configure the REPOHOST with "localhost", which is only true for the system that hosts the scheduling server itself, but not for any remote system.
In this case you'l have to do two things:
1. change the configuration
2. update the jobsever.conf file

If the jobserver is registered and connected, that part is fine.

Now if the jobserver is fine (registered and connected), we'll have to have a look at the resources.
Since you have a clear idea of where the job has to run, the situation is easy to investigate.
First you open your job definition and look at the REQUIRED RESOURCES tab. There you see a table of all required resources and where the requirement is defined (Environment, Footprint or Job Definition itself).
You now open the jobsever's RESOURCES tab. Here you can see which resources are visible for that jobserver. You'll probably have to expand the tree.
The next step is to compare the requirements with the availability. This will lead you to the missing resource. Note: if a resource is "offline", it is not allocatable (effectively not present).

HTH

Regards,

Ronald



Ni Kevin

unread,
Feb 8, 2017, 8:06:15 AM2/8/17
to schedulix
Hi Ronald,

 It`s awesome, it just worked~I add an extra resource so just remove it. Thank you so much for helping me solving problems with patience.
The schedulix is a great project and I will recommend it to my friends. If you come to Shanghai please
email me, my ceo and other colleagues will be very glad to talk with you about technology.

Thanks a lot!
Zhengfei

Ronald Jeninga

unread,
Feb 8, 2017, 8:33:52 AM2/8/17
to schedulix
Hi Zhengfei,

great everything works now.

With a large scale view, Shanghai is just around the corner from the place I live (1/3rd of an earth) ;-)
But seriously, thank you for the invitation and I certainly won't hesitate to contact you.

If enough colleagues want to work with schedulix as well, you (or your CEO) could consider to book a workshop.
Normally a workshop for users requires about 3 days. If you want to know the in and outs of the installation and administration, you'll have to add another day.
You can send me a personal e-mail if you want to know the details. (ronald.jeninga at independit.de).

Anyway, I am absolutely happy you like the project so much. And please spread the word!
There's not such a thing as too many users!
We're working hard to improve the system and it is a real reward to receive praise for it.

Regards,

Ronald

Reply all
Reply to author
Forward
0 new messages