Bots job queue modification

109 views
Skip to first unread message

BikeMike

unread,
May 30, 2012, 7:56:48 AM5/30/12
to Bots Open Source EDI Translator
Hi everyone,

The concept of jobs and job queues comes from my AS/400 background,
where they are used to help control resource usage. My implementation of
this concept in Bots means that you can request an engine run at any
time, even if another is running. It allows scheduling of bots engine
runs by an external scheduler without worrying too much about timing. I
was also considering creating a built-in scheduler for bots, but with
this change it seems less necessary.

Each run is added to a job queue and runs in job number sequence. To
keep it simple, each queued job is actually a process that is just
waiting in a 5 second "sleep" loop. When the previous job ends, the next
one drops out of its loop and runs. Duplicate jobs (parameters) are not
allowed on the queue, these requests are discarded. I see no need to
allow duplicates because of the way bots works; this helps limit the
number of queued jobs. Once a job is actually running, a duplicate of it
can be queued to run.

Even though bots may allow parallel running of bots engine in a future
version, it will probably not work with the default SQLite database; so
this modification would still be useful for a default installation.

Using the job queue is optional; controlled by a new setting in bots.ini
Henk-jan, if you wanted to use this and make it the normal way bots
works, this setting and some code could be eliminated.
# Use a job queue for running bots engine. Default: False
usejobq=True

A counter is used to provide sequential job numbers (bots_jobnumber)
Job details are stored in the persist table and deleted after running
(key: bots_jobq)

engine.py
new functions to manage job queue
addtojobq
clearjobq
jobisqueued
jobisrunning
change sequence of initialisation slightly
first connect to database
job queue processing - timer loop - wait your turn to run
initialise logging only when ready to run
new commandline option --clearjobq
Used for new menu option
Job queue is also cleared if you run crash recovery
new return codes
4: duplicate of job already in job queue
5: job cancelled from job queue

bots-context.py
retrieve usejobq value to control menu option visibility.

menu.html
New option "Clear job queue" on the Systasks menu, if using job queue.
There is not currently a way to cancel an individual job from the queue,
or view the queue; but I do not see any need for this.

views.py
Do not check for database lock before run, if using job queue.

I have been using this in our production environment for a few days now
and it is working well. It replaces a previous "kludge" solution using
batch scripts and a job queue folder, which was windows specific.

--
Kind Regards,
Mike
engine.py
bots_context.py
views.py
menu.html

henk-jan ebbers

unread,
May 30, 2012, 8:23:20 AM5/30/12
to bots...@googlegroups.com
hi Mike,

very interesting.

what exactly is the difference with a scheduler? a scheduler just runs, without queue?

this might be a very good solution for the 'parallel routes' (when combined with the option to limit the time in a channel).
(it solves the same problem, doesn't it?)
why 'parallel routes' have been requested:
- start a route, and know that it will be run. this is a problem now, because a run can just be discarded (when engine is already running). Might be an idea to introduce 'priority' in the queue.
- performance: parallel processes can be started, which run on separate processor-cores. This is nice, but AFAIK perfromance is not a problem now (never been reported).
when thinking about how to implement parallel routes, I have this conceptual problem:
now I do a global lock (on the database); for my idea I have to introduce other, lower-level locks. Eg locks on routes, channels....this is hard to get right.

I will check this out!

kind regards,
henk-jan

BikeMike

unread,
May 30, 2012, 9:10:46 AM5/30/12
to Bots Open Source EDI Translator
On May 30, 9:23 pm, henk-jan ebbers <eppye.b...@gmail.com> wrote:
> hi Mike,
>
> very interesting.
>
> what exactly is the difference with a scheduler? a scheduler just runs, without queue?

Yes, a scheduler (Windows task scheduler, Unix cron) just runs a
process. In the PC/server world, there does not seem to be any concept
of queues or batch processing. Every process just runs, and the OS
manages multitasking. The AS/400 has a scheduler too, but jobs go from
the scheduler to a queue to be run when resource is available. In a
well running system the queue never gets long ;-)) There are multiple
queues, and each queue and each job has a priority. So some jobs can
jump ahead. I don't think we need that complexity for bots though!

So with the job queue in bots you can more reliably use task
scheduler / cron etc.

> this might be a very good solution for the 'parallel routes' (when combined with the option to limit the time in a channel).
> (it solves the same problem, doesn't it?)
> why 'parallel routes' have been requested:
> - start a route, and know that it will be run. this is a problem now, because a run can just be discarded (when engine is already running). Might be an idea to introduce 'priority' in the queue.
> - performance: parallel processes can be started, which run on separate processor-cores. This is nice, but AFAIK perfromance is not a problem now (never been reported).
> when thinking about how to implement parallel routes, I have this conceptual problem:
> now I do a global lock (on the database); for my idea I have to introduce other, lower-level locks. Eg locks on routes, channels....this is hard to get right.

It is an alternative to parallel routes, but does not solve every
problem. If you have a performance problem, this will not help it. For
example, a route that takes 30 minutes to process a single very large
file could still hold up other high priority routes. I have seen
several requests like his in the mailing list, but it is not an issue
for me. It does solve the problem of discarded runs, which was my
problem. I still deliberately discard duplicates of already queued
runs.

henk-jan ebbers

unread,
May 30, 2012, 9:33:11 AM5/30/12
to bots...@googlegroups.com


On 05/30/2012 03:10 PM, BikeMike wrote:
> It is an alternative to parallel routes, but does not solve every problem. If you have a performance problem, this will not help it. For example, a route that takes 30 minutes to process a single
> very large file could still hold up other high priority routes. I have seen several requests like his in the mailing list, but it is not an issue for me. It does solve the problem of discarded runs,
> which was my problem. I still deliberately discard duplicates of already queued runs.
only thing I can find in the mailing list is the combination of a run with a high priority, while a low priority run is running too long because of big edi volume.
I think this should be manageable by limiting the input via a channel?

kind regards,
henk-jan

BikeMike

unread,
May 30, 2012, 6:29:05 PM5/30/12
to Bots Open Source EDI Translator
Hi henk-jan,
The limit on channels will help with this, unless the input is one
huge file that can't be split up. I think someone in the mailing list
mentioned a situation like this (related to aircraft freight?) but not
sure. This would be an unusual situation though.

Kind Regards,
Mike

henk-jan ebbers

unread,
May 30, 2012, 7:50:14 PM5/30/12
to bots...@googlegroups.com
yes, I looked at that one.
this was the one that has high-priority routes, that could be 'shadowed' by longer running low priority jobs.
they were fine when the high priority transactions did not arrive on time.

in my performance tests I have some big edi files (5mb each; I have never seen them that big in reality).
4 of these files in 1 run take 6:41 minute all together. (bots 2.2.0)

henk-jan

BikeMike

unread,
May 30, 2012, 10:40:58 PM5/30/12
to bots...@googlegroups.com
Hi henk-jan
I have added a priority command line parameter: eg. -p1 for priority 1.
The default priority is 5. This will allow you to submit a lower or
higher priority job according to your needs. Of course it still has to
wait if another job is already running.

Job number is now always 6 digits, the first digit is priority. Last 5
digits are unique job number.

Kind Regards,
Mike
engine.py
Reply all
Reply to author
Forward
0 new messages