Are queue manager updates written to disk immediately ?

Simon Clubley

unread,

Apr 11, 2013, 11:27:06 AM4/11/13

to

When a batch job completes, does the queue manager _immediately_ write
that information to it's data structures on disk or is that information
cached in memory for a short time first ?

I had a problem this morning which I have not seen before and it looks
like VMS is not _immediately_ writing queue updates away to disk, which
to put it mildly is bl**dy dangerous if true.

A routine batch job ran at 08:20 and completed; I have the log file
from this first run on disk.

The power failed about a minute or so later. (This box is not UPS
protected.)

When the system restarted, the same job ran again. I also have the log
file from this second run on disk.

The system disk (and all disks) are in write-through mode:

Volume Status: ODS-2, subject to mount verification, protected subsystems
enabled, file high-water marking, write-through caching enabled.

These are directly attached disks on a PCI controller and it's a Alpha
V8.3 system.

From the end of the first run:

$ submit/queue=ba0/after:"tomorrow+08:20" ownsrc:check_queues.com
Job CHECK_QUEUES (queue BA0, entry 3927) holding until 12-APR-2013 08:20
$ exit
[deleted] job terminated at 11-APR-2013 08:20:02.57

Accounting information:
Buffered I/O count: 204 Peak working set size: 5088
Direct I/O count: 229 Peak virtual size: 173040
Page faults: 1455 Mounted volumes: 0
Charged CPU time: 0 00:00:00.52 Elapsed time: 0 00:00:02.56

As the job had completed, the entry should have disappeared from the
queue database at that point. Instead after the system restart, the
same job ran again. Here is the end of the second run:

$ submit/queue=ba0/after:"tomorrow+08:20" ownsrc:check_queues.com
Job CHECK_QUEUES (queue BA0, entry 3923) holding until 12-APR-2013 08:20
$ exit
[deleted] job terminated at 11-APR-2013 08:25:11.69

Accounting information:
Buffered I/O count: 204 Peak working set size: 5680
Direct I/O count: 253 Peak virtual size: 173040
Page faults: 1387 Mounted volumes: 0
Charged CPU time: 0 00:00:00.67 Elapsed time: 0 00:00:05.25

Also, there was no entry in the queue database for the resubmitted job
(entry 3927) from the first run, but the resubmitted job from the second
run (entry 3923) is in the queue database as expected.

Does anyone have any ideas ?

Thanks,

Simon.

PS: Before someone asks, I will fire this off to VMS support in a day or
so, but I just wanted to do a quick check here to see if anyone here had
seen this first.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

Jan-Erik Soderholm

unread,

Apr 11, 2013, 12:09:22 PM4/11/13

to

I guess accounting (acc/queue=ba0) also shows both jobs as runed?
Nothing weird with the start/finish timestamps in accounting?

I would look mare at something with the system clock at
startup that made the holding job to be released. Such
as a startup with wrong time setting or similar.

That is, queue ba0 is started by the startup before the
clock is corrected. Or something like that.

Jan-Erik.

Simon Clubley

unread,

Apr 11, 2013, 12:28:04 PM4/11/13

to

On 2013-04-11, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>
> I guess accounting (acc/queue=ba0) also shows both jobs as runed?

There's no accounting entry for the first job even though there is
a full logfile, but since accounting log updates are buffered, then
that's not really a major surprise.

> Nothing weird with the start/finish timestamps in accounting?
>
> I would look mare at something with the system clock at
> startup that made the holding job to be released. Such
> as a startup with wrong time setting or similar.
>

All the timestamps on the log files and accounting are correct; the only
time the system clock is corrected on this system is once a day during
during the night from a NTP source and that job had already run a couple
of hours previously. There was nothing strange in this NTP update job
log file either.

In addition, the queue entry number from the accounting record for the
second job didn't match the entry number from the submit command in
the first job.

Thanks for the suggestions however, I do appreciate them.

Simon.

Jan-Erik Soderholm

unread,

Apr 11, 2013, 12:53:03 PM4/11/13

to

Simon Clubley wrote 2013-04-11 18:28:
> On 2013-04-11, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>
>> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>
> There's no accounting entry for the first job even though there is
> a full logfile, but since accounting log updates are buffered, then
> that's not really a major surprise.
>
>> Nothing weird with the start/finish timestamps in accounting?
>>
>> I would look mare at something with the system clock at
>> startup that made the holding job to be released. Such
>> as a startup with wrong time setting or similar.
>>
>
> All the timestamps on the log files and accounting are correct; the only
> time the system clock is corrected on this system is once a day during
> during the night from a NTP source and that job had already run a couple
> of hours previously.

But there was a reboot in between, not ?

> In addition, the queue entry number from the accounting record for the
> second job didn't match the entry number from the submit command in
> the first job.

Higher/later or lower/earlier?

Does it match the entry number from *any* previous job ?
I don't know ho long you keep logs, of course... :-)

Sounds like the quemgr thought that the job hadn't been
run (or hadn't completed) and simply restarted it.

But then, will a restarted batch job not run using the
original entry number? Maybe not...

I would expect the queue database to be updated syncronisly
at the time of batch job "rundown". That is where SHOW QUEUE
looks, not?

Weird.
I guess this could be simulated on a test system. :-)

Jan-Erik.

Paul Sture

unread,

Apr 11, 2013, 1:24:49 PM4/11/13

to

In article <kk6n7i$6br$1...@news.albasani.net>,

Jan-Erik Soderholm <jan-erik....@telia.com> wrote:

> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
> Nothing weird with the start/finish timestamps in accounting?
>
> I would look mare at something with the system clock at
> startup that made the holding job to be released. Such
> as a startup with wrong time setting or similar.
>
> That is, queue ba0 is started by the startup before the
> clock is corrected. Or something like that.

I have definitely had Alphas come back from a reboot with the wrong
time. I believe I once had a system which came back with the date out by
1 year (or a year +/- one day), which fooled me at first, though that
might have been a VAX.

--
Paul Sture

Johnny Billquist

unread,

Apr 11, 2013, 1:30:53 PM4/11/13

to

On 2013-04-11 18:28, Simon Clubley wrote:
> On 2013-04-11, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>
>> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>
> There's no accounting entry for the first job even though there is
> a full logfile, but since accounting log updates are buffered, then
> that's not really a major surprise.
>
>> Nothing weird with the start/finish timestamps in accounting?
>>
>> I would look mare at something with the system clock at
>> startup that made the holding job to be released. Such
>> as a startup with wrong time setting or similar.
>>
>
> All the timestamps on the log files and accounting are correct; the only
> time the system clock is corrected on this system is once a day during
> during the night from a NTP source and that job had already run a couple
> of hours previously. There was nothing strange in this NTP update job
> log file either.
>
> In addition, the queue entry number from the accounting record for the
> second job didn't match the entry number from the submit command in
> the first job.
>
> Thanks for the suggestions however, I do appreciate them.

My first reaction would be that even though you have the log and so on,
the job had not yet completed. However, the fact the the submitted job
in the first run don't exist in the queue makes it more weird.

Maybe some corruption of the batch queue file?

Johnny

Simon Clubley

unread,

Apr 11, 2013, 1:35:23 PM4/11/13

to

On 2013-04-11, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
> Simon Clubley wrote 2013-04-11 18:28:
>> On 2013-04-11, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>>
>>> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>>
>> There's no accounting entry for the first job even though there is
>> a full logfile, but since accounting log updates are buffered, then
>> that's not really a major surprise.
>>
>>> Nothing weird with the start/finish timestamps in accounting?
>>>
>>> I would look mare at something with the system clock at
>>> startup that made the holding job to be released. Such
>>> as a startup with wrong time setting or similar.
>>>
>>
>> All the timestamps on the log files and accounting are correct; the only
>> time the system clock is corrected on this system is once a day during
>> during the night from a NTP source and that job had already run a couple
>> of hours previously.
>
> But there was a reboot in between, not ?
>

Sequence:

NTP job (06:30) -> check_queues (08:20) -> power failure ->
check_queues run again (08:25).

>> In addition, the queue entry number from the accounting record for the
>> second job didn't match the entry number from the submit command in
>> the first job.
>
> Higher/later or lower/earlier?
>
> Does it match the entry number from *any* previous job ?
> I don't know ho long you keep logs, of course... :-)
>

I also have the log file from yesterday's run (10-Apr-2013 08:20).

The entry number from yesterday's submit command matches the entry
number in the accounting log for the second run today.

So the queue manager has indeed run the job submitted yesterday twice
to completion today.

> Sounds like the quemgr thought that the job hadn't been
> run (or hadn't completed) and simply restarted it.
>
> But then, will a restarted batch job not run using the
> original entry number? Maybe not...
>
> I would expect the queue database to be updated syncronisly
> at the time of batch job "rundown". That is where SHOW QUEUE
> looks, not?
>

That's _exactly_ what I would expect as well, and on disk as well;
not just in memory.

Even if there was some queue manager bug caused by a power failure
during some unusual tight timing window of a few milliseconds [*],
that still does not explain the disappearing job from the submit

command in the first job.

Simon.

[*] A few milliseconds maximum, because don't forget I have a _full_
logfile from the first run of the job today.

Simon Clubley

unread,

Apr 11, 2013, 1:53:04 PM4/11/13

to

On 2013-04-11, Johnny Billquist <b...@softjar.se> wrote:
>
> My first reaction would be that even though you have the log and so on,
> the job had not yet completed. However, the fact the the submitted job
> in the first run don't exist in the queue makes it more weird.
>
> Maybe some corruption of the batch queue file?
>

I don't see any evidence of that; all the queues and other jobs all
look just fine.

Paul Sture

unread,

Apr 11, 2013, 2:00:42 PM4/11/13

to

In article <kk6s8q$39e$1...@dont-email.me>,

That sounds like a small window in the batch run down code. If you
think about it there must be a point between the batch job completing
and its return status being processed so that the queue manager either
deleting or retaining the job in the queue database.

--
Paul Sture

abrsvc

unread,

Apr 11, 2013, 2:50:15 PM4/11/13

to

From the IDSM (V1.5) Chapter 30 page 1078

"... the job controller indicates that the proces should be terminated, LOGINOUT terminates is through the following steps:

1) It write a logout message to the logfile.
2) It closes the logfile.
3) If the logfile is to be printed, then LOGINOUT requests the $SNDJBCW system service again, this time to queue the file to a print queue.
4) It then requests the $EXIT system service from executive mode. After any executive mode exit handlers have performed their work, the $exit system service requests the $DELPRC system service, which removes teh process from the system.

Page 1079 lists the work for process deletion:

-> All resources allocated to the process must be returned to the system.
-> Accounting information must be sent to the job controller.
-> Any subprocesses of the process being deleted must be deleted.
-> etc...

I would guess that your process was in the middle of the above when the system went down. Until the process is actually deleted, I would suspect that the queue manager is not informed of its completion.

Dan

Simon Clubley

unread,

Apr 11, 2013, 3:27:44 PM4/11/13

to

The problem with the window theory is that while it explains a repeating
job, it doesn't explain why the submit from the first run went walkies.

_If_ the queue manager is not delaying the on disk updates, then I wonder
if I have come across more than one problem here.

It will be interesting to see what HP have to say.

Thanks everyone,

Simon.

David Froble

unread,

Apr 11, 2013, 6:02:48 PM4/11/13

to

Jan-Erik Soderholm wrote:

> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
> Nothing weird with the start/finish timestamps in accounting?
>
> I would look mare at something with the system clock at
> startup that made the holding job to be released. Such
> as a startup with wrong time setting or similar.
>
> That is, queue ba0 is started by the startup before the
> clock is corrected. Or something like that.
>
> Jan-Erik.

Good idea of something to look at.

I too wondered about the clock, but what Simon posted didn't seem to
have any time stamps that looked improper.

If it's as reported, then I'd say it's definitely a problem.

That said, I'd also suggest that if the batch job running more than once
is a problem, then perhaps some type of flag showing last run time of
the procedure might be in order. If it is at least written to disk in a
timely manner. Regardless, there could be multiple reasons for a batch
job to run when you don't want it to run. Now, if it's just a report,
so what if it's run twice.

I usually design in such sequence info in my designs when appropriate.

Jan-Erik Soderholm

unread,

Apr 12, 2013, 3:58:35 AM4/12/13

to

Well, from the name of the procedure ("CHECK_QUEUES") I thought that
it would be safe to run it twice, but the question about what actualy
happend is still interesting. :-)

And maybe you *do* want to "check your queues" right after
a reboot anyway! :-)

Actualy, it might even be designed to run that particualr
script right after any boot...

Jan-Erik.

Jan-Erik.

Stephen Hoffman

unread,

Apr 12, 2013, 8:54:25 AM4/12/13

to

On 2013-04-11 15:27:06 +0000, Simon Clubley said:

> When a batch job completes, does the queue manager _immediately_ write
> that information to it's data structures on disk or is that information
> cached in memory for a short time first ?
>
> I had a problem this morning which I have not seen before and it looks
> like VMS is not _immediately_ writing queue updates away to disk, which
> to put it mildly is bl**dy dangerous if true.

This reply will not answer your question, nor identify the culprit
software or hardware here.

You'll want the answer from HP, whatever that might be.

The queue manager has occasionally lost data or corrupted data, so a
check of the patch status is certainly warranted. HP will undoubtedly
ask about that, too.

While I don't recall the details of the queue manager implementation
off the top and whether it's using careful updates or caching data for
whatever reason, VMS has classically tried to use so-called careful
updates; ordering the writes so that partial or bad data doesn't end up
available if/when a crash or power outage arises. You get either the
whole update, or nothing. These careful updates are tricky to
implement. This approach is also a great and wonderful scheme, right
up until you meet your first caching controller, or your first caching
disk. Particularly if the caching controller or the caching disk
misrepresents the state of the data back up to the host, or if the
cache batteries don't ride over the outage, or if the controller gets
reset and loses its marbles and loses the data.

Rummage around for an OpenSolaris mailing list posting from Jeff
Bonwick (then working on ZFS at Sun) from around October, 2008 for some
related details on what ZFS was encountering with some SATA gear.
(OpenSolaris is the predecessor of what's know now as illumos, and used
in OpenIndiana and related.) Here's an excerpt from that post:

"FYI, I'm working on a workaround for broken devices. As you note,
some disks flat-out lie: you issue the synchronize-cache command, they
say "got it, boss", yet the data is still not on stable storage. Why
do they do this? Because "it performs better". Well, duh -- you can
make stuff *really* fast if it doesn't have to be correct.

Before I explain how ZFS can fix this, I need to get something off my
chest: people who knowingly make such disks should be in federal
prison. It is *fraud* to win benchmarks this way. Doing so causes real
harm to real people. Same goes for NFS implementations that ignore
sync. We have specifications for a reason. People assume that you
honor them, and build higher-level systems on top of them. Change the
mass of the proton by a few percent, and the stars explode. It is
impossible to build a functioning civil society in a culture that
tolerates lies. We need a little more Code of Hammurabi in the storage
industry."

Years ago, DEC had some SCSI configurations with batteries right in the
StorageWorks storage shelves, intended to allow the shelves and disks
to complete multiblock writes that might have been in flight. The
general problem with not getting all the data written to non-volatile
storage has only gotten more complex in the years since then, with
caching controllers (particularly with bad RAID batteries or no
batteries), and with caching drives, and the quest for higher and
higher performance I/O.

If this queue manager misbehavior is a sufficient issue for you,
consider getting yourself a Less-Interruptible Power Supply (LIPS, as
I've never met a truly uninterruptible power supply) for the system.
And as others have mentioned, add some checks against a job that really
can't run twice. (I've seen a few of these cases in clusters, when the
cluster time was skewed among hosts. Your "tomorrow+08:20" should have
avoided problems from the usual minor skews, unless the time in the
cluster — on the host that was running the queue manager, which is not
necessarily the host that was running the batch job — was very skewed.)
I've ended up with a batch scheduler for these and related tasks.

What's available for process control and process management using the
default VMS mechanisms and APIs is very low-level, unfortunately, and
what starts out as DCL and related baggage inevitably gets unwieldy as
cases and updates are added to the code; best to either acquire a
scheduler, or roll your own properly. This also gets into having a
transactional database, another of my "peeves" about application code
"rolling its own". Journaling and the rest are all because the power
and the hardware can be somewhere between untrustworthy and, well, see
Jeff's posting...

As for your question, donno. But this stuff can be (is) more
complicated than it looks.

--
Pure Personal Opinion | HoffmanLabs LLC

Stephen Hoffman

unread,

Apr 12, 2013, 9:06:34 AM4/12/13

to

On 2013-04-12 12:54:25 +0000, Stephen Hoffman said:

> "synchronize-cache command"

ps: There are very few invocations of that command within VMS.

Simon Clubley

unread,

Apr 12, 2013, 10:03:58 AM4/12/13

to

On 2013-04-12, Stephen Hoffman <seao...@hoffmanlabs.invalid> wrote:
> On 2013-04-11 15:27:06 +0000, Simon Clubley said:
>
> If this queue manager misbehavior is a sufficient issue for you,
> consider getting yourself a Less-Interruptible Power Supply (LIPS, as
> I've never met a truly uninterruptible power supply) for the system.

Thanks for the feedback, Hoff.

The problem with that is that it feels like a hardware workaround for a
software bug.

> And as others have mentioned, add some checks against a job that really
> can't run twice.

The problem with ad-hoc checks is just as you mention, that it _is_ ad-hoc.

The normal application level production jobs (this was not one of them)
are part of a site specific scheduler which means that when they run is
under that scheduler's control (job specific .com files are created and
submitted by the scheduler as required).

This design also means there are no holding jobs waiting to be released
manually by mistake when they should not be; the scheduler in use was
designed that way on purpose to stop just this problem of the job been
run when it should not be. What it will not currently protect against
however is VMS itself running the same submitted job twice.

In case it's not obvious by now :-), I tend to be rather paranoid when
it comes to data integrity and security and even I did not think about
the possibility of VMS itself doing something like this (if indeed that
turns out to be the case).

> (I've seen a few of these cases in clusters, when the
> cluster time was skewed among hosts. Your "tomorrow+08:20" should have
> avoided problems from the usual minor skews, unless the time in the

> cluster ? on the host that was running the queue manager, which is not
> necessarily the host that was running the batch job ? was very skewed.)

> I've ended up with a batch scheduler for these and related tasks.
>

It's a standalone system; no cluster involved.

All hardware is official HP supported hardware; no unsupported third
party equipment for either the controller or disks.

Everything is configured as write-through; no deferred writes involved.

BTW, it also occurred to me after my last batch of responses that if
a window exists during job rundown when the queue manager thinks the
job is still active even though the log file is complete, then the job
should have been marked with the system failed during execution status
you would normally get in that situation upon system restart.

That makes me think nothing about the job actually starting was written
to the queue manager database on disk even though a full logfile was
written to those same disks. (The logfile was on a different disk, but
that disk was attached to the same controller.)

I've now logged the issue with HP and they are currently looking at it.

Thanks everyone,

Simon.

Jan-Erik Soderholm

unread,

Apr 12, 2013, 10:27:54 AM4/12/13

to

And it wasn't as simple as an extra submit of the job
from the startup scripts? Do you have any log files
with the startup/console output ?

Jan-Erik.

Stephen Hoffman

unread,

Apr 12, 2013, 10:42:11 AM4/12/13

to

On 2013-04-12 14:03:58 +0000, Simon Clubley said:

> On 2013-04-12, Stephen Hoffman <seao...@hoffmanlabs.invalid> wrote:
>> On 2013-04-11 15:27:06 +0000, Simon Clubley said:
>>
>> If this queue manager misbehavior is a sufficient issue for you,
>> consider getting yourself a Less-Interruptible Power Supply (LIPS, as
>> I've never met a truly uninterruptible power supply) for the system.
>
> Thanks for the feedback, Hoff.
>
> The problem with that is that it feels like a hardware workaround for a
> software bug.

Um, so? If throwing hardware at the problem reduces or avoids the
problem, it's goodness.

I'm all for not testing the handling and recovery of others' software
during hard-crashes, too.

> The normal application level production jobs (this was not one of them)
> are part of a site specific scheduler which means that when they run is
> under that scheduler's control (job specific .com files are created and
> submitted by the scheduler as required).

Time to move this job into the production scheduler?

> This design also means there are no holding jobs waiting to be released
> manually by mistake when they should not be; the scheduler in use was
> designed that way on purpose to stop just this problem of the job been
> run when it should not be. What it will not currently protect against
> however is VMS itself running the same submitted job twice.

I'm not fond of the VMS APIs here. They're far too primitive, and far
too limited.

The queue manager and the operator subsystem and some other core
functions are positively ancient designs. They weren't even
particularly advanced when they were designed and written, either.
TOPS was doing better in a number of areas. IMO, an overhaul is long
overdue. (Which ties back to my comments in another recent thread
around whether seeking to emulate VMS is really such a good idea. But
I digress.)

> In case it's not obvious by now :-), I tend to be rather paranoid when
> it comes to data integrity and security and even I did not think about
> the possibility of VMS itself doing something like this (if indeed that
> turns out to be the case).

You should see what a storage controller once did to a database I was managing.

> All hardware is official HP supported hardware; no unsupported third
> party equipment for either the controller or disks.
>
> Everything is configured as write-through; no deferred writes involved.

Write-through, write-back and deferred writes are not related to what I
was referencing in my reply. With full-on fully-synchronous $qiow or
$io_performw I/O calls followed by an explicit synchronize command — a
command which VMS seldom uses, AFAIK — a controller or a disk that
caches a synchronize-cache command could show the data-loss behavior
mentioned. If you're not synchronizing the I/O with a caching
controler or a caching disk — with a controller and disks that do
correctly implement the synchronization request — then the results of a
power failure or hard crash here could also lose data.

But I don't recall the queue manager I/O off-hand. Maybe queue manager
isn't coded to deal with a power outage...

Paul Sture

unread,

Apr 12, 2013, 10:50:36 AM4/12/13

to

In article <kk95l8$lqt$1...@news.albasani.net>,

Jan-Erik Soderholm <jan-erik....@telia.com> wrote:

> And it wasn't as simple as an extra submit of the job
> from the startup scripts? Do you have any log files
> with the startup/console output ?

Not likely because it had the same job entry number.

--
Paul Sture

Simon Clubley

unread,

Apr 12, 2013, 11:01:41 AM4/12/13

to

On 2013-04-12, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>
> And it wasn't as simple as an extra submit of the job
> from the startup scripts? Do you have any log files
> with the startup/console output ?
>

Nice idea, but unfortunately I never submit routine scheduled jobs such as
this one from the system startup. The only place the submit command for
this job exists across the whole of this system is in the command procedure
itself.

The only batch jobs I submit at system startup are for those tasks designed
to be running 24/7.

Stephen Hoffman

unread,

Apr 12, 2013, 11:16:35 AM4/12/13

to

On 2013-04-12 15:01:41 +0000, Simon Clubley said:

> Nice idea, but unfortunately I never submit routine scheduled jobs such as
> this one from the system startup. The only place the submit command for
> this job exists across the whole of this system is in the command procedure
> itself.
>
> The only batch jobs I submit at system startup are for those tasks designed
> to be running 24/7.

I remember having to write code to detect other copies of the job in
the queue, too.
Code to detect jobs started at the wrong time.
Etc.
Those were fun times.

Johnny Billquist

unread,

Apr 12, 2013, 11:32:39 AM4/12/13

to

Since it had the same job number that would imply that indeed it was
rerunning the same job. Is the job set to restart?

There is a delicate race condition in batch in general, when it itself
submits the next run, in that you can end up with two jobs if the job
manage to do the submit, but don't manage to finish. This requires some
careful thinking and design to at least minimize.
I know I've had a similar issue under RSX with my backup jobs.

However, the one really strange thing I see in all this is that the log
file seems to have confirmed that your first run did submit a job, but
that job seemingly disappeared into thin air.
That is the one thing I can't account for.

Johnny

Paul Sture

unread,

Apr 12, 2013, 11:47:40 AM4/12/13

to

In article <kk948d$mju$1...@dont-email.me>,

Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:

> BTW, it also occurred to me after my last batch of responses that if
> a window exists during job rundown when the queue manager thinks the
> job is still active even though the log file is complete, then the job
> should have been marked with the system failed during execution status
> you would normally get in that situation upon system restart.
>
> That makes me think nothing about the job actually starting was written
> to the queue manager database on disk even though a full logfile was
> written to those same disks. (The logfile was on a different disk, but
> that disk was attached to the same controller.)

That started me thinking about the batch job restart mechanism.

From

http://h71000.www7.hp.com/doc/731final/6489/6489pro_046.html#exch_58

"If the system fails while your batch job is executing, your job does
not complete. When the system recovers and the queue is restarted, your
job is aborted and the next job in the queue is executed."

Which makes your incident sound like a bug. The information that the
job was running didn't make it back to disk so that the job could be
aborted on the queue restart.

FWIW I did a fair bit of testing of the batch restart functionality back
in VAX days, without problems. However given the stuff I was running
then I could never justify the extra effort to use the functionality in
a production environment.

I am slightly wondering that if you had had restart checkpoints built
into your procedure that might have forced some write back to the queue
manager database, but a check_queues job doesn't sound like a lengthy
procedure to me.

And with the benefit of hindsight, that testing was restricted to things
like doing a console halt then reboot rather than a complete power fail.

> I've now logged the issue with HP and they are currently looking at it.

--
Paul Sture

Paul Sture

unread,

Apr 12, 2013, 12:13:09 PM4/12/13

to

In article <kk98bh$lrp$1...@dont-email.me>,

Ditto. The complexity of the DCL lexical didn't help either. I recall
finding some flavour that worked and copying it all around the place.

$sndjbc was so programmer userland unfriendly that a colleague wrote
some wrappers for it.

--
Paul Sture

Paul Sture

unread,

Apr 12, 2013, 12:53:17 PM4/12/13

to

In article <nospam-7156DB....@news.chingola.ch>,

Paul Sture <nos...@sture.ch> wrote:

> That started me thinking about the batch job restart mechanism.
>
> From
>
> http://h71000.www7.hp.com/doc/731final/6489/6489pro_046.html#exch_58
>
> "If the system fails while your batch job is executing, your job does
> not complete. When the system recovers and the queue is restarted, your
> job is aborted and the next job in the queue is executed."

Oops. partial copy and paste. Append to that:

"However, by specifying the /RESTART qualifier when you submit a batch
job, you indicate that the system should reexecute your job if the
system fails before the job is finished."

--
Paul Sture

Simon Clubley

unread,

Apr 12, 2013, 1:06:23 PM4/12/13

to

On 2013-04-12, Johnny Billquist <b...@softjar.se> wrote:
> On 2013-04-12 16:50, Paul Sture wrote:
>> In article <kk95l8$lqt$1...@news.albasani.net>,
>> Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>
>>> And it wasn't as simple as an extra submit of the job
>>> from the startup scripts? Do you have any log files
>>> with the startup/console output ?
>>
>> Not likely because it had the same job entry number.
>
> Since it had the same job number that would imply that indeed it was
> rerunning the same job. Is the job set to restart?
>

No; if a random job fails my policy is for it to be examined manually
first before restart just to be safe.

The 24/7 jobs OTOH are started at system startup and are designed to be
restartable at any time.

> There is a delicate race condition in batch in general, when it itself
> submits the next run, in that you can end up with two jobs if the job
> manage to do the submit, but don't manage to finish. This requires some
> careful thinking and design to at least minimize.
> I know I've had a similar issue under RSX with my backup jobs.
>

On VMS, if the system fails during execution of a /NORESTART (the default)
batch job, then when the system restarts the job is marked as "system
failed during execution" and is not restarted.

> However, the one really strange thing I see in all this is that the log
> file seems to have confirmed that your first run did submit a job, but
> that job seemingly disappeared into thin air.
> That is the one thing I can't account for.
>

Neither can I, unless the queue manager does not write updates to disk
immediately.

Simon Clubley

unread,

Apr 12, 2013, 1:26:06 PM4/12/13

to

On 2013-04-12, Paul Sture <nos...@sture.ch> wrote:
>
> From
>
> http://h71000.www7.hp.com/doc/731final/6489/6489pro_046.html#exch_58
>
> "If the system fails while your batch job is executing, your job does
> not complete. When the system recovers and the queue is restarted, your
> job is aborted and the next job in the queue is executed."
>
> Which makes your incident sound like a bug. The information that the
> job was running didn't make it back to disk so that the job could be
> aborted on the queue restart.
>

Combined with the submit from the first job been lost, the not writing
back to disk scenario is the only viable thing I can think of at the moment.

>
> I am slightly wondering that if you had had restart checkpoints built
> into your procedure that might have forced some write back to the queue
> manager database, but a check_queues job doesn't sound like a lengthy
> procedure to me.
>

The elapsed time of the first run was 2.56 seconds according to the
log file.

_If_ the queue manager is caching on disk updates for even 30-60 seconds
that job could easily start, submit a new job, and complete without any
on disk queue manager database updates before power failure.

Jan-Erik Soderholm

unread,

Apr 12, 2013, 1:40:43 PM4/12/13

to

Simon Clubley wrote 2013-04-12 17:01:
> On 2013-04-12, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>
>> And it wasn't as simple as an extra submit of the job
>> from the startup scripts? Do you have any log files
>> with the startup/console output ?
>>
>
> Nice idea, but unfortunately I never submit routine scheduled jobs such as
> this one from the system startup. The only place the submit command for
> this job exists across the whole of this system is in the command procedure
> itself.
>
> The only batch jobs I submit at system startup are for those tasks designed
> to be running 24/7.
>
> Simon.
>

Not realy answering the original question, it's still
a mistery I guess. :-)

I'm generally against "self-submitting" COM files becuse it's
hard to get an overview of all jobs running. And if something
crashes (before the submit command), the job is "lost".

It's is also messy for jobs that runs multiple times during
each 24h period. And as I think Hoff said, the risk of
someone hitting "/release" on a holding job... :-)

I uses a cron-like tool (called CRON :-) ) that I found
many years ago. This way all scheduled job are in one
configuration file and it's simple to get an overview
using one TYPE command.

It also takes care of such things as "not on sundays",
"only the 1'st each month" and so on.

The CRON server itself is a DCL-only COM file.

Example of configuration file:

$ type util:<cron>crontab.dat

# cron example
# command order:
# Field: 1 2 3 4 5
# hour day month D-O-W command
#
# CRON matches against first four fields and then
# considers command
#
# First four fields (time) can be any of the following:
# 1) Single number --> must match exactly
# 2) number - number --> match inclusive range
# 3) number,number --> match any in the list
# 4) * --> wildcard, matches anything
#
#########################################################################
# Example:
# 1,15 12-3 1,2 2-4 SUBMIT ACCOUNTING.JOB
#########################################################################
#
# hour
# ! day
# ! ! month
# ! ! ! D-O-W
# ! ! ! ! command
# ! ! ! ! !
#
# Restart of CRON server (if it crashes)
# Runs each hour
* * * * submit/aft="+01:15:00" util:[cron]cron_startup
#
# Sync of system clock against eBay/Tradera
# Runs 01:00, 08:00 and 12:00 each day
1,8,12 * * * submit dev:[dir]get_trad_time_2
#
# Three jobs run at 01:00 each night
#
# Recalc of TRADSCHED date.
1 * * * submit dev:[dir]ts_recalc_date
#
# Resubmit of TRADSCHED items.
1 * * * submit dev:[dir]ts_check
#
# Run TRADSCHED report.
1 * * * submit dev:[dir]list_sche /param=("1234")
#
* * * * wait 00:00:05
# ! ! ! ! !
# ! ! ! ! command
# ! ! ! D-O-W
# ! ! month
# ! day
# hour

Jan-Erik.

Stephen Hoffman

unread,

Apr 12, 2013, 2:37:44 PM4/12/13

to

On 2013-04-12 16:13:09 +0000, Paul Sture said:

> $sndjbc was so programmer userland unfriendly that a colleague wrote
> some wrappers for it.

No shortage of those APIs around, either. Salient examples:
smg$create_menu... decc$to_vms()...
There's OpenSSL, with the wrapper that was recently posted...
My latest and newest and bestest API friend sys$acm[w]...
Or... well, all sorts of other calls within VMS and various LPs.

David Froble

unread,

Apr 12, 2013, 5:31:58 PM4/12/13

to

Complex can also be flexible. Easy to use can also be rigid.

And it's possible for things to go to extremes in either direction.
That's why we end up writing wrappers for some things that provide some
ease of use for our particular usage of a tool.

I cannot let the reference to OpenSSL go without observing that it
wasn't complex and hard to use, it was downright unusable outside of C.

In general, I feel that the complexity of VMS system services is pretty
much in line with the complexity of the specific operation. For
example, assigning a channel is rather simple. But creating a process
is not, and without all the options, it probably wouldn't be of much use.

David Froble

unread,

Apr 12, 2013, 5:42:05 PM4/12/13

to

And doesn't this just lead us back to the discussions years ago, before
VMS had the caching, about how fast other systems were, and how slow VMS
was because of it insisted to write everything to disk ????

The rant Steve quoted (Jeff Bonwick) was one of the best things I've
read in a while. I think similar things could be written about other
things, Microsoft comes to mind ....

Then there was David Mathog who didn't care about writing to disk, he
just wanted super fast number crunching.

Not sure what any of this means ....

Phillip Helbig---undress to reply

unread,

Apr 13, 2013, 2:20:15 AM4/13/13

to

In article <kk9guo$fr6$1...@news.albasani.net>, Jan-Erik Soderholm

<jan-erik....@telia.com> writes:

> I uses a cron-like tool (called CRON :-) ) that I found
> many years ago. This way all scheduled job are in one
> configuration file and it's simple to get an overview
> using one TYPE command.

I ended up writing something similar. A command starts all the
configured batch jobs. I usually have 12 such jobs which are either
running or constantly or resubmitted at regular intervals. 2 of these
jobs periodically check if all the jobs are there and if not resubmit
the missing ones. (I need 2 since if the watchdog job dies, no other
job could resurrect it.)

AEF

unread,

Apr 16, 2013, 9:43:01 PM4/16/13

to

On Apr 11, 12:28 pm, Simon Clubley <clubley@remove_me.eisner.decus.org-
Earth.UFP> wrote:

> On 2013-04-11, Jan-Erik Soderholm <jan-erik.soderh...@telia.com> wrote:
>
>
>
> > I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>

> There's no accounting entry for the first job even though there is
> a full logfile, but since accounting log updates are buffered, then
> that's not really a major surprise.
>

> > Nothing weird with the start/finish timestamps in accounting?
>
> > I would look mare at something with the system clock at
> > startup that made the holding job to be released. Such
> > as a startup with wrong time setting or similar.
>

> All the timestamps on the log files and accounting are correct; the only
> time the system clock is corrected on this system is once a day during
> during the night from a NTP source and that job had already run a couple

> of hours previously. There was nothing strange in this NTP update job
> log file either.

>
> In addition, the queue entry number from the accounting record for the

> second job didn't match the entry number from the submit command in
> the first job.

What was that queue entry number?

So the first job submitted a job with entry number 3927.

The second job ran and submitted a new job with entry number 3923.

So you're saying that the second job did not have an entry number of
3927? You're saying that nothing ran with an entry number of 3927?

Can you please be more specific?

You wrote:

From the end of the first run:

$ submit/queue=ba0/after:"tomorrow+08:20"
ownsrc:check_queues.com
Job CHECK_QUEUES (queue BA0, entry 3927) holding until 12-APR-2013
08:20
$ exit
[deleted] job terminated at 11-APR-2013 08:20:02.57

Accounting information:
Buffered I/O count: 204 Peak working set
size: 5088
Direct I/O count: 229 Peak virtual
size: 173040
Page faults: 1455 Mounted
volumes: 0
Charged CPU time: 0 00:00:00.52 Elapsed time: 0
00:00:02.56

As the job had completed, the entry should have disappeared from the
queue database at that point. Instead after the system restart, the
same job ran again. Here is the end of the second run:

$ submit/queue=ba0/after:"tomorrow+08:20"
ownsrc:check_queues.com
Job CHECK_QUEUES (queue BA0, entry 3923) holding until 12-APR-2013 0
$ exit
[deleted] job terminated at 11-APR-2013 08:25:11.69

Accounting information:
Buffered I/O count: 204 Peak working set
size: 5680
Direct I/O count: 253 Peak virtual
size: 173040
Page faults: 1387 Mounted
volumes: 0
Charged CPU time: 0 00:00:00.67 Elapsed time: 0
00:00:05.25

End of job details.

>
> Thanks for the suggestions however, I do appreciate them.

>
> Simon.
>
> --
> Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
> Microsoft: Bringing you 1980s technology to a 21st century world

AEF

unread,

Apr 16, 2013, 9:45:44 PM4/16/13

to

On Apr 11, 1:35 pm, Simon Clubley <clubley@remove_me.eisner.decus.org-

Earth.UFP> wrote:
> On 2013-04-11, Jan-Erik Soderholm <jan-erik.soderh...@telia.com> wrote:

> > Simon Clubley wrote 2013-04-11 18:28:

> >> On 2013-04-11, Jan-Erik Soderholm <jan-erik.soderh...@telia.com> wrote:
>
> >>> I guess accounting (acc/queue=ba0) also shows both jobs as runed?
>
> >> There's no accounting entry for the first job even though there is
> >> a full logfile, but since accounting log updates are buffered, then
> >> that's not really a major surprise.
>
> >>> Nothing weird with the start/finish timestamps in accounting?
>
> >>> I would look mare at something with the system clock at
> >>> startup that made the holding job to be released. Such
> >>> as a startup with wrong time setting or similar.
>
> >> All the timestamps on the log files and accounting are correct; the only
> >> time the system clock is corrected on this system is once a day during
> >> during the night from a NTP source and that job had already run a couple
> >> of hours previously.
>

> > But there was a reboot in between, not ?
>
> Sequence:
>
> NTP job (06:30) -> check_queues (08:20) -> power failure ->
> check_queues run again (08:25).
>

> >> In addition, the queue entry number from the accounting record for the
> >> second job didn't match the entry number from the submit command in
> >> the first job.
>

> > Higher/later or lower/earlier?
>
> > Does it match the entry number from *any* previous job ?
> > I don't know ho long you keep logs, of course... :-)
>
> I also have the log file from yesterday's run (10-Apr-2013 08:20).
>
> The entry number from yesterday's submit command matches the entry
> number in the accounting log for the second run today.

Which of the two submit commands from "yesterday"?

>
> So the queue manager has indeed run the job submitted yesterday twice
> to completion today.
>

[...]

AEF

unread,

Apr 16, 2013, 9:54:29 PM4/16/13

to

On Apr 11, 3:27 pm, Simon Clubley <clubley@remove_me.eisner.decus.org-

Earth.UFP> wrote:
> On 2013-04-11, Paul Sture <nos...@sture.ch> wrote:

> > In article <kk6s8q$39...@dont-email.me>,

> > Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>
> >> Even if there was some queue manager bug caused by a power failure
> >> during some unusual tight timing window of a few milliseconds [*],
> >> that still does not explain the disappearing job from the submit
> >> command in the first job.
>
> >> Simon.
>
> >> [*] A few milliseconds maximum, because don't forget I have a _full_
> >> logfile from the first run of the job today.
>
> > That sounds like a small window in the batch run down code. If you
> > think about it there must be a point between the batch job completing
> > and its return status being processed so that the queue manager either
> > deleting or retaining the job in the queue database.
>
> The problem with the window theory is that while it explains a repeating
> job, it doesn't explain why the submit from the first run went walkies.

"Walkies"? What's that?

> _If_ the queue manager is not delaying the on disk updates, then I wonder
> if I have come across more than one problem here.
>
> It will be interesting to see what HP have to say.
>
> Thanks everyone,
>
> Simon.
>
> --
> Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
> Microsoft: Bringing you 1980s technology to a 21st century world

Theory:

Space aliens. Please check your security settings and shields. (^_^)

I need to get your story down more accurately to do any better. I'm
really confused about which jobs ran with which entry numbers
according to which sources. Please try again.

At one place I wrote some DCL that would manage jobs. I don't recall
just exactly what it did, but I think that's what you need. I believe
others said that, too.

OK.

AEF

Paul Sture

unread,

Apr 17, 2013, 4:18:52 AM4/17/13

to

In article <kk9u79$1ms$1...@dont-email.me>,

David Froble <da...@tsoft-inc.com> wrote:

> Stephen Hoffman wrote:
> > On 2013-04-12 16:13:09 +0000, Paul Sture said:
> >
> >> $sndjbc was so programmer userland unfriendly that a colleague wrote
> >> some wrappers for it.
> >
> > No shortage of those APIs around, either. Salient examples:
> > smg$create_menu... decc$to_vms()...
> > There's OpenSSL, with the wrapper that was recently posted...
> > My latest and newest and bestest API friend sys$acm[w]...
> > Or... well, all sorts of other calls within VMS and various LPs.
> >
> >
>
> Complex can also be flexible. Easy to use can also be rigid.
>
> And it's possible for things to go to extremes in either direction.
> That's why we end up writing wrappers for some things that provide some
> ease of use for our particular usage of a tool.

Some wrappers don't have sensible defaults for omitted parameters either.

I once worked with a set of RMS wrappers which defaulted to read with
lock. We had some battles about that one :-)

--
Paul Sture

VAXman-

unread,

Apr 17, 2013, 7:41:31 AM4/17/13

to

In article <e5ee91c0-5b97-44b0...@e13g2000vbn.googlegroups.com>, AEF <spamsi...@yahoo.com> writes:
>On Apr 11, 3:27=A0pm, Simon Clubley <clubley@remove_me.eisner.decus.org-

>Earth.UFP> wrote:
>> On 2013-04-11, Paul Sture <nos...@sture.ch> wrote:
>
>> > In article <kk6s8q$39...@dont-email.me>,

>> > =A0Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>>
>> >> Even if there was some queue manager bug caused by a power failure
>> >> during some unusual tight timing window of a few milliseconds [*],
>> >> that still does not explain the disappearing job from the submit
>> >> command in the first job.
>>
>> >> Simon.
>>
>> >> [*] A few milliseconds maximum, because don't forget I have a _full_
>> >> logfile from the first run of the job today.
>>

>> > That sounds like a small window in the batch run down code. =A0If you

>> > think about it there must be a point between the batch job completing
>> > and its return status being processed so that the queue manager either
>> > deleting or retaining the job in the queue database.
>>
>> The problem with the window theory is that while it explains a repeating
>> job, it doesn't explain why the submit from the first run went walkies.
>
>"Walkies"? What's that?

Walkies: lost or missing.

--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG

Well I speak to machines with the voice of humanity.

Simon Clubley

unread,

Apr 17, 2013, 3:35:53 PM4/17/13

to

On 2013-04-16, AEF <spamsi...@yahoo.com> wrote:
>
> What was that queue entry number?
>

I ran through the details in the other messages, however it's now in
the hands of HP who are doing some testing so I will now wait and see
what they have to say. Thanks anyway however.

I will report back here with what they found when they come back to me.

Simon.

PS: Thanks to whoever in HP pointed the HP VMS support person to this
thread (assuming he didn't find it himself); it answered a number of his
questions.

AEF

unread,

Apr 17, 2013, 7:45:54 PM4/17/13

to

On Apr 17, 3:35 pm, Simon Clubley <clubley@remove_me.eisner.decus.org-
Earth.UFP> wrote:

> On 2013-04-16, AEF <spamsink2...@yahoo.com> wrote:
>
>
>
> > What was that queue entry number?
>
> I ran through the details in the other messages, however it's now in
> the hands of HP who are doing some testing so I will now wait and see
> what they have to say. Thanks anyway however.

I read the other messages and was still confused. In one case you said
the job yesterday when in fact there were two. It was confusing and I
am partly curious as to exactly what happened.

It seems to me that what happened was that your job ran, it submitted
another job. Then the power outage. Then power back up. Then a job ran
that wasn't the submitted one and some remarks about accounting
entries that were not clearly matched to which jobs. I was hoping for
a single time-line of events.

>
> I will report back here with what they found when they come back to me.

OK. And I hope that report will clarify what the actual sequence of
events were and clearly show what entry numbers did or did not match
with whatever jobs ran or did not run.

>
> Simon.
>
> PS: Thanks to whoever in HP pointed the HP VMS support person to this
> thread (assuming he didn't find it himself); it answered a number of his
> questions.
>
> --
> Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
> Microsoft: Bringing you 1980s technology to a 21st century world

AEF

Johnny Billquist

unread,

Apr 18, 2013, 9:36:05 AM4/18/13

to

Simon can correct me if I understood things wrong, but this is my
understanding:

Job X ran.
Job X submitted job Y for running tomorrow.
System crash
Job X ran again.
Job X submitted job Z for running tomorrow.
Job X finished

Job Y disappeared. Job Z exists as expected. Job X got two log files
from the two runs. Job X would appear to have completed both times.

My guess is that job X was rerun because it didn't finish, but that is
perhaps not a good enough explanation for the rerun.
There is no obvious explanation at all for job Y not being around.

The "best" explanation might just be that the quemanager work file was
not updated at all, so job X would appear to not have run at all. Thus
the run after the reboot again, and also the loss of job Y.

Johnny

Paul Sture

unread,

Apr 18, 2013, 11:37:45 AM4/18/13

to

In article <kkoss5$ije$1...@Iltempo.Update.UU.SE>,

Johnny Billquist <b...@softjar.se> wrote:

> Simon can correct me if I understood things wrong, but this is my
> understanding:
>
> Job X ran.
> Job X submitted job Y for running tomorrow.
> System crash
> Job X ran again.
> Job X submitted job Z for running tomorrow.
> Job X finished
>
> Job Y disappeared. Job Z exists as expected. Job X got two log files
> from the two runs. Job X would appear to have completed both times.
>
> My guess is that job X was rerun because it didn't finish, but that is
> perhaps not a good enough explanation for the rerun.
> There is no obvious explanation at all for job Y not being around.
>
> The "best" explanation might just be that the quemanager work file was
> not updated at all, so job X would appear to not have run at all. Thus
> the run after the reboot again, and also the loss of job Y.

As I understand it, the queue manager file wasn't even updated to
reflect that job X had started, let alone finished. According to the
document I quoted up thread, running jobs should be aborted when the
machine comes back

--
Paul Sture

Simon Clubley

unread,

Apr 18, 2013, 1:20:16 PM4/18/13

to

On 2013-04-18, Paul Sture <nos...@sture.ch> wrote:
> In article <kkoss5$ije$1...@Iltempo.Update.UU.SE>,
> Johnny Billquist <b...@softjar.se> wrote:
>
>> Simon can correct me if I understood things wrong, but this is my
>> understanding:
>>
>> Job X ran.
>> Job X submitted job Y for running tomorrow.
>> System crash
>> Job X ran again.
>> Job X submitted job Z for running tomorrow.
>> Job X finished
>>
>> Job Y disappeared. Job Z exists as expected. Job X got two log files
>> from the two runs. Job X would appear to have completed both times.
>>

Yes, this is correct. It was a power failure instead of a system crash
but that does not make any difference to the sequence of events.

>> My guess is that job X was rerun because it didn't finish, but that is
>> perhaps not a good enough explanation for the rerun.
>> There is no obvious explanation at all for job Y not being around.
>>
>> The "best" explanation might just be that the quemanager work file was
>> not updated at all, so job X would appear to not have run at all. Thus
>> the run after the reboot again, and also the loss of job Y.
>

This is still what I think happened as well.

I still think this was queue manager specific. If something at VMS level
or at hardware level was wrongly deferring writes to disk (everything is
write through), then the log file should not have been written away either.

> As I understand it, the queue manager file wasn't even updated to
> reflect that job X had started, let alone finished. According to the
> document I quoted up thread, running jobs should be aborted when the
> machine comes back
>

Yes, I would also have expected to see the job retained with the usual
"system failed during execution" message if the job starting had been
written away to the disk based queue manager database.

BTW, the HP support person handling this still has not had a response
from the HP people upstream yet so either they don't understand the
problem or they have duplicated it and are wondering what to do about
it...

Simon.

AEF

unread,

Apr 19, 2013, 10:35:09 AM4/19/13

to

On Apr 18, 11:37 am, Paul Sture <nos...@sture.ch> wrote:
> In article <kkoss5$ij...@Iltempo.Update.UU.SE>,

Thanks!

Just one more:

X - entry number not in accounting
Y - entry number 3927
Z - entry number 3923

Is this right? If so -- fascinating.

AEF

Johnny Billquist

unread,

Apr 19, 2013, 10:40:13 AM4/19/13

to

Entry number for X have not been given in any of the posts, as far as I
know. But I'm sure it could be provided, if we really think we need it.
(Can't say I think it makes any difference if I manage to put an
explicit value to X.) But I fail to see any connection to any
accounting. These are queue job numbers.

> Y - entry number 3927
> Z - entry number 3923
>
> Is this right? If so -- fascinating.

Yes...

Johnny

AEF

unread,

Apr 19, 2013, 11:01:53 AM4/19/13

to

Entry numbers are recorded in ACCOUNTING records.

Also, regards job X, was _that_ recorded in ACCOUNTNG.DAT?

AEF

Jan-Erik Soderholm

unread,

Apr 19, 2013, 11:03:59 AM4/19/13

to

Johnny Billquist wrote 2013-04-19 16:40:

> But I fail to see any connection to any accounting.
> These are queue job numbers.

Queue entry/job numbers are recorded in accounting :

$ acc/queue=sys$batch/fu

BATCH Process Termination
-------------------------
...
...
Queue entry: 1973
Queue name: SYS$BATCH
Job name: xxxxx

Jan-Erik.

Johnny Billquist

unread,

Apr 19, 2013, 12:13:20 PM4/19/13

to

My mistake. I was thinking of accounts.

> Also, regards job X, was _that_ recorded in ACCOUNTNG.DAT?

Now, that is another good question.

Johnny

Simon Clubley

unread,

Apr 19, 2013, 12:57:13 PM4/19/13

to

On 2013-04-19, Johnny Billquist <b...@softjar.se> wrote:

> On 2013-04-19 17:01, AEF wrote:
>>
>> Entry numbers are recorded in ACCOUNTING records.
>
> My mistake. I was thinking of accounts.
>
>> Also, regards job X, was _that_ recorded in ACCOUNTNG.DAT?
>
> Now, that is another good question.
>

I already thought of that one, but thanks anyway for the question.

There was no accounting entry from the first job, but that's not really
a surprise because accounting entries are either buffered before been
written to disk or there is a delay in updating the EOF marker.

You can test this for yourself.

In one session, log into a VMS box and have "$ acc/since={10 minutes ago}"
ready to run.

Now log in (and out again immediately) to a VMS box from another session
then run the $ acc command above immediately. You will see there is a
delay before the accounting entry from that session becomes visible.

This is on VMS Alpha v8.3, in case this behaviour is different in other
VMS versions.

BTW, still no reply from HP yet...

Simon.

Simon Clubley

unread,

Apr 19, 2013, 2:52:26 PM4/19/13

to

On 2013-04-19, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>
> I already thought of that one, but thanks anyway for the question.
>
> There was no accounting entry from the first job, but that's not really
> a surprise because accounting entries are either buffered before been
> written to disk or there is a delay in updating the EOF marker.
>
> You can test this for yourself.
>

That came across as a bit abrupt, sorry. (I was heading out the door.). :-)

What I meant to say is that you can test it for yourself in case you
find it hard to believe that I've come across yet another example of
VMS deferring writes to disk. :-)

I've known about accounting logs been deferred for a number of years,
but it's the only part of VMS I _thought_ used deferred updates.

Of course, it's one thing to defer writing away historical data and it's
a completely different situation to defer writing away updates to a active
queue manager database (if that is what is going on).

> In one session, log into a VMS box and have "$ acc/since={10 minutes ago}"
> ready to run.
>
> Now log in (and out again immediately) to a VMS box from another session
> then run the $ acc command above immediately. You will see there is a
> delay before the accounting entry from that session becomes visible.
>

The box I tested it on is a standalone box, not a cluster in case it
makes any difference. I managed to get up to a delay before it became
visible of about 1 minute.

> This is on VMS Alpha v8.3, in case this behaviour is different in other
> VMS versions.
>

Paul Sture

unread,

Apr 19, 2013, 3:40:35 PM4/19/13

to

Another thought occurred to me today.

Was volume shadowing in operation on the disk that holds the queue
manager database?

I am thinking along the lines of the "wrong" disk coming back as the
shadow master here.

What caused the power loss?

--
Paul Sture

Simon Clubley

unread,

Apr 19, 2013, 8:53:43 PM4/19/13

to

On 2013-04-19, Paul Sture <nos...@sture.ch> wrote:
> Another thought occurred to me today.
>
> Was volume shadowing in operation on the disk that holds the queue
> manager database?
>

No. Multiple hardware RAID 1 sets all connected to the same controller,
no writeback caching in operation (everything is write-through), no disk
failures, no filesystem corruption (ie: from disks been out of sync).

That last item is critical. If the disks were out of sync you would
see filesystem corruption been reported by analyze/disk because the
disk for read I/O in a RAID 1 set is chosen effectively at random.

No application level detected corruptions either.

> I am thinking along the lines of the "wrong" disk coming back as the
> shadow master here.
>
> What caused the power loss?
>

Something external to the building containing the machine.

Peter 'EPLAN' LANGSTOeGER

unread,

Apr 22, 2013, 2:31:15 PM4/22/13

to

In article <kkoss5$ije$1...@Iltempo.Update.UU.SE>, Johnny Billquist <b...@softjar.se> writes:
>Simon can correct me if I understood things wrong, but this is my
>understanding:
>
>Job X ran.
>Job X submitted job Y for running tomorrow.
>System crash
>Job X ran again.
>Job X submitted job Z for running tomorrow.
>Job X finished
>
>Job Y disappeared. Job Z exists as expected. Job X got two log files
>from the two runs. Job X would appear to have completed both times.

I tend to read this as an incomplete coding of job X
One has to use a SET RESTART_VALUE checkpoint to prevent submitting
tomorrow's job in case of a second run of today's job...

just my 0.02

--
Peter "EPLAN" LANGSTÖGER
Network and OpenVMS system specialist
E-mail Pe...@LANGSTOeGER.at
A-1030 VIENNA AUSTRIA I'm not a pessimist, I'm a realist

Simon Clubley

unread,

Sep 6, 2013, 1:15:59 PM9/6/13

to

On 2013-04-11, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
> When a batch job completes, does the queue manager _immediately_ write
> that information to it's data structures on disk or is that information
> cached in memory for a short time first ?
>
> I had a problem this morning which I have not seen before and it looks
> like VMS is not _immediately_ writing queue updates away to disk, which
> to put it mildly is bl**dy dangerous if true.
>
> A routine batch job ran at 08:20 and completed; I have the log file
> from this first run on disk.
>
> The power failed about a minute or so later. (This box is not UPS
> protected.)
>
> When the system restarted, the same job ran again. I also have the log
> file from this second run on disk.
>

[snip]

Anyone remember this from a few months ago ?

I _finally_ got some answers today from VMS Engineering. My thanks to
Mandar for getting someone in HP who actually understood what we were
trying to tell them to look at this and to come back with a well
reasoned analysis which was the kind of thing Nashua used to produce.

It turns out my initial suspicions were correct; VMS sacrifices robust
behaviour in favour of increased performance in the queue manager design.

VMS maintains a in-memory copy of the permanent on disk queue manager
database and routine changes to that database (such as the above timed
release job starting) are immediately made _only_ to the in-memory copy
of the database.

Those in-memory updates are only committed to the permanent on-disk queue
manager database when a timer expires every couple of minutes. There is no
way to change the timer interval or to force the immediate updating of the
on-disk queue manager database.

This means that when there's a power failure shortly after a timed release
job starts, then the job will start for a second time when power is restored
instead of being marked as retained on error with a "system failed
during execution" status.

I have expressed the opinion that this behaviour is _very_ dangerous and
is totally inconsistent with the default robust behaviour that VMS
traditionally prides itself on. I have asked for changes to be made so
that a site specific parameter can be set to either change the flushing
timeout interval or to force immediate commits by disabling it entirely.

I have asked for this as I consider it to be a design flaw/bug that this
functionality was not designed in and set to a safe setting by default.
(Of course, for compatibility reasons, the current behaviour will have
to remain as the default behaviour now.)

What do you (comp.os.vms) think ?

Simon.

PS: Now that I've documented this behaviour, I hope this helps someone else
in the future so they don't find out about this little feature the hard
way. :-)

Simon Clubley

unread,

Sep 6, 2013, 1:45:03 PM9/6/13

to

On 2013-09-06, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>
> This means that when there's a power failure shortly after a timed release
> job starts, then the job will start for a second time when power is restored
> instead of being marked as retained on error with a "system failed
> during execution" status.
>

I should also point out this means that for quick jobs which start and
complete inbetween on-disk updates, there's no record at all within the
queue manager database that the job ever started and completed. This was
the case for me.

Simon.

Stephen Hoffman

unread,

Sep 6, 2013, 2:44:28 PM9/6/13

to

On 2013-09-06 17:15:59 +0000, Simon Clubley said:

> What do you (comp.os.vms) think ?

I don't think there'll ever be a generic and uniformly-accepted
short-enough cache-flush timer setting here. Somebody will always want
caching disabled.

In general, I'd rather have access to an integrated, transactional and
(for this particular queue manager case) distributed database within
VMS, and for applications I've been writing for use on VMS. And a more
capable integrated job scheduler. But I've mentioned these desires
before. <http://labs.hoffmanlabs.com/node/50>
<http://labs.hoffmanlabs.com/node/97>
<http://labs.hoffmanlabs.com/node/872>

This rather than being left to roll my own data storage implementation
— whether for OS-level cases such as the queue manager, or PCSI, or any
other operating system or end-user applications that are storing and
retrieving structured data — and then dealing with the ensuing work
that rolling my own database solution (via RMS) seemingly inevitably
encounters. Whether it's implementing journaling, distributed updates,
database recovery, data migrations to newer formats, or simply
performing coordinated online backups of the data, there's seemingly
always some work left undone here. By default, RMS just doesn't
provide enough. Not by (my) present standards. Not without help,
that is; via the use of DECdtm distributed transaction manager and RMS
journaling, or via a transactional database package, for sure.

This addition of and migration to a transactional database likely won't
happen within VMS as it presently exists in the market, and it'll be up
to HP whether this particular queue manager behavior will considered
for remediation, or whether this case will be considered to be
functioning as intended.

More reliable power is the usual brute-force solution to cases such as
this queue manager issue; to reduce the exposure to data loss from
"unbacked" data caches. For smaller AlphaServer, AlphaStation and
Integrity systems, various capable less-interruptible power supplies
(LIPS) are readily available, and at least one of the usual denizens
here has available add-on software that can monitor certain vendors and
models of LIPS devices connected to VMS systems.

A more complex and brute-force solution would involve custom-developed
server symbionts within the current queue manager environment, or a
fully-custom queue manager replacement. Don't turn the stuff over to
the VMS queue manager until the server symbiont or the scheduler has
logged it (an approach which has its issues), or don't use the VMS
queue manager at all. Whether that's a product such as CA's package
(formerly DECscheduler) or such, or a port of a grid client. (I don't
know of any recent grid computing clients for VMS, but that'd be
another obvious approach
<http://en.wikipedia.org/wiki/Grid_computing>.) Rolling your own job
scheduler, or grid client, or porting same. Preferably with a
transactional database underneath, but that much should be obvious.

--
Pure Personal Opinion | HoffmanLabs LLC

David Froble

unread,

Sep 6, 2013, 4:34:29 PM9/6/13

to

Simon Clubley wrote:
> On 2013-04-11, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>> When a batch job completes, does the queue manager _immediately_ write
>> that information to it's data structures on disk or is that information
>> cached in memory for a short time first ?
>>
>> I had a problem this morning which I have not seen before and it looks
>> like VMS is not _immediately_ writing queue updates away to disk, which
>> to put it mildly is bl**dy dangerous if true.
>>
>> A routine batch job ran at 08:20 and completed; I have the log file
>> from this first run on disk.
>>
>> The power failed about a minute or so later. (This box is not UPS
>> protected.)
>>
>> When the system restarted, the same job ran again. I also have the log
>> file from this second run on disk.
>>
>
> [snip]
>
> Anyone remember this from a few months ago ?
>
> I _finally_ got some answers today from VMS Engineering. My thanks to
> Mandar for getting someone in HP who actually understood what we were
> trying to tell them to look at this and to come back with a well
> reasoned analysis which was the kind of thing Nashua used to produce.

You mean such as the leap year explanation ???

> It turns out my initial suspicions were correct; VMS sacrifices robust
> behaviour in favour of increased performance in the queue manager design.
>
> VMS maintains a in-memory copy of the permanent on disk queue manager
> database and routine changes to that database (such as the above timed
> release job starting) are immediately made _only_ to the in-memory copy
> of the database.
>
> Those in-memory updates are only committed to the permanent on-disk queue
> manager database when a timer expires every couple of minutes. There is no
> way to change the timer interval or to force the immediate updating of the
> on-disk queue manager database.
>
> This means that when there's a power failure shortly after a timed release
> job starts, then the job will start for a second time when power is restored
> instead of being marked as retained on error with a "system failed
> during execution" status.

You posed such a simple question, and the fact that it wasn't answered
in a like manner is depressing. What happens when you pose a tough
question?

> I have expressed the opinion that this behaviour is _very_ dangerous and
> is totally inconsistent with the default robust behaviour that VMS
> traditionally prides itself on. I have asked for changes to be made so
> that a site specific parameter can be set to either change the flushing
> timeout interval or to force immediate commits by disabling it entirely.
>
> I have asked for this as I consider it to be a design flaw/bug that this
> functionality was not designed in and set to a safe setting by default.
> (Of course, for compatibility reasons, the current behaviour will have
> to remain as the default behaviour now.)
>
> What do you (comp.os.vms) think ?

I think there should be options to define the timeout period, and to
cause immediate updates.

I also think you should have some battery backup on your system(s).

Simon Clubley

unread,

Sep 6, 2013, 5:26:01 PM9/6/13

to

On 2013-09-06, David Froble <da...@tsoft-inc.com> wrote:

> Simon Clubley wrote:
>>
>> Anyone remember this from a few months ago ?
>>
>> I _finally_ got some answers today from VMS Engineering. My thanks to
>> Mandar for getting someone in HP who actually understood what we were
>> trying to tell them to look at this and to come back with a well
>> reasoned analysis which was the kind of thing Nashua used to produce.
>
> You mean such as the leap year explanation ???
>

Actually I was thinking of the kinds of things I used to throw at them.
For some reason, I always seem to do things slightly differently and as
a result seem to find things others don't. :-)

>> It turns out my initial suspicions were correct; VMS sacrifices robust
>> behaviour in favour of increased performance in the queue manager design.
>>
>> VMS maintains a in-memory copy of the permanent on disk queue manager
>> database and routine changes to that database (such as the above timed
>> release job starting) are immediately made _only_ to the in-memory copy
>> of the database.
>>
>> Those in-memory updates are only committed to the permanent on-disk queue
>> manager database when a timer expires every couple of minutes. There is no
>> way to change the timer interval or to force the immediate updating of the
>> on-disk queue manager database.
>>
>> This means that when there's a power failure shortly after a timed release
>> job starts, then the job will start for a second time when power is restored
>> instead of being marked as retained on error with a "system failed
>> during execution" status.
>
> You posed such a simple question, and the fact that it wasn't answered
> in a like manner is depressing. What happens when you pose a tough
> question?
>

Congratulations David, you got it in one. (And yes, it's very depressing.)

This is _exactly_ what I have been saying to various people and it is
exactly why I forced this issue _now_. Given these doubts about what
quality of support you could rely on, would you rather know for sure
when everything is working ok or when you have systems down and
priority 1 calls logged ?

There were times during all this when the answers were so inane and out
of touch with the questions and comments, that I actually and seriously
wondered if they were taking the mickey.

I don't know if you remember the people who showed up in comp.os.vms a
few months ago wanting to do VMS system management from their IBM box
and they just would not take advice and kept repeating the same thing
without any real connection to what the comp.os.vms regulars were saying
while trying to help them.

At times, this interaction with VMS Engineering felt _exactly_ like that.

I now know there are still people you can get through if you shout loud
enough, but that type of service should come as standard just as it did
in the Nashua days.

>> I have expressed the opinion that this behaviour is _very_ dangerous and
>> is totally inconsistent with the default robust behaviour that VMS
>> traditionally prides itself on. I have asked for changes to be made so
>> that a site specific parameter can be set to either change the flushing
>> timeout interval or to force immediate commits by disabling it entirely.
>>
>> I have asked for this as I consider it to be a design flaw/bug that this
>> functionality was not designed in and set to a safe setting by default.
>> (Of course, for compatibility reasons, the current behaviour will have
>> to remain as the default behaviour now.)
>>
>> What do you (comp.os.vms) think ?
>
> I think there should be options to define the timeout period, and to
> cause immediate updates.
>

I agree. If the queue manager were been designed from scratch today the
default behaviour should be to bypass the in-memory cache and write
directly to disk. Unfortunately, that cannot happen with the situation
today because you need to maintain backwards compatibility and cannot
risk degrading sites with massive batch and print queues.

> I also think you should have some battery backup on your system(s).

The battery backup is a interesting comment. The site in question has
been running DEC systems for a _long_ time (decades) and the DEC systems
they have used have traditionally been extremely robust in power failures.
The Linux systems (both servers and desktops) are configured in ways
that have also proven to be robust against power failures as well.

There's also the fact that the local power grid is usually extremely
stable and otherwise reliable and that combined with various other
factors such as all the other equipment that would need UPS backup
if you were to go down that route meant that a decision was taken
not to use it.

In different circumstances, I would be the first pushing for a UPS
setup, but I agree with the decision that was taken in this _specific_
case.

Simon.

JF Mezei

unread,

Sep 6, 2013, 5:42:07 PM9/6/13

to

On 13-09-06 13:15, Simon Clubley wrote:

> Those in-memory updates are only committed to the permanent on-disk queue
> manager database when a timer expires every couple of minutes. There is no
> way to change the timer interval or to force the immediate updating of the
> on-disk queue manager database.

At the very least, this should be well documented. Documenting this
might allow some site to paliate the situation by using control files to
indicate a job has begun and deleting said file after succesful
completion. So when a job starts, it can check if previous run was
completed or not.

This problem also means that if I delete/entry the job that was to
transfer $10,000 to Mr VAXman, and the power fails right after the
command, that job will stll run when the system restarts despite having
been explicitely deleted.

In other words: don't trust the queue manager.

As to fixing this, how far back does this behaviour go ? How many
versions of VMS would need patches built ? Or would HP only fix IA64 8.4 ?

I don't know enough about the queue database architecture to know what
it means to cache the database and update it every few minutes to know
what the best approach would be to solve this.

Does a cache flush involve blindly rewriting a whole lot of disk blocks
even when they contain unchanged job entries ? Or just selectively
updating individual records that have been updated since the last flush ?

Ideally, a customer should be able to request immediate flushing to
disk. I don't think being able to change the delay in cache flushes
would help much. It should be an on or off thing.

And with no chaching, hopefully, they would only update the modified
record instead of rewriting the whole database to disk everytime a job
starts/ends or queue is modified with new or deleted jobs.

Simon Clubley

unread,

Sep 6, 2013, 6:12:47 PM9/6/13

to

On 2013-09-06, JF Mezei <jfmezei...@vaxination.ca> wrote:
> On 13-09-06 13:15, Simon Clubley wrote:
>
>> Those in-memory updates are only committed to the permanent on-disk queue
>> manager database when a timer expires every couple of minutes. There is no
>> way to change the timer interval or to force the immediate updating of the
>> on-disk queue manager database.
>
> At the very least, this should be well documented. Documenting this
> might allow some site to paliate the situation by using control files to
> indicate a job has begun and deleting said file after succesful
> completion. So when a job starts, it can check if previous run was
> completed or not.
>

Strongly agree. I don't know if there's a throwaway line in the
documentation somewhere, but if there is I don't remember seeing it.

This is actually worse than what you have with Unix (which traditionally
had used deferred write caching by default). The old days of always run
your Unix systems with UPS systems have long vanished.

These days, there are a wide range of Unix filesystem options, including
the sysadmin been able to turn off deferred write caching on a per
filesystem basis at mount time. I actually think quite a bit about what
are the best mount options to use with various filesystems.

The VMS queue manager is supposed to be a critical transaction based
database but it turns out it does not have any of these options.

> This problem also means that if I delete/entry the job that was to
> transfer $10,000 to Mr VAXman, and the power fails right after the
> command, that job will stll run when the system restarts despite having
> been explicitely deleted.
>
> In other words: don't trust the queue manager.
>
> As to fixing this, how far back does this behaviour go ? How many
> versions of VMS would need patches built ? Or would HP only fix IA64 8.4 ?
>

The current mainstream supported versions are V8.3 and V8.4.

>
>
> I don't know enough about the queue database architecture to know what
> it means to cache the database and update it every few minutes to know
> what the best approach would be to solve this.
>
> Does a cache flush involve blindly rewriting a whole lot of disk blocks
> even when they contain unchanged job entries ? Or just selectively
> updating individual records that have been updated since the last flush ?
>
> Ideally, a customer should be able to request immediate flushing to
> disk. I don't think being able to change the delay in cache flushes
> would help much. It should be an on or off thing.
>

One of the things which was pointed out to me today is that some sites
are running massive print and batch queues and they will need a finer
control over this, including retaining the current behaviour by default
so the performance of the queues is not impacted as a result of applying
a patch.

Simon.

Phillip Helbig---undress to reply

unread,

Sep 6, 2013, 6:33:46 PM9/6/13

to

In article <l0d2ke$ia$1...@dont-email.me>, Simon Clubley

<clubley@remove_me.eisner.decus.org-Earth.UFP> writes:

> I have asked for this as I consider it to be a design flaw/bug that this
> functionality was not designed in and set to a safe setting by default.
> (Of course, for compatibility reasons, the current behaviour will have
> to remain as the default behaviour now.)

It should remain the same only if someone is relying on it. I doubt
anyone is, and surely performance is not an issue here anymore.

Simon Clubley

unread,

Sep 6, 2013, 6:47:52 PM9/6/13

to

I wouldn't be too sure about that. As mentioned in another response,
HP mentioned to me today that some people are running massive print
and batch queues and besides, one of the things about VMS is that it
has always focused on not negatively impacting performance due to a
sudden change in defaults, especially as a result of merely applying
a patch.

The problem here is that the queue manager should have started out as
a default-safe configuration (in keeping with the rest of the VMS design
mentality) with deferred write caching disabled and the system manager
could have then made a informed decision to relax that configuration
in order to achieve a required site specific performance level if
massive print and batch queues were required.

As it turns out, at the moment you cannot even turn off the deferred
write caching even if you wanted to (and I do :-)).

Simon.

Keith Parris

unread,

Sep 6, 2013, 6:48:42 PM9/6/13

to

On 9/6/2013 11:15 AM, Simon Clubley wrote:
> VMS maintains a in-memory copy of the permanent on disk queue manager
> database and routine changes to that database (such as the above timed
> release job starting) are immediately made _only_ to the in-memory copy
> of the database.
>
> Those in-memory updates are only committed to the permanent on-disk queue
> manager database when a timer expires every couple of minutes.

Another factor to ponder is that the capabilities of hardware platforms
have actually degraded in some dimensions since the days when decisions
about the VMS queue manager design might have been made.

In 1985, I had a VAX 11/785 with battery-backed memory. Upon a power
failure, it was perfectly capable at the time of restarting where it
left off, without losing any of those in-memory queue manager transactions.

That battery-backed memory sure helped the day my 4-year old daughter
found she was just tall enough now to reach the emergency power-off
button located just inside the datacenter doorway.

Simon Clubley

unread,

Sep 6, 2013, 6:58:02 PM9/6/13

to

On 2013-09-06, Keith Parris <keithparris...@yahoo.com> wrote:
> On 9/6/2013 11:15 AM, Simon Clubley wrote:
>> VMS maintains a in-memory copy of the permanent on disk queue manager
>> database and routine changes to that database (such as the above timed
>> release job starting) are immediately made _only_ to the in-memory copy
>> of the database.
>>
>> Those in-memory updates are only committed to the permanent on-disk queue
>> manager database when a timer expires every couple of minutes.
>
> Another factor to ponder is that the capabilities of hardware platforms
> have actually degraded in some dimensions since the days when decisions
> about the VMS queue manager design might have been made.
>
> In 1985, I had a VAX 11/785 with battery-backed memory. Upon a power
> failure, it was perfectly capable at the time of restarting where it
> left off, without losing any of those in-memory queue manager transactions.
>

That's a interesting take on the situation I had not considered.

So you are saying that the queue manager may have been designed on the
basis that memory was permanent even when the power was pulled and that
the design decisions were never reviewed when memory moved again to
being volatile ?

If so, I wonder how that got missed.

> That battery-backed memory sure helped the day my 4-year old daughter
> found she was just tall enough now to reach the emergency power-off
> button located just inside the datacenter doorway.
>

Ouch. :-)

How long did it take you to live _that_ one down ? :-)

Simon.

David Froble

unread,

Sep 6, 2013, 8:14:51 PM9/6/13

to

Simon Clubley wrote:
> On 2013-09-06, David Froble <da...@tsoft-inc.com> wrote:
>> Simon Clubley wrote:
>>> Anyone remember this from a few months ago ?
>>>
>>> I _finally_ got some answers today from VMS Engineering. My thanks to
>>> Mandar for getting someone in HP who actually understood what we were
>>> trying to tell them to look at this and to come back with a well
>>> reasoned analysis which was the kind of thing Nashua used to produce.
>> You mean such as the leap year explanation ???
>>
>
> Actually I was thinking of the kinds of things I used to throw at them.
> For some reason, I always seem to do things slightly differently and as
> a result seem to find things others don't. :-)

Leap year was still the best ever ...

>>> It turns out my initial suspicions were correct; VMS sacrifices robust
>>> behaviour in favour of increased performance in the queue manager design.
>>>
>>> VMS maintains a in-memory copy of the permanent on disk queue manager
>>> database and routine changes to that database (such as the above timed
>>> release job starting) are immediately made _only_ to the in-memory copy
>>> of the database.
>>>
>>> Those in-memory updates are only committed to the permanent on-disk queue
>>> manager database when a timer expires every couple of minutes. There is no
>>> way to change the timer interval or to force the immediate updating of the
>>> on-disk queue manager database.
>>>
>>> This means that when there's a power failure shortly after a timed release
>>> job starts, then the job will start for a second time when power is restored
>>> instead of being marked as retained on error with a "system failed
>>> during execution" status.
>> You posed such a simple question, and the fact that it wasn't answered
>> in a like manner is depressing. What happens when you pose a tough
>> question?
>>
>
> Congratulations David, you got it in one. (And yes, it's very depressing.)

I've been saying this kind of stuff ever since they fired the people who
had any clues ...

I disagree. I don't understand additional requirements. You could have
a monitored UPS and VMS could be gracefully shut down when required.
Frankly, I think anything else is reckless.

Michael Moroney

unread,

Sep 6, 2013, 9:07:21 PM9/6/13

to

Is it possible that this issue has been there for a while, and nobody
tripped over it until now? I don't see the current VMS Engineering
going back and trying to speed up VMS by messing with the queue manager
like that. On the other hand, I don't see the old VMS Engineering ever
making such a speedup in the first place, they'd have known what could
happen.

JF Mezei

unread,

Sep 6, 2013, 11:50:45 PM9/6/13

to

On 13-09-06 21:07, Michael Moroney wrote:
> Is it possible that this issue has been there for a while, and nobody

> tripped over it until now? ...

As I recall, the queue manager was rewritten sometime between 5.0 and
7.3 with big speed improvements. Perhaps the caching was added at that
time ?

The lack of writing to disk really has to be documented because that is
different from expected behaviour. And since this was never documented,
they can fix it to match expected behaviour. The one or 2 shops for whom
perofrmance is more important than reliability can set some logical to
enable the caching.

What is also not clear is whether that application level cache still
provides performance improvement when you consider caching done by disk
arrays today.

Sure, delaying the writes does skip the "job started" change to database
and only writes the "job completed" one.

BTW, if there is a status to indicate a job was running while the system
failed, but this status is not set for short jobs, then this is a bug
that needs to be fixed.

My guess: with only one customer complaining, I doubt HP coporate will
allocate sufficient budget to HP India to develop a fix for this. But
had this come out in the early 1990s, this would have gotten a fix
really quickly. Hoff might even have skipped lunch to get the fix out by
early afternoon :-)

Simon Clubley

unread,

Sep 7, 2013, 6:01:43 AM9/7/13

to

On 2013-09-06, JF Mezei <jfmezei...@vaxination.ca> wrote:
>

> The lack of writing to disk really has to be documented because that is
> different from expected behaviour.

I completely and totally agree. For decades, people have been told that
VMS is not the fastest performer but that it was designed to be the safest
performer by default. And then we find this lurking underneath. :-)

>
> Sure, delaying the writes does skip the "job started" change to database
> and only writes the "job completed" one.
>

This isn't what happens. Neither the job started or job complete status
change is written away until after the freerunning two minute timer has
expired or something major like the queue manager being shutdown occurs.

If the job complete status had been written away, then the job would not
have run for a second time when power was restored.

> BTW, if there is a status to indicate a job was running while the system
> failed, but this status is not set for short jobs, then this is a bug
> that needs to be fixed.
>

It can also happen on long jobs if the power fails between the time the
job started and the time this is recorded in the on-disk database.

BTW, I have been talking about power failures because that is what happened
here, but surely exactly the same thing is going to happen with a system
crash (unless there's some flushing to disk of in-memory structures when
the system crashes I am unaware of).

Simon.

Simon Clubley

unread,

Sep 7, 2013, 6:09:09 AM9/7/13

to

On 2013-09-06, JF Mezei <jfmezei...@vaxination.ca> wrote:
>

> My guess: with only one customer complaining, I doubt HP coporate will
> allocate sufficient budget to HP India to develop a fix for this.

[I forgot to address this bit.]

I've now put in a formal request via the UK support person for this to
be fixed on the grounds that it can be considered a design flaw and even
a outright bug, that it can have very significant real world impacts on
customers (especially if they are unaware of this little feature as I was),
and that it is completely contrary to the general VMS safety first by
default ethos.

I will let you know what comes of it.

Simon.

Phillip Helbig---undress to reply

unread,

Sep 7, 2013, 6:27:09 AM9/7/13

to

In article <l0dm2o$jj2$1...@dont-email.me>, Simon Clubley

<clubley@remove_me.eisner.decus.org-Earth.UFP> writes:

> one of the things about VMS is that it
> has always focused on not negatively impacting performance due to a
> sudden change in defaults, especially as a result of merely applying
> a patch.

Not always. I remember when stopped working SHOW INTERFACE/CLUSTER. It
wasn't even deprecated to give people a chance to come up with something
else.

Jan-Erik Soderholm

unread,

Sep 7, 2013, 6:40:50 AM9/7/13

to

Simon Clubley wrote 2013-09-06 19:45:
> On 2013-09-06, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>>
>> This means that when there's a power failure shortly after a timed release
>> job starts, then the job will start for a second time when power is restored
>> instead of being marked as retained on error with a "system failed
>> during execution" status.
>>
>
> I should also point out this means that for quick jobs which start and
> complete inbetween on-disk updates, there's no record at all within the
> queue manager database that the job ever started and completed. This was
> the case for me.
>
> Simon.
>

So there would be a "problem" with all batch jobs shorter then a
"couple of minutes" ? That can't be so, we have 100's of batch
jobs lasting a few seconds and we do not se this.

And what "record" do you exepect to see in the qmgr database ?

Or did you meen that the problem is if the system crashes within
a "couple of minutes" efter the short job has run? Hm...

For a job that compleets successfully, and not using some
"retain" flag, what record is left in the qmgr database?

Jan-Erik.

Simon Clubley

unread,

Sep 7, 2013, 7:12:14 AM9/7/13

to

On 2013-09-07, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
> Simon Clubley wrote 2013-09-06 19:45:
>> On 2013-09-06, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:
>>>
>>> This means that when there's a power failure shortly after a timed release
>>> job starts, then the job will start for a second time when power is restored
>>> instead of being marked as retained on error with a "system failed
>>> during execution" status.
>>>
>>
>> I should also point out this means that for quick jobs which start and
>> complete inbetween on-disk updates, there's no record at all within the
>> queue manager database that the job ever started and completed. This was
>> the case for me.
>>
>

> So there would be a "problem" with all batch jobs shorter then a
> "couple of minutes" ? That can't be so, we have 100's of batch
> jobs lasting a few seconds and we do not se this.
>

All those updates will be written to disk just fine, but not for anything
up to two minutes (it's a freerunning timer and I have been told the
timer's interval is two minutes). Until then, the only record is in the
in-memory copy of the queue manager database.

> And what "record" do you exepect to see in the qmgr database ?
>
> Or did you meen that the problem is if the system crashes within
> a "couple of minutes" efter the short job has run? Hm...
>

Yes. This is exactly what I mean. This was discovered when a power
failure occurred shortly after a job ran and the same job ran
for a second time when power was restored.

> For a job that compleets successfully, and not using some
> "retain" flag, what record is left in the qmgr database?
>

If the job started record was not flushed to disk before the power
failure, then there will be no record in the queue manager database
of the job starting or been removed after it completed. It will be as
if the job never started in the first place and will run for a second
time after the power is restored.

If the job started record was flushed to disk before the power failure
but not the job completed record then the job will be marked as system
failed during execution when the power is restored.

If the job completed record was flushed to disk before the power failure
then the job will have been removed from the queue manager database just
as you would expect.

Jan-Erik Soderholm

unread,

Sep 7, 2013, 7:22:56 AM9/7/13

to

OK. I understand.
B.t.w, I also thought that the updates was flushed at once.
And 2 minutes seem as a far to long time anyway at todays
standards. And also, any disk system with (battery backuped)
write-back cache, such as in any SAN solution today, will
help with not banging on the disks all the time.

If this was something added in the 7.x timeframe, it might
be something that is not to hard to "fix" today.

Interesting...

Jan-Erik.

Paul Sture

unread,

Sep 7, 2013, 6:50:45 AM9/7/13

to

In article <l0ev1t$e3s$1...@online.de>,
hel...@astro.multiCLOTHESvax.de (Phillip Helbig---undress to reply)
wrote:

But that's TCP/IP Services, which has not been noted for following "the
rules".

IMHO of course.

--
Paul Sture

Paul Sture

unread,

Sep 7, 2013, 8:43:06 AM9/7/13

to

In article <l0dqvu$9m1$1...@dont-email.me>,

I'm going to take Simon's side here.

It's a matter of risk analysis and cost. Adding the complication of
Less Uninterruptible Power Supplies (LUPS) can easily add a significant
risk and there's a substantial capital cost too.

I have seen a misbehaving LUPS take out disk controllers, and at another
place one took out a whole building which included several football
pitch sized server rooms for several hours.

At another place with a very large assembly line driven by VMS the
decision was taken to opt for a DEC hardware support contract during
office hours only. Full 24 hour Monday to Friday support was so
expensive that management decided to take the financial hit on the chin
in the form of a very large bill from DEC if it ever came to needing
that support outside contracted hours.

--
Paul Sture

VAXman-

unread,

Sep 7, 2013, 10:19:34 AM9/7/13

to

To the contrary, when JCP&L's transformer outside blew up during hurricane
Sandy, the EMF surge blew up a 3kVA UPS (the one with 2 external/additional
battery packs) and caused it to catch fire. Happily, every single piece of
equipment it backed up was spared of the same fate which befell this UPS.

That UPS was replaced this spring. I must thank my Uncle Sam's bullshit
promises of helping victims, both large and small, of superstorm Sandy --
the second worst storm in US history WRT costs, damage and loss of life.

--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG

Well I speak to machines with the voice of humanity.

Stephen Hoffman

unread,

Sep 7, 2013, 11:09:25 AM9/7/13

to

On 2013-09-07 10:01:43 +0000, Simon Clubley said:

> On 2013-09-06, JF Mezei <jfmezei...@vaxination.ca> wrote:
>>
>> The lack of writing to disk really has to be documented because that is
>> different from expected behaviour.
>
> I completely and totally agree. For decades, people have been told that
> VMS is not the fastest performer but that it was designed to be the
> safest performer by default. And then we find this lurking underneath.
> :-)

There are other similar gremlins lurking. Multi-block writes to
various SCSI shelves can fail to reach the rotating rust, for instance.
VMS and the app tried to get the data to disk and the disk may well
have indicated the operation was completed, and it didn't get there.
There were StorageWorks options that were shelf-local batteries
intended to reduce the exposure to these cache-loss cases and to keep
the power long enough to get the disk or the controller caches flushed,
but AFAIK the shelf "brick" battery options were retired some years ago.

Making this more complex, there are also disks around which consider a
transfer into local cache to be the I/O completion; the host receives a
write-completed response from the disk. If the end-user or the OEM
system vendor that happened to select these disks doesn't detect the
completion arrives before the data is written to non-volatile storage,
then there's a window that can lead to data loss.

For a related product — which was never supported on VMS — see the DEC
PrestoServe NVRAM hardware. Various mid- and upper-end RAID storage
controllers can now have local batteries intended to keep the caches
valid (hopefully) until power is recovered, as well.

A design that's robust against power outages and hardware failures and
related is difficult, and expensive. This is the purview of VAXft and
NSK hardware. But then I've also had high-end FC SAN storage
controllers corrupt RAID volume disk blocks underneath a critical
database, so there are always new and creative ways for a computer to
clobber your application processing.

> BTW, I have been talking about power failures because that is what
> happened here, but surely exactly the same thing is going to happen
> with a system crash (unless there's some flushing to disk of in-memory
> structures when the system crashes I am unaware of).

AFAIK, the only data that's typically automatically recovered from the
carcass of a crashdump (by VMS) is the error log data. The (cached)
hardware errors that might have triggered to the crash.

Various databases can be configured to recover in-flight and cached
transactions using journals, though there are undoubtedly some few
cases around where even that effort will fail, too.

Paul Sture

unread,

Sep 7, 2013, 11:00:37 AM9/7/13

to

In article <00AD8F3C...@SendSpamHere.ORG>,

Brain fart. I meant Less Interruptible Power Supplies (LIPS).

> >I have seen a misbehaving LUPS take out disk controllers, and at another
> >place one took out a whole building which included several football
> >pitch sized server rooms for several hours.
>
> To the contrary, when JCP&L's transformer outside blew up during hurricane
> Sandy, the EMF surge blew up a 3kVA UPS (the one with 2 external/additional
> battery packs) and caused it to catch fire. Happily, every single piece of
> equipment it backed up was spared of the same fate which befell this UPS.

Yep, every installation should be tailored to cover local conditions.

> That UPS was replaced this spring. I must thank my Uncle Sam's bullshit
> promises of helping victims, both large and small, of superstorm Sandy --
> the second worst storm in US history WRT costs, damage and loss of life.

Business as usual I note :-(

--
Paul Sture

Paul Sture

unread,

Sep 7, 2013, 11:10:56 AM9/7/13

to

In article <l0f2ae$s24$1...@news.albasani.net>,
Jan-Erik Soderholm <jan-erik....@telia.com> wrote:

> Simon Clubley wrote 2013-09-07 13:12:
> >
> > If the job started record was not flushed to disk before the power
> > failure, then there will be no record in the queue manager database
> > of the job starting or been removed after it completed. It will be as
> > if the job never started in the first place and will run for a second
> > time after the power is restored.
> >
> > If the job started record was flushed to disk before the power failure
> > but not the job completed record then the job will be marked as system
> > failed during execution when the power is restored.
> >
> > If the job completed record was flushed to disk before the power failure
> > then the job will have been removed from the queue manager database just
> > as you would expect.
> >
> > Simon.
> >
>
> OK. I understand.
> B.t.w, I also thought that the updates was flushed at once.
> And 2 minutes seem as a far to long time anyway at todays
> standards. And also, any disk system with (battery backuped)
> write-back cache, such as in any SAN solution today, will
> help with not banging on the disks all the time.
>
> If this was something added in the 7.x timeframe, it might
> be something that is not to hard to "fix" today.
>
> Interesting...

IIRC (which is doubtful), the queue manager got a lot of features added
in V4.0 and then again in 5.4.

I've got rusty in this area. What part does the queue manager journal
file play here? I (perhaps mistakenly) thought that the journal bit of
the name was to do with recovery from crashes.

There was a bug in the early 6.n timeframe (documented in the V6.2
Release Notes or Cover Letter IIRC) where the journal file would grow to
hundreds of blocks on a busy system, and an otherwise undocumented value
you fed to JBC$COMMAND.EXE to fix this.

--
Paul Sture

Stephen Hoffman

unread,

Sep 7, 2013, 11:18:21 AM9/7/13

to

On 2013-09-07 10:40:50 +0000, Jan-Erik Soderholm said:

> So there would be a "problem" with all batch jobs shorter then a
> "couple of minutes" ? That can't be so, we have 100's of batch jobs
> lasting a few seconds and we do not se this.

So fire up one of your test systems, spool up a couple hundred
one-second jobs on a test system, and pull the plug. Then check to see
what gets restarted, and compared with the log files that have already
been written. If the job controller queue data is being cached,
you'll detect some number of duplicate runs, depending on how long it's
been since the queue data has been flushed.

> And what "record" do you exepect to see in the qmgr database ?

The job start and job completion would be typical.

> Or did you meen that the problem is if the system crashes within a
> "couple of minutes" efter the short job has run? Hm...

Welcome to what inevitably happens with any volatile data caches.

> For a job that compleets successfully, and not using some "retain"
> flag, what record is left in the qmgr database?

Per the reported behavior, none, at least until the job controller
queue manager data cache is flushed.

In summary: the queue manager uses cached data for maintaining the
status of the queues, and up to ~two minutes of activity of that cached
queue activity data can be lost after a power failure, disk or RAID
failure or other hardware error. Increasing the frequency of flushes
will reduce but cannot eliminate this window.

David Froble

unread,

Sep 7, 2013, 11:56:52 AM9/7/13

to

The bottom line, there are no guarantees ....

Design your applications with this concept, and perhaps you'll be less
vulnerable. But there will always be some vulnerability.

I've seen, and not just recently, people who do not consider failures.
I was told "if you send a request over the internet, you'll always get a
reply, why would you need a timeout on your read?"

Seems to be a lot of this type of thinking in "modern programming" ...

Simon Clubley

unread,

Sep 7, 2013, 12:10:06 PM9/7/13

to

On 2013-09-07, Paul Sture <nos...@sture.ch> wrote:
>
> IIRC (which is doubtful), the queue manager got a lot of features added
> in V4.0 and then again in 5.4.
>
> I've got rusty in this area. What part does the queue manager journal
> file play here? I (perhaps mistakenly) thought that the journal bit of
> the name was to do with recovery from crashes.
>

No, you are correct. That's how it is described in the manuals:

|Contains information allowing the queue manager to return to the last known
|state if:
|
| A standalone machine stops unexpectedly
| An OpenVMS Cluster node that is running the queue manager leaves the
| OpenVMS Cluster environment
|
|The journal file also contains job definitions.

Taken from http://h71000.www7.hp.com/doc/73final/6017/6017pro_054.html

Notice the absence of any information about delays in updating the
on-disk journal file or the fact that there's both a on-disk and memory
based component to the journaling.

This is clearly a new use of the word journal I was previously unfamiliar
with. (With apologies to Douglas Adams. :-))

> There was a bug in the early 6.n timeframe (documented in the V6.2
> Release Notes or Cover Letter IIRC) where the journal file would grow to
> hundreds of blocks on a busy system, and an otherwise undocumented value
> you fed to JBC$COMMAND.EXE to fix this.
>

Try up to hundreds of thousands of blocks. :-)

mcr jbc$command
diag 7

It's been a long time since I did that, but I just checked the sequence
before posting and yes, I did remember it correctly...

(Please don't ask why this is so permanently embedded in my brain. :-))

Stephen Hoffman

unread,

Sep 7, 2013, 12:15:24 PM9/7/13

to

On 2013-09-07 15:56:52 +0000, David Froble said:

> The bottom line, there are no guarantees ....

Mr. Murphy would disagree. There's at least one guarantee around:
errors can and will happen.

> Design your applications with this concept, and perhaps you'll be less
> vulnerable. But there will always be some vulnerability.
>
> I've seen, and not just recently, people who do not consider failures.
> I was told "if you send a request over the internet, you'll always get
> a reply, why would you need a timeout on your read?"

<http://en.wikipedia.org/wiki/Fallacies_of_Distributed_Computing>
<http://aphyr.com/posts/288-the-network-is-reliable>

But there's also the discussion around whether higher reliability is
affordable, and more many applications, a solution that's "good enough"
can win. Much like security, for that matter. Trade-offs abound.

> Seems to be a lot of this type of thinking in "modern programming" ...

Always has been, really. I've chased more than a little buggy VMS code
around that completely ignored synchronization or error checking, and
similar issues go back an eon or two. Adding in networking provides a
plethora of pitfalls for a programmer, and more choices meant for Mr.
Murphy.

Paul Sture

unread,

Sep 7, 2013, 2:17:05 PM9/7/13

to

In article <l0fj4u$ckh$1...@dont-email.me>,

Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> wrote:

> On 2013-09-07, Paul Sture <nos...@sture.ch> wrote:
> >
> > IIRC (which is doubtful), the queue manager got a lot of features added
> > in V4.0 and then again in 5.4.
> >
> > I've got rusty in this area. What part does the queue manager journal
> > file play here? I (perhaps mistakenly) thought that the journal bit of
> > the name was to do with recovery from crashes.
> >
>
> No, you are correct. That's how it is described in the manuals:
>
> |Contains information allowing the queue manager to return to the last known
> |state if:
> |
> | A standalone machine stops unexpectedly
> | An OpenVMS Cluster node that is running the queue manager leaves the
> | OpenVMS Cluster environment
> |
> |The journal file also contains job definitions.
>
> Taken from http://h71000.www7.hp.com/doc/73final/6017/6017pro_054.html
>
> Notice the absence of any information about delays in updating the
> on-disk journal file or the fact that there's both a on-disk and memory
> based component to the journaling.

Yep. I am also unsure about what "The journal file also contains job
definitions." really means.

> This is clearly a new use of the word journal I was previously unfamiliar
> with. (With apologies to Douglas Adams. :-))

Sigh. It's times like this that I wish I had stashed all the DSNlink
articles when they were available.

> > There was a bug in the early 6.n timeframe (documented in the V6.2
> > Release Notes or Cover Letter IIRC) where the journal file would grow to
> > hundreds of blocks on a busy system, and an otherwise undocumented value
> > you fed to JBC$COMMAND.EXE to fix this.
> >
>
> Try up to hundreds of thousands of blocks. :-)

I think I saw it at several thousand blocks but never so huge.

> mcr jbc$command
> diag 7
>
> It's been a long time since I did that, but I just checked the sequence
> before posting and yes, I did remember it correctly...
>
> (Please don't ask why this is so permanently embedded in my brain. :-))

Nuff said.

The copy of the VMS FAQ I have (Revision Date/September 2006) has this
under section 5.18 "How do I move the queue manager database?":

----
To move the queue database:

1 Checkpoint the journal file. This reduces the file
size to the in-memory database size. This will cause
the noted delay.

$ RUN SYS$SYSTEM:JBC$COMMAND
JBC$COMMAND> DIAG 0 7
----

That's the only mention of the in-memory database I am aware of.

I am afraid I am having difficulty with this 2 minute value; it seems
way too high to me. I did test the batch restart stuff pretty
thoroughly way back when, but that was when JBCSYSQUE.DAT was the queue
file (pre-V5.4 IIRC). I was involved with hefty batch and print
processing (up to 140,000 batch/print jobs a day) with V6.1 and V6.2 (a
mix of VAX and Alpha) and find it hard to believe we would have been
subject to a potential 2 minute data loss there. We didn't have any
power cuts in that period so it's hard to say.

It would be interesting to know when this 2 minute value arrived. Part
or all of the "cure" for the large journal files perhaps?

--
Paul Sture

JF Mezei

unread,

Sep 7, 2013, 2:49:22 PM9/7/13

to

On 13-09-07 14:17, Paul Sture wrote:

> It would be interesting to know when this 2 minute value arrived. Part
> or all of the "cure" for the large journal files perhaps?

Is there any documentation on what the queue manager file structures are
? This may give a clue on why caching was "necessary".

How big is the cache ? The whole queue database ? Or just a window into
the active area ? How much is rewritten every 2 minutes ? The whole
thing or just blocks that were modified or jsut individual records ?

My guess is that 2 minutes was some totally arbritrary number arrive at
by looking that the one large customer who complained about performance.
If you process a gazillion very short jobs that last a second, each
initiator gets to do 120 jobs per 2 minute interval. With no caching,
that means 240 writes to update status of each job as it starts and then
as it ends. With caching, the writes are reduced to 1.

But if you process 1 job per hour, then the caching gives you nothing in
terms of improvement.

Simon Clubley

unread,

Sep 7, 2013, 3:04:13 PM9/7/13

to

You are not the only one having problems with this whole delayed writeback
business in the first place, let alone a deferred writeback of up to
a couple of minutes. It's so incompatible with the rest of VMS's design
goals that I don't see how any VMS system managers would have thought of
this upfront unless it had been pointed out in great big bold letters in
the VMS documentation.

Note that the UK based HP support person seemed to think it was 1 minute
after looking at the code himself before the final VMS Engineering
analysis came back but he wasn't completely sure.

However, the VMS Engineering analysis I received on Friday was mainly
focused on how my specific timed release job was handled so I don't know
if there's a different code path for timed release jobs which increases
the deferred writeback delay on the cache to 2 minutes or if there's
another part of the VMS code which the UK person didn't pick up on
which also increases the delay in general.

Chris Scheers

unread,

Sep 7, 2013, 7:04:33 PM9/7/13

to

Simon Clubley wrote:
> On 2013-09-06, David Froble <da...@tsoft-inc.com> wrote:

>> I also think you should have some battery backup on your system(s).
>

> The battery backup is a interesting comment. The site in question has
> been running DEC systems for a _long_ time (decades) and the DEC systems
> they have used have traditionally been extremely robust in power failures.
> The Linux systems (both servers and desktops) are configured in ways
> that have also proven to be robust against power failures as well.

While having battery backup is generally a good idea, it is a red
herring for these discussions.

Deferred writing of data opens you up to all sorts of failures. Loss of
power is only one of them. There are also hardware failures, system
crashes, etc.

Since the queue manager can deal in the exportation of internal data
from the system to an external destination, anything that can cause this
exportation to be unreliable is a problem.

I once worked with a VMS system that, among things, used the queue
manager in the path of delivering buy/sell orders. It was extremely
important to be able to note if a particular message was an original or
a resend because of a possibly failed first delivery.

This caching feature/bug of the queue manager would be very problematic.

--
-----------------------------------------------------------------------
Chris Scheers, Applied Synergy, Inc.

Voice: 817-237-3360 Internet: ch...@applied-synergy.com
Fax: 817-237-3074

Simon Clubley

unread,

Sep 7, 2013, 8:06:26 PM9/7/13

to

On 2013-09-07, Chris Scheers <ch...@applied-synergy.com> wrote:
> Simon Clubley wrote:
>> On 2013-09-06, David Froble <da...@tsoft-inc.com> wrote:
>
>>> I also think you should have some battery backup on your system(s).
>>
>> The battery backup is a interesting comment. The site in question has
>> been running DEC systems for a _long_ time (decades) and the DEC systems
>> they have used have traditionally been extremely robust in power failures.
>> The Linux systems (both servers and desktops) are configured in ways
>> that have also proven to be robust against power failures as well.
>
> While having battery backup is generally a good idea, it is a red
> herring for these discussions.
>
> Deferred writing of data opens you up to all sorts of failures. Loss of
> power is only one of them. There are also hardware failures, system
> crashes, etc.
>

I strongly agree with this.

> Since the queue manager can deal in the exportation of internal data
> from the system to an external destination, anything that can cause this
> exportation to be unreliable is a problem.
>
> I once worked with a VMS system that, among things, used the queue
> manager in the path of delivering buy/sell orders. It was extremely
> important to be able to note if a particular message was an original or
> a resend because of a possibly failed first delivery.
>
> This caching feature/bug of the queue manager would be very problematic.
>

I've also got important stuff running on the batch queues for which
this will be a very real problem.

The more I think about this, the more I am having trouble reconciling
the fact that (a) I am apparently the first one to have found this after
all these years with (b) the level of use the queue manager clearly
gets in a number of organisations.

This deferred updates problem has been confirmed for my case of time
released jobs, but while the general queue manager design is flawed I
am beginning to wonder if the problem might be mitigated in some
circumstances.

For example, if you submit a brand new job off the command line, does
that force a flushing of the in-memory structures to disk ? But OTOH,
if you look at the log file extracts in the original posting you will
see that the problem job did exactly that and it didn't make a
difference.

Or is it just that people just wrote off a "that's wierd, that job ran
again after the system came back up" versus my digging into this
problem to get to the bottom of it ? However, given the mindset of VMS
people, I can't believe I would be the first one to investigate this.

While the problem has been confirmed and the design flaw within the
queue manager identified, I am having a increasingly hard time
understanding why I am the first one to have found this.

JF Mezei

unread,

Sep 8, 2013, 1:00:01 AM9/8/13

to

Is there a queue manager command that forces the memory copy to be
flushed to disk ?

If so, then you start every batch job with it so that the "job started"
status is recorded to disk.

At end of job, it is a bit harder. You can submit a tiny separate job
which just does the command to have queue manager flushed. This way, as
the real job ends, another one begins which causes the former job's
"completed" status to be written to disk.

Simon Clubley

unread,

Sep 8, 2013, 6:09:39 AM9/8/13

to

On 2013-09-08, JF Mezei <jfmezei...@vaxination.ca> wrote:
> Is there a queue manager command that forces the memory copy to be
> flushed to disk ?
>

The VMS system manager has absolutely no control over this.

There are no options available to control the free running timer's period
or to disable deferred write caching in the first place.

That's what makes this so bad; if those options existed I would have
already used them and this discussion would be in the form of a heads-up
only. At least in Unix land, the sys admin has a range of options
available to trade off safety versus performance.

As mentioned in a previous message, I have filed a formal request
for such options on the grounds that this behaviour is a design flaw
and can be considered a bug and therefore my request can be considered
a bugfix and not a enhancement.

What I have in mind are options which set the behaviour of the queue manager
generally and not something you could run at the start of a job (although
if the latter was available, and non privileged, it's something you
could drop into sylogin for execution when in batch).

> If so, then you start every batch job with it so that the "job started"
> status is recorded to disk.
>
> At end of job, it is a bit harder. You can submit a tiny separate job
> which just does the command to have queue manager flushed. This way, as
> the real job ends, another one begins which causes the former job's
> "completed" status to be written to disk.
>

Although still important, the end of the job status isn't as important.
The job would just be marked as retained on error with a status of system
failed during execution if it was marked as started, but the job complete
record had not been written.

The critical thing is to stop the same job running twice.

Jan-Erik Soderholm

unread,

Sep 8, 2013, 6:51:49 AM9/8/13

to

Has "checkpointing" anything with this "caching of updates" to do?

This command force a "checkpoint" of the queue manager :

$ mc JBC$COMMAND diag 0 7
%JBC-I-DIAGNOSTIC,
Log for playback = 0
Save old Journal files = 0
Log all requests = 0
Dump on error = 0
Checkpoint: State = 1, In-memory blocks = 100
$

That creates a new SYS$QUEUE_MANAGER.QMAN$JOURNAL, if that
has anything to do with the issue at hand.

See also: http://keithobrien.co.uk/openvms-pages/jbc$command.html

Jan-Erik.

VAXman-

unread,

Sep 8, 2013, 6:58:39 AM9/8/13

to

In article <l0hid3$lr6$1...@dont-email.me>, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> writes:
>On 2013-09-08, JF Mezei <jfmezei...@vaxination.ca> wrote:
>> Is there a queue manager command that forces the memory copy to be
>> flushed to disk ?
>>
>
>The VMS system manager has absolutely no control over this.
>
>There are no options available to control the free running timer's period
>or to disable deferred write caching in the first place.

$ RUN SYS$LIBRARY:DELTA ;)

Simon Clubley

unread,

Sep 8, 2013, 8:19:20 AM9/8/13

to

On 2013-09-08, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>
> Has "checkpointing" anything with this "caching of updates" to do?
>
> This command force a "checkpoint" of the queue manager :
>
> $ mc JBC$COMMAND diag 0 7
> %JBC-I-DIAGNOSTIC,
> Log for playback = 0
> Save old Journal files = 0
> Log all requests = 0
> Dump on error = 0
> Checkpoint: State = 1, In-memory blocks = 100
> $
>

Typical. You don't use a command for ~15 years and then it comes
along twice within 24 hours. :-)

It's not clear from the description in the link that you posted if it
actually flushes the in-memory structures to disk or if it's just like
doing, say, a vacuum on a SQL database.

In any case however, even if it does flush the in-memory blocks, I can't
use it because it would need to be something which could be dropped into
sylogin for use by a non-privileged user's batch jobs.

I do appreciate the suggestion however.

Jan-Erik Soderholm

unread,

Sep 8, 2013, 8:43:58 AM9/8/13

to

Simon Clubley wrote 2013-09-08 14:19:
> On 2013-09-08, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>
>> Has "checkpointing" anything with this "caching of updates" to do?
>>
>> This command force a "checkpoint" of the queue manager :
>>
>> $ mc JBC$COMMAND diag 0 7
>> %JBC-I-DIAGNOSTIC,
>> Log for playback = 0
>> Save old Journal files = 0
>> Log all requests = 0
>> Dump on error = 0
>> Checkpoint: State = 1, In-memory blocks = 100
>> $
>>
>
> Typical. You don't use a command for ~15 years and then it comes
> along twice within 24 hours. :-)
>

Yes, I saw it in the first post, then googled a bit and found the
page where the description of the DIAG options was documented. :-)

> It's not clear from the description in the link that you posted if it
> actually flushes the in-memory structures to disk or if it's just like
> doing, say, a vacuum on a SQL database.
>

Yes, I wasn't sure either. Hence my initial question above... :-)

My thought was that you could take this back through your
currently open channel(s).

> In any case however, even if it does flush the in-memory blocks, I can't
> use it because it would need to be something which could be dropped into
> sylogin for use by a non-privileged user's batch jobs.
>

*IF* this flushes the in-memory cache, maybe it could be run, say,
each 10 sec in a separate (privileged) job. That will at least
make the "window" smaller. That is, of course, that the command
has a low overhead in itself...

Jan-Erik.

VAXman-

unread,

Sep 8, 2013, 9:02:18 AM9/8/13

to

In article <l0href$6ro$1...@news.albasani.net>, Jan-Erik Soderholm <jan-erik....@telia.com> writes:
>Simon Clubley wrote 2013-09-08 14:19:

>> On 2013-09-08, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
>>>
>>> Has "checkpointing" anything with this "caching of updates" to do?
>>>
>>> This command force a "checkpoint" of the queue manager :
>>>
>>> $ mc JBC$COMMAND diag 0 7
>>> %JBC-I-DIAGNOSTIC,
>>> Log for playback = 0
>>> Save old Journal files = 0
>>> Log all requests = 0
>>> Dump on error = 0
>>> Checkpoint: State = 1, In-memory blocks = 100
>>> $
>>>
>>
>> Typical. You don't use a command for ~15 years and then it comes
>> along twice within 24 hours. :-)
>>
>

>Yes, I saw it in the first post, then googled a bit and found the

>page where the description of the DIAG options was documented. :-)

>
>> It's not clear from the description in the link that you posted if it
>> actually flushes the in-memory structures to disk or if it's just like
>> doing, say, a vacuum on a SQL database.
>>
>

>Yes, I wasn't sure either. Hence my initial question above... :-)
>
>My thought was that you could take this back through your
>currently open channel(s).
>

>> In any case however, even if it does flush the in-memory blocks, I can't
>> use it because it would need to be something which could be dropped into
>> sylogin for use by a non-privileged user's batch jobs.
>>
>

>*IF* this flushes the in-memory cache, maybe it could be run, say,
>each 10 sec in a separate (privileged) job. That will at least
>make the "window" smaller. That is, of course, that the command
>has a low overhead in itself...

$SNDJBC ;)

Jan-Erik Soderholm

unread,

Sep 8, 2013, 10:09:16 AM9/8/13

to

Sorry, but I do not understand anything.
What has that routine to do with this?
Can you send the same "diag" commands through SNDJBC ?

Jan-Erik.

Paul Sture

unread,

Sep 8, 2013, 11:20:27 AM9/8/13

to

In article <00AD8FFA...@SendSpamHere.ORG>,

Now that's cruel.

(The V8.3 $SNDJBC documentation in PDF format is 60 pages long,
including the single page for $SNDJBCW)

--
Paul Sture

Jan-Erik Soderholm

unread,

Sep 8, 2013, 11:54:03 AM9/8/13

to

Paul Sture wrote 2013-09-08 17:20:
> In article <00AD8FFA...@SendSpamHere.ORG>,
> VAXman- @SendSpamHere.ORG wrote:
>
>> In article <l0href$6ro$1...@news.albasani.net>, Jan-Erik Soderholm
>> <jan-erik....@telia.com> writes:
>>>
>>> *IF* this flushes the in-memory cache, maybe it could be run, say,
>>> each 10 sec in a separate (privileged) job. That will at least
>>> make the "window" smaller. That is, of course, that the command
>>> has a low overhead in itself...
>>
>> $SNDJBC ;)
>
> Now that's cruel.

Why?

Jan-Erik.

hb

unread,

Sep 8, 2013, 12:29:13 PM9/8/13

to

On 09/08/2013 02:43 PM, Jan-Erik Soderholm wrote:

> *IF* this flushes the in-memory cache, maybe it could be run, say,
> each 10 sec in a separate (privileged) job. That will at least
> make the "window" smaller. That is, of course, that the command
> has a low overhead in itself...

*IF* you want to make the window smaller, try to find the delta time in
the image file and patch it. If you have the VMS listings and maps it
should be easy to locate the quadword and its current value. I'm sure
someone has these files and can point you to the right file and offset.

VAXman-

unread,

Sep 8, 2013, 2:37:01 PM9/8/13

to

In article <l0i0ec$go5$1...@news.albasani.net>, Jan-Erik Soderholm <jan-erik....@telia.com> writes:
>VAXman- @SendSpamHere.ORG wrote 2013-09-08 15:02:

>Sorry, but I do not understand anything.
>What has that routine to do with this?
>Can you send the same "diag" commands through SNDJBC ?

These diag functions are not "formally" documented but if you were to open,
say, SYS$LIBRARY:STARLET.REQ and search for SJC$_DIAG and SJC$V_DIAG, you'd
see that the functions and associated bits are defined and there are even a
few comments therein as well.

JF Mezei

unread,

Sep 8, 2013, 4:18:31 PM9/8/13

to

On 13-09-08 06:58, VAXman- @SendSpamHere.ORG wrote:

> $ RUN SYS$LIBRARY:DELTA ;)

Patching the image to have a reduced timer value for more frequency
database flushes could have unintended consequences.

If, every 2 minutes, the whole memory structure is flushed back to disk
representing a large number of blocks, then doing this every 10 seconds
could really slow your system.

On the other hand, if the flushing only affects modified blocks, then a
system with no activity could "flush" every 10 seconds with no disk
writes unless there has been some jobs started in the last 10 seconds.

Bear in mind that TCPIP services uses the queue manager for emails. So
a high email activity level could still cause a lot of disk writes even
if there are no batch/print jobs.