Google Groepen ondersteunt geen nieuwe Usenet-berichten of -abonnementen meer. Historische content blijft zichtbaar.

NaT consumption faults with COBOL?

52 weergaven
Naar het eerste ongelezen bericht

Jim Duff

ongelezen,
17 nov 2009, 15:22:2717-11-2009
aan
All,

I'm currently seeing some NaT consumption faults on straight COBOL code
compiled with HP COBOL V2.9-1453 on OpenVMS IA64 V8.3-1H1. The compiler
is invoked with the following qualifiers:

/ansi/tie/standard=v3/reserved=noxopen/check=(all,nodec)/convert=leading

Looking up the NATFAULT error, we have (in part):

"A NaT value can be generated by a user program using an I64 feature
called control speculation. However, compiler-generated code should
never take a NaT fault."

Based on this, a call was logged with HP, and they came back with "This
is a function of the memory load on your machine."

This doesn't ring right with me. Has anyone else seen anything similar?
Any known issues around this area?

Thanks,
Jim.
--
www.eight-cubed.com

John Reagan

ongelezen,
17 nov 2009, 18:59:4517-11-2009
aan

"Jim Duff" <spam...@127.0.0.1> wrote in message
news:49dbt6-...@SendSpamHere.ORG...

That wasn't the most helpful answer you received.

I haven't seen any NaT consumption faults with COBOL code. There was some
bugs in the GEM code generator that might expose incoming NaT that just
happen to be sitting around in registers. We only saw them in C and BLISS
code. However, my memory says they should be fixed in the GEM inside of
COBOL V2.9 regardless.

Is this easily reproducable? Most NaT consumption faults I've seen are
random and hard to pin down.

You should submit a case against the COBOL compiler and provide a reproducer
if possible. If not, the team would certainly want to know enough info to
track it back to the COBOL source line that tripped across the NaT (so .MAP
files, /LIST/MACH files, etc.). We need that to figure out which code
pattern needs NaT safety checking. As the message said, no
compiler-generated code should take a NaT fault (including Macro-32). You
should only be able to get a NaT fault if you write in Itanium assembly and
aim at your foot.

GEM itself does not use the NaT feature of Itanium, but the C++ compiler
does (which is how NaT show up from time to time).

I can explain in more detail if folks are interested (and you have a strong
gag reflex).

John


Bob Gezelter

ongelezen,
17 nov 2009, 19:34:5917-11-2009
aan
On Nov 17, 6:59 pm, "John Reagan" <johnrrea...@earthlink.net> wrote:
> "Jim Duff" <spam.t...@127.0.0.1> wrote in message

John,

With all due respect, don't you mean "do NOT have a strong gag
reflex" (I presume you meant "strong stomach").

- Bob ("Strong Stomach") Gezelter, http://www.rlgsc.com

Jeremy Begg

ongelezen,
17 nov 2009, 22:34:5317-11-2009
aan
Hi John,

> I can explain in more detail if folks are interested (and you have a strong
> gag reflex).

I've parked a bucket by my desk, let's hear it!

Curlsman

ongelezen,
17 nov 2009, 23:41:5617-11-2009
aan

Somehow I'm expecting something less than Mr. Creosote from Monty
Python's Meaning of Life...
but I'm prepared to be disappointed.

http://en.wikipedia.org/wiki/Mr_Creosote

John Reagan

ongelezen,
18 nov 2009, 11:04:5218-11-2009
aan
OK children sit down by the campfire and I'll tell you a scary story about
NaTs!

[holding a flashlight under my chin to illuminate my face...]

There are two NaT-related stories to tell.

1)

The integer registers on Itanium are actually 65 bits with. 64 bits of data
and a special bit called the NaT bit (Nat is short for Not A Thing). NaTs
are like their silent NaN counterparts in the IEEE floating world. If you
have a NaT, you can add to it, subtract from it, use it like any register
operand, etc. the NaT just propagates along. You only get into trouble
when you try to store that register to memory. If the register's NaT bit is
set, then you get a fault saying the register really has no value to store.

The two normal ways to clear a NaT bit on a register is to move a literal
value into the register or load a value from memory into the register. I
won't go into on how NaT are set other than saying it is a part of the
Itanium sepculative load feature where you can try to load a value and if
the memory location doesn't exist, you get a NaT instead of an ACCVIO.

The GEM code generator often wants to write a register in pieces for things
like small structs, etc. that were allocated to a register. So if GEM
writes to the bottom longword of a register with an 'dep' instruction and
then writes to the top longword with another 'dep' instruction, you'd think
that we've now written all 64-bits, right? Nope. The 'dep' instruction
just propagates the NaT bit. Inserting a value into a portion of a NaT
still gives you a NaT. The GEM optimization that turns into those multiple
deposits into a register is pretty deep inside the flow analyzer. As we
found them (via bugreports), we added some extra code that tries to clear
the register first to clear out the NaT. It isn't as easy as it sounds
since the multiple deposits could be very far apart. We didn't want to
always clear all registers either since most of the time things are fine.
Why slow down everybody for the rare code patterns.

And to expose the bugs, the register had to start as a NaT. How would that
happen? Well, at some point recently, you must have executed something
written in C++ which does use speculative loads and it left a NaT in a
register. That flowed into the COBOL routine (even in newly created stacked
registers which often start as garbage from stuff on the register backing
store), GEM did multiple deposits into the register and then tried to write
that to memory.

This is the bug that the COBOL application must have found another occurance
of.


2)

While Macro-32 doesn't use the code-generator part of GEM, we also found a
NaT-related problem.

Unlike code in a high-level language which would never blindly store a
register to the stack "just for fun", it happens in Macro-32 code all the
time. I call them courtesy saves. You've all seen them. A Macro-32
routine that is about to use some register (like near a MOVCx for instance
or needing a quick scratch register on some rare code path) does a PUSHL of
register(s) to the stack, uses them for whatever, and POPLs them back. The
Macro-32 routine doesn't know if the registers had anything meaningful in
them or not, but did the push/pop "just in case".

For some courtesy saves (PUSHLs near the top of the routine), the compiler
can recognize them as register saves and actually moves them into the
routine prologue (turning them into 64-bit saves and uses stacked registers
on Itanium/memory stack on Alpah). However, for PUSHLs farther down in the
program (especially on branches of flow paths), the compiler thinks you
might actually want to push that value on the stack for perhaps a future
CALLS or perhaps you are building some data structure like a descriptor. So
we generate a 'st4' to push that register onto the memory stack. And if
that register contains a NaT? Yep, fault. Sucks in user mode. REALLY
SUCKS in kernel mode. :)

So what's the poor little compiler to do? As one of the many pieces in the
flow graph we build, we now look for paths from the start of the routine to
register pushs to the stack (not just any store to memory) which didn't
store into the register first (or have it listed as an INPUT register,
etc.). For those registers which might be in a courtesy save, we generate
extra code in the routine prologue to check if it is a NaT and it if is,
shove a -1 into the register clearing the NaT. The courtesy save will now
save/restore a -1 but it was prepared to save/restore garbage anyway. We
have more work to do in the epilogue since if we found a NaT in R4-R7 (the
preserved register set), we have to put the NaT back since some C++ code
earlier in the call chain might still expect the NaT to be in place (unless
the register is marked as OUTPUT or SCRATCH of course). And it gets really
nasty when routines branch between each other. Any epilogue/exit-sequence
doesn't know for sure which registers to put NaTs back into. There are
bitvectors created by prologues for such cases. The epilogues load that
bitvector into the predicate set and then do a bunch of predicated
instructions to restore the saved NaTs into the right registers.

And every year at Halloween the NaT-creature comes back to haunt misbehaving
old-farts like us and eats our brains. Boooooo!!!!!!!

Curlsman

ongelezen,
18 nov 2009, 16:41:3118-11-2009
aan
OK, so it was more like Alien, and the NaT is the tongue that darts
out to open a skull.

But Thanks!


Sean

On Nov 18, 8:04 am, "John Reagan" <johnrrea...@earthlink.net> wrote:
> OK children sit down by the campfire and I'll tell you a scary story about
> NaTs!
>
> [holding a flashlight under my chin to illuminate my face...]
>
> There are two NaT-related stories to tell.
>

> <snip>

Jim Duff

ongelezen,
19 nov 2009, 14:39:3919-11-2009
aan

John,

Thanks for confirming what I expected. As soon as I have a valid
traceback, listings, and maps, I'll forward the info through the
appropriate HP channels.

Jim.
--
www.eight-cubed.com

Hein RMS van den Heuvel

ongelezen,
19 nov 2009, 19:27:4219-11-2009
aan
On Nov 18, 10:04 am, "John Reagan" <johnrrea...@earthlink.net> wrote:
> OK children sit down by the campfire and I'll tell you a scary story about
> NaTs!
>
> [holding a flashlight under my chin to illuminate my face...]
:
<snip>

Thank you John! Great stuff.
Gem (sic) replies like this makes it all worth my while to wade
through the daily crud too often posted here!

Cheers,
Hein

JF Mezei

ongelezen,
19 nov 2009, 20:46:3819-11-2009
aan
John Reagan wrote:

> The integer registers on Itanium are actually 65 bits with.


Can we sue Intel for false advertising then ? :-)

> Itanium sepculative load feature

And some HP employees had the gall to criticize all the speculation in
comp.os.vms at the same time they were designing Itanium with tons of
speculation in and around it !

Perhaps they should have added an instruction that clears all
speculative bits in all registers, and this could be called before and
after calling any subroutine.

John Reagan

ongelezen,
19 nov 2009, 21:04:0219-11-2009
aan

"JF Mezei" <jfmezei...@vaxination.ca> wrote in message
news:00fef009$0$23230$c3e...@news.astraweb.com...

>
> Perhaps they should have added an instruction that clears all
> speculative bits in all registers, and this could be called before and
> after calling any subroutine.
>

Well then you'd have no preserved state registers across a routine call.

And there are instructions to load/save the entire set of NaT bits. They
live in their own application register with special instructions to access
them. They need to be saved/restore at context swaps, interrupts, etc. The
problem is that they are pretty slow instructions since they interrupt the
pipeline/parallel execution nature of the chip.

Same issue with the setf.sig/getf.sig instructions which move integer
registers to/from the 64-bit mantissa of the floating registers. They are
used quite a bit, but they are slow since it forces the integer unit on the
chip to start talking to the floating unit on the chip. They have different
number of pipeline stages.

John


IanMiller

ongelezen,
20 nov 2009, 04:37:3120-11-2009
aan
On 20 Nov, 02:04, "John Reagan" <johnrrea...@earthlink.net> wrote:
> "JF Mezei" <jfmezei.spam...@vaxination.ca> wrote in message


All of which demonstrates things are even weirder than they first
appear, even for the Itainium architecture which is very strange in
places.

John Reagan

ongelezen,
20 nov 2009, 12:59:5620-11-2009
aan

"IanMiller" <gx...@uk2.net> wrote in message
news:86368d83-374f-4e2c...@j4g2000yqe.googlegroups.com...

Like what? Having the ability to attempt to prefetch memory and hoist
operations out of a loop on the rare chance that the memory doesn't exist
but if it did, you save lots of cycles? Both control and data speculation
on Itanium are one of the few things that really do make a potential
difference.

Like what? Slowing the machine down when having two internal functional
units talk to each other? The Alpha FTOI and ITOF instructions are just as
bad.

John


Jeremy Begg

ongelezen,
22 nov 2009, 22:05:1422-11-2009
aan John Reagan
Hi John,

John Reagan wrote:
> OK children sit down by the campfire and I'll tell you a scary story about
> NaTs!
>
> [holding a flashlight under my chin to illuminate my face...]

Your explanation makes more sense than the few NaT reference I could find in
the Itanium Architecture Manuals. (The ones Sue made us all take home a
couple of years ago :-)

But maybe I just didn't read enough. Interesting stuff. I just hope the
C++ people have a good reason for wanting to use features that cause so much
grief!


Thanks for the gory details.


Jeremy Begg

0 nieuwe berichten