I hate FP math...

Michal Necasek

unread,

Nov 15, 2003, 9:14:13 PM11/15/03

to

Since the release of Open Watcom 1.2 is looming ahead, I've
been running a bunch of regression tests, fixing what I can.
Sadly, when running math library tests, I've run into severe
problems that I probably cannot fix.

I have a very strong suspicion that "recent" OS/2 kernels are
seriously broken WRT floating point math. Some OS/2 API calls
have a tendency to trash the FP control word. This was also
observed by the Mozilla folks.

Here's a little test proggy:

----------------------------------------------------
#include <stdio.h>
#include <math.h>
#include <float.h>

void main( void )
{
int cw;
double d;
char buf[512];

cw = _control87( PC_24, MCW_PC ); /* Set FPU CW to atypical
value */
cw = _control87( 0, 0 ); /* Read back the FPU CW */
sprintf( buf, "cw = %#X", cw );
cw = _control87( 0, 0 ); /* Read/print again to ensure */
puts( buf ); /* sprintf() didn't mess
with it */
sprintf( buf, "cw = %#X", cw );
puts( buf );

cw = _control87( 0, 0 ); /* See what FPU CW looks like */
printf( "cw = %#X\n", cw ); /* after we printed it */
d = atan2( 1.0, DBL_MAX ); /* This may cause underflow
xcpt */
cw = _control87( 0, 0 ); /* Let's see what FPU CW is
now */
printf( "cw = %#X\n", cw );
printf( "%g\n", d ); /* Print the atan2 result */
}
----------------------------------------------------

When I ran this program on my machine, I observed the following
behaviour:

- In PM sessions, the FP CW gets trashed (set to 0x37F)

- In FS sessions, the FP CW gets trashed too (but set to 0x362)

- Under debugger control, everything works just fine (I hate that!)

- IBM VAC++ 3.08, Open Watcom and BCOS2 2.0 all exhibit this
behaviour (their default FP CW is different, but all get trashed,
and their respective debuggers mask this problem)

- Borland and Watcom produced executables crash in FS because of
this (because they expect FP underflow exceptions to be masked,
which is not the case after the FP CW gets messed up)

- VAC++ produced executables "only" print different results for
the atan2() call in FS/PM sessions.

- Problem did not exist in Warp Connect

- Problem _did_ exist in WSeB GA

Could be interesting if someone tested this on Warp 4 GA and/or
various Warp 4 FP levels.

Michal

dinkmeister

unread,

Nov 15, 2003, 9:20:37 PM11/15/03

to

suggestion: also try posting to comp.os.os2.bugs to get Scott's attention?

On Sun, 16 Nov 2003 02:14:13 GMT, Michal Necasek wrote:

:
: Since the release of Open Watcom 1.2 is looming ahead, I've

:

Ilya Zakharevich

unread,

Nov 15, 2003, 9:41:54 PM11/15/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek
<mic...@prodigy.net>], who wrote in article <V3Btb.21079$R92....@newssvr33.news.prodigy.com>:

>
> Since the release of Open Watcom 1.2 is looming ahead, I've
> been running a bunch of regression tests, fixing what I can.
> Sadly, when running math library tests, I've run into severe
> problems that I probably cannot fix.
>
> I have a very strong suspicion that "recent" OS/2 kernels are
> seriously broken WRT floating point math. Some OS/2 API calls
> have a tendency to trash the FP control word. This was also
> observed by the Mozilla folks.

I found (and put workaround into Perl, otherwise overflow exceptions
are frequent) the following scenarios of FP flags changing:

a) Some not-fully understood situations related to PM APIs
(dependent on a video driver and/or PM hooks?);

b) Running INIT/TERM code of certain DLLs; in particular:

b1) depending on used DLLs, FP flags at the start of the
application can be not what is expected; Note that presence
of HOOK DLLs changes the list of DLLs loaded at start time;

b2) DosLoadModule() can change flags;

b3) Creation/destruction of a message queue (un)loads some other
set of HOOK DLLs, thus also may affect the flags.

c) (No workaround possible/easy): some HOOKs are run during a screen
write in a VIO session; they also have a chance [;-)] to modify
the flags; E.g., at least one of the versions of GAMESRVR.DLL (of
Dive) is known to do this.

Hope this helps,
Ilya

Michal Necasek

unread,

Nov 15, 2003, 10:04:45 PM11/15/03

to

Ilya Zakharevich wrote:

> a) Some not-fully understood situations related to PM APIs
> (dependent on a video driver and/or PM hooks?);
>

This is the real issue.

It just occurred to me that I should test a command line boot... my
WSeB machine does NOT trash the FP CW in command line boot.

Replacing SDD with GENGRADD I can see that the problem still exists
in PM sessions but not in FS - I have a hunch that the latest SNAP
might maybe "fix" this too... just maybe.

So anyway the kernel itself is apparently not the culprit, although
I'd still like to know why running under a debugger should affect
things. This makes it somewhat difficult for me to pinpoint the
problem.

> b) Running INIT/TERM code of certain DLLs; in particular:
>

This is somewhat unavoidable but not likely a show stopper. DLLs
that don't use FP math should not need to touch the FP CW ever.
For those that do it might be trickier, but still the INIT/TERM
code does not get run exactly often. I hate FP math even more now.

Michal

William L. Hartzell

unread,

Nov 15, 2003, 10:42:09 PM11/15/03

to

Sir: Michal Necasek wrote: > Ilya Zakharevich wrote: >> a) Some not-fully understood situations related to PM APIs >> (dependent on a video driver and/or PM hooks?); > This is the real issue. > It just occurred to me that I should test a command line boot... my > WSeB machine does NOT trash the FP CW in command line boot. > Replacing SDD with GENGRADD I can see that the problem still exists > in PM sessions but not in FS - I have a hunch that the latest SNAP > might maybe "fix" this too... just maybe. > So anyway the kernel itself is apparently not the culprit, although > I'd still like to know why running under a debugger should affect > things. This makes it somewhat difficult for me to pinpoint the > problem. >> b) Running INIT/TERM code of certain DLLs; in particular: > This is somewhat unavoidable but not likely a show stopper. DLLs > that don't use FP math should not need to touch the FP CW ever. > For those that do it might be trickier, but still the INIT/TERM > code does not get run exactly often. I hate FP math even more now. Would or could this be related to the use of the floating point registers to do matrix intger math, ie. 3Dnow? Bill Thanks a Million!

Marty

unread,

Nov 15, 2003, 10:41:45 PM11/15/03

to

Michal Necasek wrote:
> Ilya Zakharevich wrote:
>
>> a) Some not-fully understood situations related to PM APIs
>> (dependent on a video driver and/or PM hooks?);
>>
> This is the real issue.
>
> It just occurred to me that I should test a command line boot... my
> WSeB machine does NOT trash the FP CW in command line boot.
>
> Replacing SDD with GENGRADD I can see that the problem still exists
> in PM sessions but not in FS - I have a hunch that the latest SNAP
> might maybe "fix" this too... just maybe.
>
> So anyway the kernel itself is apparently not the culprit,

Shouldn't the kernel be protecting your application and even driver code
from this? Isn't that part of maintaining the proper ring/context?

You can probably work around it in your code, but should you have to??

Ilya Zakharevich

unread,

Nov 16, 2003, 2:39:13 AM11/16/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <hPBtb.21084$K82....@newssvr33.news.prodigy.com>:

> > a) Some not-fully understood situations related to PM APIs
> > (dependent on a video driver and/or PM hooks?);

> This is the real issue.

Well, until proven otherwise, I expect that it is a corollary of "b"
and hook DLLs. Try it under BOOTOS2-created PM session.

> So anyway the kernel itself is apparently not the culprit, although
> I'd still like to know why running under a debugger should affect
> things. This makes it somewhat difficult for me to pinpoint the
> problem.

Could the debugger somehow "hijacks" PM initialization, so the HOOK
DLLs are run in the context of the debugger?

> > b) Running INIT/TERM code of certain DLLs; in particular:

> This is somewhat unavoidable but not likely a show stopper.

My experience is otherwise. *All* things I *could* trace lead to that.

> DLLs that don't use FP math should not need to touch the FP CW ever.

Michal, you are writing compilers, how can you be so naive? Of course
it is not the application code which trashes FP; I think it is CRTL
initialization code of these DLLs.

> For those that do it might be trickier, but still the INIT/TERM
> code does not get run exactly often.

True. It looks like some broken CRTL trash FP flags in *every* entry
point (or those exported from the DLL?).

Hope this helps,
Ilya

Michal Necasek

unread,

Nov 16, 2003, 5:24:08 AM11/16/03

to

Ilya Zakharevich wrote:

>> This is the real issue.
>
> Well, until proven otherwise, I expect that it is a corollary of "b"
> and hook DLLs. Try it under BOOTOS2-created PM session.
>

The WSeB system I was testing on is a pretty clean install... and
I've made sure that the FP CW corruption is there even with plain
old VGA driver. I suspect it's some part of PM itself that's
doing it.

> Could the debugger somehow "hijacks" PM initialization, so the HOOK
> DLLs are run in the context of the debugger?
>

Then the hooks couldn't work, could they? I think this might well
be some obscure bug in the DosDebug API...

> My experience is otherwise. *All* things I *could* trace lead to that.
>

In this case, it's a call to printf() that does it. I really don't
think that loads any DLLs, not even very indirectly. It'll end up
doing a DosWrite to stdout, which in turn will do who knows what
to display the output in a VIO window...

>>DLLs that don't use FP math should not need to touch the FP CW ever.
>
> Michal, you are writing compilers, how can you be so naive?
>

I'm not writing compilers, just maintaining one ;-) Speaking for
(Open) Watcom, it will never touch the FPU if you aren't using any
FP math. This holds true for DLLs as well, AFAICT. There's just
no math code linked in.

If you do use FP math in a DLL however... all bets are off. The
runtime _will_ set the CW to a default value; moreover if any math
exceptions are triggered, the handler is likely to mess with the
control word too.

I think I've just convinced myself that using FP math in a DLL
that's intended to be called by "any" client is probably more
trouble than it's worth.

Michal

Michal Necasek

unread,

Nov 16, 2003, 5:25:35 AM11/16/03

to

William L. Hartzell wrote:

> Would or could this be related to the use of the floating point
> registers to do matrix intger math, ie. 3Dnow?
>

No. One of the CPUs I was testing on is an ol' PPro, no MMX
or 3DNow in sight. That's not to say MMX apps can't be affected
by this problem.

Michal

Michal Necasek

unread,

Nov 16, 2003, 5:27:11 AM11/16/03

to

Marty wrote:

> Shouldn't the kernel be protecting your application and even driver code
> from this?
>

Maybe the kernel does! I would expect that the problem is entirely
with Ring 3 code. There's an awful lot of stuff that runs in R3 ;-)

> You can probably work around it in your code, but should you have to??
>

Certainly not.

Michal

Meinolf Sondermann

unread,

Nov 16, 2003, 6:33:08 AM11/16/03

to

Michal Necasek wrote:
> Ilya Zakharevich wrote:
>
>> a) Some not-fully understood situations related to PM APIs
>> (dependent on a video driver and/or PM hooks?);
>>
> This is the real issue.
>
> It just occurred to me that I should test a command line boot... my
> WSeB machine does NOT trash the FP CW in command line boot.

You should test if it is the PM or the WPS. Run your tests with
RUNWORKPLACE set to some other exec.

>
> Replacing SDD with GENGRADD I can see that the problem still exists
> in PM sessions but not in FS - I have a hunch that the latest SNAP
> might maybe "fix" this too... just maybe.

Does the problem exist with a non-GRADD video driver?

[...]

--
Bye/2
Meinolf

Scott

unread,

Nov 16, 2003, 10:37:24 AM11/16/03

to

1. The kernel does do a save and restore of FP regs on every context switch
2. I don't know about the debugger, per se -- that would take some effort.
3. The problem you are having is a variation of a typical library (in this
case PM) usage
issue. The code that's changing your FP state is, in fact, running as
part of your thread. I
agree that that it is a really crummy situation, but, in fact, FP on x86
has always been
poorly done, and was made worse by some interesting decisions back in
the DOS days.
There is little or nothing we can do at this point.

I think that fairly recent PMMERGEs might have a SET GRENOFLOAT=TRUE feature
that
reduces FP usage in the graphics engine. Otherwise, you're pretty much SOL.
-Scott

eric w

unread,

Nov 16, 2003, 12:29:25 PM11/16/03

to

On Sun, 16 Nov 2003 15:37:24 UTC, "Scott" <sgarf...@sbcglobal.bogus.net>
wrote:

> 1. The kernel does do a save and restore of FP regs on every context switch

they suspect this is a GCC compiler bug according to bugzilla.

the problem arose when MOZILLA switched from VAC++ to GCC .

http://bugzilla.mozilla.org/show_bug.cgi?id=224487

...eric

Al Savage

unread,

Nov 16, 2003, 12:46:45 PM11/16/03

to

On Sun, 16 Nov 2003 17:29:25 UTC, "eric w" <er...@nospam.net> wrote:

> On Sun, 16 Nov 2003 15:37:24 UTC, "Scott" <sgarf...@sbcglobal.bogus.net>
> wrote:
>
> > 1. The kernel does do a save and restore of FP regs on every context switch
>
> they suspect this is a GCC compiler bug according to bugzilla.
>
> the problem arose when MOZILLA switched from VAC++ to GCC .

So, by implication, there is a similar FP Control Word-trashing bug in
the Open Watcom project's compiler (which is what Michal is working
with)?

--
Regards,
Al S.

eric w

unread,

Nov 16, 2003, 1:45:22 PM11/16/03

to

perhaps something on his system has been compiled with the gcc compiler.

Ilya Zakharevich

unread,

Nov 16, 2003, 1:57:39 PM11/16/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <cfItb.30132$Jg5....@newssvr31.news.prodigy.com>:

> > My experience is otherwise. *All* things I *could* trace lead to that.
> >
> In this case, it's a call to printf() that does it. I really don't
> think that loads any DLLs, not even very indirectly. It'll end up
> doing a DosWrite to stdout, which in turn will do who knows what
> to display the output in a VIO window...

This was exactly the situation I debugged FP flag corruption first.
Read my initial reply: VIO window writes trigger some HOOK DLLs (when
I debugged it, it was GAMESRVR in UserIni[PM_ED_HOOKS]).

[It probably (?) is not INIT/TERM, but still an operation of a HOOK DLL.]

> > Michal, you are writing compilers, how can you be so naive?

> I'm not writing compilers, just maintaining one ;-)

Then you must know how many unobvious things can go wrong...

> Speaking for (Open) Watcom, it will never touch the FPU if you
> aren't using any FP math. This holds true for DLLs as well,
> AFAICT. There's just no math code linked in.

Good to hear. Apparently, IBM uses some *other* compiler. ;-)

> If you do use FP math in a DLL however... all bets are off. The
> runtime _will_ set the CW to a default value; moreover if any math
> exceptions are triggered, the handler is likely to mess with the
> control word too.

AFAIK, EMX will never mess with FP flags (only an explicit
_control87() from the application can do it).

> I think I've just convinced myself that using FP math in a DLL
> that's intended to be called by "any" client is probably more
> trouble than it's worth.

Only if the compiler is broken. (But the DLL should document the
expected values of FP flags.)

Hope this helps,
Ilya

Michal Necasek

unread,

Nov 16, 2003, 2:17:31 PM11/16/03

to

eric w wrote:

> they suspect this is a GCC compiler bug according to bugzilla.
>

No, they do not. There is a bug in gcc that prevents the
suggested workaround from functioning properly, but it's not gcc
code that's trashing the FP CW.

> the problem arose when MOZILLA switched from VAC++ to GCC .
>

I can easily demonstrate that VAC++ compiled programs have the
exact same problem.

Michal

eric w

unread,

Nov 16, 2003, 2:23:21 PM11/16/03

to

ok.

ain't debugging compilers fun!

takes me back many years...

...eric

Michal Necasek

unread,

Nov 16, 2003, 2:32:17 PM11/16/03

to

Ilya Zakharevich wrote:

> [It probably (?) is not INIT/TERM, but still an operation of a HOOK DLL.]
>

Aha! ;-)

>> I'm not writing compilers, just maintaining one ;-)
>
> Then you must know how many unobvious things can go wrong...
>

Oh yes. Writing DLLs is tricky at the best of times. So's FP
math. Combine the two and it's a serious headache.

> Good to hear. Apparently, IBM uses some *other* compiler. ;-)
>

Most likely! ;-)

> AFAIK, EMX will never mess with FP flags (only an explicit
> _control87() from the application can do it).
>

Are you sure? How can that work? I don't see how EMX could ever
deliver consistent math results without setting the FP CW to a known
value.

>> I think I've just convinced myself that using FP math in a DLL
>>that's intended to be called by "any" client is probably more
>>trouble than it's worth.
>
> Only if the compiler is broken.
>

I think your definition of "broken" is "it's not EMX" :-P

Michal

Michal Necasek

unread,

Nov 16, 2003, 2:48:12 PM11/16/03

to

Scott wrote:

> 2. I don't know about the debugger, per se -- that would take some effort.
>

Do you have any idea why it might happen? This issue really bothers me
(in a way more than the actual corruption). I just have no idea how the
debugger could do it.

> 3. The problem you are having is a variation of a typical library (in this
> case PM) usage issue. The code that's changing your FP state is, in fact,
> running as part of your thread.
>

That's certainly what it looks like.

> There is little or nothing we can do at this point.
>

There might be if I could track down what exactly is causing the
corruption...

But I concur that FP math is pretty screwed up on x86.

> I think that fairly recent PMMERGEs might have a SET GRENOFLOAT=TRUE feature
> that reduces FP usage in the graphics engine.
>

Thanks, I think I'll try that. What does "fairly recent" mean BTW?

Michal

Al Savage

unread,

Nov 16, 2003, 3:21:55 PM11/16/03

to

On Sun, 16 Nov 2003 18:45:22 UTC, "eric w" <er...@nospam.net> wrote:

> > So, by implication, there is a similar FP Control Word-trashing bug in
> > the Open Watcom project's compiler (which is what Michal is working
> > with)?
>
> perhaps something on his system has been compiled with the gcc compiler.

???
Michal's testcase code is what he's using to generate the failure, and
he's not using GCC to compile it, he's using Open Watcom (I think).

Or, do you mean that there is some other, running app (compiled by GCC),
that is causing his testcase to present FPCW corruption?

--
Regards,
Al S.

eric w

unread,

Nov 16, 2003, 3:40:02 PM11/16/03

to

my 1st thoughts were he was using gcc to generate the watcom compiler
(horrors);

OR perhaps SNAP used GCC to generate the versions we are running;

but it sounds like we have some VERY talented people all looking into this bug
& I aver it will be solved by the end of the week!

cheers...
...eric

Michal Necasek

unread,

Nov 16, 2003, 4:22:50 PM11/16/03

to

eric w wrote:

> my 1st thoughts were he was using gcc to generate the watcom compiler
> (horrors);
>

It should be possible (OW can be built with gcc on Linux), but what'd
be the point?

Anyway I repeat, apps built with VAC++ have the same problem. The
trick is that:

- VAC++ apps appear to have a default math exception handler that
masks the problem; Watcom, Borland and gcc apps do not handle FP
exceptions when they aren't expecting them. NB: VAC++ apps will still
deliver inconsistent math results.

- The code trashing the FP CW was likely built with VAC++, hence the
trashed FP CW is a lot like what VAC++ apps would normally use anyway.

> OR perhaps SNAP used GCC to generate the versions we are running;
>

It didn't, because gcc is not very suitable for low level OS/2
development (SNAP for OS/2 is built almost completely with VAC++ 3.08,
except for all the actual driver code which is built with Watcom).

At any rate the corruption was occurring even when SNAP wasn't being
used at all, with plain VGA driver.

Michal

Paul Ratcliffe

unread,

Nov 16, 2003, 5:43:08 PM11/16/03

to

On Sun, 16 Nov 2003 02:14:13 GMT, Michal Necasek <mic...@prodigy.net> wrote:

> I have a very strong suspicion that "recent" OS/2 kernels are
> seriously broken WRT floating point math. Some OS/2 API calls
> have a tendency to trash the FP control word. This was also
> observed by the Mozilla folks.

FWIW, here are my results with the .EXE built using VAC++ 3.08:

On Warp 4 FP15, Matrox 2.21.063 drivers, 14.096e kernel, Full screen:
cw = 0X62
cw = 0X62
cw = 0X62
cw = 0X62
0

On Warp 4 FP15, Matrox 2.21.063 drivers, 14.096e kernel, windowed and
on eCS 1.1 no fixes, SDD, 14.093 kernel, windowed:
cw = 0X62
cw = 0X62
cw = 0X37F
cw = 0X37F
5.56268e-309

On eCS 1.1 no fixes, SDD, 14.093 kernel, Full screen:
cw = 0X62
cw = 0X62
cw = 0X362
cw = 0X362
0

Michal Necasek

unread,

Nov 16, 2003, 10:07:41 PM11/16/03

to

Paul Ratcliffe wrote:

> FWIW, here are my results with the .EXE built using VAC++ 3.08:
>
> On Warp 4 FP15, Matrox 2.21.063 drivers, 14.096e kernel, Full screen:
> cw = 0X62
> cw = 0X62
> cw = 0X62
> cw = 0X62
> 0
>

> On eCS 1.1 no fixes, SDD, 14.093 kernel, Full screen:
> cw = 0X62
> cw = 0X62
> cw = 0X362
> cw = 0X362
> 0
>

I think this is A Clue. I just don't know what it means yet.

What happens is that SDD drags in a bunch of PM DLLs into FS processes.
This was never really intended, it just happened to work. Until recently,
when it started causing deadlocks after VIDEOPMI got updated... to cut
the story short, the latest version of SNAP does not do this anymore, and
I suspect (and hope) that the FP CW corruption won't be there in FS. Need
to check this.

I think I'll play with KDB for a bit tomorrow - that is, if I figure
out how the heck I can examine FP registers in KDB! Maybe ICAT can do
it, hmm...

Michal

Ilya Zakharevich

unread,

Nov 16, 2003, 10:25:53 PM11/16/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <5hQtb.21258$Da7....@newssvr33.news.prodigy.com>:

> > AFAIK, EMX will never mess with FP flags (only an explicit
> > _control87() from the application can do it).

> Are you sure? How can that work? I don't see how EMX could ever
> deliver consistent math results without setting the FP CW to a known
> value.

It is set to a known value by a startup code. Then it is the
responsibility of the application. If *the application* does not
change flags, the flag stays the same.

Personally, I do not see any "trickyness" related to DLLs and/or FP
math. DLLs are just executables using somebody else's stack; since
the structure of the stack does not matter 99.999% of the time, this
is irrelevant for most purposes.

Unless you do *very deep* FP stuff (e.g., you need to run your
calculation twice with different FP flags - to check the stability of
the algorithm), you just do not muck with FP flags, period.

And no function should have FP flag different on return() (comparing
to the entry) - with the obvious exception of the function the purpose
of which is to change FP flags.

> >> I think I've just convinced myself that using FP math in a DLL
> >>that's intended to be called by "any" client is probably more
> >>trouble than it's worth.

> > Only if the compiler is broken.

> I think your definition of "broken" is "it's not EMX" :-P

EMX is unsupported. While this continues, it is as broken as the rest
(or more). This may be easily changed - but apparently we do not have
enough interested people...

Hope this helps,
Ilya

Ilya Zakharevich

unread,

Nov 16, 2003, 10:31:56 PM11/16/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <f3Qtb.21251$u47....@newssvr33.news.prodigy.com>:

> > they suspect this is a GCC compiler bug according to bugzilla.

> No, they do not. There is a bug in gcc that prevents the
> suggested workaround from functioning properly

Could you provide a pointer, please? I mostly use infinite-precision
calculators, for which FP does not matter much, but I would like to
analyse this anyway...

> > the problem arose when MOZILLA switched from VAC++ to GCC .

> I can easily demonstrate that VAC++ compiled programs have the
> exact same problem.

I would suspect that VAC++ may have its own trashing of FP flags -
which might have masked the trashing from other components (e.g., OS/2
"CORE"). Since GCC is (AFAIK) FP-flags transparent, any change done
by the other components is *immediately* visible.

Ilya

Ilya Zakharevich

unread,

Nov 16, 2003, 10:34:47 PM11/16/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <0wQtb.21259$gg7....@newssvr33.news.prodigy.com>:

> But I concur that FP math is pretty screwed up on x86.

Again, could you be more detailed here? AFAIK, x86 (can be made)/is
sufficiently close to IEEE for most purposes...

Thanks,
Ilya

eric w

unread,

Nov 16, 2003, 10:51:28 PM11/16/03

to

On Mon, 17 Nov 2003 03:34:47 UTC, Ilya Zakharevich <nospam...@ilyaz.org>
wrote:

i think he refers to Intel's implementation at the hardware level & how the
machine instructions interact.

eric w

unread,

Nov 16, 2003, 10:53:16 PM11/16/03

to

On Mon, 17 Nov 2003 03:31:56 UTC, Ilya Zakharevich <nospam...@ilyaz.org>
wrote:

> > > they suspect this is a GCC compiler bug according to bugzilla.

>
> > No, they do not. There is a bug in gcc that prevents the
> > suggested workaround from functioning properly
>
> Could you provide a pointer, please? I mostly use infinite-precision
> calculators, for which FP does not matter much, but I would like to
> analyse this anyway...

http://bugzilla.mozilla.org/show_bug.cgi?id=224487

Michal Necasek

unread,

Nov 17, 2003, 12:56:39 AM11/17/03

to

Ilya Zakharevich wrote:

> Again, could you be more detailed here? AFAIK, x86 (can be made)/is
> sufficiently close to IEEE for most purposes...
>

I'm not talking about the FPU itself - more like the practical
implementation in a complex multitasking environment. This is probably
for the most part a throwback to the fact that the FPU was initially
a separate chip. It shows.

Michal

Michal Necasek

unread,

Nov 17, 2003, 1:07:36 AM11/17/03

to

Ilya Zakharevich wrote:

> It is set to a known value by a startup code. Then it is the
> responsibility of the application. If *the application* does not
> change flags, the flag stays the same.
>

Oh, OK, so EMX _does_ "mess" with the FP CW. Just like the other
compilers do - set it to a known default during program startup. The
problem is that different compilers have different defaults, and
this naturally also explains why running init code of foreign DLLs
tends to trash the FP CW.

> Personally, I do not see any "trickyness" related to DLLs and/or FP
> math. DLLs are just executables using somebody else's stack; since
> the structure of the stack does not matter 99.999% of the time, this
> is irrelevant for most purposes.
>

That is very seriously untrue. DLLs do not just use someone else's
stack; they also use someone else's memory, file handles and just about
every other resource. Including the FPU state.

> Unless you do *very deep* FP stuff (e.g., you need to run your
> calculation twice with different FP flags - to check the stability of
> the algorithm), you just do not muck with FP flags, period.
>

Yes, that's certainly true. That's why we're having this discussion
in the first place - apps don't expect to have the FP CW changed on them
behind their back, because they're not resetting it all the time!

> And no function should have FP flag different on return() (comparing
> to the entry) - with the obvious exception of the function the purpose
> of which is to change FP flags.
>

I completely agree. Obviously this is not true for some (as yet unknown)
function(s)/API(s).

Practical question: is there some trace utility that would allow me to
track the FPU state across OS/2 API calls? Seeing as debuggers (normally
my favourite tool) are useless in this particular instance...

> EMX is unsupported. While this continues, it is as broken as the rest
> (or more). This may be easily changed - but apparently we do not have
> enough interested people...
>

Heh, don't tell me about it :-)

Michal

Michal Necasek

unread,

Nov 17, 2003, 2:21:39 AM11/17/03

to

Michal Necasek wrote:

> I think this is A Clue. I just don't know what it means yet.
>

I think I have Another Clue. In desperation, I tried to run
OS2TRACE on my test proggy. Only I don't think OS2TRACE can capture
the FPU state, so it doesn't help me much in this case.

Except... the OS2TRACE-instrumented executable suddenly doesn't get
its FP CW trashed! What's up with that?!?

Even stranger, it doesn't matter that the APIs that I suspect
of trashing the FP CW (DosWrite) aren't traced at all. What the
heck could OS2TRACE be doing to affect this?

Curiouser and curiouser...

eric w

unread,

Nov 17, 2003, 2:47:00 AM11/17/03

to

almost sounds like interrupt timing dependencies.

has anyone tried running on different chips; like athlon vs. pentium???

...eric

Ilya Zakharevich

unread,

Nov 17, 2003, 5:38:13 AM11/17/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <IAZtb.21613$lr1....@newssvr33.news.prodigy.com>:

> > It is set to a known value by a startup code. Then it is the
> > responsibility of the application. If *the application* does not
> > change flags, the flag stays the same.

> Oh, OK, so EMX _does_ "mess" with the FP CW.

Nope.

> Just like the other
> compilers do - set it to a known default during program startup. The
> problem is that different compilers have different defaults, and
> this naturally also explains why running init code of foreign DLLs
> tends to trash the FP CW.

AFAIK, EMX DLL startup code is not doing anything with the FP flags.
So it is not "messing" with flags at all: only the startup code of EMX
*executables* changes anything. Checking... Actually, right now I
cannot find any code which *calls* _control87()... I'm pretty sure I
saw some when I was debugging these problems (last time 3 years ago,
when TCPIP32.DLL turned out to trash the flags on load), but today I
can't...

> That is very seriously untrue. DLLs do not just use someone else's
> stack; they also use someone else's memory, file handles and just about
> every other resource. Including the FPU state.

This is not different from having the code statically loaded; so what
is your point? I singled out the stack since it is mapped
"essentially differently" with different compilers...

> > And no function should have FP flag different on return() (comparing
> > to the entry) - with the obvious exception of the function the purpose
> > of which is to change FP flags.

> I completely agree. Obviously this is not true for some (as yet unknown)
> function(s)/API(s).

I think the only thing we may hope to is to get the list of these
functions, and put wrappers in c.lib. Or maybe Watcom with it smart
calling-convention handlers can autogenerate wrappers for "calling
convention which trashes FP flags" automatically?

However, even in this best scenario I have no idea which entity can
generate such a list. :-(

> Practical question: is there some trace utility that would allow me to
> track the FPU state across OS/2 API calls? Seeing as debuggers (normally
> my favourite tool) are useless in this particular instance...

I do not remember any problem debugging this with sd386.

BTW, can't you modify the calling-convention part of Watcom to
checkpoint FP flags on entry and return, and complain on a discrepancy?

> > EMX is unsupported. While this continues, it is as broken as the rest
> > (or more). This may be easily changed - but apparently we do not have
> > enough interested people...

> Heh, don't tell me about it :-)

This was not directed to you, you have enough problems in your pocket
anyway ;-). But I still hope that if I make remainders each couple of
months, enough people will realize that it is going to be time-saver
for them too to fix problems globally, and not per-application...

[And *several* people is required to get peer-review for QA purposes.]

Yours,
Ilya

Ilya Zakharevich

unread,

Nov 17, 2003, 5:45:46 AM11/17/03

to

[A complimentary Cc of this posting was sent to
eric w
<er...@nospam.net>], who wrote in article <vSrfmdoFuNkL-p...@nospam.nospam.net>:

> > > No, they do not. There is a bug in gcc that prevents the
> > > suggested workaround from functioning properly

> http://bugzilla.mozilla.org/show_bug.cgi?id=224487

- Does not optimize jsnum.c. Due to an optimizer bug in GCC, the _control87()
call was getting reordered after the floating point calls, resulting in a
crash.

Is this actually a bug? I do not remember which function calls may be
reordered; the common sense says that 'none' - but I think that it
were the floating point calls which were reordered before
_control87(), not _control87() after them ;-).

Do I need to change

unsigned fpflag = _control87(0,0);

to

volatile unsigned fpflag = _control87(0,0);

in all of my code??? Should this help?

Ilya

Marc L. Cohen

unread,

Nov 17, 2003, 11:41:53 AM11/17/03

to

Michal Necasek wrote:
> Ilya Zakharevich wrote:
>
>> a) Some not-fully understood situations related to PM APIs
>> (dependent on a video driver and/or PM hooks?);
>>
> This is the real issue.
>
> It just occurred to me that I should test a command line boot... my
> WSeB machine does NOT trash the FP CW in command line boot.
>
> Replacing SDD with GENGRADD I can see that the problem still exists
> in PM sessions but not in FS - I have a hunch that the latest SNAP
> might maybe "fix" this too... just maybe.
>
> So anyway the kernel itself is apparently not the culprit, although
> I'd still like to know why running under a debugger should affect
> things. This makes it somewhat difficult for me to pinpoint the
> problem.
>

The graphics subsystem has been using floating point math for several
years now. I don't know that it was implemented to save/restore the
state on entry/exit. Unfortunately, at the moment I don't seem to have
access to the code trees, so I can't check.

>> b) Running INIT/TERM code of certain DLLs; in particular:
>>
> This is somewhat unavoidable but not likely a show stopper. DLLs
> that don't use FP math should not need to touch the FP CW ever.
> For those that do it might be trickier, but still the INIT/TERM
> code does not get run exactly often. I hate FP math even more now.
>
>
> Michal
>

Michal Necasek

unread,

Nov 17, 2003, 12:30:39 PM11/17/03

to

Ilya Zakharevich wrote:

> AFAIK, EMX DLL startup code is not doing anything with the FP flags.
>

If that is true then how can EMX DLLs deliver consistent math results?

>> That is very seriously untrue. DLLs do not just use someone else's
>>stack; they also use someone else's memory, file handles and just about
>>every other resource. Including the FPU state.
>
> This is not different from having the code statically loaded; so what
> is your point?
>

The point is that if you write DLL that _other people_ will use, you
have to be very, very careful. A DLL that only your own app uses is
indeed much the same as statically linked code. But as soon as you let
other people load your DLL, suddenly a whole lot of assumptions may
be invalid. You have to play nice.

> I singled out the stack since it is mapped
> "essentially differently" with different compilers...
>

In what sense?

> I think the only thing we may hope to is to get the list of these
> functions, and put wrappers in c.lib. Or maybe Watcom with it smart
> calling-convention handlers can autogenerate wrappers for "calling
> convention which trashes FP flags" automatically?
>

I see no existing provisions for that - it is expected that whoever
changes the FPU state knows what they're doing ;-)

> BTW, can't you modify the calling-convention part of Watcom to
> checkpoint FP flags on entry and return, and complain on a discrepancy?
>

I could, but it'd take me a while to figure out where the problem is.
Besides, it'll tell me that it's DosWrite that's doing it - and I know
that already. I need something that will give me more precise
information.

>> Heh, don't tell me about it :-)
>
> This was not directed to you, you have enough problems in your pocket
> anyway ;-)
>

I know you didn't direct it to me, I just know the situation too well :-)

FYI I ran my test program under sd386 and as I expected, it behaved
just like the 3 other debuggers I tried before - the problem didn't show
up. I really wish I knew why.

Michal

Daniela Engert

unread,

Nov 17, 2003, 12:33:20 PM11/17/03

to

Michal Necasek wrote:

> I completely agree. Obviously this is not true for some (as yet unknown)
> function(s)/API(s).
>
> Practical question: is there some trace utility that would allow me to
> track the FPU state across OS/2 API calls? Seeing as debuggers (normally
> my favourite tool) are useless in this particular instance...

OS/2 has a built-in trace facility which does exactly what you are
asking for (the mysterious files in \OS2\SYSTEM\TRACE are precompiled
trace hooks). For information about that study the Debugging Handbooks
(part of the Redbook series). You may also use GoldenCode's Trace
utilities which can do the same remotely and hav a nice UI.

Ciao,
Dani

eric w

unread,

Nov 17, 2003, 12:48:32 PM11/17/03

to

On Mon, 17 Nov 2003 17:33:20 UTC, Daniela Engert <dani%ngr...@nospam.de>
wrote:

> OS/2 has a built-in trace facility which does exactly what you are
> asking for (the mysterious files in \OS2\SYSTEM\TRACE are precompiled
> trace hooks). For information about that study the Debugging Handbooks
> (part of the Redbook series).

amazingly these are still available online (hardcopy costs a few $$$, are much
easier to work with though).

> http://publib-b.boulder.ibm.com/cgi-bin/searchsite.cgi?query=os/2+debugging

...eric

Ted Edwards

unread,

Nov 17, 2003, 12:57:56 PM11/17/03

to

Sorry, I'm coming in on this in middle but:

> I have a very strong suspicion that "recent" OS/2 kernels
> are seriously broken WRT floating point math.

I suggest you look elsewhere since I seriously doubt this. I have been
a heavy user of APL since the 1967, IBM's APL2 for DOS since the mid
'80s and APL2 for OS/2 for about ten years now and have never seen any
evidence of this.

BTW, most of my APL2 stuff is pretty heavy duty math.

My C is has a 15 year accumulation of rust but it looks like you are
messing about with Arctan 1:

16{fmt}(180{div}{circ}1){times}{neg}3{circ}1
45.0000000000000000

Ted

Marc L. Cohen

unread,

Nov 17, 2003, 12:48:11 PM11/17/03

to

Marc L. Cohen wrote:

> The graphics subsystem has been using floating point math for several
> years now. I don't know that it was implemented to save/restore the
> state on entry/exit. Unfortunately, at the moment I don't seem to have
> access to the code trees, so I can't check.
>

Well, I can't find anywhere where the GRE is modifying
(saving/restoring) the FP CW.

eric w

unread,

Nov 17, 2003, 1:32:21 PM11/17/03

to

is your os/2 system up to date (fixpaks, kernels, video drivers) ???

...eric

Scott G.

unread,

Nov 17, 2003, 2:35:51 PM11/17/03

to

On Mon, 17 Nov 2003 18:33:20 +0100, Daniela Engert wrote:

>> Practical question: is there some trace utility that would allow me to
>> track the FPU state across OS/2 API calls? Seeing as debuggers (normally
>> my favourite tool) are useless in this particular instance...
>OS/2 has a built-in trace facility which does exactly what you are asking for

Actually, I don't think there's any way in the trace code to dump out the
floating point registers, etc.
OS/2 RAS tools are sadly deficient in the FP area.

Ilya Zakharevich

unread,

Nov 17, 2003, 2:46:12 PM11/17/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <3B7ub.21782$IW2....@newssvr33.news.prodigy.com>:

> > AFAIK, EMX DLL startup code is not doing anything with the FP flags.

> If that is true then how can EMX DLLs deliver consistent math results?

Sorry, I have no idea what you are talking about. What is "consistent
math results"? [I would consider a program which ignores "my" calls
to _config87() broken.]

> >> That is very seriously untrue. DLLs do not just use someone else's
> >>stack; they also use someone else's memory, file handles and just about
> >>every other resource. Including the FPU state.

> > This is not different from having the code statically loaded; so what
> > is your point?

> The point is that if you write DLL that _other people_ will use, you
> have to be very, very careful. A DLL that only your own app uses is
> indeed much the same as statically linked code. But as soon as you let
> other people load your DLL, suddenly a whole lot of assumptions may
> be invalid. You have to play nice.

I repeat: I see no difference with statically loaded code. Can you be
more specific?

> > I singled out the stack since it is mapped
> > "essentially differently" with different compilers...

> In what sense?

GUARD pages, for one. Having stack moved between calls, for another
one (not compiler-specific - if you do not call REXX a compiler... ;-).
The default stack size (64K vs 8M) for the third...

Yours,
Ilya

James J. Weinkam

unread,

Nov 17, 2003, 7:19:23 PM11/17/03

to

I think Ted is running W4 FP15, but I am running MCP1 C004 with the kernel
dated 2003/08/21 and I get the same result as Ted did.

On the other hand, when APL/2 for OS/2 is running there are at least three
active components: The shared variable processor, the interpreter, and the
session manager. The application is multi threaded and I don't think the
interpreter (where all the floating point math takes place) does any IO. As I
understand it, IO is done either in the session manager or in various APs
which communicate through shared variables and run in separate threads. Since
previous posters have said that the problem only occurs when operations such
as DosWrite are executed as part of the same thread, I think APL might be
immune to the problem. Perhaps David Liebtag or Nancy Wheeler could comment
on this if they are reading this thread.

Michal Necasek

unread,

Nov 17, 2003, 7:45:30 PM11/17/03

to

Ilya Zakharevich wrote:

>> If that is true then how can EMX DLLs deliver consistent math results?
>
> Sorry, I have no idea what you are talking about. What is "consistent
> math results"? [I would consider a program which ignores "my" calls
> to _config87() broken.]
>

What I mean is, if someone changes - or not - the FP CW on you, and
changes the FPU precision, rounding etc., then you can't get
consistent results.

> I repeat: I see no difference with statically loaded code. Can you be
> more specific?
>

We seem to be talking about different things. From the OS
perspective, sure, code in a DLL runs the same way code in an EXE
does.

The difference that I can see is from the perspective of the DLL
author, and especially so if it is a DLL that will be loaded and used
by several apps at the same time. You can't just willy nilly allocate
memory, open files etc. in the DLL, because those resources belong
to the executable.

Another issue (which you mention) is stack size - you don't know how
much there will be, and using up gobs of stack space in a DLL is
likely to cause trouble.

> >> In what sense?
>
> GUARD pages, for one.
>

Guard pages only apply to secondary threads AFAIK? Anyway yes, if
the stack is sparse, you have to be careful. Does emx gcc handle
automatic stack growing like the IBM or Watcom compilers do?

> The default stack size (64K vs 8M) for the third...
>

But that is controlled by the process that loads the DLL?? It could
be even (much) less or even more...

Michal

Michal Necasek

unread,

Nov 17, 2003, 7:51:08 PM11/17/03

to

Daniela Engert wrote:

> OS/2 has a built-in trace facility which does exactly what you are
> asking for (the mysterious files in \OS2\SYSTEM\TRACE are precompiled
> trace hooks). For information about that study the Debugging Handbooks
> (part of the Redbook series).
>

Dani, I am not a _complete_ idiot. I have used the trace in the past
with good success (it saved my ass a few times). I do have the
Debugging Handbook and I searched through it before I asked the
question.

I shall be more specific this time: how do I capture the FPU control
word with TRACE? My search of available documentation came up with
nothing.

Speaking of which, how the heck do I display the FPU registers
from KDB? Or is the answer "you don't"?

Michal

Ilya Zakharevich

unread,

Nov 17, 2003, 8:11:49 PM11/17/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<Mic...@scitechsoft.com>], who wrote in article <KYdub.22063$7k4....@newssvr33.news.prodigy.com>:
> > Sorry, I have no idea what you are talking about. What is "consistent
> > math results"? [I would consider a program which ignores "my" calls
> > to _config87() broken.]

> What I mean is, if someone changes - or not - the FP CW on you, and
> changes the FPU precision, rounding etc., then you can't get
> consistent results.

And I ask again: what is "consistent"? (FP flags are *designed* to be
changeable.)

What I meant is: there are two ways to do things:

a) set flags to a certain value before doing any FP operation;

b) run FP math in the current environment set on FP hardware.

With "a" the program ignores user-written _control87() calls; which
IMO is insanity. With "b" the results depend on what the other parts
of the program are doing; apparently, you consider this as a bad
thing. Well, I do not.

> > I repeat: I see no difference with statically loaded code. Can you be
> > more specific?

> The difference that I can see is from the perspective of the DLL

> author, and especially so if it is a DLL that will be loaded and used
> by several apps at the same time. You can't just willy nilly allocate
> memory, open files etc. in the DLL, because those resources belong
> to the executable.

... which is the same as with any statically linked library... Do you
discusss GLOBALINIT SHARED SINGLE DLLs or what? What I discussed was
INITINSCTANCE NONSHARED MULTIPLE stuff; using any other combination
indeed requires more thought...

> Another issue (which you mention) is stack size - you don't know how
> much there will be, and using up gobs of stack space in a DLL is
> likely to cause trouble.

... which is the same as with any statically linked library...

> > GUARD pages, for one.

> Guard pages only apply to secondary threads AFAIK?

Depends on the compiler, I presume...

> the stack is sparse, you have to be careful. Does emx gcc handle
> automatic stack growing like the IBM or Watcom compilers do?

EMX's stack is fully committed. Does not matter much with
overcommitment required to load OS/2.

> > The default stack size (64K vs 8M) for the third...

> But that is controlled by the process that loads the DLL?? It could
> be even (much) less or even more...

Of course. What I describe are typical defaults (one for 'gcc -Zomf',
another for 'gcc').

Hope this helps,
Ilya

Michal Necasek

unread,

Nov 17, 2003, 8:18:29 PM11/17/03

to

Ilya Zakharevich wrote:

> And I ask again: what is "consistent"?
>

Always delivering the exact same results. Not "0" in some cases
and "5.56268e-309" in others.

> What I meant is: there are two ways to do things:
>
> a) set flags to a certain value before doing any FP operation;
>
> b) run FP math in the current environment set on FP hardware.
>
> With "a" the program ignores user-written _control87() calls; which
> IMO is insanity. With "b" the results depend on what the other parts
> of the program are doing; apparently, you consider this as a bad
> thing. Well, I do not.
>

No, you misunderstand me. All I am saying is that if user written
(or C runtime provided) code sets the FPU control word to certain
value, and then someone else changes it unexpectedly, there's going
to be trouble.

> ... which is the same as with any statically linked library... Do you
> discusss GLOBALINIT SHARED SINGLE DLLs or what? What I discussed was
> INITINSCTANCE NONSHARED MULTIPLE stuff; using any other combination
> indeed requires more thought...
>

Yes, I was thinking of DLLs with shared data. Also DLLs called by
apps built with other compilers etc. It's not entirely trivial.

>> Guard pages only apply to secondary threads AFAIK?
>
> Depends on the compiler, I presume...
>

I don't think so?

> EMX's stack is fully committed.
>

Even for secondary threads? The main thread, sure.

> Of course. What I describe are typical defaults (one for 'gcc -Zomf',
> another for 'gcc').
>

Ah. Thanks for the explanation.

Michal

Steven Levine

unread,

Nov 17, 2003, 8:30:15 PM11/17/03

to

In <7G_tb.25441$q21...@newssvr32.news.prodigy.com>, on 11/17/2003

at 07:21 AM, Michal Necasek <mic...@prodigy.net> said:

> Except... the OS2TRACE-instrumented executable suddenly doesn't get its
>FP CW trashed! What's up with that?!?

It's possible Dave's code saves and restores the FP environment in his
DLLs.

Regards,

Steven

--
--------------------------------------------------------------------------------------------
Steven Levine <ste...@earthlink.bogus.net> MR2/ICE 2.40 #10183
Warp4/FP15/14.093c_W4 www.scoug.com irc.webbnet.info irc.fyrelizard.org #scoug (Wed 7pm PST)
--------------------------------------------------------------------------------------------

Peter Moylan

unread,

Nov 17, 2003, 8:38:30 PM11/17/03

to

Michal Necasek <mic...@prodigy.net> wrote:

> FYI I ran my test program under sd386 and as I expected, it behaved
>just like the 3 other debuggers I tried before - the problem didn't show
>up. I really wish I knew why.

Explanation 1: perhaps your debuggers are saving/restoring some state
for you, thereby cancelling out the problem.

Explanation 2: generally you can't even use a debugger sensibly unless
you have some optimisations turned off. (A perennial headache in my
own debugging is the variable that turns out to be invisible because
it's never stored to memory, just kept in temporary registers; or the
parameter that's never passed because the compiler knows it's not
going to be used.) Do your symptoms change depending on what compiler
optimisations are being done?

--
Peter Moylan Peter....@newcastle.edu.au
http://eepjm.newcastle.edu.au (OS/2 and eCS information and software)

William L. Hartzell

unread,

Nov 17, 2003, 8:57:11 PM11/17/03

to

Sir: Marc L. Cohen wrote: > Marc L. Cohen wrote: >> The graphics subsystem has been using floating point math for several >> years now. I don't know that it was implemented to save/restore the >> state on entry/exit. Unfortunately, at the moment I don't seem to have >> access to the code trees, so I can't check. > Well, I can't find anywhere where the GRE is modifying > (saving/restoring) the FP CW. Remember back a few years when the WPS was reporting files sizes in fractional units, ie. 2345.3453 bytes? Maybe the fix there was to mask or change the CW? Even before that, there was a similar bug in the VIO code. Bill Thanks a Million!

eric w

unread,

Nov 17, 2003, 10:11:00 PM11/17/03

to

On Tue, 18 Nov 2003 01:18:29 UTC, Michal Necasek <Mic...@scitechsoft.com>
wrote:

> Always delivering the exact same results. Not "0" in some cases
> and "5.56268e-309" in others.
>

i thought when dealing with a floating point representation of zero you need to
be plus or minus a certain tolerance.

5 e-309 cetainly sounds like it is pretty damned close!

...eric

eric w

unread,

Nov 17, 2003, 10:14:23 PM11/17/03

to

On Tue, 18 Nov 2003 00:45:30 UTC, Michal Necasek <Mic...@scitechsoft.com>
wrote:

> The difference that I can see is from the perspective of the DLL

> author, and especially so if it is a DLL that will be loaded and used
> by several apps at the same time. You can't just willy nilly allocate
> memory, open files etc. in the DLL, because those resources belong
> to the executable.
>

you are talking about reentrable code here.

i wonder if some non-reentrable code snuck in somewhere (would be an easy
mistake by not setting proper compiler switches).

...eric

Michal Necasek

unread,

Nov 17, 2003, 11:19:34 PM11/17/03

to

Peter Moylan wrote:

> Explanation 1: perhaps your debuggers are saving/restoring some state
> for you, thereby cancelling out the problem.
>

But how would they do it? When I run the app full speed under
a debugger, the debugger shouldn't really be doing anything...

> Do your symptoms change depending on what compiler
> optimisations are being done?
>

Not at all.

Michal

Ilya Zakharevich

unread,

Nov 18, 2003, 1:03:13 AM11/18/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<Mic...@scitechsoft.com>], who wrote in article <Freub.22069$js4....@newssvr33.news.prodigy.com>:
> > And I ask again: what is "consistent"?

> Always delivering the exact same results. Not "0" in some cases
> and "5.56268e-309" in others.

[This is what I expected from what you wrote before, but] The
behaviour you want is not acceptable. The results *must* depend on
the user choices.

> No, you misunderstand me. All I am saying is that if user written
> (or C runtime provided) code sets the FPU control word to certain
> value, and then someone else changes it unexpectedly, there's going
> to be trouble.

Since "unexpected" changes should not happen, I do not see a point in
this argument.

> > ... which is the same as with any statically linked library... Do you
> > discusss GLOBALINIT SHARED SINGLE DLLs or what? What I discussed was
> > INITINSCTANCE NONSHARED MULTIPLE stuff; using any other combination
> > indeed requires more thought...

> Yes, I was thinking of DLLs with shared data.

Yes. *This* can't be made trivial by a compiler choice. This
requires intelligence - there are many choices, and they should be
made by users.

> Also DLLs called by apps built with other compilers etc.

This is trivial - there should be no difference with static libraries
(of course, one should not expect that the *CRTL data* [such as
environ etc] is going to be shared between compilation environments).

> > EMX's stack is fully committed.

> Even for secondary threads?

Well, those created with DosBeginThread2(). ;-) There is nothing done
with the stack in _beginthread() - it uses the OS's supplied one; I do
not know anything the "structure" of the stack in this case.

Yours,
Ilya

Michal Necasek

unread,

Nov 18, 2003, 1:19:48 AM11/18/03

to

Update:

I installed the latest version of SNAP (2.2.4?) on my testbox.
This had no effect in PM, but did change the behaviour of FS sessions.
The FP CW trashing is still present, but now I can see it under
a debugger!

To my surprise, I discovered that the first DosWrite call does
in fact load a number of DLLs. BVHVGA, BVHSVGA, VIDEPMI, SDDPMI
and a few others. I am almost certain that the culprit is the VAC++
runtime that SNAP uses. Which is good news - because I can fix that!
Note: I could not find any instance where SDDPMI would use the FPU.
I think the VAC++ runtime is just being (too) proactive.

In PM however, there's more to it because the corruption occurs
without SNAP anywhere in sight, and the value that the FP CW is set
to is different than in FS. Also in PM, there are no DLLs loaded
(that I could see) on the first DosWrite, unlike in FS.

I think I'm seeing at least two, and probably more, similar but
unrelated problems.

Stefan Knodt

unread,

Nov 18, 2003, 4:55:21 AM11/18/03

to

Compiled with VAC 3.08

On Sun, 16 Nov 2003 02:14:13 UTC, Michal Necasek <mic...@prodigy.net>
wrote:

> - Problem did not exist in Warp Connect

See below

> Could be interesting if someone tested this on Warp 4 GA and/or
> various Warp 4 FP levels.

WC ver 8.264 486 ATI Mach32 Driver
==============================
XTERM in Xfree86 3.3.6 :
cw=0x62
cw=0x62
cw=0x62
cw=0x62
0

PM:
cw=0x62
cw=0x62
cw=0x37F
cw=0x37F
5.56268e-309

FS:
cw=0x62
cw=0x62
cw=0x37F
cw=0x37F
5.56268e-309

WC ver 8.266 PentiuPro Matrox Driver
==============================
PM:
cw=0x62
cw=0x62
cw=0x362
cw=0x362
0

FS:
cw=0x62
cw=0x62
cw=0x62
cw=0x62
0

MCP1 Dual PII-ODP Matrox Driver
===========================
PM:
cw=0x62
cw=0x62
cw=0x37F
cw=0x37F
5.56268e-309

FS :
cw=0x62
cw=0x62
cw=0x62
cw=0x62
0

W4 FP15 PII-ODP SNAP
====================
PM:
cw=0x62
cw=0x62
cw=0x37F
cw=0x37F
5.56268e-309

FS:
PM:
cw=0x62
cw=0x62
cw=0x362
cw=0x362
0

MCP1 PII TP600E SNAP
====================
XTERM:
FS :
cw=0x62
cw=0x62
cw=0x62
cw=0x62
0

PM:
cw = 0X62
cw = 0X62
cw = 0X37F
cw = 0X37F
5.56268e-309

FS:
PM:
cw=0x62
cw=0x62
cw=0x362
cw=0x362
0

--

Ted Edwards

unread,

Nov 18, 2003, 12:52:00 PM11/18/03

to

eric w wrote:

> is your os/2 system up to date (fixpaks, kernels, video drivers) ???

It was as of a bit more than a year ago. Using the SNAP video drivers.
Not sure what that has to do with s;oppy arithmetic. Certainly graphics
uses FP but heavy calc bound stuff shouldn't be affected by the graphics
drivers until output time.

But we can get even farther from the arithmetic by:

45-(180{div}{circ}1){times}{neg}3{circ}1
0

Ted

Veit Kannegieser

unread,

Nov 18, 2003, 2:07:21 PM11/18/03

to

Michal Necasek wrote:

> > Explanation 1: perhaps your debuggers are saving/restoring some state
> > for you, thereby cancelling out the problem.
> >
> But how would they do it? When I run the app full speed under
> a debugger, the debugger shouldn't really be doing anything...

From my reading of the Virtual Pascal debugger, OS/2 sends exceptions
if dlls are loaded. Maybe there is some special handling in the
code that calls the dll startup code...

Veit Kannegieser

Michal Necasek

unread,

Nov 18, 2003, 6:33:58 PM11/18/03

to

Stefan Knodt wrote:

> WC ver 8.264 486 ATI Mach32 Driver

> WC ver 8.266 PentiuPro Matrox Driver
>

What fixpack levels are these?

Ilya Zakharevich

unread,

Nov 18, 2003, 6:55:14 PM11/18/03

to

[A complimentary Cc of this posting was NOT [per weedlist] sent to
Michal Necasek

<mic...@prodigy.net>], who wrote in article <8Siub.25875$oK4....@newssvr32.news.prodigy.com>:

> To my surprise, I discovered that the first DosWrite call does
> in fact load a number of DLLs. BVHVGA, BVHSVGA, VIDEPMI, SDDPMI
> and a few others. I am almost certain that the culprit is the VAC++
> runtime that SNAP uses. Which is good news - because I can fix that!

Could you please report the details when you find it, so other
compilers may try to avoid such problems?

Yours,
Ilya

Michal Necasek

unread,

Nov 19, 2003, 12:30:37 AM11/19/03

to

Ilya Zakharevich wrote:

> Could you please report the details when you find it, so other
> compilers may try to avoid such problems?
>

So far I've determined that loading the VAC++ 3.0 runtime,
CPPOM30.DLL, sets the FP CW to 0x362. By extension, this happens
when loading any DLL directly or indirectly linking to the
VAC++ runtime. Loading the VAC++ 3.6 runtime does not appear to
set the FP CW to any specific value, but it might change
the FP CW in some way.

The Watcom runtime DLL sets the FP CW to 0x127F. I'll see if
there's any way around it. Not that the Watcom runtime DLL is
exactly widely used. The statically linked runtime is significantly
more compact than the competition (I could post some file sizes
here) ;-)

I realized that 0x37F is the default FP CW state after FINIT
instruction. That might (or might not) mean something.

I can't find out what bit 5 (0x40) of the FP CW means, if
anything.

Setting GRENOFLOAT does not appear to have any effect.

Michal

Stefan Knodt

unread,

Nov 19, 2003, 5:51:10 AM11/19/03

to

Some mor tests on a
MCP1 PII TP600E SNAP
in PM command window

[H:\ibmcpp\working\fp]fp

cw = 0X62
cw = 0X62
cw = 0X37F
cw = 0X37F
5.56268e-309

[H:\ibmcpp\working\fp]fp | more

cw = 0X62
cw = 0X62

cw = 0X62
cw = 0X62
0

[H:\ibmcpp\working\fp]fp > con

cw = 0X62
cw = 0X62
cw = 0X37F
cw = 0X37F
5.56268e-309

May be the problem is i the con device.
--

Michal Necasek

unread,

Nov 19, 2003, 12:33:43 PM11/19/03

to

Stefan Knodt wrote:

> May be the problem is i the con device.
>

No, but the problem occurs when writing output to screen
(either in FS or in PM). So it makes sense that redirecting
the output will avoid FPCW corruption. Thanks for noticing!

Michal

eric w

unread,

Nov 19, 2003, 1:14:14 PM11/19/03

to

On Wed, 19 Nov 2003 05:30:37 UTC, Michal Necasek <mic...@prodigy.net> wrote:

> I can't find out what bit 5 (0x40) of the FP CW means, if
> anything.
>
>

isn't hex 40 bit 6 (with first bit, bit 0)???

according to a 1996 intel hardware reference (ppro)

bit 5 = precision exception mask
bit 6 = reserved

...eric

eric w

unread,

Nov 19, 2003, 1:39:03 PM11/19/03

to

same in 1999 edition.

James J. Weinkam

unread,

Nov 19, 2003, 2:57:55 PM11/19/03

to

Michal Necasek wrote:
>
> I can't find out what bit 5 (0x40) of the FP CW means, if
> anything.
>

Control Word Format:

Exception Masks:

0 Invalid operation
1 Denormalized operand
2 Zero divide
3 Overflow
4 Underflow
5 Precision

Reserved:

6- 7

Control:

8- 9 Precision: 00=24, 01=reserved, 10=53, 11=64
10-11 Rounding: 00=nearest or even, 01=down, 10=up, 11=chop
12 Infinity (ignored. always afine)

Reserved:

13-15

Note that the default value 037F sets bit 6 and is equivalent in function to 033f.

For complete details see

http://www.intel.com/design/intarch/techinfo/Pentium/fpu.htm#2327

Michal Necasek

unread,

Nov 19, 2003, 3:19:56 PM11/19/03

to

James J. Weinkam wrote:

>> I can't find out what bit 5 (0x40) of the FP CW means, if
>> anything.
>

My mistake - I meant bit 6 (0x40).

> Reserved:
>
> 6- 7
>
Exactly. So why is bit 6 set?

BTW bit 7 used to be interrupt control on the 8087.

> Note that the default value 037F sets bit 6 and is equivalent in
> function to 033f.
>

That's what I don't understand...

Michal

James J. Weinkam

unread,

Nov 19, 2003, 4:44:31 PM11/19/03

to

Michal Necasek wrote:
>
> BTW bit 7 used to be interrupt control on the 8087.
>

Aha!

>> Note that the default value 037F sets bit 6 and is equivalent in
>> function to 033f.
>>
> That's what I don't understand...

Probably for backwards compatibility, just like they continue to accept and
then ignore the infinity control bit.

Stefan Knodt

unread,

Nov 19, 2003, 5:34:46 PM11/19/03

to

On Tue, 18 Nov 2003 23:33:58 UTC, Michal Necasek
<Mic...@scitechsoft.com> wrote:

> Stefan Knodt wrote:
>
> > WC ver 8.264 486 ATI Mach32 Driver

XRGW040

> > WC ver 8.266 PentiuPro Matrox Driver
> >

XRGW042

> What fixpack levels are these?
>

--

David T. Johnson

unread,

Jan 14, 2004, 1:41:21 AM1/14/04

to

Michal Necasek wrote: > Since the release of Open Watcom 1.2 is looming ahead, I've > been running a bunch of regression tests, fixing what I can. > Sadly, when running math library tests, I've run into severe > problems that I probably cannot fix. > I have a very strong suspicion that "recent" OS/2 kernels are > seriously broken WRT floating point math. Some OS/2 API calls > have a tendency to trash the FP control word. This was also > observed by the Mozilla folks. > Here's a little test proggy: > ---------------------------------------------------- > #include <stdio.h> > #include <math.h> > #include <float.h> > void main( void ) > int cw; > double d; > char buf[512]; > cw = _control87( PC_24, MCW_PC ); /* Set FPU CW to atypical > value */ > cw = _control87( 0, 0 ); /* Read back the FPU CW */ > sprintf( buf, "cw = %#X", cw ); > cw = _control87( 0, 0 ); /* Read/print again to ensure */ > puts( buf ); /* sprintf() didn't mess with > it */ > sprintf( buf, "cw = %#X", cw ); > puts( buf ); > cw = _control87( 0, 0 ); /* See what FPU CW looks like */ > printf( "cw = %#X\n", cw ); /* after we printed it */ > d = atan2( 1.0, DBL_MAX ); /* This may cause underflow > xcpt */ > cw = _control87( 0, 0 ); /* Let's see what FPU CW is > now */ > printf( "cw = %#X\n", cw ); > printf( "%g\n", d ); /* Print the atan2 result */ > ---------------------------------------------------- > When I ran this program on my machine, I observed the following > behaviour: > - In PM sessions, the FP CW gets trashed (set to 0x37F) > - In FS sessions, the FP CW gets trashed too (but set to 0x362) > - Under debugger control, everything works just fine (I hate that!) > - IBM VAC++ 3.08, Open Watcom and BCOS2 2.0 all exhibit this > behaviour (their default FP CW is different, but all get trashed, > and their respective debuggers mask this problem) > - Borland and Watcom produced executables crash in FS because of > this (because they expect FP underflow exceptions to be masked, > which is not the case after the FP CW gets messed up) > - VAC++ produced executables "only" print different results for > the atan2() call in FS/PM sessions. > - Problem did not exist in Warp Connect > - Problem _did_ exist in WSeB GA > Could be interesting if someone tested this on Warp 4 GA and/or > various Warp 4 FP levels. I tested OS/2 floating point math and my system hardware with "Prime95" running on OS/2 v4.52 fixpack 3 and did not find any problems. Prime95 not only tests the floating math but also tests the general operation of the CPU and the repeatability of the memory hardware by causing the machine to use a large amount of memory to repetitively perform extensive calculations for which the exact result is known. If the final calculated result for each calculation does not agree exactly with the known answer, Prime95 reports a failure. I ran Prime95 for OS/2 for 4 hours on my system and did not fail a single test. You can download Prime95 for OS/2 here: http://www.mersenne.org/freesoft.htm Posted with OS/2 Warp 4.52 and IBM Web Browser v2.0.2

William L. Hartzell

unread,

Jan 14, 2004, 6:42:15 PM1/14/04

to

Sir:

David T. Johnson wrote:
<snip>

>
> I tested OS/2 floating point math and my system hardware with "Prime95"
> running on OS/2 v4.52 fixpack 3 and did not find any problems. Prime95
> not only tests the floating math but also tests the general operation of
> the CPU and the repeatability of the memory hardware by causing the
> machine to use a large amount of memory to repetitively perform
> extensive calculations for which the exact result is known. If the
> final calculated result for each calculation does not agree exactly with
> the known answer, Prime95 reports a failure. I ran Prime95 for OS/2 for
> 4 hours on my system and did not fail a single test. You can download
> Prime95 for OS/2 here:
>
> http://www.mersenne.org/freesoft.htm
>
>

Thanks for the pointer to this program. However, I don't see where this
program exercises the APIs that have a problem with floating p1oint math?
--
Bill
Thanks a Million!

David T. Johnson

unread,

Jan 15, 2004, 12:55:17 PM1/15/04

to

William L. Hartzell wrote: > Sir: > David T. Johnson wrote: > <snip> >> I tested OS/2 floating point math and my system hardware with >> "Prime95" running on OS/2 v4.52 fixpack 3 and did not find any >> problems. Prime95 not only tests the floating math but also tests the >> general operation of the CPU and the repeatability of the memory >> hardware by causing the machine to use a large amount of memory to >> repetitively perform extensive calculations for which the exact result >> is known. If the final calculated result for each calculation does >> not agree exactly with the known answer, Prime95 reports a failure. I >> ran Prime95 for OS/2 for 4 hours on my system and did not fail a >> single test. You can download Prime95 for OS/2 here: >> http://www.mersenne.org/freesoft.htm > Thanks for the pointer to this program. However, I don't see where this > program exercises the APIs that have a problem with floating p1oint math? The program does extensive floating point calculations. Is there a reason why you think that it is not using the floating point APIs? Posted with OS/2 Warp 4.52 and IBM Web Browser v2.0.2