float to int causes truncation of any fractional part.
Then shouldn't:
#include <stdio.h>
int main(void)
{
float a = 1.0000;
int b = a;
printf("%d %d\n", a, b);
return 0;
}
give 1 1 as the output instead of 0 and garbage?
Correct.
> Then shouldn't:
>
> #include <stdio.h>
>
> int main(void)
> {
> float a = 1.0000;
> int b = a;
> printf("%d %d\n", a, b);
> return 0;
> }
>
> give 1 1 as the output instead of 0 and garbage?
The C compiler does not look at the contents of the first argument
to determine type information. All it has is the declaration of the
function. In your case
int printf(const char *format, ...);
There are special rules how arguments are converted (the specific
term is "promoted") when passed as part of "...". In your case
the float is converted to a double. Function printf then interprets
the bits of a double as bits of an int and outputs garbage.
To test truncation you have to specify the intended conversion
manually. For example like this
int c = b;
printf("%d %d\n", a, c);
or like this:
printf("%d %d\n", a, (int)b);
--
I still get garbage with either of the modifications.
Of course you get garbage. Why do you expect anything else?
Please review in your documentation which specifier is used
with floating point data within printf.
You see now?
I think he meant (int)a, ie. convert the float a to the int value compatible
with the first %d format.
The advice was wrong (I think Alexander misread the types of a and b).
You just need the right format specifier for each argument to printf:
printf("%g %d\n", a, b);
(or %f if you prefer). Slightly confusingly, floats are converted to
double when they appear as an argument to a function like printf so %g
and %f are for printing argument expressions of type double.
--
Ben.
No, because I don't want output to be formatted as floating point
data, I want integer values for both a and b. I don't understand why
formatting a with %d works but casting a to an integer b doesn't work
looking at the output for b.
> On pg 45 of K&R, the authors write that:
>
> float to int causes truncation of any fractional part.
Right.
>
> Then shouldn't:
>
> #include <stdio.h>
>
> int main(void)
> {
> float a = 1.0000;
> int b = a;
> printf("%d %d\n", a, b);
> return 0;
> }
>
> give 1 1 as the output instead of 0 and garbage?
No. Here's what's going on in int b = a:
The = operator takes the value of its right operand, which is 1.0F
(float), and stores this value in its left operand. Since the left
operand is an int, a conversion is performed on the value yielded by
the left operand, to coerce that value into the proper type. So b
gets an int value - but a remains a float. The conversion in no way
affects the value stored in a, or the type of a.
The printf doesn't do any conversions into int type, so it's not
surprising you get garbage when you treat a non-int as if it were an
int. If you are surprised, it may be because you think of %d as a
sort of type conversion utility. It isn't.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within
As Richard has said, printf does not convert its arguments, it simply
interprets them. Interpreting the bits of a floating point number as
an integer is not the same as converting the number to int. This is
the operational description of what is wrong with your program.
There is another way to look at it. Passing anything but an int as
the argument to printf's %d format is "undefined behaviour"[1] -- all
bets are off. Anything could happen. This is a better way to think
about C if you need write robust and portable code. Understanding
what actually happens on some system or other can be interesting, but
it can lead you astray if you don't know what is assured by the C
language and what is not.
For example, I've used a system where you could inspect the
representation of a double like this:
printf("%04x%04x\n", 3.14);
but the fact that is worked was an accident and it might have stopped
working simply by using a different version of the same compiler, or
even different compile-time options with the same version. However,
just switching to:
printf("%a\n", 3.14);
is usually much more helpful and portable to all C99 implementations.
If you need to inspect the representation in more detail, you can do
so with a character pointer. For example:
double f = 3.14;
int i;
unsigned char *rep = (void *)&f;
for (i = 0; i < sizeof f; i++)
printf("%02x", rep[i]);
does in a much more portable way what the double %04x hack tries to
do.
[1] Lots of things get converted to int when passed to printf so you
need to know about these automatic conversions to be sure of what
works and how. A good C text will discuss this at some length.
--
Ben.
Maybe it's just because I just got out of bed, but I don't see how b
gets an int value, but remains a float. Does this mean that b is both
an int and a float at the same time? Or is one type just a subset of
another type.
No, it's actually simpler than you're making it out to be.
a is of type float, and it can only hold a float value.
b is of type int, and it can only hold an int value.
In the declaration
int b = a;
the float value that's stored in a is *converted* from float to int,
yielding the int value 1. That int value is then stored in b.
The following statement:
printf("%d %d\n", a, b);
fails because the %d format requires an int argument; by passing
the value of a, which is a float value, the call lies to printf.
(The failure of the printf call can take any form. The behavior is
undefined, so anything can happen. In practice, the most Likely
result is that some garbage value is printed, but it could crash
the program.)
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
<snip>
> Maybe it's just because I just got out of bed, but I don't see how b
> gets an int value, but remains a float.
b gets an int value because it's an int.
a remains a float because it's a float.
Part of the confusion may be caused by the stupid stupid name choices.
float thisisafloat = 1.000;
int thisisanint = thisisafloat;
thisisafloat is a float, with the value 1.0F. In this program, it is
never given any other value, so it retains the value 1.0F, which is a
float value.
thisisanint is an int, so you can't store fractional parts in it, so
when you try to store a float value in it, the fractional part is
lost /from the value you are trying to store in thisisanint/. The
value stored in thisisafloat does not change.
Imagine that each object shouts out its value over and over again. The
thisisafloat object is yelling "one point nought, one point nought,
one point nought, one point nought, one point nought, one point
nought, one point nought, ..."
The thisisanint object hears the whole value, "one point nought", but
because it's an integer type it can't store anything at or after the
point, so the value it gets is one, and it starts yelling "one, one,
one, one, one..."
Meantime, back at thisisafloat, ***nothing has changed***: "one point
nought, one point nought, one point nought, one point nought..."
The copying of a value from one object to another and the possible
conversion of that value to another type on the way, has *no effect*
on the type of, or the value stored in, the original object.
<snip>
Re-read. He said that 'a' remains a float; he wasn't talking about 'b'.
Why does b print some garbage value? I mean, the type for b is an int
and the second %d in printf() is also an int. Is this because he tried
to pass float as an int in the first arg to printf()?
Given that a is a float and b is an int, the behavior of
printf("%d %d\n", a, b);
is undefined. That doesn't mean the implementation can print anything
it likes for a, but must still get b and everything else right. Once
undefined behavior has occurred, all bets are off; the standard no
longer says *anything* about how the program behaves.
As far as the standard is concerned, that's all that there is to be
said about it. But examining the behavior on some particular system
can be useful as an *example* of how undefined behavior can manifest
itself.
Here's the original program, with the variable names changed for
clarity:
#include <stdio.h>
int main(void)
{
float f = 1.0000;
int i = f;
printf("%d %d\n", f, i); /* undefined behavior */
return 0;
}
On my system, I get a compile-time warning:
c.c:7: warning: format `%d' expects type `int', but argument 2 has type `double'
The compiler isn't required to warn about this, but it's kind enough
to do so anyway.
The run-time output is:
0 1072693248
Here's what I think is happening.
On the call to printf, the float value of f is promoted (implicitly
converted) to double. The int value of i is passed as an int. On my
system, float is 4 bytes, double is 8 bytes, and int is 4 bytes.
Arguments to printf are (I think) passed consecutively on the stack;
printf then uses the <stdarg.h> mechanism, or something very much like
it, to extract the argument values as directed by the format string.
The call passes 12 bytes of data, 8 for the double value (promoted from
float) and 4 for the int value. The "%d %d\n" format instructs printf
to extract 8 bytes of data, 4 for each of two expected int arguments.
The double representation of 1.0, when interpreted as a pair of 32-bit
ints, happens to look like (0, 1072693248). (Note that 1072693248 is
0x3ff00000 in hex. Consult the IEEE floating-point standard to see if
this matches; I haven't checked it myself, but it's certainly
plausible.) The passed int value of i is quietly ignored, because
it's beyond the 8 bytes of data specified by the format string.
When I modify the program as follows:
#include <stdio.h>
int main(void)
{
float f = 1.0000;
int i = 42;
printf("%d %d %d\n", f, i); /* undefined behavior */
return 0;
}
I get even more warnings:
c2.c:7: warning: format `%d' expects type `int', but argument 2 has type `double'
c2.c:7: warning: too few arguments for format
and the following run-time output:
0 1072693248 42
I passed 12 bytes of data (a double and an int), and the format
string specifies 12 bytes of data (3 ints). So, at run time,
everything is, in some strange sense, consistent, and we see the
actual value of i.
Do not, I repeat, *DO NOT*, use techniques like this for real-world
code. I wrote this purely for the purpose of exploring the internal
workings of printf on my system. It could work entirely differently
on a different system. For example, doubles and ints might be
passed as arguments using different mechanisms.
If you really want to examine the internal contents of a
floating-point object, there are ways to do that; the safest is to
alias it with an array of unsigned char.
The format string in a call to printf should *always* match the
(promoted) arguments. If it doesn't, you might or might not get
a warning from your compiler, and the actual behavior could be
almost literally anything. The behavior I've shown on my system
is relatively benign, and it still took several paragraphs just to
describe it.
Understanding this kind of system-specific behavior can be very
useful for debugging. For example, if you see an output value
of 1072693248 where you expected a small int value, it's likely
that something other than an integer is somehow being interpreted
as if it were of an integer. If a pointer value, printed with
"%p", looks like "0x00000003", it's probably a small integer being
misinterpreted as a pointer value.
But the point of such analysis is to fix the problem; it's almost
never a good idea to take advantage of it.
Here's another example of undefined behavior:
#include <stdio.h>
int main(void)
{
double f = 0.0;
long n = f;
printf("%d %d\n", f, n); / *undefined behavior */
return 0;
}
Again, I've lied to printf about the arguments I'm passing to it, and
any behavior is permitted. But here's the output I get on my system:
0 0
Types double and long just happen to be the same size, and 0.0 and 0L
just happen to have the same representation, all-bits-zero. So the
output *looks* correct even though it's garbage and even though it
might be different on a different system.
This is the worst way that undefined behavior can manifest itself.
There's a serious bug in the program, but it's very difficult to
detect. It will probably show itself at the worst possible time, when
the values of f and n have been changed, or when the program is run on
the intended target system rather than on my test system. Fortunately
gcc warns about this:
c3.c:7: warning: format `%d' expects type `int', but argument 2 has type `double'
c3.c:7: warning: format `%d' expects type `int', but argument 3 has type `long int'
but there are plenty of similar errors for which it can't or won't
issue a warning.
The output of this program on my machine is:
0 1072693248
If float is 32 bits, IEEE 754 specifies 1 sign bit, 8 exponent bits, and
23 fraction bits.
The value 1.0000 should be:
sign = 0
exponent = 127 (0 + 2^(8-1) bias)
fraction = 0 (+1)
Which, if interpreted as an integer would be 1065353216 (3F800000h). The
output of the above program is 1072693248 (3FF00000h), which, if I
haven't made an error in my arithmetic, is the floating-point value 1.875.
Can anyone explain this discrepancy? Where did the extra fraction bits
(0.875) come from?
--
-Ted
First off, the behavior is undefined; the language itself has nothing
to say about the output you're seeing.
The particular results you're seeing on your system, with whatever
compiler and options you're using and the current phase of the moon,
are likely to have something to do with the fact that float is
promoted to double when passed to a variadic function as an argument
corresponding to the "..." in the function's prototype.
But really, the program is just wrong. You can probably learn
something from its behavior, but in most cases your time is better
spent fixing the code.
My English isn't as good as yours, but did you by any chance possibly
mean to say "a conversion is performed on the value yielded by the
_RIGHT_ operand" or "a conversion is performed on the value yielded
_TO_ the left operand" [emphasis mine]? This might explain Chad's
confusion. Or perhaps it's my comprehension skills that are lacking.
- Anand
> On Oct 10, 6:09 am, Richard Heathfield <r...@see.sig.invalid> wrote:
<snip>
>> The = operator takes the value of its right operand, which is 1.0F
>> (float), and stores this value in its left operand. Since the left
>> operand is an int, a conversion is performed on the value yielded
>> by the left operand,
^^^^
[...]
> did you by any chance
> possibly mean to say "a conversion is performed on the value yielded
> by the _RIGHT_ operand"
Er, yes. Darn. Thanks for the correction.
Thank you - that last sentence has completed my understanding of the
original problem :)
The float is probably converted to double, so is 64 bits. 64 bits I believe
uses an 11-bit exponent. The mantissa is 0 because there is an implied 1 in
there.
--
Bartc
>> The output of this program on my machine is:
>>
>> 0 1072693248
>>
>> If float is 32 bits, IEEE 754 specifies 1 sign bit, 8 exponent bits, and
>> 23 fraction bits.
>>
>> The value 1.0000 should be:
>>
>> sign = 0
>> exponent = 127 (0 + 2^(8-1) bias)
>> fraction = 0 (+1)
>>
>> Which, if interpreted as an integer would be 1065353216 (3F800000h). The
>> output of the above program is 1072693248 (3FF00000h), which, if I
>> haven't made an error in my arithmetic, is the floating-point value
>> 1.875.
>>
>> Can anyone explain this discrepancy? Where did the extra fraction bits
>> (0.875) come from?
>
> First off, the behavior is undefined; the language itself has nothing
> to say about the output you're seeing.
>
> The particular results you're seeing on your system, with whatever
> compiler and options you're using and the current phase of the moon,
> are likely to have something to do with the fact that float is
> promoted to double
I didn't see this bit when I gave my own reply. I'd only got as far as the
phase of the moon..
Sometimes it's useful to give a concrete example of why some behaviours
happen, as you did in your excellent reply to Chad earlier in the thread.
The Standard only goes on about Undefined Behaviour so much, because /it
doesn't know what machine the code is being run on/, so it can't really
comment.
On the other hand, a programmer trying to debug a piece of code usually does
know, and knowing exactly why it's not working can be useful to learn for
next time.
--
bartc
> > First off, the behavior is undefined; the language itself has nothing
> > to say about the output you're seeing.
>
> > The particular results you're seeing on your system, with whatever
> > compiler and options you're using and the current phase of the moon,
> > are likely to have something to do with the fact that float is
> > promoted to double
>
> I didn't see this bit when I gave my own reply. I'd only got as far as the
> phase of the moon..
>
> Sometimes it's useful to give a concrete example of why some behaviours
> happen, as you did in your excellent reply to Chad earlier in the thread.
yes, I was contemplating doing one but Keith's was far better.
Sometimes we need to know what is going on under the hood.
Particularly
when we are learning. I'm less interested now in exactly what the bits
are doing when UB is invoked but I used to be interested.
> The Standard only goes on about Undefined Behaviour so much, because /it
> doesn't know what machine the code is being run on/, so it can't really
> comment.
>
> On the other hand, a programmer trying to debug a piece of code usually does
> know, and knowing exactly why it's not working can be useful to learn for
> next time.
not sure about that. Once you've spotted the UB you've found the bug
and you don't need to study the implementations actual behaviour.
You just remove the UB!
00111111 10000000 00000000 00000000
Exp = 127 (1)
00000001
Man = .10000000 00000000 00000000
1.00000000e+00
The top row is the raw float. In hex 0x3f800000.
--
Joe Wright
"If you rob Peter to pay Paul you can depend on the support of Paul."
> In <fMYzm.46574$ze1....@news-server.bigpond.net.au>, Albert wrote:
>
>> On pg 45 of K&R, the authors write that:
>>
>> float to int causes truncation of any fractional part.
>
> Right.
>
>>
>> Then shouldn't:
>>
>> #include <stdio.h>
>>
>> int main(void)
>> {
>> float a = 1.0000;
>> int b = a;
>> printf("%d %d\n", a, b);
>> return 0;
>> }
>>
>> give 1 1 as the output instead of 0 and garbage?
>
> No. Here's what's going on in int b = a:
>
> The = operator takes the value of its right operand, which is 1.0F
> (float), and stores this value in its left operand. Since the left
> operand is an int, a conversion is performed on the value yielded by
> the [right] operand, to coerce that value into the proper type. [snip]
A minor point -- assignment _always_ performs a conversion,
whether the types of the two sides are the same or different.
>> The = operator takes the value of its right operand, which is 1.0F
>> (float), and stores this value in its left operand. Since the left
>> operand is an int, a conversion is performed on the value yielded by
>> the [right] operand, to coerce that value into the proper type. [snip]
>
> A minor point -- assignment _always_ performs a conversion,
> whether the types of the two sides are the same or different.
So what conversion is performed when assigning an int value to an int
destination of the same width?
--
Bartc
http://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf
6.5.16.1
6.3
this reminds me of the maths people who tell you a quadratic equation
always has two roots (solutions). And I say "but there's only one
answer
in such-and-such a case" "ah yes in that case the two roots are
actually
identical"
6.5.16.1p2 is pretty clear about this: "the value of the right operand
is converted to the type of the assignment expression" - there's nothing
conditional about that statement.
> So what conversion is performed when assigning an int value to an int
> destination of the same width?
int=>int, an identity conversion. It's covered by 6.3p2: "Conversion of
an operand value to a compatible type causes no change to the value or
the representation." 'int' is compatible with 'int'.
A minor point - what assignment?
You're right, of course - this is initialization rather than assignment.
However, his comment could be taken as simply a response to your
comments about the "= operator", without reference to the fact that the
code you were talking about did not contain an "= operator".
Luckily for both of you, "the same type constraints and conversions as
for simple assignment apply," (6.7.8p11)
In a case where performance is critical, is there a way to be sure that
the compiler does not perform a conversion for same-type
assignment? ...or would that be an implementation detail?
--
-Ted
Presumably a quality-of-implementation issue. In practice, I doubt there
have been any compilers, ever, which generated code to perform such
"conversions".
-s
--
Copyright 2009, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
The standard requires that such a conversion must change neither the
value nor the representation, so it can be fully implemented by not
generating any code whatsoever. It's not clear to me that it's even
meaningful to talk about not performing a no-op conversion.
Is there any context where this subtlety (viz., "conversion" is
performed even when the types of the operands on either side of = is
the same) is important? If so, is it important to the implementor or
even to the programmer? How/Why?
I think that essentially 100% of the importance of this subtlety lies in
the simplification it allows in the standard's description of what is
required to happen.
For the most part such conversions don't change anything
and so generate no additional code. But see also my
other replies.
> On 2009-10-12, Ted DeLoggio <tdel...@gmail.com> wrote:
>> In a case where performance is critical, is there a way to be sure that
>> the compiler does not perform a conversion for same-type
>> assignment? ...or would that be an implementation detail?
>
> Presumably a quality-of-implementation issue. In practice, I doubt there
> have been any compilers, ever, which generated code to perform such
> "conversions".
Except in the case of floating point types, when such
conversions actually can make a difference. Notable
because some well-known compilers (gcc is the example
I'm thinking of) sometimes get this wrong.
Notwithstanding the assurances of 6.3p2, a same-type conversion
actually can result in a different value when the types involved
are floating-point types. I neglected to mention this section
earlier, let me correct that now -- 6.3.1.5p2. Also applies to
the complex types.
Is this the old hassle with intermediate representations?
Yes there is, when floating-point types are involved.
Conversions in such cases are required to discard extra
precision and range (see 6.3.1.5p2). For example, in
double a, b, c;
...
a = b + c;
the plus operation can be computed in greater precision than
(double), but upon being assigned the value must be squeezed
back into a (double) again. For developers, this can matter
when deciding when to simplify expressions. For example:
/* 1 */
t0 = b * c;
t1 = d * e;
a = t0 + t1;
/* 2 */
a = b * c + d * e;
There's a good chance the result in /*1*/ will be
different from the result in /*2*/.
For implementors, it's important to remember to follow the
requirements, since it can be tempting not to for reasons of
performance and/or optimization. I believe gcc gets this
wrong in some cases, notably on the x86, where the processor
instruction set makes it pretty inconvenient (or so I've
heard) to do what the Standard requires.
What James Kuyper said (thanks James!). I was in fact
responding to the mentionings of '= operator' and 'operand'
(both left and right), which seems to be talking about
assignment.
> On 2009-10-12, Tim Rentsch <t...@alumni.caltech.edu> wrote:
>> Except in the case of floating point types, when such
>> conversions actually can make a difference. Notable
>> because some well-known compilers (gcc is the example
>> I'm thinking of) sometimes get this wrong.
>
> Is this the old hassle with intermediate representations?
Assuming I understand your question correctly, the
answer is yes. C requires that intermediate results
be converted to the precision of the type involved
upon assignment. I believe this requirement is
there partly (mostly?) to conform to rules set for
IEEE 754 floating point. (Disclaimer: I know very
little about IEEE 754; my comment here is based
on some long-ago traded emails with the committee
chairman for IEEE 754.)
Okay, I'm going to take the bait here. How can the plus operation be
computed with greater precision than double?
(Example)
Some floating point hardware works internally using 80-bits, when the
precision of double is 64-bits, which can lead to inconsistencies when
intermediate 80-bit results are written to memory as 64-bits then loaded
again, compared with keeping the intermediate values in the registers.
--
Bartc
Do you mean how can it happen, or when will it ever make
a difference? The answer for how it can happen is,
no matter what the range and precision are for (double)
(or (long double), for that matter), the implementation
is allowed to use greater range and precision for the
results of operations. So plus could be carried out
with 1024 bits of precision, say, or with more exponent
bits to give a greater range (or both). Extra bits
may be relevant because floating-point numbers might
be in different ranges (ie, have different exponents).
As to when will it ever make a difference, for this
simple example I think it depends on rounding modes.
Obviously for more complicated expressions, eg
a = b + c + d + e + f + g;
some extra precision could make a difference due to
carries when adding some small numbers and some bigger
ones. Extra range could also matter when adding
some positive numbers and some negative ones,
protecting against overflows in intermediate results.
I'm sure there must be other examples, and probably
better ones, but the ones here are just the first
ones that popped into my head.
I was going to say that the expression b + c has type (double), but after
looking in the standard for confirmation of this, I'm confused:
6.3.1.8 Usual arithmetic conversions
"Unless explicitly stated otherwise, the common real type is also
the corresponding real type of the result"
[so the result of b + c would have type double -- MK]
but I'm confused by paragraph 2 and its footnote, which say
"The values of floating operands and of the results of floating
expressions may be represented in greater precision and range
than that required by the type; the types are not changed thereby.
52)"
and "52) The cast and assignment operators are still required to perform
their specified conversions as described in 6.3.1.4 and 6.3.1.5."
What's meant by this? If "the types are not changed thereby", does this
mean that (b + c) has type double, or not? And if the type is not changed,
what conversion would be necessary to do the assignment to a?
Furthermore, if the result of a floating expression can be "represented
in greater precision and range" than that required, what does this say
about sizeof(b + c)? What can we predict about the value of the expression
sizeof(b + c) == sizeof(double)
in conforming implementations? Can a strictly conforming program rely on
this having the value 1?
Or is this "greater range and precision" clause merely giving
implementations
permission to represent intermediate results in ways that could give
different results for more complicated floating expressions, e.g.
potentially
giving different results for
((double)(b + c)) - ((double)(e * f))
vs.
(b + c) - (e * f)
where b, c, e, and f are all doubles?
--
Morris Keesan -- mke...@post.harvard.edu
I guess I meant to ask how could it happen. I couldn't figure out how
to phrase the question because my grammar isn't that strong. What can
I say. I should have probably paid more attention in my High School
English classes.
"long double". Or possibly an internal representation which doesn't map
onto any native type.
Think, say, the FPU on a 68000-series chip with FPU, early on. You could
have a "double" type which was 64 bits, and a "long double" type which was
80 bits -- and only have hardware support for the 80-bit type. Solution?
Do absolutely all calculations in 80-bit, then truncate when you had to store
the value.
This led to a possible problem: What if a helpful optimizer didn't bother
truncating the value before using it again? Well, then, you got "wrong"
results -- and yes, in floating point, "more precision than we expected"
can be "wrong".
Right.
> but I'm confused by paragraph 2 and its footnote, which say
>
> "The values of floating operands and of the results of floating
> expressions may be represented in greater precision and range
> than that required by the type; the types are not changed thereby. 52)"
> and "52) The cast and assignment operators are still required to perform
> their specified conversions as described in 6.3.1.4 and 6.3.1.5."
>
> What's meant by this? If "the types are not changed thereby", does this
> mean that (b + c) has type double, or not? And if the type is not changed,
> what conversion would be necessary to do the assignment to a?
It means, even though the value is represented in greater range and
precision (than (double), for this case), the type is still (double).
The conversion for assignment to 'a' is 'a = (double) (b+c)'.
I know it seems weird that converting an expression to the same
type as the expression can change its value, but that's the rule.
> Furthermore, if the result of a floating expression can be "represented
> in greater precision and range" than that required, what does this say
> about sizeof(b + c)? What can we predict about the value of the expression
>
> sizeof(b + c) == sizeof(double)
>
> in conforming implementations? Can a strictly conforming program rely on
> this having the value 1?
The type of (b+c) is still double, even if the result value is
represented with greater range or precision. The sizeof
comparison you wrote is indeed always 1 (assuming b and c are
doubles).
> Or is this "greater range and precision" clause merely giving
> implementations
> permission to represent intermediate results in ways that could give
> different results for more complicated floating expressions, e.g.
> potentially
> giving different results for
>
> ((double)(b + c)) - ((double)(e * f))
> vs.
> (b + c) - (e * f)
>
> where b, c, e, and f are all doubles?
Yes, the point is to give implementation more freedom for
intermediate results, and there is a good chance that these two
expressions will have different values, because casting to
(double) forces any extra range and/or precision of the two
intermediate values (that are operands to '-') to be discarded.
Agreed.
> but I'm confused by paragraph 2 and its footnote, which say
>
> "The values of floating operands and of the results of floating
> expressions may be represented in greater precision and range
> than that required by the type; the types are not changed
> thereby. 52)"
> and "52) The cast and assignment operators are still required to perform
> their specified conversions as described in 6.3.1.4 and 6.3.1.5."
>
> What's meant by this? If "the types are not changed thereby", does this
> mean that (b + c) has type double, or not? And if the type is not changed,
> what conversion would be necessary to do the assignment to a?
It's of type double but may represented by something with greater
precision than double. Until stuck in an actual double.
> Furthermore, if the result of a floating expression can be "represented
> in greater precision and range" than that required, what does this say
> about sizeof(b + c)? What can we predict about the value of the expression
>
> sizeof(b + c) == sizeof(double)
>
> in conforming implementations? Can a strictly conforming program rely on
> this having the value 1?
I think the std. requires some clarification on that. The lack of
distinction between expressions and types leaves the above unclear.
If the description had stated that all expressions are first treated
as their type, then you'd be mapped back to double type, and the
confusion would disappear.
> Or is this "greater range and precision" clause merely giving
> implementations
> permission to represent intermediate results in ways that could give
> different results for more complicated floating expressions,
> e.g. potentially
> giving different results for
>
> ((double)(b + c)) - ((double)(e * f))
> vs.
> (b + c) - (e * f)
>
> where b, c, e, and f are all doubles?
I have several archs here where i'd expect to trivially be able to
come up with values for b, c, e, and f which would yield different
values for those two expressions. The x86-based ones, if using the
FPU, because of higher precision intermediates, and the others (POWER,
Arm) because of fused exact multiply-add instructions. Anything which
has catastrophic cancellation should work.
Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1
But don't forget that "exactly the precision we expected" can be
"wrong" for even simpler reasons. That's why if you want numeric
work, you get someone skilled in the field, who can manage the
various wrongs appropriately.
> "Morris Keesan" <mke...@post.harvard.edu> writes:
[snip]
The type of the expression (b+c) is double, as already noted:
"the types are not changed thereby." The sizeof operator works
on types: "The size is determined from the type of the operand."
(6.5.3.4p2). The Standard doesn't leave any wiggle room: the
two types are the same so their sizes are the same; the result
is well-defined and must be equal to 1.
Note also that with a fused multiply-add the expression
a * b + c * d == c * d + a * b
is not necessarily true.
--
dik t. winter, cwi, science park 123, 1098 xg amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/
Yup, the standard's off the hook, that clause is unambiguous.
I can imagine that
a * b + c * d == a * b + c * d
is not necessarily true either.
I've seen Apple's (POWER) gcc get confused with register allocations
before, and wouldn't put it past it to change the order of evaluation
of sub-expressions if it got confused.