If the implementation supports negative zeros, they shall be
generated only by:
— the &, |, ^, ~, <<, and >> operators with arguments that produce
such a value;
— the +, -, *, /, and % operators where one argument is a negative
zero and the result is zero;
— compound assignment operators based on the above cases.
Is this list exclusive? Or can an implementation-defined result
override it?
For example, suppose a sign-magnitude implementation with negative
zero support where char is signed and has 8 bits; on this
implementation, a negative zero has the value of 128 (== 0x80) when
accessed with unsigned char. Then, for the following assignment:
char c;
c = 0x80;
the standard specifies the result is i-d. Now can c have a negative
zero since the result is i-d, or should it have any other value than a
negative zero because of the list cited above?
If it should, the most reasonable choice for the value of c seems to
be 0 (positive zero). Unfortunately, converting 0x80 to 0x00 when
storing into c brings another but related problem:
char *s = "abc\x80" "def";
printf("%d\n", (int)strlen(s)); /* 3 or 7? */
I know strlen() is guaranteed to access the array passed with unsigned
char; I am not talking about accesses performed in strlen() but
initializing the storage for the string literal. 6.4.5p5 says:
The multibyte character sequence is then used to initialize an
array of static storage duration and length just sufficient to
contain the sequence. For character string literals, the array
elements have type char, and are initialized with the individual
bytes of the multibyte character sequence;
which implies that the initialization for s is done as if:
static char __noname[] =
{ 'a', 'b', 'c', '\x80', 'd', 'e', 'f', 0 };
char *s = __noname;
so we have __noname[3] = 0x80. Or should 6.4.5p5 be taken as to mean
something like:
char *__noname;
... prepare static storage for __noname ...
unsigned char __literal[] =
{ 'a', 'b', 'c', '\x80', 'd', 'e', 'f', 0 };
memcpy(__noname, __literal, sizeof(__literal));
char *s = (char *)__noname;
since it says "bytes" of the mb character sequence? If so, what should
happen when the code changes to
char s[] = "abc\x80" "def";
printf("%d\n", (int)strlen(s));
where we have s[3] = 0x80 again?
Shouldn't the list be modified to include an i-d result?
--
Jun, Wooong (woong.jun at gmail.com)
[... some assumptions for code below snipped ...]
>
> char c;
> c = 0x80;
>
> the standard specifies the result is i-d.
>
Anticipating a typical answer I may get ;-), yes, the standard also
says an i-d signal can occur, which makes the code run in an
unpredictable way.
My point, however, is that the standard *precludes* an implementation
from storing a negative zero into c if the list cited in my previous
post is taken as an exclusive one.
To me, the phrase "only by" seems to pretty clearly mark it as an
exclusive list. Undefined behavior might override that exclusivity,
since the standard explicitly does not constrain the behavior in any way
when it is undefined, but implementation-defined behavior should not
violate that requirement.
James Kuyper wrote:
> To me, the phrase "only by" seems to pretty clearly mark it as an
> exclusive list.
Indeed. I would think that in particular the omission of the
'=' assignment operator would indicate that negative zero is
supported, but only insofar as the hardware produces it as
the result of an arithmetic operation. The way to preserve a
negative zero can then only be done with care, such as using
assignments of the form 'a = +b' instead of the usual 'a = b'
(which we can assume is a normalizing assignment). Or by
using memcpy(), of course.
(On related but off-topic note, it appears that these rules
parallel those for IEEE floating-point arithmetic, but I don't
recall offhand if assignment is normalizing or not.)
-drt
If 'a = b' normalizes the value, why wouldn't 'a = +b' do so?
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Agreed. And with that exclusive list, and valid assumptions I made in
the first post, I concluded that one might get a weird result nobody
would expect.
This problem depends on how 6.4.5p5 that handles string literals is
interpreted. Due to the phrase "individual bytes" used there, I think
it should be interpreted as to map the array produced by a string
literal to the storage whose bytes are consisted of the contents of
the literal. That is, the array is initialized as if done by this
code:
char *__noname;
... prepare static storage for __noname ...
unsigned char __literal[] =
{ 'a', 'b', 'c', '\x80', 'd', 'e', 'f', 0 };
memcpy(__noname, __literal, sizeof(__literal));
char *s = (char *)__noname;
My first question is, is this interpretation of 6.4.5p5 right? Or
should it be interpreted to mean something like:
static char __noname[] =
{ 'a', 'b', 'c', '\x80', 'd', 'e', 'f', 0 };
char *s = __noname;
? I surmise the former is intended because the problem gets even
worse if the later is really the intended one.
Now consider this code:
char s[] = "abc\x80" "def"; // char s[] = { 'a' 'b' 'c' ... };
char *t = "abc\x80" "def";
assert(strcmp(s, t) == 0);
Can this assertion fail? The standard seems to say that it must fail
on the implementation I assumed.
Another code:
char *s = "abc\x80" "def";
char *t = "abc\x80" "def";
t[3] = t[3];
assert(strcmp(s, t) == 0);
How about this code? The standard seems to say that this assertion
also must fail since t[3] is not allowed to generate a negative zero.
Even if the latter would not surprise a programmer who is familiar to
a negative zero, the former surely would, I believe.
AFAICT, the list should include at least:
- by an implementation-defined result of conversions
and possibly:
- by a certain kind of type punnings
to be exclusive, as the cases enumerated in 6.5.9p6 (Equality
operators) have in C99:
or one is a pointer to one past the end of one array object and
the other is a pointer to the start of a different array object
...
for the "iff."
--
Jun, Woong (woong.jun at gmail.com)
It can. The difference is that the standard requires 'a = b' do
normalize.
Where is that requirement stated?
Oops, I was wrong. Anything does not prevent '=' from normalizing a
negative zero nor from retaining it. The same goes for '= +'.
David R. Tribble might try to mean that if -0 is produced by the +
operator, the operator used later is likely to preserve the sign of
0 since it is allowed to produce -0 from the start.
Of course, the only reliable way to carry -0 is to use memcpy(), that
is, copying via unsigned char.
> Indeed. I would think that in particular the omission of the
> '=' assignment operator would indicate that negative zero is
> supported, but only insofar as the hardware produces it as
> the result of an arithmetic operation. The way to preserve a
> negative zero can then only be done with care, such as using
> assignments of the form 'a = +b' instead of the usual 'a = b'
> (which we can assume is a normalizing assignment). Or by
> using memcpy(), of course.
This may be unclear wording in the C Standard, but I would think that
a plain assignment operated doesn't "generate" anything. It just
assigns whatever is assigned, and if I assign a negative zero, then so
be it.
The list of operators that may "generate" a negative zero includes the
+= operator. So after executing a += b; a might be a negative zero
under the right circumstances. But we know that a+=b; is the same as a
= a + b; except that the lvalue a is evaluated only once, so if a +=
b; can result in "a" being a negative zero, then so can a = a + b; .
Strange that -0 apparently cannot generate a negative zero.
I'm not sure I see the point on restricting which operators can generate
a negative zero on implementations which allow one. I can see the point
of making plain 0 positive (for use when calling memset), but other than
that when do you need to guarantee that an operator will yield a
positive zero regardless of the operands?
Allowing it would be a very special case since you don't want to allow
``-a'' to generate a negative zero if a happens to be zero.
--
Larry Jones
If I was being raised in a better environment, I wouldn't
do things like that. -- Calvin
Keith Thompson wrote:
>> If 'a = b' normalizes the value, why wouldn't 'a = +b' do so?
>
Jun Woong writes:
> It can. The difference is that the standard requires 'a = b' to normalize.
Keith Thompson wrote:
>> Where is that requirement stated?
>
Jun Woong writes:
>> Oops, I was wrong. Anything does not prevent '=' from normalizing a
>> negative zero nor from retaining it. The same goes for '= +'.
>>
>> David R. Tribble might try to mean that if -0 is produced by the +
>> operator, the operator used later is likely to preserve the sign of
>> 0 since it is allowed to produce -0 from the start.
>
Yes.
The list of operators allowed to preserve negative zero (6.2.6.2p3,
quoted previously) does not include simple assignment '=', so we
can assume that it is required to normalize -0 into +0 (plain zero).
I was probably incorrect about concluding that unary '+' is sign
preserving, however, since the unary operators do not appear
in the list.
It would seem that if binary '+' is sign preserving, then 'j + 0'
would preserve the sign of j if it was -0; however, 'k = j + 0'
would not assign -0 to k because of the '=' rule above.
So it would appear that the only guaranteed way to preserve
the sign of negative zero would be by using memcpy().
-drt
No, as is clear from the sentence following the list:
It is unspecified ... whether a negative zero becomes a normal
zero when stored in an object.
> I was probably incorrect about concluding that unary '+' is sign
> preserving, however, since the unary operators do not appear
> in the list.
It's not clear whether the "+" and "-" operators in the list are
intended to be read as just the binary operators or as both the unary
and binary operators. My guess is they're intended to include both.
> It would seem that if binary '+' is sign preserving, then 'j + 0'
> would preserve the sign of j if it was -0;
It's *allowed* to preserve the sign, it's not required to (that's the
elided part of the quote above).
--
Larry Jones
I like Mom to be impressed when I fulfill the least of my obligations.
-- Calvin
You said in the other branch of this thread:
> Allowing it would be a very special case since you don't want to allow
> ``-a'' to generate a negative zero if a happens to be zero.
where "it" refers to an expression applying the unary - operator to 0.
What's the truth?
Oh, the truth is that I misunderstood what you said. ;-)
You said creation of a negative zero in one place, and its
preservation in the other. Sorry.
A clearer way to say this would be to add the simple assignment
operator to the list since the list already says about both creation
and preservation of a negative zero; I've also mistaken it since
missing the sentence following the list.
A question I'd like to ask here is:
Is "when stored in an object" intended to include other cases than
using the simple assignment operator?
I guess no, then is there any other reason not to make it into the
list?
> Jun Woong <wo...@icu.ac.kr> wrote:
>
> [... some assumptions for code below snipped ...]
>
>>
>> char c;
>> c = 0x80;
>>
>> the standard specifies the result is i-d.
>>
>
> Anticipating a typical answer I may get ;-), yes, the standard also
> says an i-d signal can occur, which makes the code run in an
> unpredictable way.
No, it runs in a predictable way, given in the documentation
for implementation-defined behavior. The behavior is
implementation-dependent, but it is not unpredictable.
> My point, however, is that the standard *precludes* an implementation
> from storing a negative zero into c if the list cited in my previous
> post is taken as an exclusive one.
No, it doesn't. 7.14p4: "The complete set of signals, /their semantics/,
and their default handling, is implementation defined;" [emphasis added].
The signal handler could legally generate a negative zero, and its
semantics could be to return that value as the value of the conversion
that raised the signal.
> The standard enumerates cases where generation of negative zeros is
> allowed in 6.2.6.2p3:
>
> If the implementation supports negative zeros, they shall be
> generated only by:
> * the &, |, ^, ~, <<, and >> operators with arguments that produce
> such a value;
> * the +, -, *, /, and % operators where one argument is a negative
> zero and the result is zero;
> * compound assignment operators based on the above cases.
>
> Is this list exclusive? Or can an implementation-defined result
> override it?
>
> For example, suppose a sign-magnitude implementation with negative
> zero support where char is signed and has 8 bits; on this
> implementation, a negative zero has the value of 128 (== 0x80) when
> accessed with unsigned char. Then, for the following assignment:
>
> char c;
> c = 0x80;
>
> the standard specifies the result is i-d. Now can c have a negative
> zero since the result is i-d, or should it have any other value than a
> negative zero because of the list cited above?
Of course c can get a value of negative zero. See 6.3.1.3p3 and 7.14p4.
> Jun Woong wrote:
>> The standard enumerates cases where generation of negative zeros is
>> allowed in 6.2.6.2p3:
>>
>> If the implementation supports negative zeros, they shall be
>> generated only by:
>> * the &, |, ^, ~, <<, and >> operators with arguments that produce
>> such a value;
>> * the +, -, *, /, and % operators where one argument is a negative
>> zero and the result is zero;
>> * compound assignment operators based on the above cases.
>>
>> Is this list exclusive? Or can an implementation-defined result
>> override it?
>
> To me, the phrase "only by" seems to pretty clearly mark it as an
> exclusive list. Undefined behavior might override that exclusivity,
> since the standard explicitly does not constrain the behavior in any
> way when it is undefined, but implementation-defined behavior should
> not violate that requirement.
Yielding a value isn't the same as generating it.