#include <stddef.h>
wchar_t foo = L'ab';
can an implementation declare the value of L'ab' to be essentially
L'a' * (WCHAR_MAX + 1) + L'b', and thus reject the above as a violation
of constraint 6.4.4p2 which requires the value to be in range of a wchar_t?
If so, I assume 'ab' could also be rejected for the same reason, if
the implementation declared multicharacter non-wide constants to have
some out-of-range value such as INT_MAX + 1.
Neil.
Not only can it, but actually that's what I would expect.
> If so, I assume 'ab' could also be rejected for the same reason, if
> the implementation declared multicharacter non-wide constants to have some
> out-of-range value such as INT_MAX + 1.
There is a difference there, in that 'ab' has type int, not type char,
and it is specified as packing char representations into that int datum
(in an implementation-specific way). This is a legacy issue and the
language is pretty much stuck with it.
The standard merely says (C99 6.4.4.4p10):
The value of an integer character constant containing more than
one character (e.g., 'ab'), or containing a character or escape
sequence that does not map to a single-byte execution character,
is implementation-defined.
It says nothing about "packing char representations into that int
datum".
It makes sense to do so, and if you say that's the intent I'll take
your word for it, but the standard doesn't actually say so.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
> Not only can it, but actually that's what I would expect.
Thanks.
>> If so, I assume 'ab' could also be rejected for the same reason, if
>> the implementation declared multicharacter non-wide constants to have some
>> out-of-range value such as INT_MAX + 1.
>
> There is a difference there, in that 'ab' has type int, not type char,
> and it is specified as packing char representations into that int datum
> (in an implementation-specific way). This is a legacy issue and the
> language is pretty much stuck with it.
But there seems to be nothing preventing an implementation from
declaring multichar constants out of range (say 1 million) and rejecting
them outright. Of course, I doubt real implementations would do this.
I was interested because I was wondering what can reliably be tested by
a test suite, and it seems the answer is "nothing".
Neil.
So far as I recall, they cannot do that. The value gets to be
defined by the implementor, but there seems to be a strong
implication that the valuehas to be representable.
Of course, that includes 'abcd'=='a' or some other simple mapping.
Keith was right to note that the packing aspect is not an actual
requirement, just common practice.
> I was interested because I was wondering what can reliably be tested by
> a test suite, and it seems the answer is "nothing".
That was true even in pre-Standard C, where 'ab' on one ASCII
host might have the same value as 'ba' on another ASCII host.
>> But there seems to be nothing preventing an implementation from
>> declaring multichar constants out of range (say 1 million) and rejecting
>> them outright. Of course, I doubt real implementations would do this.
>
> So far as I recall, they cannot do that. The value gets to be
> defined by the implementor, but there seems to be a strong
> implication that the valuehas to be representable.
The standard should be explicit about that then. If 'ab' must be
given a representable value then you should surely require the same
for L'ab'; the wording in the standard is identical for both cases.
>> I was interested because I was wondering what can reliably be tested by
>> a test suite, and it seems the answer is "nothing".
>
> That was true even in pre-Standard C, where 'ab' on one ASCII
> host might have the same value as 'ba' on another ASCII host.
My point was that to me it seemed you couldn't even use such a
multichar constant in a test as the implementation is entitled to
reject it.
That may not be the intent, but you cannot deduce that from the
standard's wording IMO.
Neil.
From the C99+TG1+TG2 May 2005 draft (6.4.4.4.10):
The value of an integer character constant containing more than
one character (e.g., 'ab'), or containing a character or escape
sequence that does not map to a single-byte execution character,
is implementation-defined.
For my class project, I had the lexer add a random number to a hash of
the character sequence; I’m pretty sure that was conforming.
--Joel
A program merely using an implementation-defined behavior is *meant*
to be a correct program. If its output does not depend on such a
behavior, it qualifies as a s.c. program, which requires an
implementation to accept it; to be precise, a conforming
implementation is not allowed to reject it just because it uses a
multi-character constant.
--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.
``All opinions expressed are mine, and do not represent
the official opinions of any organization.''