In article <
boj49j...@mid.dfncis.de>,
And it does this to fill a much-needed gap. The correct way to print
-1 converted to an unsigned short is to use plain %u format and cast
the arg to unsigned short yourself.
I used to think that the behaviour is defined in all cases (to be that
of taking the (promoted) int or unsigned arg and casting it, with an
implementation-defined result in some cases). Now I am unsure what
the standard requires if there is a type mismatch even if the result
is representable.
I used to think that the behaviour is clearer for scanf() -- that
scanf() fills a much-needed gap for most arg types, because its
behaviour is undefined if the result is not representable so it
is unusable on input that hasn't been verified by other means.
However, its wording specifies representability of _converted_
results, and converted results are always representable (by the
definition of result). From n869.txt (n1570.pdf is no different):
% [#10] Except in the case of a % specifier, the input item
% (or, in the case of a %n directive, the count of input
% characters) is converted to a type appropriate to the
% conversion specifier. If the input item is not a matching
"appropriate" seems to have its non-technical meaning. Therefore, it
is the C type literally matching the conversion specifier.
% sequence, the execution of the directive fails: this
% condition is a matching failure. Unless assignment
% suppression was indicated by a *, the result of the
% conversion is placed in the object pointed to by the first
% argument following the format argument that has not already
% received a conversion result. If this object does not have
% an appropriate type, or if the result of the conversion
% cannot be represented in the object, the behavior is
% undefined.
This wording makes no sense. "Conversion" can only be read as being
to the "appropriate" type, not some infinite-precision non-string
type capable of representing any string. Some conversions to the
"appropriate" type (mainly ones involving floating point) have no result
since conversion gives undefined behaviour on overflow. But if there
is a result, then it is representable.
Floating point types give the additional problem that the infinite-
precision result might be unrepresentable because it cannot be exactly
representable, so the specification cannot be changed to say that the
conversion is to an intermediate infinite-precision type.
%
% [#11] The length modifiers and their meanings are:
%
% hh Specifies that a following d, i, o, u, x, X, or
% n conversion specifier applies to an argument
% with type pointer to signed char or unsigned
% char.
This seems to say that scanf("%hhu", &u_char_var) on input -1 _is_
defined, since the result of the conversion of -1 to unsigned char
is just UCHAR_MAX and that is surely representable.
Related case for floating point:
- ("%f", &float_var) on input 0.1 is defined, since the result of
conversion of 0.1 to float is 0.1F. Decimal 0.1 might be
unrepresentable as a binary float, but the behaviour is defined
since the converted value is surely representable in the converted
type.
- ("%f", &float_var) on input 1e6666666 is defined, at least in
implementations that support infinities, since the result of conversion
of 1e6666666 to float is INFINITY. 1e6666666 is unrepresentable as a
float, but the behaviour is defined, as above.
This is clearly wrong. The behaviour should be undefined on overflow,
but defined as the result of conversion when the infinite-precision
value is not exactly representable, except possibly when the conversion
undeflows.
I used to think that the behaviour is clearer for the atol() family --
that it fills a much-needed gap by giving undefined behaviour like
sscanf(). The specification of these functions doesn't mention
conversion before representability:
% [#1] The functions atof, atoi, atol, and atoll need not
% affect the value of the integer expression errno on an
% error. If the value of the result cannot be represented,
% the behavior is undefined.
However, it is unclear what "the value of the result" means. What is
the difference between this and "the result"? I think "the result"
literally means just the result of conversion to the function return
type and the point about representability makes no sense, as above.
If it doesn't mean that, then it must mean an infinite-precision
intermediate result, but that makes no sense since it gives undefined
bahaviour for results that are not exactly representable.
These bugs are all missing for the strtol() family. This family doesn't
fill gaps with undefined behaviour, and the wording of its specification
makes sense. For example, for the floating point subset:
% [#10] The functions return the converted value, if any. If
% no conversion could be performed, zero is returned. If the
% correct value is outside the range of representable values,
By saying "correct value" instead of "converted value", it doesn't
require conversion before considering representability. "correct"
is underspecified, but everyone knows what it means.
% plus or minus HUGE_VAL, HUGE_VALF, or HUGE_VALL is returned
% (according to the return type and sign of the value), and
% the value of the macro ERANGE is stored in errno. If the
% result underflows (7.12.1), the functions return a value
% whose magnitude is no greater than the smallest normalized
% positive number in the return type; whether errno acquires
% the value ERANGE is implementation-defined.
Even the behaviour on underflow is specified (in more detail than
rounding for non-underflowing cases).
Bruce