Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

C99 printf formats

190 views
Skip to first unread message

Adrian Sandor

unread,
Mar 10, 2004, 11:32:51 AM3/10/04
to
I'm working on a C99 standard library implementation for mingw.
I would like to know what is the *correct* output of this program:

#include <stdio.h>

int main()
{
printf("%hd\n", 0xf4f5f6f7);
printf("%hhd\n", 0xf4f5f6f7);
printf("%zd\n", 0xf4f5f6f7);
}

notes: sizeof(int)==4, sizeof(short int)==2, ptrdiff_t is int, size_t
is unsigned int

thanks

Adrian

Kevin Bracey

unread,
Mar 10, 2004, 12:20:33 PM3/10/04
to
In message <8ecf8e03.04031...@posting.google.com> you wrote:

> I'm working on a C99 standard library implementation for mingw.
> I would like to know what is the *correct* output of this program:
>

> notes: sizeof(int)==4, sizeof(short int)==2, ptrdiff_t is int, size_t
> is unsigned int

Well, also assuming CHAR_BIT=8, no padding etc etc... This means that
the constant 0xf4f5f6f7 has type unsigned int.

> #include <stdio.h>
>
> int main()
> {
> printf("%hd\n", 0xf4f5f6f7);

Technically undefined behaviour, as the conversion of 0xf4f5f6f7 to short int
falls out of range. But assuming you do 2's complement modulo arithmetic when
converting to signed values, the correct output is "-2313\n".

The phrasing in the standard is woolly enough to argue that it's implicitly
undefined behaviour if you supply an "real" int instead of a promoted short,
regardless of value. I'm not sure that's the intent, as why in that case
would the standard need to say that the input value is converted to short
int?

> printf("%hhd\n", 0xf4f5f6f7);

Same argument as above, leading to "-9\n".

> printf("%zd\n", 0xf4f5f6f7);
> }

"%zd" is a little odd, but legal, and means the argument is an int in your
case. So it's the same as just "%d". The behaviour is undefined because the
value 0xf4f5f6f7 isn't representable in an int. A normal 2's complement
implementation would output "-185207049\n".

Actually, come to think of it, the first two cases are undefined for the same
reason as this case, as they're expecting an int argument too. But anyway, if
you're not implementing this for the DeathStation 9000, those are the outputs
you want.

--
Kevin Bracey, Principal Software Engineer
Tematic Ltd Tel: +44 (0) 1223 503464
182-190 Newmarket Road Fax: +44 (0) 1223 503458
Cambridge, CB5 8HE, United Kingdom WWW: http://www.tematic.com/

Adrian Sandor

unread,
Mar 11, 2004, 12:47:10 AM3/11/04
to
Thank you for your reply

Kevin Bracey wrote:
> Well, also assuming CHAR_BIT=8, no padding etc etc... This means that
> the constant 0xf4f5f6f7 has type unsigned int.

yes, of course

> "%zd" is a little odd, but legal, and means the argument is an int in your
> case. So it's the same as just "%d". The behaviour is undefined because the
> value 0xf4f5f6f7 isn't representable in an int.

Well, at http://www.dinkumware.com/manuals/reader.aspx?b=c/&h=lib_prin.html#Print%20Conversion%20Specifiers
(note: you may have to access it twice because the first time it
displays the "limited access notice") it says that %zd takes an
argument of type size_t and then converts it to a ptrdiff_t (so it
should take my number as unsigned int and convert it to int).
Is the behaviour still undefined when the value is in the range of
size_t but out of the range of ptrdiff_t? Or is that information
wrong?
(BTW, if anybody from Dinkumware is reading this, why do %hh@, %j@,
%ll@, %t@ and %z@, with @ in {u, x, X}, have default base==8?)

> Actually, come to think of it, the first two cases are undefined for the same
> reason as this case, as they're expecting an int argument too.

Oh, then suppose I was passing an int with the same binary
representation on 4 bytes :) Then I guess they would be undefined for
the initial reasons.

> But anyway, if you're not implementing this for the DeathStation 9000,
> those are the outputs you want.

That's what I thought too, but I saw implementations that displayed
the int value for %h and %hh. Well, if the behaviour is undefined,
then I guess they're not wrong. I may have to modify my conformity
tests.

Adrian

Dan Pop

unread,
Mar 11, 2004, 7:38:13 AM3/11/04
to
In <8ecf8e03.04031...@posting.google.com> adi...@yahoo.com (Adrian Sandor) writes:

>Well, at http://www.dinkumware.com/manuals/reader.aspx?b=c/&h=lib_prin.html#Print%20Conversion%20Specifiers

There is NO substitute for the C99 standard for an implementor.

>(note: you may have to access it twice because the first time it
>displays the "limited access notice") it says that %zd takes an
>argument of type size_t and then converts it to a ptrdiff_t (so it
>should take my number as unsigned int and convert it to int).
>Is the behaviour still undefined when the value is in the range of
>size_t but out of the range of ptrdiff_t? Or is that information
>wrong?

It's dead wrong. From the C99 standard:

z Specifies that a following d, i, o, u, x, or X
conversion specifier applies to a size_t or the
corresponding signed integer type argument; or that
a following n conversion specifier applies to a
pointer to a signed integer type corresponding to
size_t argument.

t Specifies that a following d, i, o, u, x, or X
conversion specifier applies to a ptrdiff_t or
the corresponding unsigned integer type argument;
or that a following n conversion specifier applies
to a pointer to a ptrdiff_t argument.

ptrdiff_t has absolutely no connection to the z modifier, period.

>> But anyway, if you're not implementing this for the DeathStation 9000,
>> those are the outputs you want.
>
>That's what I thought too, but I saw implementations that displayed
>the int value for %h and %hh. Well, if the behaviour is undefined,
>then I guess they're not wrong. I may have to modify my conformity
>tests.

They are not wrong, because the code invokes undefined behaviour, but the
intent of the standard is clear: printf is supposed to convert back to the
target type:

hh Specifies that a following d, i, o, u, x, or X conversion
specifier applies to a signed char or unsigned
char argument (the argument will have been promoted
according to the integer promotions, but its value
^^^^^^^^^^^^^
shall be converted to signed char or unsigned char
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
before printing); or that a following n conversion
^^^^^^^^^^^^^^^
specifier applies to a pointer to a signed char argument.

h Specifies that a following d, i, o, u, x, or X
conversion specifier applies to a short int or unsigned
short int argument (the argument will have been promoted
according to the integer promotions, but its value shall
^^^^^^^^^^^^^^^^^^^
be converted to short int or unsigned short int before
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
printing); or that a following n conversion specifier
^^^^^^^^
applies to a pointer to a short int argument.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

P.J. Plauger

unread,
Mar 11, 2004, 8:27:03 AM3/11/04
to
"Adrian Sandor" <adi...@yahoo.com> wrote in message
news:8ecf8e03.04031...@posting.google.com...

> Well, at
http://www.dinkumware.com/manuals/reader.aspx?b=c/&h=lib_prin.html#Print%20Conversion%20Specifiers
> (note: you may have to access it twice because the first time it
> displays the "limited access notice") it says that %zd takes an
> argument of type size_t and then converts it to a ptrdiff_t (so it
> should take my number as unsigned int and convert it to int).
> Is the behaviour still undefined when the value is in the range of
> size_t but out of the range of ptrdiff_t? Or is that information
> wrong?
> (BTW, if anybody from Dinkumware is reading this, why do %hh@, %j@,
> %ll@, %t@ and %z@, with @ in {u, x, X}, have default base==8?)

I guess because that correction hasn't percolated out to our web
site yet. Thanks for pointing out the gaffe.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com


P.J. Plauger

unread,
Mar 11, 2004, 8:46:36 AM3/11/04
to
"Dan Pop" <Dan...@cern.ch> wrote in message
news:c2pmjl$87e$1...@sunnews.cern.ch...

Dead wrong? Oh, I think that's a bit harsh, though I do confess to
having taken a shortcut for the purposes of simpler explication.
The z modifier, when applied to signed displays, converts a size_t
argument to "a signed integer type corresponding to a size_t argument."
ptrdiff_t is "the signed integer type of the result of subtracting
two pointers." One can make a case for the latter being no bigger
than the former, but it's hard to make a case that a conforming
implementation can make the latter smaller than the former.

I'm sure the good folks who haunt this forum can contrive an
implementation, and a nominally conforming program that runs on it,
where the effect of our description *might* fall short. It would
have to be due to an overflow that differs between the signed type
chosen for ptrdiff_t and the signed type "corresponding to" (the
same size as?) size_t. But to say that ptrdiff_t has "absolutely no
connection to the z modifier, period" is, well, dead wrong.

Bjorn Reese

unread,
Mar 11, 2004, 11:42:52 AM3/11/04
to
On Thu, 11 Mar 2004 12:38:13 +0000, Dan Pop wrote:

> z Specifies that a following d, i, o, u, x, or X
> conversion specifier applies to a size_t or the
> corresponding signed integer type argument; or that
> a following n conversion specifier applies to a
> pointer to a signed integer type corresponding to
> size_t argument.

How does one determine "the corresponding signed integer type"?

--
mail1dotstofanetdotdk

James Kuyper

unread,
Mar 11, 2004, 12:10:51 PM3/11/04
to

Whichever type the implementor chooses size_t to be a typedef for, it's
required to have a corresponding signed integer type (6.2.5p6). If it's
a standard type, the corresponding signed type is formed simply by
removing the 'unsigned' portion of the type. There are a couple of
exceptions: the corresponding signed type for 'unsigned char', is
'signed char', and _Bool is an unsigned type with no corresponding
standard signed type. Both of those cases are rather unlikely (but not
non-conforming) choices for size_t. :-)

size_t could also be an extended unsigned type, in which case you'll
have to look at the implementation's documentation to determine the
corresponding signed type. It's required to exist, and is
implementation-specific, but it's not labelled as
implementation-defined. Therefore, that's not one of the things an
implementation is required to document, but any decent implementation
should do so anyway.

P.J. Plauger

unread,
Mar 11, 2004, 12:17:17 PM3/11/04
to
"Bjorn Reese" <bre...@see.signature> wrote in message
news:pan.2004.03.11....@see.signature...

Yup. Presumably, the folks who provide the implementation know
what this type is. But us poor third-party vendors are not so
lucky. That's why I chose ptrdiff_t as both a) something most
programmers can relate to, and b) a type that almost certainly
does what's intended.

Kevin Bracey

unread,
Mar 11, 2004, 12:47:30 PM3/11/04
to
In message <pan.2004.03.11....@see.signature>
"Bjorn Reese" <bre...@see.signature> wrote:

In a portable fashion as a user of the library? Haven't the foggiest :)
That's why it's of dubious value. I suppose having that statement does mean
that printing a size_t with %zd is legal, as long as the size_t is in the
range of the corresponding signed type. But again, who's to say what the
range of that signed type is? Probably means you can print 0..SIZE_MAX/2 with
"%zd" in normal implementations though.

Jeremy Yallop

unread,
Mar 11, 2004, 2:08:01 PM3/11/04
to
James Kuyper wrote:
> There are a couple of exceptions: the corresponding signed type for
> 'unsigned char', is 'signed char', and _Bool is an unsigned type
> with no corresponding standard signed type. Both of those cases are
> rather unlikely (but not non-conforming) choices for size_t. :-)

_Bool is a non-conforming choice for size_t, since SIZE_MAX must be at
least 65535.

Jeremy.

James Kuyper

unread,
Mar 11, 2004, 2:26:45 PM3/11/04
to

The standard only requires that _Bool must be capable of storing 0 and
1; it doesn't prohibit an implementation from defining it as being able
to represent larger values. :-)

Adrian Sandor

unread,
Mar 11, 2004, 3:38:25 PM3/11/04
to
Dan Pop wrote:

> There is NO substitute for the C99 standard for an implementor.

> [...]

Ok, take it easy :)
I understand this (and it's becoming even clearer now), but the C99
standard is a substitute for $18 (or more), and I'm not currently
willing (or even able(?) regarding online payment) to buy it.
I referred to the next best thing (called n869), and I found the same
things you quoted, but they seem ambiguous to me. E.g. for z, it's not
clear when to use size_t and when to use the "corresponding signed
integer type", and what kind of animal is that.
That's why I wanted to find some alternative source, written by people
who understood what the standard intended to say, and that explains
the details more clearly. The Dinkumware reference seemed to be a good
choice; unfortunately I found it was incorrect.

Anyway, the things are clear now *if* I use my common sense. We could
still go on with a sterile discussion about the phrasing of the
standard, ambiguities and interpretations, extreme conforming
implementations, defined/undefined behaviour disputes, etc. but IMO
it's not worth it.

Thanks everyone for the help!

Adrian

Clark Cox

unread,
Mar 11, 2004, 4:52:48 PM3/11/04
to
In article <4050BD75...@saicmodis.com>,
James Kuyper <kuy...@saicmodis.com> wrote:

Of course, how you'd get that value into the _Bool in the first place
escapes me:

6.3.1.2
1 When any scalar value is converted to _Bool, the result is 0 if the
value compares equal to 0; otherwise, the result is 1.

Douglas A. Gwyn

unread,
Mar 11, 2004, 4:46:18 PM3/11/04
to
Bjorn Reese wrote:
> How does one determine "the corresponding signed integer type"?

You can't do it automatically+portably, but what is meant is
the correspondence between signed and unsigned integer types
explained when integer types are first introduced.

Dan Pop

unread,
Mar 12, 2004, 9:41:12 AM3/12/04
to

>Dan Pop wrote:
>
>> There is NO substitute for the C99 standard for an implementor.
>> [...]
>
>Ok, take it easy :)
>I understand this (and it's becoming even clearer now), but the C99
>standard is a substitute for $18 (or more), and I'm not currently
>willing (or even able(?) regarding online payment) to buy it.
>I referred to the next best thing (called n869), and I found the same

Yes, N869 is the best substitute for the C99 standard, if the $18 are
ruling out the real thing.

>things you quoted, but they seem ambiguous to me. E.g. for z, it's not
>clear when to use size_t and when to use the "corresponding signed
>integer type", and what kind of animal is that.

It's quite clear: if the modified conversion specifier expects a signed
int, you have to use the mythical signed integer type corresponding to
size_t; if it expects un unsigned int, you use size_t itself.

So, %zd and %zi expect the signed type, while %zo, %zu, %zx and %zX
expect size_t itself.

As to figuring out the type expected by %zd, there is NO portable way.
The library is supposed to be implemented by people with inside
information about the compiler, which rules out truly portable
implementations. You have (at least) 3 options:

1. Assume it's a standard integer type and use the preprocessor to find
a suitable one.

#if UCHAR_MAX == SIZE_MAX
typedef signed char s_size_t;
#elif USHRT_MAX == SIZE_MAX
typedef signed short s_size_t;
#elif UINT_MAX == SIZE_MAX
typedef int s_size_t;
...
#else
#error "Cannot identify a type corresponding to size_t"
#endif

No sane implementation is likely to use unsigned char or unsigned short
for size_t, so you may omit the first two alternatives, unless you
want to be anally correct.

If SIZE_MAX corresponds to two or more unsigned integer types, e.g.
unsigned int and unsigned long, it is reasonable to assume that both
types are handled identically by the compiler, so it doesn't really
matter if you define s_size_t as int, while the implementation defines
size_t as unsigned long.

2. The Dinkumware heuristic: use ptrdiff_t as the most likely candidate.

3. On POSIX platforms, you may include <sys/types.h> and use ssize_t
as the most likely candidate.



>Anyway, the things are clear now *if* I use my common sense. We could

You *have* to use it, when dealing with the C standard.

James Kuyper

unread,
Mar 12, 2004, 9:57:16 AM3/12/04
to
Clark Cox wrote:
>
> In article <4050BD75...@saicmodis.com>,
> James Kuyper <kuy...@saicmodis.com> wrote:
...

> > The standard only requires that _Bool must be capable of storing 0 and
> > 1; it doesn't prohibit an implementation from defining it as being able
> > to represent larger values. :-)
>
> Of course, how you'd get that value into the _Bool in the first place
> escapes me:
>
> 6.3.1.2
> 1 When any scalar value is converted to _Bool, the result is 0 if the
> value compares equal to 0; otherwise, the result is 1.

Agreed. I'm not sure why the standard doesn't just say that the only
values that a _Bool can represent are 0 and 1.

Dan Pop

unread,
Mar 12, 2004, 9:57:41 AM3/12/04
to
In <05_3c.36610$rW6....@nwrddc03.gnilink.net> "P.J. Plauger" <p...@dinkumware.com> writes:

>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:c2pmjl$87e$1...@sunnews.cern.ch...
>
>> In <8ecf8e03.04031...@posting.google.com> adi...@yahoo.com
>(Adrian Sandor) writes:
>>
>> >Well, at
>http://www.dinkumware.com/manuals/reader.aspx?b=c/&h=lib_prin.html#Print%20Conversion%20Specifiers
>>
>> There is NO substitute for the C99 standard for an implementor.
>>
>> >(note: you may have to access it twice because the first time it
>> >displays the "limited access notice") it says that %zd takes an
>> >argument of type size_t and then converts it to a ptrdiff_t (so it
>> >should take my number as unsigned int and convert it to int).
>> >Is the behaviour still undefined when the value is in the range of
>> >size_t but out of the range of ptrdiff_t? Or is that information
>> >wrong?
>>
>> It's dead wrong. From the C99 standard:
>>
>> z Specifies that a following d, i, o, u, x, or X
>> conversion specifier applies to a size_t or the

^^^^^^^^^^^^^^^^^^^^^^^^^^


>> corresponding signed integer type argument; or that

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^


>> a following n conversion specifier applies to a
>> pointer to a signed integer type corresponding to
>> size_t argument.
>>
>> t Specifies that a following d, i, o, u, x, or X
>> conversion specifier applies to a ptrdiff_t or
>> the corresponding unsigned integer type argument;
>> or that a following n conversion specifier applies
>> to a pointer to a ptrdiff_t argument.
>>
>> ptrdiff_t has absolutely no connection to the z modifier, period.
>
>Dead wrong? Oh, I think that's a bit harsh, though I do confess to
>having taken a shortcut for the purposes of simpler explication.
>The z modifier, when applied to signed displays, converts a size_t
>argument to "a signed integer type corresponding to a size_t argument."

Wrong! It does NOT expect a size_t argument in such a case, it
expects its corresponding signed type (so passing negative values of the
right type does NOT invoke undefined behaviour). Because of this,
%zd and %zi are useless in programs that don't make assumptions about
the type of size_t.

>ptrdiff_t is "the signed integer type of the result of subtracting
>two pointers." One can make a case for the latter being no bigger
>than the former, but it's hard to make a case that a conforming
>implementation can make the latter smaller than the former.

Chapter and verse, please. If the standard wanted to support your view,
there was nothing preventing it from stating it in a normative paragraph,
was it?

>I'm sure the good folks who haunt this forum can contrive an
>implementation, and a nominally conforming program that runs on it,
>where the effect of our description *might* fall short. It would
>have to be due to an overflow that differs between the signed type
>chosen for ptrdiff_t and the signed type "corresponding to" (the
>same size as?) size_t. But to say that ptrdiff_t has "absolutely no
>connection to the z modifier, period" is, well, dead wrong.

Show us the text of the standard estabilishing such a connection.
If the standard wanted to do it, it would have done it.

From a pragmatic point of view, I strongly agree that ptrdiff_t should be
the signed "brother" of size_t and I see no good reason for not having it
*explicitly* stated by the standard. But, this being comp.std.c and the
standard being what it is, there is no way to establish a logical
connection between the two types, based on the text of the standard,
period.

Eric Backus

unread,
Mar 12, 2004, 2:34:02 PM3/12/04
to
"James Kuyper" <kuy...@saicmodis.com> wrote in message
news:4051CFCC...@saicmodis.com...

Maybe so that implementations don't do the equivalent of (value & 0x1),
which would, for example, map 2 to 0. After all, that's the sort of
reduction that is typically done for other conversions of one integer type
to a smaller integer type.

--
Eric Backus

James Kuyper

unread,
Mar 13, 2004, 10:14:12 AM3/13/04
to
"Eric Backus" <eric_...@alum.mit.edu> wrote in message news:<10791200...@cswreg.cos.agilent.com>...

> "James Kuyper" <kuy...@saicmodis.com> wrote in message
> news:4051CFCC...@saicmodis.com...
> > Clark Cox wrote:
...

> > > 6.3.1.2
> > > 1 When any scalar value is converted to _Bool, the result is 0 if the
> > > value compares equal to 0; otherwise, the result is 1.
> >
> > Agreed. I'm not sure why the standard doesn't just say that the only
> > values that a _Bool can represent are 0 and 1.
>
> Maybe so that implementations don't do the equivalent of (value & 0x1),
> which would, for example, map 2 to 0. After all, that's the sort of
> reduction that is typically done for other conversions of one integer type
> to a smaller integer type.

That's an invalid way of converting a non-_Bool value to _Bool,
regardless of the representation of _Bool. It violates 6.3.1.2. It
can't be a reason for allowing an implementation to define _Bool as
having a representation of 2.

Dan Pop

unread,
Mar 15, 2004, 9:15:40 AM3/15/04
to

You're missing the point. The value got into the _Bool object without
being converted to _Bool first (object initialised with garbage or
aliased via a pointer to unsigned char). Now, if the standard guaranteed
that a _Bool can represent *only* 0 and 1, the compiler would have to
generate code to convert any bit pattern to 0 or 1 and the previous
poster gave one example of such mapping.

If *only* 0 and 1 can be represented, then there is no place for undefined
behaviour in my example above, just as there is no place for undefined
behaviour in the case of an unsigned char: even uninitialised, you're
still guaranteed to find a value in the range 0..UCHAR_MAX inside.

Leaving place for other values opens the door to trap representations
and undefined behaviour, so the implementation no longer has to perform
any mapping when accessing the value of a _Bool: a badly initialised
_Bool need not evaluate to 0 or 1 *only*.

Wojtek Lerch

unread,
Mar 15, 2004, 2:33:58 PM3/15/04
to
Dan...@cern.ch (Dan Pop) wrote in message news:<c34dqc$759$1...@sunnews.cern.ch>...

> In <8b42afac.04031...@posting.google.com> kuy...@wizard.net (James Kuyper) writes:
>
> >"Eric Backus" <eric_...@alum.mit.edu> wrote in message news:<10791200...@cswreg.cos.agilent.com>...
> >> "James Kuyper" <kuy...@saicmodis.com> wrote in message
> >> news:4051CFCC...@saicmodis.com...
> >> > Agreed. I'm not sure why the standard doesn't just say that the only
> >> > values that a _Bool can represent are 0 and 1.
> >>
> >> Maybe so that implementations don't do the equivalent of (value & 0x1),
> >> which would, for example, map 2 to 0. After all, that's the sort of
> >> reduction that is typically done for other conversions of one integer type
> >> to a smaller integer type.
> >
> >That's an invalid way of converting a non-_Bool value to _Bool,
> >regardless of the representation of _Bool. It violates 6.3.1.2. It
> >can't be a reason for allowing an implementation to define _Bool as
> >having a representation of 2.
>
> You're missing the point. The value got into the _Bool object without
> being converted to _Bool first (object initialised with garbage or
> aliased via a pointer to unsigned char). Now, if the standard guaranteed
> that a _Bool can represent *only* 0 and 1, the compiler would have to
> generate code to convert any bit pattern to 0 or 1 and the previous
> poster gave one example of such mapping.

If I understand the previous poster correctly, he was talking about
conversions from other integer types to _Bool, not about extracting
the value from a _Bool representation.

Even if the standard guaranteed that the only values _Bool can
represent are 0 and 1, reading from a _Bool containing a trap
representation could still produce the value 2 or 10000 (or reboot
your machine). Without a guarantee that _Bool doesn't have trap
representations, a promise that a _Bool can't represent a value
different from 0 and 1 wouldn't be any more useful than the existing
promise that converting to _Bool can't produce any value different
than 0 and 1.



> If *only* 0 and 1 can be represented, then there is no place for undefined
> behaviour in my example above, just as there is no place for undefined
> behaviour in the case of an unsigned char: even uninitialised, you're
> still guaranteed to find a value in the range 0..UCHAR_MAX inside.

Unsigned char is not a good analogy: it doesn't have trap
representations because the standard specifically says so, not because
it can only represent the values in the range 0 to UCHAR_MAX.
Unsigned int can only represent values in the range 0 to UINT_MAX, but
that doesn't mean that it can't have trap representations, does it?



> Leaving place for other values opens the door to trap representations
> and undefined behaviour, so the implementation no longer has to perform
> any mapping when accessing the value of a _Bool: a badly initialised
> _Bool need not evaluate to 0 or 1 *only*.

No, leaving place for other values has nothing to do with trap
representations.

James Kuyper

unread,
Mar 15, 2004, 3:10:16 PM3/15/04
to
Wojtek Lerch wrote:
>
> Dan...@cern.ch (Dan Pop) wrote in message news:<c34dqc$759$1...@sunnews.cern.ch>...
> > In <8b42afac.04031...@posting.google.com> kuy...@wizard.net (James Kuyper) writes:
> >
> > >"Eric Backus" <eric_...@alum.mit.edu> wrote in message news:<10791200...@cswreg.cos.agilent.com>...
> > >> "James Kuyper" <kuy...@saicmodis.com> wrote in message
...

> Even if the standard guaranteed that the only values _Bool can
> represent are 0 and 1, reading from a _Bool containing a trap
> representation could still produce the value 2 or 10000 (or reboot
> your machine). Without a guarantee that _Bool doesn't have trap
> representations, a promise that a _Bool can't represent a value
> different from 0 and 1 wouldn't be any more useful than the existing
> promise that converting to _Bool can't produce any value different
> than 0 and 1.

I agree - in the sense that it would not change the actual range of
allowed consequences for a given piece of code. However, I think it
would still be useful, by reason of being clearer than the current
wording. _Bool is clearly not intended to store values other than 0 or 1
- the standard would be better if it said so explicitly.

The standard could say something like: "_Bool is an integer type with 1
value bit, no sign bit, and no trap representations." However, I want to
make it quite clear that I think this would be over-specification, and
unacceptably inefficient on some platforms for some purposes.

--
James Kuyper
MODIS Level 1 Lead
Science Data Support Team
(301) 352-2150

Antoine Leca

unread,
Mar 17, 2004, 11:00:24 AM3/17/04
to
Hi folks,

[ about %zd ]

En c2sj55$dou$2...@sunnews.cern.ch, Dan Pop va escriure:


> It does NOT expect a size_t argument in such a case, it
> expects its corresponding signed type (so passing negative values of
> the right type does NOT invoke undefined behaviour). Because of this,
> %zd and %zi are useless in programs that don't make assumptions about
> the type of size_t.

What about printing the result of mbrtowc() when you are sure the value you
passed for n is below 32767?

Note that I did not think very long about the eventual problems of double
conversions, particularly 1's-complement and signed-magnitude.


>> ptrdiff_t is "the signed integer type of the result of subtracting
>> two pointers." One can make a case for the latter being no bigger
>> than the former, but it's hard to make a case that a conforming
>> implementation can make the latter smaller than the former.

Let's see if I can: consider 16-bit x86, with 32-bit (segmented) pointers,
without adjustment (_not_ "huge" pointers). For ease of implementation,
ptrdiff_t is usually signed 16-bit. Now, in order to support some extensions
to the standard, one makes size_t 32 bits.

Is it conforming? (all other things being otherwise correct) [at least to
C89, I did not remember if C99 requires ±65535 for ptrdiff_t range]


Antoine


Dan Pop

unread,
Mar 17, 2004, 1:08:46 PM3/17/04
to
In <405883a6$0$18757$636a...@news.free.fr> "Antoine Leca" <ro...@localhost.gov> writes:

>Hi folks,
>
>[ about %zd ]
>
>En c2sj55$dou$2...@sunnews.cern.ch, Dan Pop va escriure:
>> It does NOT expect a size_t argument in such a case, it
>> expects its corresponding signed type (so passing negative values of
>> the right type does NOT invoke undefined behaviour). Because of this,
>> %zd and %zi are useless in programs that don't make assumptions about
>> the type of size_t.
>
>What about printing the result of mbrtowc() when you are sure the value you
>passed for n is below 32767?

mbrtowc() can return values like (size_t)(-2) and (size_t)(-1) which may
be out of the range of the mythical type expected by %zd.

Furthermore, there is no consensus that passing the type with the wrong
signedness is acceptable to printf, if the value is in range. The
normative text of the standard dictates undefined behaviour and only
a non-normative footnote suggests otherwise.

Kevin Bracey

unread,
Mar 17, 2004, 1:59:32 PM3/17/04
to
In message <c3a47e$77r$1...@sunnews.cern.ch>
Dan...@cern.ch (Dan Pop) wrote:

> Furthermore, there is no consensus that passing the type with the wrong
> signedness is acceptable to printf, if the value is in range. The
> normative text of the standard dictates undefined behaviour and only a
> non-normative footnote suggests otherwise.

Is there consensus on anything in comp.*.c? I'm sure you can find someone to
disagree with any argument.

This really should be covered by the more general rules on variadic argument
passing, rather than anything to do with printf specifically.

The only problem (as far as I can see) is that printf() may not be
implemented in C, so might not use va_arg, whose description contains the
clause stating compatibility of corresponding unsigned and signed integer
types.

So in theory, it would seem that printf() could use a different calling
convention to a user variadic function, and that calling convention might not
support the same signed/unsigned compatibility as va_arg and unprototyped
functions.

In practice, this just isn't going to happen, of course. Gives the language
lawyers something to talk about though, which is always a good thing.

Antoine Leca

unread,
Mar 17, 2004, 2:17:39 PM3/17/04
to
Dan,

Thanks for your detailed explanations.

En c3a47e$77r$1...@sunnews.cern.ch, Dan Pop va escriure:


> In <405883a6$0$18757$636a...@news.free.fr> "Antoine Leca"
> <ro...@localhost.gov> writes:
>
>> [ about %zd ]
>>
>> En c2sj55$dou$2...@sunnews.cern.ch, Dan Pop va escriure:
>>> It does NOT expect a size_t argument in such a case, it
>>> expects its corresponding signed type (so passing negative values of
>>> the right type does NOT invoke undefined behaviour). Because of
>>> this, %zd and %zi are useless in programs that don't make
>>> assumptions about the type of size_t.
>>
>> What about printing the result of mbrtowc() when you are sure the
>> value you passed for n is below 32767?
>
> mbrtowc() can return values like (size_t)(-2) and (size_t)(-1) which

Exactly my idea (printing them in a useful way, i.e. "-1" rather than some
unreadable large constant.)

> may be out of the range of the mythical type expected by %zd.

Well, after a bit more of thought, I cannot think of a result that might be
out of range, but you are correct there might be problems, as they might
result in trap representations.

Why mythical? Isn't it required by 6.2.5p6? (Given that size_t cannot be
_Bool).


> Furthermore, there is no consensus that passing the type with the
> wrong signedness is acceptable to printf, if the value is in range.
> The normative text of the standard dictates undefined behaviour
> and only a non-normative footnote suggests otherwise.

Granted, I missed that point (particularly that 6.3.1.3p3 does not apply for
vararg arguments, since 7.15.1.1p2 clearly "preempts" it and _requires_
values representable in both types, that is, between 0 and SignedType_MAX).

Where is that footnote you are talking about?


Antoine


James Kuyper

unread,
Mar 17, 2004, 2:21:31 PM3/17/04
to
Kevin Bracey wrote:
>
> In message <c3a47e$77r$1...@sunnews.cern.ch>
> Dan...@cern.ch (Dan Pop) wrote:
>
> > Furthermore, there is no consensus that passing the type with the wrong
> > signedness is acceptable to printf, if the value is in range. The
> > normative text of the standard dictates undefined behaviour and only a
> > non-normative footnote suggests otherwise.
>
> Is there consensus on anything in comp.*.c? I'm sure you can find someone to
> disagree with any argument.
>
> This really should be covered by the more general rules on variadic argument
> passing, rather than anything to do with printf specifically.

Argument passing by variadic functions, is certainly part of the issue.
However, there's also section 7.19.6.1p9, which says "If any argument is
not the correct type for the corresponding conversion specification, the
behavior is undefined."

Now, printf() is implicitly required to use an argument passing
algorithm that works just like va_arg(), when the type is correct.
However, 7.19.6.1p9 means that when the type is incorrect, it can use
any method it likes, including methods that distinguish corresponding
signed and unsigned types, even when the value is within the range of
both types. It's also allowed to distinguish other types that are merely
compatible with each other, rather than "correct".

Antoine Leca

unread,
Mar 17, 2004, 2:52:22 PM3/17/04
to
En 4058A53B...@saicmodis.com, James Kuyper va escriure:

> Argument passing by variadic functions, is certainly part of the
> issue. However, there's also section 7.19.6.1p9, which says "If any
> argument is not the correct type for the corresponding conversion
> specification, the behavior is undefined."

So using %zn is not s.c.
Bad thing, since it appeared to me using size_t argument to fetch indices
into the output stream/string would have been a nice feature. :-(


Antoine


Wojtek Lerch

unread,
Mar 18, 2004, 12:14:46 PM3/18/04
to
Kevin Bracey <kevin....@tematic.com> wrote in message news:<eff137914...@tematic.com>...

> The only problem (as far as I can see) is that printf() may not be
> implemented in C, so might not use va_arg, whose description contains the
> clause stating compatibility of corresponding unsigned and signed integer
> types.

Even if printf() is implemented in C, it might not be using va_arg().
For instance, on some machines there might be a way of fetching the
arguments that's much more efficient than using va_arg() but makes
assumptions that va_arg() cannot generally make.

> So in theory, it would seem that printf() could use a different calling
> convention to a user variadic function, and that calling convention might not
> support the same signed/unsigned compatibility as va_arg and unprototyped
> functions.

No, printf() must use the same calling convention, at least in the
sense that it must be possible to call printf indirectly using a
pointer variable of the appropriate type. But depending on whether
the signed/unsigned compatibility is implemented by the opcodes in the
calling function that pass the arguments to the variadic function, or
by the code that the va_arg() macro expands to in the variadic
function, details of the exact types of all the arguments may be
available to any variadic function. The fact that there's no
*portable* way to find those details doesn't mean that printf() is not
*allowed* to know how to find them, does it?...

Imagine a silly machine that has three stacks: one for signed
integers, one for unsigned integers, and one for pointers. When you
call a variadic function, the compiler passes a hidden argument to it
that points to a table describing the exact types of all the
arguments, and va_arg() uses that table to figure out which stack to
take the next argument from. Isn't it conceivable that printf() could
sometimes ignore the table and fetch the next argument based on the
format string instead?

Dan Pop

unread,
Mar 18, 2004, 1:05:08 PM3/18/04
to
In <4058a448$0$18767$636a...@news.free.fr> "Antoine Leca" <ro...@localhost.gov> writes:

>Dan,
>
>Thanks for your detailed explanations.
>
>En c3a47e$77r$1...@sunnews.cern.ch, Dan Pop va escriure:
>> In <405883a6$0$18757$636a...@news.free.fr> "Antoine Leca"
>> <ro...@localhost.gov> writes:
>>
>>> [ about %zd ]
>>>
>>> En c2sj55$dou$2...@sunnews.cern.ch, Dan Pop va escriure:
>>>> It does NOT expect a size_t argument in such a case, it
>>>> expects its corresponding signed type (so passing negative values of
>>>> the right type does NOT invoke undefined behaviour). Because of
>>>> this, %zd and %zi are useless in programs that don't make
>>>> assumptions about the type of size_t.
>>>
>>> What about printing the result of mbrtowc() when you are sure the
>>> value you passed for n is below 32767?
>>
>> mbrtowc() can return values like (size_t)(-2) and (size_t)(-1) which
>
>Exactly my idea (printing them in a useful way, i.e. "-1" rather than some
>unreadable large constant.)
>
>> may be out of the range of the mythical type expected by %zd.
>
>Well, after a bit more of thought, I cannot think of a result that might be
>out of range, but you are correct there might be problems, as they might
>result in trap representations.
>
>Why mythical? Isn't it required by 6.2.5p6? (Given that size_t cannot be
>_Bool).

It is required, but if size_t is an extended type, we have no idea about
what its corresponding signed type might be called.

>> Furthermore, there is no consensus that passing the type with the
>> wrong signedness is acceptable to printf, if the value is in range.
>> The normative text of the standard dictates undefined behaviour
>> and only a non-normative footnote suggests otherwise.
>
>Granted, I missed that point (particularly that 6.3.1.3p3 does not apply for
>vararg arguments, since 7.15.1.1p2 clearly "preempts" it and _requires_
>values representable in both types, that is, between 0 and SignedType_MAX).
>
>Where is that footnote you are talking about?

9 The range of nonnegative values of a signed integer type is a
subrange of the corresponding unsigned integer type, and the
representation of the same value in each type is the same.31)
____________________

31) The same representation and alignment requirements are meant
to imply interchangeability as arguments to functions,
return values from functions, and members of unions.

So, the issue is something like printf("%x\n", 1).

The fprintf specification says:

o,u,x,X The unsigned int argument is converted to ...
^^^^^^^^^^^^
and

If any argument is not the correct type for the
corresponding conversion specification, the behavior is undefined.

The correct type for %x is unsigned int and nothing else, therefore my
second quote indicates undefined behaviour without any if's and but's.

7.15.1.1p2 doesn't apply at all, because there is no requirement that
printf is implemented using the va_arg macro or that it should behave as
if it were.

The *only* part of the standard that could save us from undefined
behaviour is footnote 31: if values having the same representation can
be used as arguments for functions expecting the "mirror" type, then 1
doesn't have the wrong type for %x.

But a footnote cannot trump a normative paragraph, so...

BTW, the very same issue applies to this simple program:

int foo();

int main()
{
return foo(1);
}

int foo(unsigned x)
{
return x - x;
}

1 will be passed as a signed int argument (in the absence of a
prototype declaration, only the default argument promotions are
performed), but foo expects an unsigned int argument. The type mismatch
results in undefined behaviour.

Since non-prototype declarations are a fossil in the language, it is
printf and friends that are the real concern.

Consider a processor with I registers used for signed arithmetic
operations and U registers used for unsigned arithmetic operations
(overflow on I registers traps, while overflow on U registers has the C
semantics for unsigned arithmetic). The natural argument passing
convention for such a processor would be int arguments in I registers
and unsigned int arguments in U registers. And there is NO normative
text in the standard ruling out the usage of this convention for
anything but functions using the va_arg macro (which use floating point
registers for both signed and unsigned integers, in our implementation),
period.

Douglas A. Gwyn

unread,
Mar 18, 2004, 9:06:37 PM3/18/04
to
Wojtek Lerch wrote:
> Even if printf() is implemented in C, it might not be using va_arg().

This argument has arisen before, without satisfactory resolution.
Since there is *no meaning* ascribed to variadic arguments by the
C standard *except* in the context of <stdarg.h> facilities, one
could argue to the contrary that however printf() collects its
argument values, it must be exactly "as if" <stdarg.h> is used.

Wojtek Lerch

unread,
Mar 19, 2004, 12:28:33 AM3/19/04
to
"Douglas A. Gwyn" <DAG...@null.net> wrote in message
news:u-Wdnc-X2_s...@comcast.com...

> Wojtek Lerch wrote:
> > Even if printf() is implemented in C, it might not be using va_arg().
>
> This argument has arisen before, without satisfactory resolution.

Right: everybody else failed to convince you. ;-)

> Since there is *no meaning* ascribed to variadic arguments by the

What are you talking about? An argument, variadic or otherwise, is an
expression on the comma-separated list in the parenthesies of a function
call expression. The definition of "argument" (3.3) and the description of
the function call operator (6.5.2.2, in particular p3) are very clear and
simple.

> C standard *except* in the context of <stdarg.h> facilities, one
> could argue to the contrary that however printf() collects its
> argument values, it must be exactly "as if" <stdarg.h> is used.

The standard describes the output of printf() in terms of the "arguments",
and in several cases refers to the type that the arguments are converted to
by default argument promotions.

The value of va_arg() is also described in terms of the "arguments", and
also refers to the fact that default argument promotions are performed on
them.

Where in the standard is a hint that when the description of printf(), but
not the description of va_arg(), talks about "arguments", it does not refer
to what 3.3 defines as "arguments", but instead it refers to what va_arg()
would return if it were called with the appropriate type as its argument?


James Kuyper

unread,
Mar 19, 2004, 12:00:38 PM3/19/04
to

Agreed: whenever the behavior is defined, it reasonable to expect that
it should be "as if" using <stdarg.h>, though I'm less convinced than
you are that this can clearly be deduced from the actual text of the
standard. However, in the case where the type is not correct for the
format string, the behavior of printf() is explicitly undefined, and "as
if" guarantees no longer apply.

Douglas A. Gwyn

unread,
Mar 20, 2004, 3:45:07 AM3/20/04
to
Wojtek Lerch wrote:
> What are you talking about?

We were talking about characteristics of the mechanism used by the
implementation to permit the called function to pick up the variadic
argument values. There are numerous ways to pass function arguments (a
topic that used to be covered in computer science classes). There are
*no* semantics defined in the standard for this particular case apart
from the <stdarg.h> mechanism. Therefore it is not only reasonable to
assume that that mechanism (or one indistinguishable from it) is also
what is used when the variadic function "printf" is called, it is
unreasonable to assume anything else.

Douglas A. Gwyn

unread,
Mar 20, 2004, 3:49:13 AM3/20/04
to
James Kuyper wrote:
> ... I'm less convinced than you are that this can clearly

> be deduced from the actual text of the standard.

I didn't say that, did I? I said that there is no reason to assume
anything different. We didn't insert additional "clear" wording on this
specific point, because we thought it would be clear enough as it is.
Note that there is a similar assumption for non-variadic library
functions; they pick up *their* argument values "as if" by the mechanism
described for user-written functions.

David Adrien Tanguay

unread,
Mar 21, 2004, 2:25:57 AM3/21/04
to

Can you do something like:

extern int myfunc( const char*, ... ); // uses stdarg.h to get its args
int (*fp)( const char*, ... );
fp = printf;
fp( "hello %s\n", "world" );
fp = myfunc;
fp( "hello %s\n", "world" );

?
If so, then printf would have to behave as if it used stdarg.
--
David Tanguay http://www.sentex.ca/~datanguayh/
Kitchener, Ontario, Canada [43.24N 80.29W]

Douglas A. Gwyn

unread,
Mar 21, 2004, 4:48:10 PM3/21/04
to
David Adrien Tanguay wrote:
> Can you do something like:
> extern int myfunc( const char*, ... ); // uses stdarg.h to get its args
> int (*fp)( const char*, ... );
> fp = printf;
> fp( "hello %s\n", "world" );
> fp = myfunc;
> fp( "hello %s\n", "world" );
> ?

Sure.

> If so, then printf would have to behave as if it used stdarg.

It could still, for example, add one to each numerical argument value in
the process of "picking it up", or remove the high-order bits, etc. (on
the presumption that it is not bound by the standard-specified manner of
picking up such argument values). However, that is not a reasonable
thing to assume.

David Adrien Tanguay

unread,
Mar 22, 2004, 12:29:15 AM3/22/04
to

Wouldn't that give it the wrong value? printf("%d",2) would print 3

Charles Sanders

unread,
Mar 22, 2004, 2:52:14 AM3/22/04
to

David Adrien Tanguay wrote:

> Can you do something like:
>
> extern int myfunc( const char*, ... ); // uses stdarg.h to

> // get its args


> int (*fp)( const char*, ... );
> fp = printf;
> fp( "hello %s\n", "world" );
> fp = myfunc;
> fp( "hello %s\n", "world" );
>
> ?
> If so, then printf would have to behave as if it used stdarg.

Wouldn't it be possible for a standard conforming
compiler to have two (or more) versions of printf, with
different calling conventions (and obviously different names
as far as the linker was concerned), and where it sees
printf( .....
substitute a call to an "internal" version of printf with a
more efficient calling convention than va_arg, and where it sees
fp = printf;
or any other initialisation of a function pointer with printf,
assign the address of a version of printf that obeys the
(assumed to be) less efficient va_arg calling conventions.

Going even furthur, would it not even be possible for
a compiler to partially inline direct calls to printf with a
known format string by checking the supplied types and the
format string and substituting direct calls to the underlying
formatting functions used by printf.

For example, if printf implementation called an internal
function __fmt_int_dec() to format a signed integer as decimal,
it should be able to implement printf("%07d", i) as a call
to __fmt_int_dec() with suitable arguments and a call to puts()
or some internal equivalent.


Charles

James Kuyper

unread,
Mar 22, 2004, 9:26:22 AM3/22/04
to
"Douglas A. Gwyn" wrote:
> David Adrien Tanguay wrote:
> > If so, then printf would have to behave as if it used stdarg.
>
> It could still, for example, add one to each numerical argument value in
> the process of "picking it up", or remove the high-order bits, etc. (on
> the presumption that it is not bound by the standard-specified manner of
> picking up such argument values). However, that is not a reasonable
> thing to assume.

None of those cases has explicitly undefined behavior, so printf() is
obligated to operate as if it used <stdarg.h>; there's nothing in the
standard which allows it to add 1 or drop bits. The case where the type
is incorrect has explicitly undefined behavior, which removes all
obligations. Therefore printf() can use something which must produce the
same results as <stdarg.h> if the type is correct, but not necessarily
when it's incorrect.

Dan Pop

unread,
Mar 22, 2004, 9:26:00 AM3/22/04
to

Wrong conclusion. The implementation may pass the arguments, in this
case, using both the <stdarg.h> convention *and* the printf and friends
convention (because fp is compatible with printf). This would keep
both myfunc and printf happy, although each uses a different argument
passing scheme (e.g. memory for <stdarg.h>, registers for printf
and friends).

Antoine Leca

unread,
Mar 22, 2004, 1:41:41 PM3/22/04
to
En c3cock$8r$1...@sunnews.cern.ch, Dan Pop va escriure:
>>> [the mythical type expected by %zd.]

>>
>> Why mythical? Isn't it required by 6.2.5p6? (Given that size_t
>> cannot be _Bool).
>
> It is required, but if size_t is an extended type, we have no idea
> about what its corresponding signed type might be called.

Well, even if it is not... (i.e. how can you distinguish int and long int if
they are both 32 bits?)


[UB with mismatch signess of types]


> BTW, the very same issue applies to this simple program:
>
> int foo();
>
> int main()
> {
> return foo(1);
> }
>
> int foo(unsigned x)
> {
> return x - x;
> }
>
> 1 will be passed as a signed int argument (in the absence of a
> prototype declaration, only the default argument promotions are
> performed), but foo expects an unsigned int argument. The type
> mismatch results in undefined behaviour.

Because of 6.7.5.3p15, 2nd sentence

" If one type has a parameter type list and the other type is specified by a
function declarator that is not part of a function definition and that
contains an empty identifier list, the parameter list shall not have an
ellipsis terminator and the type of each parameter shall be compatible with
the type that results from the application of the default argument
promotions. "

Or do you refer to 6.5.2.2p6 (elided)
" If the expression that denotes the called function has a type that does
not include a prototype, the integer promotions are performed on each
argument, [...] If the function is defined with a type that includes a
prototype, and [...] the types of the arguments after promotion are not
compatible with the types of the parameters, the behavior is undefined.
[...] "

Or do you have another lecture?

(By the way, the second would allow int foo(x) unsigned x; { /* same */ },
but 6.7.5.3 does not.)


> Since non-prototype declarations are a fossil in the language, it is
> printf and friends that are the real concern.

Continuing your example, it is funny (to me) that 7.15.1.1 allows

#include <stdarg.h>

int foo(int, ...);

int main()
{
return foo(1, 1);
}

int foo(int a, ...)
{ va_list ap;
unsigned x;

va_start(ap, a);
x = va_arg(ap, unsigned);
va_end(ap);
return x - x;
}

Or do I miss something else?


Antoine


Wojtek Lerch

unread,
Mar 22, 2004, 2:06:50 PM3/22/04
to
Douglas A. Gwyn <DAG...@null.net> wrote:
> We were talking about characteristics of the mechanism used by the
> implementation to permit the called function to pick up the variadic
> argument values. There are numerous ways to pass function arguments (a
> topic that used to be covered in computer science classes). There are
> *no* semantics defined in the standard for this particular case apart
> from the <stdarg.h> mechanism.

Not true: the semantics of fprintf(), fscanf() and their friends (except
for the "v" variants) are defined without referring to the <stdarg.h>
mechanism. You could remove <stdarg.h> and all the references to it
(along with the "v" variants of printf and scanf) from the standard
without touching the descriptions of fprintf() and friends. Why is it
unreasonable to think that that wouldn't change the *meaning* of the
text that describes fprintf()?

The <stdarg.h> interface is just a lowest common denominator that the
standard forces all implementations to have to allow portable programs
to define their own variadic functions. There's no reason why an
implementation couldn't have its own, more powerful mechanism, and
implement both printf() and va_arg() on top of it. Just because
va_arg() is defined in the standard, it doesn't mean that all the
standard variadic functions must behave as if they couldn't possibly
know anything that va_arg() can't tell them.

> Therefore it is not only reasonable to
> assume that that mechanism (or one indistinguishable from it) is also
> what is used when the variadic function "printf" is called, it is
> unreasonable to assume anything else.

On the contrary. The rules of how printf() and friends fetch their
arguments *could* have been described by referring to va_arg(). That
would have both guaranteed and made it obvious that printf() must be
consistent with va_arg(); and I imagine it might also have made the
description shorter and simpler.

Instead, the authors of the standard decided to describe the two
interfaces independantly of each other, even though it meant repeating
very similar words several times. How could it be unreasonable to
assume that that decision was made because the requirements for the two
interfaces were meant to be independant of each other, too?

lawrenc...@ugsplm.com

unread,
Mar 22, 2004, 4:05:10 PM3/22/04
to
Wojtek Lerch <wojt...@yahoo.ca> wrote:
>
> Instead, the authors of the standard decided to describe the two
> interfaces independantly of each other, even though it meant repeating
> very similar words several times. How could it be unreasonable to
> assume that that decision was made because the requirements for the two
> interfaces were meant to be independant of each other, too?

Because that ignores the historical fact that printf was standardized
long before varargs/stdargs was. It's far more likely that the printf
description was just never rewritten after stdargs was adopted. In
fact, it's my recollection that the phrase "correct type" in the printf
decscription was intended to be *less* restrictive than "compatible
type", not more restrictive.

-Larry Jones

Years from now when I'm successful and happy, ...and he's in
prison... I hope I'm not too mature to gloat. -- Calvin

Wojtek Lerch

unread,
Mar 22, 2004, 4:42:10 PM3/22/04
to
lawrenc...@ugsplm.com wrote:
> Wojtek Lerch <wojt...@yahoo.ca> wrote:
>
>>Instead, the authors of the standard decided to describe the two
>>interfaces independantly of each other, even though it meant repeating
>>very similar words several times. How could it be unreasonable to
>>assume that that decision was made because the requirements for the two
>>interfaces were meant to be independant of each other, too?
>
>
> Because that ignores the historical fact that printf was standardized
> long before varargs/stdargs was. It's far more likely that the printf
> description was just never rewritten after stdargs was adopted. In
> fact, it's my recollection that the phrase "correct type" in the printf
> decscription was intended to be *less* restrictive than "compatible
> type", not more restrictive.

Are you saying that this kind of knowledge about the history of the C
standard is necessary to be able to judge what interpretation of the
latest version of the text is reasonable?

OK, allow me to rephrase:

Is it unreasonable to assume that the reason why the committee didn't
bother to put any references to va_arg() in the description of fprintf()
was because they didn't think it was important to make it clear that all
the limitations of va_arg() also apply to fprintf()?

Dan Pop

unread,
Mar 23, 2004, 9:38:54 AM3/23/04
to
In <405f335c$0$300$626a...@news.free.fr> "Antoine Leca" <ro...@localhost.gov> writes:

>En c3cock$8r$1...@sunnews.cern.ch, Dan Pop va escriure:
>>>> [the mythical type expected by %zd.]
>>>
>>> Why mythical? Isn't it required by 6.2.5p6? (Given that size_t
>>> cannot be _Bool).
>>
>> It is required, but if size_t is an extended type, we have no idea
>> about what its corresponding signed type might be called.
>
>Well, even if it is not... (i.e. how can you distinguish int and long int if
>they are both 32 bits?)

Then, it is very likely that the implementation handles both types
identically internally and you don't need to make the distinction. So,
if the implementation makes size_t unsigned long, but you use int as its
corresponding signed type, nothing is going to break.

I had the second paragraph in mind. Nothing in the normative text of the
standard overrides it for my example.

>Continuing your example, it is funny (to me) that 7.15.1.1 allows
>
> #include <stdarg.h>
>
> int foo(int, ...);
>
> int main()
> {
> return foo(1, 1);
> }
>
> int foo(int a, ...)
> { va_list ap;
> unsigned x;
>
> va_start(ap, a);
> x = va_arg(ap, unsigned);
> va_end(ap);
> return x - x;
> }
>
>Or do I miss something else?

This is explicitly allowed by the standard. And this is the strongest
argument of the camp that claims that printf("%u", 1) is well defined,
according to the standard (see the post with an example involving
pointers to variadic functions). Unfortunately, it is not good enough.

It is hard to understand the committee's reluctance to add a few words to
the normative text of the standard, to put an end to this very old
controversy... I can't see any harm in mentioning in the introduction of
clause 7 that all the variadic functions from the standard library behave
as if they used va_arg to access their variable parameter lists. And
inserting footnote 31 inside the normative text would trivially fix my
example above.

Wojtek Lerch

unread,
Mar 23, 2004, 10:00:35 AM3/23/04
to
Dan Pop wrote:
> In <405f335c$0$300$626a...@news.free.fr> "Antoine Leca" <ro...@localhost.gov> writes:
>>Well, even if it is not... (i.e. how can you distinguish int and long int if
>>they are both 32 bits?)
>
> Then, it is very likely that the implementation handles both types
> identically internally and you don't need to make the distinction. So,
> if the implementation makes size_t unsigned long, but you use int as its
> corresponding signed type, nothing is going to break.

Of course, it's also theroretically possible that even though int and
long have the same size, width, and range, they may have different
representations. For instance, one could be little endian and the other
big endian. Very unlikely in practice, but allowed by the standard.

Antoine Leca

unread,
Mar 23, 2004, 10:51:48 AM3/23/04
to
En c3pi5u$3ch$1...@sunnews.cern.ch, Dan Pop va escriure:

> In <405f335c$0$300$626a...@news.free.fr> "Antoine Leca"
> <ro...@localhost.gov> writes:
>
>> En c3cock$8r$1...@sunnews.cern.ch, Dan Pop va escriure:
>>>>> [the mythical type expected by %zd.]
>>>>
>>>> Why mythical? Isn't it required by 6.2.5p6? (Given that size_t
>>>> cannot be _Bool).
>>>
>>> It is required, but if size_t is an extended type, we have no idea
>>> about what its corresponding signed type might be called.
>>
>> Well, even if it is not... (i.e. how can you distinguish int and
>> long int if they are both 32 bits?)
>
> Then, it is very likely that the implementation handles both types
> identically internally and you don't need to make the distinction.

Sorry if I am picky here. I feel I miss something in your reasonment.
I do not understand why you are making a distinction here between extended
and basic integer types.
Either two types are compatibles, or they are not. The fact they may have
the same range, representation, endianness, etc. should not have any
relevance (or does it?)

(I agree that in reality it should work. But this is no argument. You
perfectly know the code we are discussing from day 1 is also working. At the
very least because anyone who release a library where it does not work, will
receive plenty of complains ;-). But these kind of arguments are not
acceptable here, obviously.)


> It is hard to understand the committee's reluctance to add a few
> words to the normative text of the standard, to put an end to
> this very old controversy... I can't see any harm in mentioning
> in the introduction of clause 7 that all the variadic functions
> from the standard library behave as if they used va_arg to access
> their variable parameter lists.

Since this is more or less required when seen from the caller point of view
(because of the call through a pointer), this should be an implementer
problem. But I cannot figure what is the (supposed or alleged) problem.

OTOH, I do not see any DR related to this point, nor can I spot any C9X
public comments (through ANSI) on it. Looks like a bit of bureaucracy may be
in order.


Antoine


Dan Pop

unread,
Mar 23, 2004, 11:56:26 AM3/23/04
to
In <40605d0b$0$284$636a...@news.free.fr> "Antoine Leca" <ro...@localhost.gov> writes:

>En c3pi5u$3ch$1...@sunnews.cern.ch, Dan Pop va escriure:
>> In <405f335c$0$300$626a...@news.free.fr> "Antoine Leca"
>> <ro...@localhost.gov> writes:
>>
>>> En c3cock$8r$1...@sunnews.cern.ch, Dan Pop va escriure:
>>>>>> [the mythical type expected by %zd.]
>>>>>
>>>>> Why mythical? Isn't it required by 6.2.5p6? (Given that size_t
>>>>> cannot be _Bool).
>>>>
>>>> It is required, but if size_t is an extended type, we have no idea
>>>> about what its corresponding signed type might be called.
>>>
>>> Well, even if it is not... (i.e. how can you distinguish int and
>>> long int if they are both 32 bits?)
>>
>> Then, it is very likely that the implementation handles both types
>> identically internally and you don't need to make the distinction.
>
>Sorry if I am picky here. I feel I miss something in your reasonment.
>I do not understand why you are making a distinction here between extended
>and basic integer types.
>Either two types are compatibles, or they are not. The fact they may have
>the same range, representation, endianness, etc. should not have any
>relevance (or does it?)

Yes, you're missing the point. If size_t is a basic integer type, you
can use the preprocessor to figure out which type it is (modulo the same
size issue already discussed) and this allows you to obtain its
corresponding signed type:

#if SIZE_MAX == ULLONG_MAX
typedef long long isize_t;
#elif SIZE_MAX == ULONG_MAX
typedef long isize_t;
#elif SIZE_MAX == UINT_MAX
typedef int isize_t;
...
#else
#error "size_t is an extended integer type.\n"
#endif

Can you see now why I make a distinction?

>(I agree that in reality it should work. But this is no argument. You

It depends on your actual purpose. If you really need isize_t, my scheme
above will provide a working definition (even if the standard doesn't
guarantee that), as long as size_t is one of the standard integer types.

>perfectly know the code we are discussing from day 1 is also working. At the
>very least because anyone who release a library where it does not work, will
>receive plenty of complains ;-). But these kind of arguments are not
>acceptable here, obviously.)

The point with printf is that, if the standard is supposed to reflect
existing practice, it should NOT make printf("%u", 1) undefined behaviour.

>> It is hard to understand the committee's reluctance to add a few
>> words to the normative text of the standard, to put an end to
>> this very old controversy... I can't see any harm in mentioning
>> in the introduction of clause 7 that all the variadic functions
>> from the standard library behave as if they used va_arg to access
>> their variable parameter lists.
>
>Since this is more or less required when seen from the caller point of view
>(because of the call through a pointer), this should be an implementer
>problem. But I cannot figure what is the (supposed or alleged) problem.
>
>OTOH, I do not see any DR related to this point, nor can I spot any C9X
>public comments (through ANSI) on it. Looks like a bit of bureaucracy may be
>in order.

According to Doug, it wouldn't help: the committee has already discussed
the issue and decided that the wording of the standard is perfect, as far
as this issue is concerned.

Wojtek Lerch

unread,
Mar 23, 2004, 2:03:00 PM3/23/04
to
Antoine Leca wrote:
> En c3pi5u$3ch$1...@sunnews.cern.ch, Dan Pop va escriure:
>>It is hard to understand the committee's reluctance to add a few
>>words to the normative text of the standard, to put an end to
>>this very old controversy... I can't see any harm in mentioning
>>in the introduction of clause 7 that all the variadic functions
>>from the standard library behave as if they used va_arg to access
>>their variable parameter lists.
>
> Since this is more or less required when seen from the caller point of view
> (because of the call through a pointer), this should be an implementer
> problem. But I cannot figure what is the (supposed or alleged) problem.

The call through a pointer doesn't change much. Imagine an
implementation that compiles user code to interpreted bytecode, but has
all the standard library functions implemented as native code. The
calling conventions for calling a bytecode function and a native
function are completely different, and function pointers always point to
bytecode. For any standard function that the program takes the address
of, the compiler generates a bytecode version of it that just converts
its arguments to the native calling convention and calls the native code.

Now imagine that on this implementation, the va_arg() macro stringizes
its second argument and passes the string to a native function that
parses the string to figure out the type to return to. Standard
variadic functions don't need to use such an inefficient method because
the native calling convention gives them full information about the
types of the arguments (which is easy because the set of types you can
pass to the standard variadic functions is very limited).

Is there a reason why such an implementation couldn't be conforming?

Douglas A. Gwyn

unread,
Mar 23, 2004, 5:49:51 PM3/23/04
to
Dan Pop wrote:
> According to Doug, it wouldn't help: the committee has already discussed
> the issue and decided that the wording of the standard is perfect, as far
> as this issue is concerned.

That of course is not what I said.
When you have to resort to misrepresentation and sarcasm then one
naturally assumes you don't have *good* arguments to offer.

Douglas A. Gwyn

unread,
Mar 23, 2004, 5:53:28 PM3/23/04
to
Wojtek Lerch wrote:
> Of course, it's also theroretically possible that even though int and
> long have the same size, width, and range, they may have different
> representations. For instance, one could be little endian and the other
> big endian. Very unlikely in practice, but allowed by the standard.

More likely would be the types differing in padding; for example, if bot
int and long are doublewords on a machine that has only signed word
arithmetic, for speed reasons int might have padding in the middle to
skip over the low word's sign bit; whereas to accommodate expectations
made by many programs, type long might make use of every bit, even
though the generated code is slower for arithmetic operations.

Wojtek Lerch

unread,
Mar 23, 2004, 9:19:07 PM3/23/04
to
"Douglas A. Gwyn" <DAG...@null.net> wrote in message
news:RoGdnQMq_OJ...@comcast.com...

Then they would have different widths, and you would be able to tell whether
size_t is unsigned int or unsigned long by comparing SIZE_MAX to UINT_MAX
and ULONG_MAX, no?


Wojtek Lerch

unread,
Mar 23, 2004, 10:41:28 PM3/23/04
to
"Douglas A. Gwyn" <DAG...@null.net> wrote in message
news:CN6dnfAA6PA...@comcast.com...

> Note that there is a similar assumption for non-variadic library
> functions; they pick up *their* argument values "as if" by the mechanism
> described for user-written functions.

The function *somehow* picks up the *correct* values, and then performs the
action the standard says the function performs, using those values. Is that
*all* you meant by "as if"?

The standard describes what the *correct* value is: it's the value of the
expression that was the operand of the function call operator, converted to
the appropriate type (in the case of a non-variadic function declared with a
prototype, it's the type of the corresponding parameter). The standard says
nothing about the *mechanism* that functions use to figure out the correct
value, other than by describing the behaviour in terms of that value (for
standard library functions), and by saying that that value is assigned to
parameters before the function's body is executed (for regular C functions).
And by specifying the behaviour of va_start() and friends (for functions
that use va_start()).

Standard library functions implemented as macros are a bit of a can of
worms. The standard is rather vague about how closely such a macro must
mimic the semantics of a real function call. For instance, a footnote warns
that it doesn't need to contain all the sequence points that a function call
would have. But since the normative text doesn't mention that difference
between the semantics of a function call and a library function implemented
as a macro, how can we tell what *other* differences are allowed that the
normative text doesn't mention? Do macros really have to pick up their
argument values exactly "as if" they were user-written functions, or are the
requirements somewhat relaxed here, too?

All that the standard says about how such a macro evaluates its arguments is
that it "evaluates each of its arguments exactly once, fully protected by
parentheses where necessary, so it is generally safe to use arbitrary
expressions as arguments" (7.1.4p1). But is it safe to assume that all the
conversions that would be performed by a function call are performed by the
macro, too? Is it OK to pass a void pointer to a macro that expects a FILE
pointer? Is it safe to assume that a macro will generate a diagnostic where
a function call would?


Antoine Leca

unread,
Mar 24, 2004, 5:51:52 AM3/24/04
to
En c3pq7q$q2k$1...@sunnews.cern.ch, Dan Pop va escriure:

> Yes, you're missing the point. If size_t is a basic integer type, you
> can use the preprocessor to figure out which type it is (modulo the
> same size issue already discussed) and this allows you to obtain its
> corresponding signed type:

Granted, I missed this.

Thanks for taking the time to explain it.


Antoine


Dan Pop

unread,
Mar 24, 2004, 6:35:21 AM3/24/04
to
In <Ap-dnflJwod...@comcast.com> "Douglas A. Gwyn" <DAG...@null.net> writes:

>Dan Pop wrote:
>> According to Doug, it wouldn't help: the committee has already discussed
>> the issue and decided that the wording of the standard is perfect, as far
>> as this issue is concerned.
>
>That of course is not what I said.

Then, what *exactly* did you say?

>When you have to resort to misrepresentation and sarcasm then one
>naturally assumes you don't have *good* arguments to offer.

I've offered plenty of good arguments in this very thread. But the
arrogance with which you handle this issue, begs for sarcasm, so don't
be surprised if you get what you're asking for.

lawrenc...@ugsplm.com

unread,
Mar 24, 2004, 10:13:01 AM3/24/04
to
Wojtek Lerch <Wojt...@yahoo.ca> wrote:
>
> OK, allow me to rephrase:
>
> Is it unreasonable to assume that the reason why the committee didn't
> bother to put any references to va_arg() in the description of fprintf()
> was because they didn't think it was important to make it clear that all
> the limitations of va_arg() also apply to fprintf()?

No, that's not unreasonable (although I'm not sure that "limitations" is
really the right term, perhaps "characteristics" would be better). But
it immediately prompts the question: *Why* didn't they think it was
important? Was it because they thought it was intuitively obvious, or
because it isn't true? I'm asserting it was the former, you seem to
believe it was the later.

-Larry Jones

Monopoly is more fun when you make your own Chance cards. -- Calvin

Antoine Leca

unread,
Mar 24, 2004, 10:23:23 AM3/24/04
to
En rph8j1-...@jones.homeip.net, lawrenc...@ugsplm.com va escriure:

> But it immediately prompts the question: *Why* didn't they think it
> was important? Was it because they thought it was intuitively obvious,
<snip>

Intuitive obviousness which, obviously (sorry), would be a bad reason to ban
a better wording suggested inside a DR.


Antoine


Wojtek Lerch

unread,
Mar 24, 2004, 3:44:26 PM3/24/04
to
lawrenc...@ugsplm.com wrote in message news:<rph8j1-...@jones.homeip.net>...

> Wojtek Lerch <Wojt...@yahoo.ca> wrote:
> > Is it unreasonable to assume that the reason why the committee didn't
> > bother to put any references to va_arg() in the description of fprintf()
> > was because they didn't think it was important to make it clear that all
> > the limitations of va_arg() also apply to fprintf()?
>
> No, that's not unreasonable (although I'm not sure that "limitations" is
> really the right term, perhaps "characteristics" would be better). But

Of course, sorry... ;-)

> it immediately prompts the question: *Why* didn't they think it was
> important? Was it because they thought it was intuitively obvious, or
> because it isn't true? I'm asserting it was the former, you seem to
> believe it was the later.

I'd say it's intuitively obvious that it would make sense to define
printf() in terms of va_arg(), but if you look at the text of the
standard, it's not so obvious why it wasn't written that way. And
then you notice (or are told) that even though the text defines
printf() by repeating chunks from the descriptions of va_arg(), it has
a few spots where the chunks are missing some words that would make
printf() completely consistent with va_arg(), and it's not entirely
obvious whether it's that way by mistake or on purpose. And even if
you assume that it's a mistake, it's still not quite obvious whether a
requirement that's missing from the text by mistake applies or not.

Imagine that printf(), fprintf(), and sprintf() have their own
separate copies of the text that describes the details of the format
strings, and that those copies are not completely identical, but each
is missing one or two little details that the other two do specify.
Even though it's intuitively obvious that the three functions should
parse their format strings identically, does this intuition override
the simple fact that the normative text defines their semantics
differently?

Douglas A. Gwyn

unread,
Mar 25, 2004, 1:56:33 PM3/25/04
to
Antoine Leca wrote:
> Intuitive obviousness which, obviously (sorry), would be a bad reason to ban
> a better wording suggested inside a DR.

A DR is not justified if there is no real problem to be solved.
What precisely is supposed to be the problem? Perhaps, what
printf displays when passed an out-of-range value for the type
indicated by the format specifier? If so, why do we need an
answer to that question? It isn't a practice we want to
encourage.

Wojtek Lerch

unread,
Mar 25, 2004, 5:39:10 PM3/25/04
to
James Kuyper wrote:
> Now, printf() is implicitly required to use an argument passing
> algorithm that works just like va_arg(), when the type is correct.

No, printf() is required to use an argument passing algorithm that
produces the same *value* as va_arg() would, simply because the
behaviour of printf() and va_arg() is described in very similar terms --
namely, in terms of the value of the argument (i.e. the promoted value
of the corresponding operand of the function call operator). Or, where
the standard says so, the same value converted to a slightly different
type (signed vs. unsigned or void* vs char*).

But *how* those two algorithms *work* is outside of the scope of the
standard. The standard doesn't even define any terminology to talk
about how they work, or whether they differ.

Dan Pop

unread,
Mar 26, 2004, 6:12:09 AM3/26/04
to
In <pIudnVTwcIT...@comcast.com> "Douglas A. Gwyn" <DAG...@null.net> writes:

>Antoine Leca wrote:
>> Intuitive obviousness which, obviously (sorry), would be a bad reason to ban
>> a better wording suggested inside a DR.
>
>A DR is not justified if there is no real problem to be solved.
>What precisely is supposed to be the problem?

You must be really dense, if you still haven't figured it out: the
intent is to have printf("%u", 1) well defined, but far too many people
read the standard as invoking undefined behaviour. Ditto for foo(1),
when foo() expects an unsigned int parameter, but there is no prototype
for foo() in scope.

James Kuyper

unread,
Mar 26, 2004, 11:05:02 AM3/26/04
to
Wojtek Lerch <Wojt...@yahoo.ca> wrote in message news:<c3vn2g$2c9ih7$1...@ID-229447.news.uni-berlin.de>...

> James Kuyper wrote:
> > Now, printf() is implicitly required to use an argument passing
> > algorithm that works just like va_arg(), when the type is correct.
>
> No, printf() is required to use an argument passing algorithm that
> produces the same *value* as va_arg() would, simply because the

That's what I meant by "works just like". Perhaps it would have been
clearer if I'd said "is functionally equivalent to"? I had no
intention of implying that the actual implementation had to be the
same.

Wojtek Lerch

unread,
Mar 26, 2004, 11:17:03 AM3/26/04
to
James Kuyper wrote:
> Wojtek Lerch <Wojt...@yahoo.ca> wrote in message news:<c3vn2g$2c9ih7$1...@ID-229447.news.uni-berlin.de>...
>>James Kuyper wrote:
>>>Now, printf() is implicitly required to use an argument passing
>>>algorithm that works just like va_arg(), when the type is correct.
>>
>>No, printf() is required to use an argument passing algorithm that
>>produces the same *value* as va_arg() would [...]

>
> That's what I meant by "works just like". Perhaps it would have been
> clearer if I'd said "is functionally equivalent to"? I had no
> intention of implying that the actual implementation had to be the
> same.

But when the type is correct, do you think the *reason* they must
produce the same value is merely because all the descriptions of all the
possible arguments to printf() happen to talk about the same value that
the description of va_arg() does, or is it because there's something
special about va_arg() that makes some details of its specification
apply to printf(), too, even if the description of printf() doesn't
mention that?

0 new messages