Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

printf %1$s

1 view
Skip to first unread message

David R Tribble

unread,
Jan 6, 1999, 3:00:00 AM1/6/99
to
> C9X mandates changes to printf() already, in the form of %ll and %m
> and such, ...

POSIX supports printf/scanf formatting string specifier prefixes of
the form '<digits>$', so that arguments can be given in a fixed order
while the format string can refer to them in an arbitrary order. For
example, these two calls print their arguments in different orders,
but the arguments are passed to the function in the same order:

#define ERR_1_E042 "file %1$s, line %2$u: error: %3$s\n"
#define ERR_2_E042 "Error %3$s, line %2$u of %1$s\n"

printf(ERR_1_E042, file, line, msg);
printf(ERR_2_E042, file, line, msg);

This is extremely useful for international programming, where some
languages prefer a different order of subject and verb; the '$'
specifier allows the printf/scanf calls to be immune to argument
rearrangement. It's irritating when a given system doesn't support
it (e.g., Microsoft VC++).

Modifying the above example slightly shows the utility of '$':
const char * fmt;

fmt = getErrFormat(errNum, getErrCatalog(langNum));
printf(fmt, file, line, msg);

Question: Was the printf/scanf '$' format spec ever considered by
the ISO/IEC committee as a possible enhancement for C9X?
(See C9X draft: 7.19.6.1, 7.19.6.2.)

-- David R. Tribble, dtri...@technologist.com --

Mark Carmichael

unread,
Jan 6, 1999, 3:00:00 AM1/6/99
to
In article <3693FBE4...@technologist.com>, David R Tribble
<dtri...@technologist.com> wrote:

> C9X mandates changes to printf() already, in the form of %ll and %m
> and such, ...

POSIX supports printf/scanf formatting string specifier prefixes of
the form '<digits>$', so that arguments can be given in a fixed order
while the format string can refer to them in an arbitrary order. For
example, these two calls print their arguments in different orders,
but the arguments are passed to the function in the same order:

#define ERR_1_E042 "file %1$s, line %2$u: error: %3$s\n"
#define ERR_2_E042 "Error %3$s, line %2$u of %1$s\n"

printf(ERR_1_E042, file, line, msg);
printf(ERR_2_E042, file, line, msg);

This is extremely useful for international programming, where some
languages prefer a different order of subject and verb; the '$'
specifier allows the printf/scanf calls to be immune to argument
rearrangement. It's irritating when a given system doesn't support
it (e.g., Microsoft VC++).

One drawback of internationalizing using this POSIX facility is the
danger of format specifiers being altered by well-meaning translators
who are asked to edit printf() formatting strings. In the course of
re-ordering, a typo may be introduced that results in a crash, or more
subtle error. International editions aren't always subjected to the same
rigorous testing as original-language versions...

This is not a disparagement of the POSIX extension; however,those looking
for highly maintainable localization methodologies may want to look for a
solution that separates format specification from ordering. UI translators
should only be able to introduce errors in translation, not in product
functionality.

--
Mark Carmichael "My phone bill, my opinions."

Douglas A. Gwyn

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
David R Tribble wrote:
> Question: Was the printf/scanf '$' format spec ever considered by
> the ISO/IEC committee as a possible enhancement for C9X?

The US-dollar-sign character is not in the required character set.
(If we added it, another trigraph would be needed for it.)

Peter Curran

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
On Wed, 06 Jan 1999 18:12:20 -0600, David R Tribble
<dtri...@technologist.com> wrote:

>> C9X mandates changes to printf() already, in the form of %ll and %m
>> and such, ...
>
>POSIX supports printf/scanf formatting string specifier prefixes of
>the form '<digits>$', so that arguments can be given in a fixed order
>while the format string can refer to them in an arbitrary order. For
>example, these two calls print their arguments in different orders,
>but the arguments are passed to the function in the same order:
>
> #define ERR_1_E042 "file %1$s, line %2$u: error: %3$s\n"
> #define ERR_2_E042 "Error %3$s, line %2$u of %1$s\n"
>
> printf(ERR_1_E042, file, line, msg);
> printf(ERR_2_E042, file, line, msg);

How would you deal with something like

printf("line %2$u: error: %3$s\n", file, line, msg);

?

In the past, I have solved this problem by using my own variant of
printf (which only accepted char * parameters, to handle the above
problem); I found it was such a small part of the problem of
internationalization that it was hardly worth mentioning.

A solution would be nice, but this approach appears to me to be
unimplementable without a lot more work, and at the very least would
add a lot of overhead to printf &c. It would be be necessary to
pre-process the format string, to determine the types of all the
arguments, before starting any conversion.

--
Peter Curran pcu...@acm.gov (chg gov=>org)

Geoff Clare

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
David R Tribble <dtri...@technologist.com> writes:

>POSIX supports printf/scanf formatting string specifier prefixes of
>the form '<digits>$', so that arguments can be given in a fixed order
>while the format string can refer to them in an arbitrary order.

Minor nit-pick: this feature is not required by POSIX. It's marked as an
extension to POSIX in the Open Group specs (UNIX98, UNIX95, XPG4, etc.).
--
Geoff Clare <g...@root.co.uk>
UniSoft Limited, London, England.

David R Tribble

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
Peter Curran wrote:
> How would you deal with something like
> printf("line %2$u: error: %3$s\n", file, line, msg);
> ?

I would expect a reasonable implementation to work as expected,
i.e., it would ignore the 'file' argument entirely.

Ah, but now that's the problem, isn't it? How does printf()
know how big the 1st arg is if it doesn't have a format spec for
'%1$' ?

I don't know the answer. (On some machines, this wouldn't be
a problem because all stack arguments are equally sized.)
A reasonable thing to do is to produce an error (and perhaps
print out a message such as "mismatched %$ arguments").

David R Tribble

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
David R Tribble wrote:
> Question: Was the printf/scanf '$' format spec ever considered by
> the ISO/IEC committee as a possible enhancement for C9X?

Douglas A. Gwyn wrote:
> The US-dollar-sign character is not in the required character set.
> (If we added it, another trigraph would be needed for it.)

Of course, how US-centric of me.

So how about some other character, such as '/'?

E.g.:
printf("file %1/s, line %2/u: error: %3/s\n", file, line, msg);

Was something like this ever proposed by anyone to the committee?

Peter Curran

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
On Thu, 07 Jan 1999 14:54:10 -0600, David R Tribble
<dtri...@technologist.com> wrote:

>Peter Curran wrote:
>> How would you deal with something like
>> printf("line %2$u: error: %3$s\n", file, line, msg);
>> ?
>
>I would expect a reasonable implementation to work as expected,
>i.e., it would ignore the 'file' argument entirely.
>
>Ah, but now that's the problem, isn't it? How does printf()
>know how big the 1st arg is if it doesn't have a format spec for
>'%1$' ?

Yes, that is the problem I was referring to.

>I don't know the answer. (On some machines, this wouldn't be
>a problem because all stack arguments are equally sized.)
>A reasonable thing to do is to produce an error (and perhaps
>print out a message such as "mismatched %$ arguments").

I don't know of many machines for which float, double, long long,
long, int and pointer are all the same size. Certainly the standard
cannot make any such assumption. I don't see any "mismatches" in the
example I gave.

Even if this problem is resolved (e.g. require that, if the third
argument is referenced numerically, the first and second must be as
well), the overhead of pre-processing the format string and recording
the types of the parameters wouild be, IMHO, unacceptable. (This
pre-processing is necessary because, for example, to locate the third
parameter, it is necessary to first know the types (and hence sizes)
of the first two.)

David R Tribble

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
Peter Curran wrote:
>
> David R Tribble <dtri...@technologist.com> wrote:
>
> > Peter Curran wrote:
> >> How would you deal with something like
> >> printf("line %2$u: error: %3$s\n", file, line, msg);
> >> ?
> >
> > I would expect a reasonable implementation to work as expected,
> > i.e., it would ignore the 'file' argument entirely.
> >
> > Ah, but now that's the problem, isn't it? How does printf()
> > know how big the 1st arg is if it doesn't have a format spec for
> > '%1$' ?
>
> Yes, that is the problem I was referring to.
>
> > I don't know the answer.

From a standards viewpoint, it could be deemed "undefined behavior"
if the arg number specifiers are incorrect or missing. From an
implementation viewpoint, printf() could simply refuse to even try
to pull args off the stack, and instead simply print the format
string as is (without replacing the '%' specifiers); this would at
least show up at runtime, indicating that something went wrong.

> > (On some machines, this wouldn't be
> > a problem because all stack arguments are equally sized.)
> > A reasonable thing to do is to produce an error (and perhaps
> > print out a message such as "mismatched %$ arguments").
>
> I don't know of many machines for which float, double, long long,
> long, int and pointer are all the same size. Certainly the standard
> cannot make any such assumption. I don't see any "mismatches" in the
> example I gave.

"Mismatch" was a bad choice of words; "Missing %$ specifier" would
have been better.

> Even if this problem is resolved (e.g. require that, if the third
> argument is referenced numerically, the first and second must be as
> well), the overhead of pre-processing the format string and recording
> the types of the parameters wouild be, IMHO, unacceptable. (This
> pre-processing is necessary because, for example, to locate the third
> parameter, it is necessary to first know the types (and hence sizes)
> of the first two.)

And yet this feature was considered useful enough, and apparently
easy enough, to actually be implementing on many POSIX systems.

===
In the interests of making such a thing standard, though, perhaps
we should give some thought about how to make it easier to
implement. Perhaps we mandate that:

1. If any '%' format specifier uses an argument number prefix,
then all of the format specifiers in the format string must
do so.

2. If an arg number specifier is present, an arg number specifier
must also be present for each arg number less than it. A format
string not meeting this requirement is ill-formed and results in
undefined behavior.

3. Argument number prefixes must immediately follow the '%' format
character and must precede all other modifiers for that formatting
specification. (We also don't want to place any limit on the arg
number, i.e., allow one *or more* digits preceding the '/'.)

4. If the format string uses format specifiers with argument number
prefixes, then the entire format string must begin with the
sequence "%/". (This lets printf() know early on in the string
processing that it has to do argument number processing; if this
sequence is absent, it can evaluate the format string
left-to-right without needing to pre-scanning it.) A format
string that does not meet this requirement is ill-formed and
results in undefined behavior.

Thus my previous example becomes:

printf("%/%3/s in file %1/s, line %2/u\n", file, line, msg);

Paul Eggert

unread,
Jan 7, 1999, 3:00:00 AM1/7/99
to
"Douglas A. Gwyn" <DAG...@null.net> writes:

>> Question: Was the printf/scanf '$' format spec ever considered by
>> the ISO/IEC committee as a possible enhancement for C9X?

>The US-dollar-sign character is not in the required character set.


>(If we added it, another trigraph would be needed for it.)

Why would a trigraph be needed, now that we have UCNs? Can't one use
"\u0024" instead? That is, we could require that dollar sign be in the
execution character set even if it's not in the source character set.

If "\u0024" is too ugly or error-prone for your taste, we could add a
new escape (e.g. "\d" or "\s") for dollar sign. The point is that
it needn't be a source character to suffice for printf/scanf `$' format.

The majority of users wouldn't need to mess with UCNs or some other
escape, of course; we're only talking about the minority that doesn't
have dollar signs on its keyboards.

Steven Correll

unread,
Jan 8, 1999, 3:00:00 AM1/8/99
to
In article <3694b48a...@news.pathcom.com>,

Peter Curran <pcu...@acm.gov> wrote:
>How would you deal with something like
>
> printf("line %2$u: error: %3$s\n", file, line, msg);
>
>?
>
>In the past, I have solved this problem by using my own variant of
>printf (which only accepted char * parameters, to handle the above
>problem); I found it was such a small part of the problem of
>internationalization that it was hardly worth mentioning.
>
>A solution would be nice, but this approach appears to me to be
>unimplementable without a lot more work, and at the very least would
>add a lot of overhead to printf &c. It would be be necessary to
>pre-process the format string, to determine the types of all the
>arguments, before starting any conversion.

Every implementation I have seen requires that if the format string uses
'$' to specify the nth argument and n > 1, it must use '$' to specify the
n-1'th argument as well.

Apparently the programmers for Solaris, HP-UX 32 and 64 bits, IBM AIX,
and DEC OSF1 figured out how to implement this with acceptable overhead,
because all of their compilers support it. This is an X/Open standard;
there's more implementation experience backing up this invention than some
which were accepted by C9X.

Like you, one of my coworkers reimplemented "printf" to solve this
problem, but (perhaps because we support some additional data types)
ours has been buggy, and runs slower than the vendor-provided versions
which allow '$' on all data types (I don't know whether the vendors
take advantage of the fact that they control the compiler, the runtime
library, and the parameter-passing conventions, but that's one reason
you might want to put a construct like this into the language rather
than asking users to implement it themselves behind the compiler's
back). Ironically, the extensive multibyte/wide-character support in C
has been useless to us, because we chose to use Unicode/UTF-8 internally,
but this extension to "printf" would have been a Godsend--except Microsoft
didn't support it.
--
Steven Correll == 1931 Palm Ave, San Mateo, CA 94403 == s...@netcom.com

Peter Curran

unread,
Jan 8, 1999, 3:00:00 AM1/8/99
to
On Thu, 07 Jan 1999 18:54:20 -0600, David R Tribble
<dtri...@technologist.com> wrote:

>Peter Curran wrote:

>"Mismatch" was a bad choice of words; "Missing %$ specifier" would
>have been better.

My point was that my example seemed typical of the kinds of things one
would want to do with this kind of facility - pass a common set of
parameters in some situations, for example, and select which ones are
appropriate in each case. From an expected-usage point of view, there
was nothing missing at all in my example. Without this kind of
capability, it would often be necessary to re-implement the facility,
in the way I described earlier, I think. (I can think of other ways
to achieve it though - for example, yet another flag, to indicate that
the specified argument is not to be converted. By the time this is
all done, we will probably be into long-long-class ugliness.)

<snip>

Peter Curran

unread,
Jan 8, 1999, 3:00:00 AM1/8/99
to
On Fri, 8 Jan 1999 01:58:02 GMT, s...@netcom.com (Steven Correll)
wrote:

<snip>


>Ironically, the extensive multibyte/wide-character support in C
>has been useless to us, because we chose to use Unicode/UTF-8 internally,
>but this extension to "printf" would have been a Godsend--except Microsoft
>didn't support it.

This was really my point - the difficulties of internationalization
are far more complex than this. A feature like this might be a bit of
a convenience - but "Godsend" sounds like a farfetched description.

Paul Eggert

unread,
Jan 9, 1999, 3:00:00 AM1/9/99
to
ka...@gabi-soft.fr (J. Kanze) writes:

>egg...@twinsun.com (Paul Eggert) writes:

>|> The majority of users wouldn't need to mess with UCNs or some other
>|> escape, of course; we're only talking about the minority that doesn't
>|> have dollar signs on its keyboards.

>What "minority"? Europe has a larger population than the US, and
>European keyboards don't have $ signs.

It's a pretty safe bet that there are more computer keyboards in the US
than in Europe, and that there are proportionally more keyboards-with-$
in Europe than keyboards-without-$ in the US.

And if we limit ourselves to the users that we're talking about
(namely, C programmers), the bet becomes safer still.

J. Kanze

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
egg...@twinsun.com (Paul Eggert) writes:

|> The majority of users wouldn't need to mess with UCNs or some other
|> escape, of course; we're only talking about the minority that doesn't
|> have dollar signs on its keyboards.

What "minority"? Europe has a larger population than the US, and
European keyboards don't have $ signs.

--
James Kanze +33 (0)1 39 23 84 71 mailto: ka...@gabi-soft.fr
GABI Software, 22 rue Jacques-Lemercier, 78000 Versailles, France
Conseils en informatique orientée objet --
-- Beratung in objektorientierter Datenverarbeitung

J. Kanze

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
David R Tribble <dtri...@technologist.com> writes:

[Concerning the POSIX positional argument extention to printf...]


|> > Even if this problem is resolved (e.g. require that, if the third
|> > argument is referenced numerically, the first and second must be as
|> > well), the overhead of pre-processing the format string and recording
|> > the types of the parameters wouild be, IMHO, unacceptable. (This
|> > pre-processing is necessary because, for example, to locate the third
|> > parameter, it is necessary to first know the types (and hence sizes)
|> > of the first two.)
|>
|> And yet this feature was considered useful enough, and apparently
|> easy enough, to actually be implementing on many POSIX systems.

I've implemented it, and it isn't difficult. You do have to scan the
format string twice. Big deal, compared to the rest of what is going
on.

|> ===
|> In the interests of making such a thing standard, though, perhaps
|> we should give some thought about how to make it easier to
|> implement. Perhaps we mandate that:
|>
|> 1. If any '%' format specifier uses an argument number prefix,
|> then all of the format specifiers in the format string must
|> do so.

I believe that this is the case in Posix.

|> 2. If an arg number specifier is present, an arg number specifier
|> must also be present for each arg number less than it. A format
|> string not meeting this requirement is ill-formed and results in
|> undefined behavior.

Ditton.

|> 3. Argument number prefixes must immediately follow the '%' format
|> character and must precede all other modifiers for that formatting
|> specification. (We also don't want to place any limit on the arg
|> number, i.e., allow one *or more* digits preceding the '/'.)

I'm not sure, but I think that Posix restricts it to one digit. This
makes a significant difference in the implementation -- with a finite
maximum number of args, you can keep the results from the first scan on
the stack, otherwise, you need dynamic memory, which is a lot more
expensive. (My own implementation used dynamic memory anyway, since it
fully expanded all of the arguments, in order, before starting the
second scan, which did the actual output. In practice, however, I've
used schemes in other contexts where stack based memory was used up to a
certain maximum, before converting to heap.)

|> 4. If the format string uses format specifiers with argument number
|> prefixes, then the entire format string must begin with the
|> sequence "%/". (This lets printf() know early on in the string
|> processing that it has to do argument number processing; if this
|> sequence is absent, it can evaluate the format string
|> left-to-right without needing to pre-scanning it.) A format
|> string that does not meet this requirement is ill-formed and
|> results in undefined behavior.

That's not worth the bother.

Let's not forget that we are formatting, which is a relatively expensive
operation. The time necessary to scan the string once before starting
is negligible.

J. Kanze

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
pcu...@acm.gov (Peter Curran) writes:

|> On Thu, 07 Jan 1999 18:54:20 -0600, David R Tribble
|> <dtri...@technologist.com> wrote:
|>
|> >Peter Curran wrote:
|>
|> >"Mismatch" was a bad choice of words; "Missing %$ specifier" would
|> >have been better.
|>
|> My point was that my example seemed typical of the kinds of things one
|> would want to do with this kind of facility - pass a common set of
|> parameters in some situations, for example, and select which ones are
|> appropriate in each case.

If that's what you want to do, you are using the wrong facility. The
typical use is that you are outputting an error message in several
different languages, and the order of the arguments is different for the
different languages.

I have regularly used this facility in the past. When I shifted to C++,
I developed something similar for the C++ iostreams, simply because I
cannot conceive of programming without it.

I normally use the POSIX gettext facility to get the format string. One
nice feature of this is that it uses a default string as the key. So it
is easy to "wrap" it to check that the use of format specifiers in the
returned (translated) string is compatible with that in the default
string.

Francis Glassborow

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
In article <m3g19k6...@gabi-soft.fr>, J. Kanze <ka...@gabi-soft.fr>
writes

>What "minority"? Europe has a larger population than the US, and
>European keyboards don't have $ signs.

The UK is part of Europe and our keyboards have always had $ signs.

Francis Glassborow Chair of Association of C & C++ Users
64 Southfield Rd
Oxford OX4 1PA +44(0)1865 246490
All opinions are mine and do not represent those of any organisation

Chris Hills

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
In article <779jeb$5vc$1...@shade.twinsun.com>, Paul Eggert
<egg...@twinsun.com> writes
>ka...@gabi-soft.fr (J. Kanze) writes:

>
>>egg...@twinsun.com (Paul Eggert) writes:
>
>>What "minority"? Europe has a larger population than the US, and
>>European keyboards don't have $ signs.
>
>It's a pretty safe bet that there are more computer keyboards in the US
>than in Europe,
No it is not a safe bet.

>and that there are proportionally more keyboards-with-$
>in Europe than keyboards-without-$ in the US.

That is true. The $ is used as a currency symbol in countries other than
the US.

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\/\ Chris Hills Staffs /\/\/\/\/\/
/\/\/\/\/\/\/\/\/\ England /\/\/\/\/\/\/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

James Kuyper

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
Paul Eggert wrote:
>
> ka...@gabi-soft.fr (J. Kanze) writes:
>
> >egg...@twinsun.com (Paul Eggert) writes:
>
> >|> The majority of users wouldn't need to mess with UCNs or some other
> >|> escape, of course; we're only talking about the minority that doesn't
> >|> have dollar signs on its keyboards.
>
> >What "minority"? Europe has a larger population than the US, and
> >European keyboards don't have $ signs.
>
> It's a pretty safe bet that there are more computer keyboards in the US
> than in Europe, and that there are proportionally more keyboards-with-$

> in Europe than keyboards-without-$ in the US.
>
> And if we limit ourselves to the users that we're talking about
> (namely, C programmers), the bet becomes safer still.

Implementations without "$" are more than common enough that they should
be catered to, regardless of whether or not they are in the majority.

Paul Eggert

unread,
Jan 10, 1999, 3:00:00 AM1/10/99
to
James Kuyper <kuy...@wizard.net> writes:

>Implementations without "$" are more than common enough that they should
>be catered to, regardless of whether or not they are in the majority.

The C standard can cater to minority implementations by defining '\s'
(or '\d' or whatever you like) to stand for the format character in
question. In the US, '\s' would stand for '$'; in Denmark it could
stand for the Euro symbol, if that's what they prefer.

Switching to '/' (or to any character other than '$') would be a break
with existing practice, and it's simply not worth it.

Let me put it this way: when the ISO standardized the Unix shell, what
did they do about its use of '$'? Did they change it to '/'?

Clive D.W. Feather

unread,
Jan 11, 1999, 3:00:00 AM1/11/99
to
In article <3693FBE4...@technologist.com>, David R Tribble
<dtri...@technologist.com> writes

>POSIX supports printf/scanf formatting string specifier prefixes of
>the form '<digits>$', so that arguments can be given in a fixed order
>while the format string can refer to them in an arbitrary order.
[...]

>Question: Was the printf/scanf '$' format spec ever considered by
>the ISO/IEC committee as a possible enhancement for C9X?

I've never seen a formal proposal. I *vaguely* remember someone raising
it and meeting a wall of disinterest.

--
Clive D.W. Feather | Director of | Work: <cl...@demon.net>
Tel: +44 181 371 1138 | Software Development | Home: <cl...@davros.org>
Fax: +44 181 371 1037 | Demon Internet Ltd. | Web: <http://www.davros.org>
Written on my laptop; please observe the Reply-To address

Clive D.W. Feather

unread,
Jan 11, 1999, 3:00:00 AM1/11/99
to
In article <3695573C...@technologist.com>, David R Tribble
<dtri...@technologist.com> writes

>And yet this feature was considered useful enough, and apparently
>easy enough, to actually be implementing on many POSIX systems.

It's in the X/Open standards.

> 1. If any '%' format specifier uses an argument number prefix,
> then all of the format specifiers in the format string must
> do so.

Required by X/Open.

> 2. If an arg number specifier is present, an arg number specifier
> must also be present for each arg number less than it. A format
> string not meeting this requirement is ill-formed and results in
> undefined behavior.

From memory X/Open is stronger: if there are N arguments after the
format there must be exactly N format specifiers and each must have a
different number between 1 and N.

> 3. Argument number prefixes must immediately follow the '%' format
> character and must precede all other modifiers for that formatting
> specification.

X/Open says that conversion specifiers begin "%" or "%1$", "%2$", etc.
In other words, they're a prefix, not a modifier.

> 4. If the format string uses format specifiers with argument number
> prefixes, then the entire format string must begin with the
> sequence "%/".

As others have said, there's no point.

Yves Arrouye

unread,
Jan 11, 1999, 3:00:00 AM1/11/99
to
>In article <3693FBE4...@technologist.com>, David R Tribble
><dtri...@technologist.com> writes

>>POSIX supports printf/scanf formatting string specifier prefixes of
>>the form '<digits>$', so that arguments can be given in a fixed order
>>while the format string can refer to them in an arbitrary order.
>[...]
>>Question: Was the printf/scanf '$' format spec ever considered by
>>the ISO/IEC committee as a possible enhancement for C9X?
>
>I've never seen a formal proposal. I *vaguely* remember someone raising
>it and meeting a wall of disinterest.

It's good as a simple way for localizing messages, though there are enough
other ways to do it without having to add to printf's formats.

Yves.

Geoff Clare

unread,
Jan 12, 1999, 3:00:00 AM1/12/99
to
"Clive D.W. Feather" <cl...@on-the-train.demon.co.uk> writes:

>> 2. If an arg number specifier is present, an arg number specifier
>> must also be present for each arg number less than it. A format
>> string not meeting this requirement is ill-formed and results in
>> undefined behavior.

>From memory X/Open is stronger: if there are N arguments after the
>format there must be exactly N format specifiers and each must have a
>different number between 1 and N.

Perhaps you should have checked the on-line UNIX98 spec. instead of
trusting your memory. It says:

"When numbered argument specifications are used, specifying the Nth
argument requires that all the leading arguments, from the first to
the (N-1)th, are specified in the format string."

and also:

"In format strings containing the %n$ form of conversion
specifications, numbered arguments in the argument list can be
referenced from the format string as many times as required."

Your statement also does not take account of arguments used to specify
field width and precision values.

For example, I believe the following is valid usage:

printf("%1$*4$s %2$*4$s %3$*4$s\n", val1, val2, val3, fieldwidth);

David R Tribble

unread,
Jan 13, 1999, 3:00:00 AM1/13/99
to
Incorporating the latest days' worth of comments, I have revised
my rules for the '$'-printf suggestion. (These rules look pretty
much the same as the POSIX rules, except that they use '/' instead
of '$'.)

1. If any '%' format specifier uses an argument number prefix,
then all of the format specifiers in the format string must
do so.

2. If an arg number specifier is present, an arg number specifier


must also be present for each arg number less than it. A format
string not meeting this requirement is ill-formed and results in
undefined behavior.

3. Argument number prefixes must immediately follow the '%' format


character and must precede all other modifiers for that formatting

specification. (We also don't want to place any limit on the arg
number, i.e., allow one *or more* digits preceding the '/'.)

4. Argument number prefixes may contain one or more digits.
However, there is a maximum number of positional arguments that
the implementation is capable of handling. Specifying a
postional argument number greater than this maximum results in
undefined behavior. This maximum value shall be represented by
the preprocessor macro named 'MAX_PRINTF_ARGS', defined in the
<stdio.h> header file.

5. A format string containing positional argument formatting
prefixes may reference any positional argument more than once,
and in any sequence.

(I removed the requirement for a leading "%/" in the format string.
It occurred to me that you don't need to double-scan the format
string; you simply scan from left to right, formatting as you go,
until you encounter the first "%n/" prefix; at that point, you
know you have a string with "%/" prefixes, so you double-scan and
apply positional formatting semantics at that point. If you've
already processed non-positional "%" specifiers, you return an
error, since the format string is ill-formed.)

My example is now:
printf("%2/u: %3/s in file %1/s, line %2/u\n", file, line, msg);

I don't have a preference for either "%1/" or "%1$". I admit that
since POSIX uses '$' it would probably be a good idea to lean in
that direction, despite the complaints from the Europeans.

BTW, what character is at the key position that is '$' on US
keyboards but is something else on European keyboards? (It's
usually shift-4 in the US.) I assume it's pound-sterling (£) in
the UK (which would be a reasonable substitute for '$').

Clive D.W. Feather

unread,
Jan 14, 1999, 3:00:00 AM1/14/99
to
In article <F5GCI...@root.co.uk>, Geoff Clare <g...@root.co.uk> writes

>>From memory X/Open is stronger: if there are N arguments after the
>>format there must be exactly N format specifiers and each must have a
>>different number between 1 and N.
>
>Perhaps you should have checked the on-line UNIX98 spec. instead of
>trusting your memory.

Perhaps I should have said "was", since I'm thinking of early 1990s.

>It says:
> "When numbered argument specifications are used, specifying the Nth
> argument requires that all the leading arguments, from the first to
> the (N-1)th, are specified in the format string."

I think that's been weakened since the stuff I remember.

>and also:
> "In format strings containing the %n$ form of conversion
> specifications, numbered arguments in the argument list can be
> referenced from the format string as many times as required."

and that definitely has.

>Your statement also does not take account of arguments used to specify
>field width and precision values.

I'd forgotten them. But they're "format specifiers" within the meaning
of what I was saying (and probably nowhere else).

Clive D.W. Feather

unread,
Jan 14, 1999, 3:00:00 AM1/14/99
to
In article <369D462C...@technologist.com>, David R Tribble
<dtri...@technologist.com> writes

> 5. A format string containing positional argument formatting
> prefixes may reference any positional argument more than once,
> and in any sequence.

You need to add a consistency requirement.

You also haven't addressed modifiers (unless I overlooked it).

>BTW, what character is at the key position that is '$' on US
>keyboards but is something else on European keyboards? (It's

>usually shift-4 in the US.) I assume it's pound-sterling (Ł) in


>the UK (which would be a reasonable substitute for '$').

On my machine it's '$' (dollar); 'Ł' (pound sterling) is shift-3. Most
implementations I've come across that understand the pound sign use it
as equivalent to # in both shell and C. In fact, isn't $ in the ISO-646
invariant subset ?

Larry Jones

unread,
Jan 14, 1999, 3:00:00 AM1/14/99
to
Clive D.W. Feather wrote:
>
> In fact, isn't $ in the ISO-646 invariant subset ?

Not unless ISO 646 has changed since the 1983 edition -- it allows 0x24
to be either a dollar sign or a generic currency sign (just like it
allows 0x23 to be either a number sign or a pound (sterling) sign.

-Larry Jones

OK, what's the NEXT amendment say? I know it's in here someplace. --
Calvin

David R Tribble

unread,
Jan 14, 1999, 3:00:00 AM1/14/99
to
> David R Tribble <dtri...@technologist.com> writes
> > 5. A format string containing positional argument formatting
> > prefixes may reference any positional argument more than once,
> > and in any sequence.

Clive D.W. Feather wrote:
> You need to add a consistency requirement.

Yes. And probably other legalese as well.

> You also haven't addressed modifiers (unless I overlooked it).

Oops. I take the position that "%*.*s" requires three args in
seccession, so by the same reasoning, "%3/*.*s" should also require
three args in succession (treated as a single "positional argument
group").

Therefore, the following is legal:
printf("file %2/.*s in dir %1/.*s\n", dirname, 80, fname, 14);
^----^ ^----^ ^---------^ ^-------^
2 args 2 args 1st group 2nd group

Someone else mentioned the ability to skip args. I don't see
the utility of this (you can't do it with current printf format
strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)

Concerning '$' versus '/': Right now I lean towards
'$'/currency-sign, which would work under ISO-646.

Geoff Clare

unread,
Jan 15, 1999, 3:00:00 AM1/15/99
to
"Clive D.W. Feather" <cl...@on-the-train.demon.co.uk> writes:

>Perhaps I should have said "was", since I'm thinking of early 1990s.

>>It [UNIX98] says:
>> "When numbered argument specifications are used, specifying the Nth
>> argument requires that all the leading arguments, from the first to
>> the (N-1)th, are specified in the format string."

>I think that's been weakened since the stuff I remember.

The text was identical in XPG3 (1988).

>>and also:
>> "In format strings containing the %n$ form of conversion
>> specifications, numbered arguments in the argument list can be
>> referenced from the format string as many times as required."

>and that definitely has.

That text first appeared in XPG4 (1992). My guess is it was added in
order to clarify whether this usage is allowed, rather than to change
the spec., since it isn't mentioned in the change history in XPG4.

Christian Bau

unread,
Jan 15, 1999, 3:00:00 AM1/15/99
to
In article <369E7A4F...@technologist.com>, David R Tribble
<dtri...@technologist.com> wrote (discussing extensions of printf)

> Someone else mentioned the ability to skip args. I don't see
> the utility of this (you can't do it with current printf format
> strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)

At the very least, the format string must contain information about the
type of the argument to be skipped. If you want to ignore the first
argument after the format string, at the very least you have to know
whether I passed an int, a long, a double or a pointer.

Clive D.W. Feather

unread,
Jan 16, 1999, 3:00:00 AM1/16/99
to
In article <369E7A4F...@technologist.com>, David R Tribble
<dtri...@technologist.com> writes

>> You also haven't addressed modifiers (unless I overlooked it).
>Oops. I take the position that "%*.*s" requires three args in
>seccession, so by the same reasoning, "%3/*.*s" should also require
>three args in succession (treated as a single "positional argument
>group").

On the other hand, it can be useful to pick up the width for several
descriptors from the same argument. And X/Open appears to do it that way
(by the way, why are designing something different to X/Open ?).

>Someone else mentioned the ability to skip args. I don't see
>the utility of this (you can't do it with current printf format
>strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)

It's useful when internationalizing strings where some languages need
more information than others. For example, the negative in French
contains two words, one on either side of the verb ("ne ... pas", "ne
... plus", "ne ... jamais") while English has one ("not", "no more",
"never").

Peter Curran

unread,
Jan 17, 1999, 3:00:00 AM1/17/99
to
On Sat, 16 Jan 1999 12:54:52 +0000, "Clive D.W. Feather"
<cl...@on-the-train.demon.co.uk> wrote:

>In article <369E7A4F...@technologist.com>, David R Tribble
><dtri...@technologist.com> writes

<snip>


>>Someone else mentioned the ability to skip args. I don't see
>>the utility of this (you can't do it with current printf format
>>strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)
>
>It's useful when internationalizing strings where some languages need
>more information than others. For example, the negative in French
>contains two words, one on either side of the verb ("ne ... pas", "ne
>... plus", "ne ... jamais") while English has one ("not", "no more",
>"never").

To give another, unilingual, example, an application could provide
several levels of verbosity - perhaps "silent", "expert", "regular",
"novice" and "debug". A possible implementation is: the application
code calls a printf-like function, in which the first parameter is a
message code. The message generator uses the message code to select
an appropriate format string, based in part on the current verbosity
level, and passes it, along with the data on to vprintf. The format
string is designed to select the appropriate pieces of information
from that provided by the application.

Note that
- the information used at the different verbosity levels need not
form subsets.
- the application programmer is probably not the right person to
be deciding which information is appropriate for each level.
That should be left to a UI specialist.
- the information for each level is likely to change is subsequent
releases, as the result of user feedback.

All this indicates that simplistic solutions, such as having the
application programmer order the information by verbosity level, is
unsatisfactory.

This example is simplistic, but the point is that the
parameter-ordering concept is not adequate to deal with the real-world
problems of data selection, and IMHO is not a good candidate for the
standard. It just doesn't solve a big enough problem.

David R Tribble

unread,
Jan 20, 1999, 3:00:00 AM1/20/99
to
"Clive D.W. Feather" wrote:
>
> David R Tribble <dtri...@technologist.com> writes
> >> You also haven't addressed modifiers (unless I overlooked it).
>> Oops. I take the position that "%*.*s" requires three args in
>> seccession, so by the same reasoning, "%3/*.*s" should also require
>> three args in succession (treated as a single "positional argument
>> group").
>
> On the other hand, it can be useful to pick up the width for several
> descriptors from the same argument. And X/Open appears to do it that
> way.

That would, in my opinion, impose unreasonable restrictions on
the format specifiers; I'd like to be able to specify separate,
independent widths/precisions for different formatted items;
printf already lets me do that, why remove this valuable capability?

> (by the way, why are [you] designing something different to X/Open?).

I'm not. In the early discussions of this thread, it appeared that
a) The POSIX spec was not sufficiently precise (for things like
skipped args, width/precision args, max arg number, etc.)
b) The whole '$' is/isn't ISO-646 controversy.

Since then, these issues have been answered, and what we're left
with is practically identical to POSIX (XPG4). Which I'm all in
favor of.

>> Someone else mentioned the ability to skip args. I don't see
>> the utility of this (you can't do it with current printf format
>> strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)
>
> It's useful when internationalizing strings where some languages need
> more information than others. For example, the negative in French
> contains two words, one on either side of the verb ("ne ... pas", "ne
> ... plus", "ne ... jamais") while English has one ("not", "no more",
> "never").

An alternate way to handle "ne ... pas" versus "not ..." is shown
by an example:

const char * fmt069 = "%s%s%s";

if (lang == LANG_ENGLISH)
printf(fmt069, "not ", item, "");
else if (lanf == LANG_FRENCH)
printf(fmt069, "ne ", item, " pas");
else
...

You'll notice that the format string is the same for both English
and French (and for the other languages as well). I argue that
missing args, therefore, is more a question of the args themselves
rather than the formatting specs.

christ...@isltd.insignia.com (Christian Bau) wrote:
>> Someone else mentioned the ability to skip args. I don't see
>> the utility of this (you can't do it with current printf format
>> strings), but perhaps "%2//" would work? (Or just "%2/0.0s"?)
>

> At the very least, the format string must contain information about
> the type of the argument to be skipped. If you want to ignore the
> first argument after the format string, at the very least you have to
> know whether I passed an int, a long, a double or a pointer.

Would these work?:
%0.0d - int
%0.0ld - long int
%0.0lf - float/double
%0.0Lf - long double
%0.0s - string (char *)
%0.0p - pointer (void *)

If not, then we will need another way of specifying skipped args
(assuming we agree that this is a useful thing, which I don't at
at present). How about:

%0$d - int
%0$s - string
%0$p - pointer
etc.

This also has the advantage of being able to skip the width and
precision args associated with a positional arg group:

%0$*.*s - string with width & prec (3 args)
etc.

Clive D.W. Feather

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <36A6335D...@technologist.com>, David R Tribble

<dtri...@technologist.com> writes
>> David R Tribble <dtri...@technologist.com> writes
>> >> You also haven't addressed modifiers (unless I overlooked it).
>>> Oops. I take the position that "%*.*s" requires three args in
>>> seccession, so by the same reasoning, "%3/*.*s" should also require
>>> three args in succession (treated as a single "positional argument
>>> group").
>>
>> On the other hand, it can be useful to pick up the width for several
>> descriptors from the same argument. And X/Open appears to do it that
>> way.
>
>That would, in my opinion, impose unreasonable restrictions on
>the format specifiers;

Why ?

If I understood Geoff Clare correctly, X/Open lets you say:

"%1$*3$.*4$d %2$*3$.*5$f"

where the arguments are:
1: value for %d
2: value for %f
3: width for both conversions
4: precision for %d
5: precision for %f

How do you do that with your proposal ? You would require two blocks of
3 arguments for this.

>I'd like to be able to specify separate,
>independent widths/precisions for different formatted items;
>printf already lets me do that, why remove this valuable capability?

I think you've misunderstood that.

>>> Someone else mentioned the ability to skip args. I don't see
>>> the utility of this

[...]


>> It's useful when internationalizing strings where some languages need
>> more information than others.

And for other purposes.

>An alternate way to handle "ne ... pas" versus "not ..." is shown
>by an example:

[...]

But that's not the way internationalization is normally done.

>Would these work?:
> %0.0d - int
> %0.0ld - long int

No, because those already have a meaning (they convert 0 to the empty
string, but otherwise are the same as %d).

>If not, then we will need another way of specifying skipped args
>(assuming we agree that this is a useful thing, which I don't at
>at present). How about:
>
> %0$d - int

No, that says use the argument at position 0.

You need syntax that says "position N is type T but unused". This is a
logical extension of a possible feature for normal printf - do not
output the value of this argument. You would do this with a new flag,
say "!".

So normal printf would use:

%!d - ignore an int argument
%!f - ignore a double argument

and X/Open would use:

%4$!d - ignore argument 4, which is an int
%7$!f - ignore argument 7, which is a double

>This also has the advantage of being able to skip the width and
>precision args associated with a positional arg group:

Not needed, since you just put 3 skip codes:

%4$!f%5$!d%6$!d

or you could allow the following to work:

%4$!*5$.%6$f

(both expect double, int, int in that order).

Francis Glassborow

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
In article <0krEteMw...@romana.davros.org>, Clive D.W. Feather
<cl...@on-the-train.demon.co.uk> writes

>>An alternate way to handle "ne ... pas" versus "not ..." is shown
>>by an example:
>[...]
>
>But that's not the way internationalization is normally done.

Anyway it assumes that there are no other mechanisms in other languages
to generate negatives. At this level I18N is very close to impossible.
If you know much of the syntax/semantics of languages such as Chinese
(any dialect will do) or Arabic, Punjabi etc. you may just get an
inkling as to the level of problem.

Even English can raise problems (some dialects use double negatives for
emphasis, while others have them cancel:)

David R Tribble

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
David R Tribble <dtri...@technologist.com> writes
>> I take the position that "%*.*s" requires three args in
>> seccession, so by the same reasoning, "%3/*.*s" should also
>> require three args in succession (treated as a single "positional
>> argument group".

Clive:


>> On the other hand, it can be useful to pick up the width for several
>> descriptors from the same argument. And X/Open appears to do it that
>> way.

David:


>> That would, in my opinion, impose unreasonable restrictions on
>> the format specifiers;

Clive D.W. Feather wrote:
> Why ?
> If I understood Geoff Clare correctly, X/Open lets you say:
>
> "%1$*3$.*4$d %2$*3$.*5$f"
>
> where the arguments are:
> 1: value for %d
> 2: value for %f
> 3: width for both conversions
> 4: precision for %d
> 5: precision for %f
>
> How do you do that with your proposal ? You would require two blocks
> of 3 arguments for this.

I misunderstood. Now that I see that positional arg number prefixes
can apply to all of the components of the format spec (the item, the
item's width, and the item's precision), I see the beauty and
usefulness of the POSIX definition.

We would probably have to add some restrictions stating that if
if any "n$" prefixes are used, they must also be used on any and all
width and precision specifiers.

>> Would these work?:
>> %0.0d - int
>> %0.0ld - long int
>
> No, because those already have a meaning (they convert 0 to the empty
> string, but otherwise are the same as %d).

I kind of suspected that.

>> If not, then we will need another way of specifying skipped args
>> (assuming we agree that this is a useful thing, which I don't at
>> at present). How about:
>> %0$d - int
>
> No, that says use the argument at position 0.

Oops.

> You need syntax that says "position N is type T but unused". This is a
> logical extension of a possible feature for normal printf - do not
> output the value of this argument.
> You would do this with a new flag, say "!".

How about simply prefixing the position number with a '-' to indicate
that it is to be skipped? That way we don't need to invent (use up)
yet another special formatting character.

Thus your example of:
%4$!f%5$!d%6$!d
or
%4$!*5$.%6$f

Becomes:
%-4$d%-5$d%-6$f
or
%-4$*.-5$*-6$f

to skip an expected int, int, and double, in that order.

(If we don't like '-', we could use '~' or '!' instead.)

(Don't quote me on the exact POSIX-based lexicon; I admit that
it appears to be somewhat complicated. Which of these is correct?:
%1$*3$.4$f
%1$*.3$*4$f
%1$*3$.*4$f
I lean towards the 2nd one.)

David R Tribble

unread,
Jan 21, 1999, 3:00:00 AM1/21/99
to
Francis Glassborow wrote:
> Clive D.W. Feather <cl...@on-the-train.demon.co.uk> writes
>>> An alternate way to handle "ne ... pas" versus "not ..." is shown
>>> by an example:
>>[...]
>>
>> But that's not the way internationalization is normally done.
>
> Anyway it assumes that there are no other mechanisms in other
> languages to generate negatives. At this level I18N is very close to
> impossible. If you know much of the syntax/semantics of languages
> such as Chinese (any dialect will do) or Arabic, Punjabi etc. you may
> just get an inkling as to the level of problem.
>
> Even English can raise problems (some dialects use double negatives
> for emphasis, while others have them cancel:)

Are you sayin there ain't no way, no how, to do it?

-- David R. Tribble, dtri...@technologist.com --

milli = 10**-3
micro = 10**-6
nano = 10**-9
pico = 10**-12
femto = 10**-15
atto = 10**-18
aintno = 10**-100

Douglas A. Gwyn

unread,
Jan 22, 1999, 3:00:00 AM1/22/99
to
David R Tribble wrote:
> Are you sayin there ain't no way, no how, to do it?

Not even, oddly enough.

Geoffrey KEATING

unread,
Jan 22, 1999, 3:00:00 AM1/22/99
to
Francis Glassborow <fra...@robinton.demon.co.uk> writes:

> In article <0krEteMw...@romana.davros.org>, Clive D.W. Feather


> <cl...@on-the-train.demon.co.uk> writes
> >>An alternate way to handle "ne ... pas" versus "not ..." is shown
> >>by an example:
> >[...]
> >
> >But that's not the way internationalization is normally done.
>
> Anyway it assumes that there are no other mechanisms in other languages
> to generate negatives. At this level I18N is very close to impossible.
> If you know much of the syntax/semantics of languages such as Chinese
> (any dialect will do) or Arabic, Punjabi etc. you may just get an
> inkling as to the level of problem.

I think the preferred way to do this sort of thing is to write

printf(found ? _("`%s' was found in `%s'.")
: _("`%s' was not found in `%s'."),
item, location)

where _ is a macro that does internationalisation. That is, you don't
use tricks with grammar, you go for whole sentences. However
(dragging the discussion vaguely back on topic), you still need a way
to cope with the translator deciding that

In `%2$s', `%1$s' was found.

is a better way (or perhaps the only way) to order the above sentence
in some language.

--
Geoff Keating <Geoff....@anu.edu.au>

Geoff Clare

unread,
Jan 22, 1999, 3:00:00 AM1/22/99
to
David R Tribble <dtri...@technologist.com> writes:

>Clive D.W. Feather wrote:
>> If I understood Geoff Clare correctly, X/Open lets you say:
>>
>> "%1$*3$.*4$d %2$*3$.*5$f"

>I misunderstood. Now that I see that positional arg number prefixes


>can apply to all of the components of the format spec (the item, the
>item's width, and the item's precision), I see the beauty and
>usefulness of the POSIX definition.

>We would probably have to add some restrictions stating that if
>if any "n$" prefixes are used, they must also be used on any and all
>width and precision specifiers.

Yes. The X/Open specs say "The results of mixing numbered and unnumbered
argument specifications in a format string are undefined."

(And please stop referring to this as a POSIX feature. It is not required
by POSIX - only by the X/Open specs, which are supersets of POSIX.)

>(Don't quote me on the exact POSIX-based lexicon; I admit that
>it appears to be somewhat complicated. Which of these is correct?:
> %1$*3$.4$f
> %1$*.3$*4$f
> %1$*3$.*4$f
>I lean towards the 2nd one.)

The 3rd one is correct (Clive got it right in his example). To specify
argument numbers you simply replace each % with %n$ (except in %%, of
course) and each * with *n$.

0 new messages