Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

va_list, va_start, va_end very nice ... but how to copy all vars as one parameter ?

9 views
Skip to first unread message

Douwe

unread,
Aug 23, 2003, 1:52:27 PM8/23/03
to
I try to build my own version of printf which just passes all
arguments to the original printf. As long as I keep it with the single
argument version everything is fine. But their is also a version which
uses the "..." as the last parameter how can I pass them to the
orignal printf ?

void myprintf(char *txt, ...)
printf(txt, ???????);
}

hopefully this time i´m in the right group :)

Richard Heathfield

unread,
Aug 23, 2003, 2:26:26 PM8/23/03
to
Douwe wrote:

Yes, you are.

#include <stdarg.h>

void myprintf(const char *fmt, ...)
{
va_list ap;
va_start(ap, txt);

/* you may do some va_arg stuff here if you wish */

vprintf(fmt, ap);
va_end(ap);
}


--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Douwe

unread,
Aug 23, 2003, 6:07:19 PM8/23/03
to
Richard Heathfield <dont...@address.co.uk.invalid> wrote in message news:<bi8bki$o59$1...@titan.btinternet.com>...

> Douwe wrote:
>
> > I try to build my own version of printf which just passes all
> > arguments to the original printf. As long as I keep it with the single
> > argument version everything is fine. But their is also a version which
> > uses the "..." as the last parameter how can I pass them to the
> > orignal printf ?
> >
> > void myprintf(char *txt, ...)
> > printf(txt, ???????);
> > }
> >
> > hopefully this time i'm in the right group :)
>
> Yes, you are.
>
> #include <stdarg.h>
>
> void myprintf(const char *fmt, ...)
> {
> va_list ap;
> va_start(ap, txt);
>
> /* you may do some va_arg stuff here if you wish */
>
> vprintf(fmt, ap);
> va_end(ap);
> }

Thank you Richard,

this is what I was searching for. I didn´t know the function vprintf
yet (probably missed it the tons of available functions). I asume if
this function didn´t exist forwarding the "..." arguments to the
printf was impossible. This also means that I would have to write a
vmyprintf to allow overriding of my own version.

Kevin Easton

unread,
Aug 24, 2003, 7:29:19 AM8/24/03
to
Douwe <do...@parkserver.net> wrote:
> Richard Heathfield <dont...@address.co.uk.invalid> wrote in message news:<bi8bki$o59$1...@titan.btinternet.com>...
>> Douwe wrote:
>>
>> > I try to build my own version of printf which just passes all
>> > arguments to the original printf. As long as I keep it with the single
>> > argument version everything is fine. But their is also a version which
>> > uses the "..." as the last parameter how can I pass them to the
>> > orignal printf ?
>> >
>> > void myprintf(char *txt, ...)
>> > printf(txt, ???????);
>> > }
>> >
>> > hopefully this time i'm in the right group :)
>>
>> Yes, you are.
>>
>> #include <stdarg.h>
>>
>> void myprintf(const char *fmt, ...)
>> {
>> va_list ap;
>> va_start(ap, txt);
>>
>> /* you may do some va_arg stuff here if you wish */

It's worth nothing the scope for changing the argument list here is
rather limited - you can't reorder it, remove arguments at arbitrary
points or add arguments. Perhaps one day this will change.

>> vprintf(fmt, ap);
>> va_end(ap);
>> }
>
> Thank you Richard,
>

> this is what I was searching for. I didn?t know the function vprintf


> yet (probably missed it the tons of available functions). I asume if

> this function didn?t exist forwarding the "..." arguments to the


> printf was impossible. This also means that I would have to write a
> vmyprintf to allow overriding of my own version.

Yes, when you're writing a printf-like function, the best practice is to
write the logic of the function in a vprintf-like function, with a small
wrapper function with the actual "..." variable argument list that just
calls the vprintf-like function.

- Kevin.

Paul Hsieh

unread,
Aug 24, 2003, 2:33:22 PM8/24/03
to
Kevin Easton <kevin@-nospam-pcug.org.au> wrote:
> >> [...]

> >> void myprintf(const char *fmt, ...)
> >> {
> >> va_list ap;
> >> va_start(ap, txt);
> >>
> >> /* you may do some va_arg stuff here if you wish */
>
> It's worth nothing the scope for changing the argument list here is
> rather limited - you can't reorder it, remove arguments at arbitrary
> points or add arguments. Perhaps one day this will change.

Worse yet ...

> >> vprintf(fmt, ap);

... the second argument of vprintf is not decorated with a "const"!
This means that once you pass ap to vprintf, or vsprintf, etc, you
cannot use ap usefully anymore, and va_next, or va_end is the only
thing you can really do to it.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Chris Torek

unread,
Aug 30, 2003, 6:31:04 PM8/30/03
to
In article <796f488f.03082...@posting.google.com>

Paul Hsieh <q...@pobox.com> writes:
>... the second argument of vprintf is not decorated with a "const"!
>This means that once you pass ap to vprintf, or vsprintf, etc, you
>cannot use ap usefully anymore, and va_next, or va_end is the only
>thing you can really do to it.

(I have no idea what "va_next" is meant to refer to.)

This is indeed the case -- and there is a good reason for it.

While the C standard does not dictate any particular method by
which a system is to implement "va_list"s and the operations on
them, there are in fact two common techniques -- and they behave
in quite opposite ways.

The first and most obvious technique is the one used on typical
stack-based function-parameter mechanisms like those found in most
x86 C compilers. Here, any function call -- including calls to
functions like printf() -- simply pushes each argument, in this
case in reverse order, on "the" (there is only one) stack:

printf(fmt, a1, a2, a3);

turns into:

push a3
push a2
push a1
push fmt
call printf

and each "push" writes the value into the (single) stack and
decrements the stack pointer. This means that the fixed arguments
-- here, just one, the format "fmt" -- are immediately followed in
memory by the varying arguments:

raw memory
address item
---------- ----
0xabcd0128: a3
0xabcd0124: a2
0xabcd0120: a1
0xabcd011c: fmt

The <stdarg.h> implementation on such a system merely needs to
calculate one pointer, "address of a1", which fits in some ordinary
machine data pointer type (on the x86, excluding x86-64 anyway,
any 32-bit entity such as "char *" or "void *" will serve). Thus,
if stdarg.h is actually a readable file, some systems will even
contain something like:

typedef void *va_list;

(although gcc 3.x finally uses a magic compiler builtin, for
reasons that have more to do with the second common mechanism).

Naturally, if you pass the value of an ordinary "char *" or
"void *" pointer to some subsidiary function like vprintf(),
this passes a copy of the value of that pointer. The original
pointer is still valid, so:

vprintf(fmt, ap);
vprintf(fmt, ap);

simply prints the same thing twice.

The second technique is becoming more common today, because as CPUs
get faster, memory has not been keeping up. We now have 3 and 4
gigahertz CPUs backed by 133 to 200 megahertz RAM. Even with wide
datapaths and enormous caches, the CPUs tend to spend (or waste)
a lot of time just waiting for memory. So, why not keep parameters
inside the CPU? If you have a lot of registers, a call like:

printf(fmt, a1, a2, a3);

does not have to write the parameters to memory. Instead, the
compiler can do this:

move a3, %r19
move a2, %r18
move a1, %r17
move fmt, %r16
call printf

Since there are anywhere from 32 to 256 CPU registers, it is easy
to dedicate six or eight or so to parameters. If a function really
needs 4167 parameters, the extra ones can go in memory, but many
function calls will not need memory at all, and can go quite a bit
faster (typical speedups here are from 2 to 200 times faster,
depending on too many factors to describe). But now printf() has
a problem: how can we access the varying parameters if they are
in (non-addressable) registers instead of (addressable) memory?
What if there are more than the six or eight register parameters
so that some wind up in memory? Worse yet, what if floating-point
parameters go in floating-point registers that are separate from
integer registers?

There are a number of possible answers, but one that is used today
is to have functions like printf(), that take varying parameters,
dump those parameters into memory. Now they *are* addressable and
the old techniques work. But -- what memory? The memory regions
into which the integer (and, if needed, floating-point) registers
are to be written might be separate from the memory regions for
"overflow" parameters (those beyond the first six or eight). One
solution, again in use today, is to make the va_list type name a
structure. The structure tells where each group of parameters
live:

struct __va_info {
int _ni; /* number of integer-registers left */
int *_ip; /* memory holding int-register values */
int _nf; /* number of fp-registers left */
float *_fp; /* memory holding fp-register values */
void *_rest;/* "overflow" stuff, e.g., on stack */
};

Now the type va_list becomes an alias for an array of one of these
structures:

typedef struct __va_info va_list[1];

The va_arg() macro (or built-in) uses the type to decide whether
the parameter would normally be in an integer register or floating-point
register:

if (the type would be floating) {
n = ap->_nf;
ptr = (void *)ap->_ip;
} else {
n = ap->_ni;
ptr = (void *)ap->_fp;
}

and then tests whether all of those registers are "used up", in which
case the argument must be in the overflow area, or whether it is in
the specified area:

if (n < n_needed)
ptr = ap->_rest;

The desired value is then at *ptr, but the appropriate pointer must
of course be incremented, so the "final" version of the code is:

if (the type would be floating) {
n = ap->_nf;
if (n >= n_needed) {
ptr = (void *)ap->_fp;
ap->_nf -= n_needed;
ap->_fp += n_needed;
} else {
ptr = ap->_rest;
ap->_rest = (float *)ap->_rest + n_needed;
}
} else {
n = ap->_ni;
if (n >= n_needed) {
ptr = (void *)ap->_ip;
ap->_ni -= n_needed;
ap->_ip += n_needed;
} else {
ptr = ap->_rest;
ap->_rest = (int *)ap->_rest + n_needed;
}
}
/* and now *(type *)ptr gives the value */

(In fact, the "final" version is usually even more complicated,
due to register-pairing issues, and whether aggregate types are
passed by value or by indirection, and other such complications.
Of course, the va_list type, and the code to extract arguments,
depend on the compiler's argument layout -- which is why this is
best done by a compiler built-in. Only the compiler really knows
where it will put the arguments, so the compiler is the only one
that can be sure how to retrieve them later.)

In any case, the key point here is that va_list is now an alias
for "array (of size 1) of struct" with modifiable contents. A
call that passes a va_list parameter, e.g., to vprintf(), passes
the address of that array's first and only element:

va_list ap; /* i.e., array 1 of struct __va_list */

va_start(ap, fmt); /* fills in the struct */
vprintf(fmt, ap); /* passes &ap[0] because ap is an array */

When vprintf() returns, the elements of the structure have been
modified -- ap[0]._ni and ap[0]._nf and ap[0]._ip and so on all
now contain the "after printing" values. If you call vprintf()
again:

vprintf(fmt, ap);

you will NOT get the same output for a "%d" format, for instance,
because ap[0]._ni and ap[0]._ip no longer give information about
parameter "a1".

Thus, we can repeat Paul Hsieh's conclusion, using only this knowledge
about actual implementations:

>This means that once you pass ap to vprintf, or vsprintf, etc, you

>cannot use ap usefully anymore, and [...] va_end is the only


>thing you can really do to it.

You *must* va_end the va_list object (because the standard says so
-- other than the pre-Standard-C Pyramid implementation, in which
va_start() contained an open brace and va_end() closed that brace,
I have never seen an implementation that actually does anything
with va_end()), and attempting to re-use it instead will do different
things on different implementations.

Incidentally, we might note (as a last item) that passing varying
arguments in registers generally *slows down* the second type of
implmentation (slightly). A hybrid system in which fixed arguments
go in registers and *any* varying arguments *always* go into memory
immediately is perhaps the best. This would allow the same simple
va_start and va_arg mechanisms one finds on x86 systems. It has
only one drawback: broken C code will misbehave.

In particular, consider the following program:

/* BUG: missing #include <stdio.h> */
int main(void) {
printf("%s %s\n", "hello", "world");
return 0;
}

In a hypothetical "parameters go in registers, except for the
varying arguments to functions like printf" system, this call to
printf() effectively declares printf() using an old-style ("K&R
C"):

int printf();

which tells the compiler "printf has unknown, but fixed, parameters".
Since we have now lied to the compiler, it puts the two "%s"
arguments in registers, instead of memory. When printf() goes to
fetch the parameters from memory, they will not be there.

(There are tricks to get around this. For instance, the compiler
could decorate external function names, similar to C++-style "name
mangling", so that the program simply fails to link: kr$printf
would not match v1$printf, where kr$ means "K&R style" while "v1$"
means "varying arguments after one fixed argument". The linker
would allow a symbol like "kr$zorg" to match "aw$zorg", where "aw$"
stands for "ANSI C declaration, all parameters match their widened
types". For whatever reason, systems like this are anywhere from
rare to nonexistent.)
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://67.40.109.61/torek/index.html (for the moment)
Reading email is like searching for food in the garbage, thanks to spammers.

Paul Hsieh

unread,
Aug 30, 2003, 10:10:54 PM8/30/03
to
Chris Torek <nos...@elf.eng.bsdi.com> wrote:
> Paul Hsieh <q...@pobox.com> writes:
> >... the second argument of vprintf is not decorated with a "const"!
> >This means that once you pass ap to vprintf, or vsprintf, etc, you
> >cannot use ap usefully anymore, and va_next, or va_end is the only
> >thing you can really do to it.
>
> (I have no idea what "va_next" is meant to refer to.)

... va_arg ... (you know you might be a c.l.c regular when ...)



> This is indeed the case -- and there is a good reason for it.

For certain definitions of "good" ...



> While the C standard does not dictate any particular method by
> which a system is to implement "va_list"s and the operations on
> them, there are in fact two common techniques -- and they behave
> in quite opposite ways.
>
> The first and most obvious technique is the one used on typical
> stack-based function-parameter mechanisms like those found in most
> x86 C compilers.

This is only one mode found in most (especially older) x86 C
compilers. And yes it leads to a trivial va_* implementation, but
this is a thoroughly obsolete mode that dates back to the mid 80s.
The default calling convention for most modern x86 compilers store
initial parameters into registers before pushing them onto the stack.

> The second technique is becoming more common today, because as CPUs
> get faster, memory has not been keeping up. We now have 3 and 4
> gigahertz CPUs backed by 133 to 200 megahertz RAM. Even with wide
> datapaths and enormous caches, the CPUs tend to spend (or waste)
> a lot of time just waiting for memory.

The tip of the x86 stack is *ALWAYS* in the L1 cache. Only the
earliest parameters from very deep call chains, or only if you
completely saturate your L1 cache usage in your inner loops is this an
issue. The reason for moving to register based parameter passing
(which has happened in the x86 world quite some time ago, by the
leading compiler vendors) is because the paired push and pop
instructions are redundant, non-parallelisable, and just generally
slower than passing via registers. While this overhead usually
doesn't have significant impact on well written code, that does
nothing for poorly written code, which the compilers can't do anything
about.

> [...] So, why not keep parameters


> inside the CPU? If you have a lot of registers, a call like:
>
> printf(fmt, a1, a2, a3);
>
> does not have to write the parameters to memory. Instead, the
> compiler can do this:
>
> move a3, %r19
> move a2, %r18
> move a1, %r17
> move fmt, %r16
> call printf
>
> Since there are anywhere from 32 to 256 CPU registers, it is easy
> to dedicate six or eight or so to parameters.

For those that are counting, you actually only need 4 registers to do
that above. x86s have more than 4 registers.

> [ ... poor solution that doesn't work for "double" parameters snipped ... ]

Linked lists are very poor data structure than in general should be
avoided like the plague. They are slow, and modification of them is
not thread safe. I have no doubt that the solution you give is one
possible solution that has been used, its obviously slow, complicated
and problematic.

struct __va_info {
union {
long long iParm; /* If this makes sense; obviously other */
double fParm; /* constructions for other stack */
void * aParm; /* architectures might be necessary */
} _parms[__PARMS_MAX_PARMS];
int _nParms;
void *_restOfStack;
};

typedef struct {
struct __va_info inf;
int idx;
} va_list;

So va_start(ap,firstParm) should store the parameters directly into
__va_info according to the implemented calling convention (paying
deference to what the protoype looks like) then set the idx entry to
0. So one can copy a va_list using the direct "=" operator, and this
would allow a "const" decorator to be trivially applied.

> (In fact, the "final" version is usually even more complicated,
> due to register-pairing issues, and whether aggregate types are
> passed by value or by indirection, and other such complications.
> Of course, the va_list type, and the code to extract arguments,
> depend on the compiler's argument layout -- which is why this is
> best done by a compiler built-in. Only the compiler really knows
> where it will put the arguments, so the compiler is the only one
> that can be sure how to retrieve them later.)

Right -- but whatever mechanism, the mapping which exists at compile
time can be mapped to a mechanism which can be performed at run time
as well. But regardless, the indexing mechanisms can all be top-level
variables, meaning that the va_list can and should be copyable,
meaning that making the parameter const should not be a problem.

> In particular, consider the following program:
>
> /* BUG: missing #include <stdio.h> */
> int main(void) {
> printf("%s %s\n", "hello", "world");
> return 0;
> }
>
> In a hypothetical "parameters go in registers, except for the
> varying arguments to functions like printf" system, this call to
> printf() effectively declares printf() using an old-style ("K&R
> C"):
>
> int printf();

Yes, but this should only be an issue if the default parameter mapping
and coercion does not match the parameter mechanisms specified in the
prototype. For example, if the real parameters were any of the three:
long long, int, double, then their representations will all be
different in typical x86 C compilers.

This is going to be an issue no matter what -- if you mismatch your
parameters and prototypes, then things will go wrong anyway. The va_*
implementation shouldn't impact this one way or another, other than
being as much susceptible to improper use as other parameter mismatch
problems.

> (There are tricks to get around this. For instance, the compiler
> could decorate external function names, similar to C++-style "name
> mangling", so that the program simply fails to link: kr$printf
> would not match v1$printf, where kr$ means "K&R style" while "v1$"
> means "varying arguments after one fixed argument". The linker
> would allow a symbol like "kr$zorg" to match "aw$zorg", where "aw$"
> stands for "ANSI C declaration, all parameters match their widened
> types". For whatever reason, systems like this are anywhere from
> rare to nonexistent.)

This is not a real solution. Currently, good C compilers just warn of
non-prototyped, and mismatched prototypes.

0 new messages