Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Problem with "va_list" & variable arguments in 64-bit programs on SLES 10 SP1

2,143 views
Skip to first unread message

Chuck Chopp

unread,
Sep 12, 2008, 9:08:25 PM9/12/08
to
I have some code that is being built on the following:

Windows Server 2003, both 32-bit & 64-bit editions
Windows Vista, both 32-bit & 64-bit editions
Windows Server 2008, both 32-bit & 64-bit editions

Build tools: Visual Studio 2008.

SLES 10 SP1, both 32-bit & 64 bit editions

Build tools: gss v4.1.2


This code has a function that takes a variable # of arguments, and it
serves as a wrapper around various functions in the printf() family,
namely, the "v" variants of them directly accept a "va_list" argument.
The code makes proper use of va_start() and va_end() to initialize an
instance of "va_list", which then gets passed in to vsnprintf() [Linux],
vswprintf() [Linux & Windows] and vsnprintf_s() [Windows].

The code compiles & links successfully for both 32-bit & 64-bit targets
on Windows, and it executes properly, too. It also compiles, links &
executes properly when producing a 32-bit binary on SLES 10 SP1.

Although it compiles & links successfully when producing a 64-bit binary
on SLES 10 SP1, it does not execute properly. The various printf()
functions that are being called all fail to produce proper output.

After simultaneously debugging both the 32-bit & 64-bit builds of the
program, and also comparing with what's happening on Windows, I've
obtained the following information:

On Windows, for both 32-bit & 64-bit targets, "va_list" is nothing more
than a "char *" pointer that points to a contiguous block of memory that
contains 32-bit values [on 32-bit build] and 64-bit values [on 64-bit
build]. Examining memory at that address value stored in an initialized
instance of "va_list" proves that this is true.

On Linux, for the 32-bit build of the program, "va_list" is also a "char
*", and it behaves the same as on Windows. Examining memory at that
address value stored in an initialized instance of "va_list" proves that
this is true.

On Linux, for the 64-bit build of the program, sizeof(va_list) returns a
value of 24, and examining an initialized instance of "va_list" in the
debugger [gdb] shows that it is a structured data type, with members
named "gp_offset", "fp_offset", "overflow_arg_area" and "reg_save_area".

For both the 32-bit & 64-bit builds of the program on Linux, the
"stdarg.h" header has a typedef that makes "va_list" equivalent to
"__gnuc_va_list", and another typedef that makes "__gnuc_va_list"
equivalent to "__builtin_va_list", which must be defined internally by
gcc itself, rather than being present in any header files.


My best guess at this point is that gcc is defining "__builtin_va_list"
for the 64-bit build of the program such that it differs from the
definition of "va_list" used by the the glibc implementations the
various printf() functions.

Google Groups searches of the Usenet archives haven't been turning up
anything useful, and neither has searching the gcc wiki & FAQ.


Has anybody encountered this problem, before? Any info about it would
be helpful, especially any work around or fixes or links to any web
pages that confirm what I think is going wrong.


TIA,

Chuck

Sam

unread,
Sep 12, 2008, 9:42:33 PM9/12/08
to
Chuck Chopp writes:

> My best guess at this point is that gcc is defining "__builtin_va_list"
> for the 64-bit build of the program such that it differs from the
> definition of "va_list" used by the the glibc implementations the
> various printf() functions.
>
> Google Groups searches of the Usenet archives haven't been turning up
> anything useful, and neither has searching the gcc wiki & FAQ.

Your best guess will, unfortunately, remain only a guess, unless you post
some actual example that demonstrates your question.

What printf use or doesn't use is completely irrelevant. There's nothing in
any standard that requires printf to use stdarg.h-defined facilities to
process its arguments.

Furthermore, I do not see any C++ language-related content, for
comp.lang.c++, here.

Chuck Chopp

unread,
Sep 13, 2008, 1:21:35 PM9/13/08
to

A code sample follows that can be used to demonstrate the problem.

Regarding "v" variants of the printf family of functions, their function
signatures are declared as taking a final argument of type "va_list".

Unless the function prototyping contract is being violated, they have to
use the same "va_list" definition.


From the man pages for the printf functions in question:

NAME
wprintf, fwprintf, swprintf, vwprintf, vfwprintf, vswprintf -
formatted
wide character output conversion

SYNOPSIS
#include <stdio.h>
#include <wchar.h>

int wprintf(const wchar_t *format, ...);
int fwprintf(FILE *stream, const wchar_t *format, ...);
int swprintf(wchar_t *wcs, size_t maxlen,
const wchar_t *format, ...);

#include <stdarg.h>

int vwprintf(const wchar_t *format, va_list args);
int vfwprintf(FILE *stream, const wchar_t *format, va_list args);
int vswprintf(wchar_t *wcs, size_t maxlen,
const wchar_t *format, va_list args);

Double checking the header files shows that for vswprintf(), the
argument named "args" is actually declared as type _G_var_list, and that
"va_list" is typedef'd to be "_G_var_list". In turn, stdio.h has a
typedef for _G_var_list which equates it to the type __gnu_c_va_lis,
which, in turn, is typedef'd to equate to the type __builtin_va_list.

Given that I'm using same same version of gcc on the same O.S. [SLES 10
SP1 x86_64], and that I'm using the same source code to build both
32-bit & 64-bit programs, my expectation is that the underlying type for
"va_list" [__gnuc_va_list a.k.a __builtin_va_list] would be the same
when simply toggling between building for 32-bit or building for 64-bit.

That expectation proves to be a reasonable assumption on Windows when
building with Visual C/C++ v9.0, but with gcc v4.1.2 on SLES 10 SP1
x86_64, the assumption proves invalid. That's what I'm trying to track
down and identify... why "va_list" is a 24 byte structured data type
when "-m64" is specified in the compiler flags, and why it is a "char *"
when "-m32" is specified in the compiler flags, with all else remaining
constant.


As for whether it applies to comp.lang.c++, I suppose that depends...
the source code is in a .cpp module, it's being built by g++ via gcc and
it makes use of the STL strint/wstring classes. Yes, the C-RTL
functions in question are the printf family of functions, but variable
argument support is a feature of both C and C++. Given that gnu.g++ was
dead and had few, if any, relevant messages [lots of spam in it on my
news feed], I thought that gnu.gcc & gnu.gcc.help were more appropriate
for the gcc aspect of this problem.


The relevant "CFLAGS" from the makefile used to do the build:

CFLAGS = -std=c++98 -pthread -Wall -Weffc++ -g3 -O0

With either "-m32" or "-m64" added to specify whether to build a 32-bit
program or a 64-bit program.

Code Sample:


#include <cstdio>
#include <string>
#include <cstdarg>
#include <cctype>
#include <cwchar>
#include <cwctype>


wstring MyPrintf(const wchar_t *pwszFormat,...);

size_t xvAllocPrintfW(wchar_t **ppwszRetBuf, const wchar_t *pwszFormat,
va_list v1);


wstring MyPrintf(const wchar_t *pwszFormat,...)
{
wchar_t *pwszBuf = NULL;
size_t nCharsWritten = 0;
va_list v1;
wstring sResult;

// Set up our variable argument list (a pointer/structure of some
// sort).

va_start(v1, pwszFormat);

nCharsWritten = xvAllocPrintfW(&pwszBuf,pwszFormat,v1);

va_end(v1);

if (NULL != pwszBuf)
{
sResult = pwszBuf;
delete [] pwszBuf;
}
else
{
sResult.clear();
}

return sResult;
}


size_t xvAllocPrintfW(wchar_t **ppwszRetBuf, const wchar_t *pwszFormat,
va_list v1)
{
int Result = 0;
size_t nCharsWritten = 0;
wchar_t *pwszOutBuf = NULL;

// Validate our input parameters...

if (NULL == ppwszRetBuf)
return 0;

if (NULL == pwszFormat)
return 0;

// Figure out how much memory is required to store the string...

nCharsWritten = xvNullPrintfW(pwszFormat,v1);

// Allocate the output buffer, including room for the NULL terminator.

pwszOutBuf = new wchar_t[nCharsWritten + 1];

if (NULL == pwszOutBuf)
{
if (NULL != *ppwszRetBuf)
{
delete [] *ppwszRetBuf;
*ppwszRetBuf = NULL;
}
return 0;
}

// Create the output string.

Result = vswprintf(pwszOutBuf, nCharsWritten + 1, pwszFormat, v1);

// Free up the old return buffer if necessary.

if (NULL != *ppwszRetBuf)
{
delete [] *ppwszRetBuf;
*ppwszRetBuf = NULL;
}

// Point the return buffer to the newly allocated output buffer.

*ppwszRetBuf = pwszOutBuf;

return nCharsWritten;
}

So, some sample mainline code that calls MyPrintf(), as follows:

int MyInt = 127;
wstring MyString;

MyString = MyPrintf(L"%d", MyInt);


Results in MyString containing the string L"127" on Windows for both
32-bit & 64-bit builds of the program, and also for 32-bit builds on
Linux. However, the 64-bit build produces garbage output, most likely
due to what I think is a discrepancy in the definition of "va_list" that
is occurring between the code generated by gcc and the definition of
"va_list" that the underlying vswprintf() function is using.

I'm thinking that either gcc is doing something wrong here, or else the
64-bit version of glibc was built with a different definition for
"va_list" that is not compatible. What I need to determine is whether
or not it's gcc that is at fault, or if the 64-bit build of C-RTL
[glibc?] is at fault, or if there's something that can be done with a
macro definition to alter how "va_list" is getting defined at compile time.

Mikael Pettersson

unread,
Sep 13, 2008, 2:46:15 PM9/13/08
to
In article <EESyk.407$wr1...@newsfe02.iad>,

Chuck Chopp <Chuck...@rtfmcsi.com> wrote:
>size_t xvAllocPrintfW(wchar_t **ppwszRetBuf, const wchar_t *pwszFormat,
>va_list v1)
>{
> int Result = 0;
> size_t nCharsWritten = 0;
> wchar_t *pwszOutBuf = NULL;
>
> // Validate our input parameters...
>
> if (NULL == ppwszRetBuf)
> return 0;
>
> if (NULL == pwszFormat)
> return 0;
>
> // Figure out how much memory is required to store the string...
>
> nCharsWritten = xvNullPrintfW(pwszFormat,v1);
>
> // Allocate the output buffer, including room for the NULL terminator.
>
> pwszOutBuf = new wchar_t[nCharsWritten + 1];
>
> if (NULL == pwszOutBuf)
> {
> if (NULL != *ppwszRetBuf)
> {
> delete [] *ppwszRetBuf;
> *ppwszRetBuf = NULL;
> }
> return 0;
> }
>
> // Create the output string.
>
> Result = vswprintf(pwszOutBuf, nCharsWritten + 1, pwszFormat, v1);

Here's your bug. You're reusing the va_list v1 which already has
been used above, without an intervening va_end/va_start pair.
You're not allowed to do that.

I don't have chapter and verse from relevant C standards available
right now, but this is something _I_ asked about in comp.lang.c
many many years ago, and (I think) Doug Gwyn set me straight.
--
Mikael Pettersson (mi...@it.uu.se)
Uppsala University, Department of Information Technology, Computing Science Divison

Chuck Chopp

unread,
Sep 13, 2008, 4:38:05 PM9/13/08
to


Thanks... I'll take a closer look at aspect of the problem and see what
I can turn up.

It's interesting that in 3 out of the 4 scenarios I'm running it under
that it works as desired, but fails in the fourth scenario. Either
there's a grey area which has similar implementations and similar
results, or else it's been a total streak of luck that it ever worked in
the first place under any of the scenarios.

nbal...@gmail.com

unread,
Oct 21, 2008, 9:48:07 PM10/21/08
to
Dear Chuck,

I wanted to let you know that I came across the same problem today. I
am compiling kannel (a wap/SMS gateway) on Solaris 10.5 (64 bits). The
symptoms are exactly as you described them. On 32bits it compiles as
strings and on 64bits as va_list (not exactly structure but similar).

Kannel works fine on 32 bits but it core dumps in 64bits after having
built with no problems or warnings. Seems that argument type is
important. It more or less behaves with integers but core dumps on
strings (char *) so far.

I don't have any answers yet and I was wondering if you have come
across anything useful since your post.

Myself I'll give it a go one more day, and if I don't come up with
anything, I will switch all arguments to arrays and be done with it.

BTW Kannel is straight C, and it starts and ends all va_lists
promptly.

Regards,
Nikos

nbal...@gmail.com

unread,
Oct 22, 2008, 3:27:18 PM10/22/08
to
Dear Chuck,

II hope that this is in time to help you, if not it may help some
other unfortunate soul with the same problem. I was able to solve it
for myself. It really has to do with libc and the definitions in the
relevant include files. You are correct in that it differs from system
to system in the 64bit case, I imagine to make more efficient?

The whole issue is to pass your arguments in a compatible form with
your libc from which your kernel was built. By reference or by value.
System related, not gcc. An example from kannel's gwlib/octstr.c, with
a change from me. The important thing is to determine in which of the
2 #ifdef cases you belong. Try each one:

/*.
* Unfortunately some platforms base va_list an an array type
* which makes passing of the &args a bit tricky.
*/
#if (defined(__linux__) && (defined(__powerpc__) || defined(__s390__)
|| defined(__x86_64))) || \
(defined(__FreeBSD__) && defined(__amd64__))
#define VARGS(x) (x)
#define VALPARM(y) va_list y
#define VALST(z) (z)
#else
#define VARGS(x) (&x)
#define VALPARM(y) va_list *y
#define VALST(z) (*z)
#endif
.
.
.
Octstr *octstr_format(const char *fmt, ...)
{
Octstr *os;
va_list args;

va_start(args, fmt);
os = octstr_format_valist(fmt, VARGS(args)); //
NIKOS !!!
va_end(args);
return os;
}

static void convert(Octstr *os, struct format *format, const char
**fmt,
VALPARM(args))
{
Octstr *new;
char *s, *pad;
long n;
unsigned long u;
char tmpfmt[1024];
char tmpbuf[1024];
char c;
void *p;

new = NULL;
switch (**fmt)
{
case 'c':
c = va_arg(VALST(args), int);
new = octstr_create_from_data(&c, 1);
break;
.
.
.

BR,
Nikos

0 new messages