Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Writing to std::string buffer valid?

53 views
Skip to first unread message

Marcel Mueller

unread,
May 9, 2015, 6:26:01 AM5/9/15
to
Is the code below valid and portable?

#include <string>
#include <cstdarg>
#include <cstdio>
using namespace std;

string vstringf(const char* format, va_list va)
{ int count = vsnprintf(NULL, 0, format, va);
string ret;
ret.resize(count+1);
vsnprintf(&ret[0], count+1, format, va);
ret.resize(count);
return ret;
}

AFAIK since C++11 it is guaranteed that the content of std::string is
continuous memory. Furthermore operator[] returns a writable reference
to char. From that point of view it should be OK, isn't it?
The only hack I need is the count+1 because it is not allowed to write
the additional \0 byte after the end of the string.

But someone else reported that the above code produced corrupted strings
on his MacOS. So the question is who's wrong. The compiler or the code?


Marcel

Luca Risolia

unread,
May 9, 2015, 7:12:05 AM5/9/15
to
Il 09/05/2015 12:25, Marcel Mueller ha scritto:
> string vstringf(const char* format, va_list va)
> { int count = vsnprintf(NULL, 0, format, va);
> string ret;
> ret.resize(count+1);
> vsnprintf(&ret[0], count+1, format, va);
> ret.resize(count);
> return ret;
> }

> But someone else reported that the above code produced corrupted strings
> on his MacOS. So the question is who's wrong. The compiler or the code?

It should be safe in C++11, but it might be a problem in C++98, where
COW string optimizations are allowed.

Marcel Mueller

unread,
May 9, 2015, 7:19:35 AM5/9/15
to
The code does not compile without C++11 because of several other
language features anyway. First of all I used .emplace excessively. (Do
we need the old functions anymore except for compatibility?)

But I see no dependency to COW. Using operator[] requires the string to
be unique anyway.


Marcel

Öö Tiib

unread,
May 9, 2015, 7:24:33 AM5/9/15
to
On Saturday, 9 May 2015 13:26:01 UTC+3, Marcel Mueller wrote:
> Is the code below valid and portable?
>
> #include <string>
> #include <cstdarg>
> #include <cstdio>
> using namespace std;
>
> string vstringf(const char* format, va_list va)
> { int count = vsnprintf(NULL, 0, format, va);
> string ret;
> ret.resize(count+1);
> vsnprintf(&ret[0], count+1, format, va);
> ret.resize(count);
> return ret;
> }
>
> AFAIK since C++11 it is guaranteed that the content of std::string is
> continuous memory. Furthermore operator[] returns a writable reference
> to char. From that point of view it should be OK, isn't it?
> The only hack I need is the count+1 because it is not allowed to write
> the additional \0 byte after the end of the string.

It is (or at least seems to me) valid code by standard but compilers
do not always follow standard to letter.

> But someone else reported that the above code produced corrupted strings
> on his MacOS. So the question is who's wrong. The compiler or the code?

Most probably (about 95% of time) the defect is where the call is made
to function with ellipsis parameter and where the programmer-entered list
does not match with programmer-entered format.

Compilers often do warn when argument list does not match with contents
of format with C standard library functions that have ellipsis parameter.

Messing with 'va_list' and 'char const*' format does not let compilers
to do that. So even if the code is correct ... my impression is
that ... just avoid doing it. ;) Especially when neither of you nor that
"someone else" is capable of taking and debugging actual standard library
code and seeing what really goes wrong there.

Marcel Mueller

unread,
May 10, 2015, 3:18:26 AM5/10/15
to
On 09.05.15 13.24, Öö Tiib wrote:
>> But someone else reported that the above code produced corrupted strings
>> on his MacOS. So the question is who's wrong. The compiler or the code?
>
> Most probably (about 95% of time) the defect is where the call is made
> to function with ellipsis parameter and where the programmer-entered list
> does not match with programmer-entered format.

gcc checks for that with a reasonable hit rate.
Furthermore using a static buffer fixes the problem for him. This
wouldn't fix and parameter type issues.

> Compilers often do warn when argument list does not match with contents
> of format with C standard library functions that have ellipsis parameter.
>
> Messing with 'va_list' and 'char const*' format does not let compilers
> to do that.

There is a pragma for gcc to enforce that whereever ... is used this
way. (And I never used ... for anything else.)

> So even if the code is correct ... my impression is
> that ... just avoid doing it. ;)

Well, this is the old discussion about printf. Without using external
libraries (and dependencies) there is no reasonable alternative for
error message handling. Adding an additional library based on variadic
templates is likely to cause more problems than printf format strings.
If the messages code does not fit within one line the code becomes
unreadable. And invoking a stringstream each time is really annoying.

> Especially when neither of you nor that
> "someone else" is capable of taking and debugging actual standard library
> code and seeing what really goes wrong there.

He tries to dig it. It seems that invoking vsnprintf twice with the same
va raises the problem.


Marcel

Öö Tiib

unread,
May 10, 2015, 4:44:21 AM5/10/15
to
On Sunday, 10 May 2015 10:18:26 UTC+3, Marcel Mueller wrote:
> On 09.05.15 13.24, Öö Tiib wrote:
> >> But someone else reported that the above code produced corrupted strings
> >> on his MacOS. So the question is who's wrong. The compiler or the code?
> >
> > Most probably (about 95% of time) the defect is where the call is made
> > to function with ellipsis parameter and where the programmer-entered list
> > does not match with programmer-entered format.
>
> gcc checks for that with a reasonable hit rate.
> Furthermore using a static buffer fixes the problem for him. This
> wouldn't fix and parameter type issues.

Is the issue caused by calling vsnprintf twice or caused by usage
of '&str[0]' for buffer?

> > Compilers often do warn when argument list does not match with contents
> > of format with C standard library functions that have ellipsis parameter.
> >
> > Messing with 'va_list' and 'char const*' format does not let compilers
> > to do that.
>
> There is a pragma for gcc to enforce that whereever ... is used this
> way. (And I never used ... for anything else.)

I didn't know of such pragma existing. What pragma it is?

> > So even if the code is correct ... my impression is
> > that ... just avoid doing it. ;)
>
> Well, this is the old discussion about printf. Without using external
> libraries (and dependencies) there is no reasonable alternative for
> error message handling. Adding an additional library based on variadic
> templates is likely to cause more problems than printf format strings.
> If the messages code does not fit within one line the code becomes
> unreadable. And invoking a stringstream each time is really annoying.

The library that uses variadic templates is usually couple of header
files ... but indeed, anything can cause problems.

> > Especially when neither of you nor that
> > "someone else" is capable of taking and debugging actual standard library
> > code and seeing what really goes wrong there.
>
> He tries to dig it. It seems that invoking vsnprintf twice with the same
> va raises the problem.

So something of the 'int count = vsnprintf(NULL, 0, format, va);' corrupts
'va' for next call? If he uses fixed buffer in next call (that ignores
the 'count') then does he get garbage in fixed buffer as well?

Alain Ketterlin

unread,
May 10, 2015, 5:08:08 AM5/10/15
to
Marcel Mueller <news.5...@spamgourmet.org> writes:

> He tries to dig it. It seems that invoking vsnprintf twice with the
> same va raises the problem.

You can't reuse an instance of va_list. This is fairly clear in the
manpage (on Linux). Sorry, I'm lazy to look it up in the standards.

| The va_arg() macro expands to an expression that has the type and value
| of the next argument in the call. The argument ap is the va_list ap
| initialized by va_start(). Each call to va_arg() modifies ap so that
| the next call returns the next argument. [...]
|
| If ap is passed to a function that uses va_arg(ap,type) then the value
| of ap is undefined after the return of that function.

So in the code you gave earlier:

> string vstringf(const char* format, va_list va)
> { int count = vsnprintf(NULL, 0, format, va);
> string ret;
> ret.resize(count+1);
> vsnprintf(&ret[0], count+1, format, va);
> ret.resize(count);
> return ret;
> }

the first call to vsnprint kills "va" (by repeatedly calling va_arg),
and the second call to vsnprintf gets garbage. You need to use va_copy()
to be conform, even on systems where it seems to work.

-- Alain.

Marcel Mueller

unread,
May 10, 2015, 5:23:56 AM5/10/15
to
On 10.05.15 10.44, Öö Tiib wrote:
>> Furthermore using a static buffer fixes the problem for him. This
>> wouldn't fix and parameter type issues.
>
> Is the issue caused by calling vsnprintf twice or caused by usage
> of '&str[0]' for buffer?

It is not fully proven, but it seems that the first one is the problem.

>> There is a pragma for gcc to enforce that whereever ... is used this
>> way. (And I never used ... for anything else.)
>
> I didn't know of such pragma existing. What pragma it is?

#ifdef __GNUC__
#define PRINTFATTR(i) __attribute__((format(printf, i, i+1)))
#else
#define PRINTFATTR(i)
#endif

string vstringf(const char* format, va_list va);
string stringf(const char* format, ...) PRINTFATTR(1);


> The library that uses variadic templates is usually couple of header
> files ... but indeed, anything can cause problems.

Especially cross platform. Another guy already complained about C++11
because Debian Wheezy (gcc 4.7) does not support it. (Interestingly gcc
4.7 of OS/2 supports it - however.)


>> He tries to dig it. It seems that invoking vsnprintf twice with the same
>> va raises the problem.
>
> So something of the 'int count = vsnprintf(NULL, 0, format, va);' corrupts
> 'va' for next call?

It seems so. No Idea what is going on. I can't test it since it works on
all of my platforms including Raspberry Pi (ARM hf) and the prehistoric
OS/2. OK, all of them use a gcc flavor.

> If he uses fixed buffer in next call (that ignores
> the 'count') then does he get garbage in fixed buffer as well?

We did not test that so far. The fixed buffer does not use NULL but
passes sizeof(buffer) on the first call instead.


Marcel

Marcel Mueller

unread,
May 10, 2015, 5:31:10 AM5/10/15
to
On 10.05.15 11.07, Alain Ketterlin wrote:
> Marcel Mueller <news.5...@spamgourmet.org> writes:
>
>> He tries to dig it. It seems that invoking vsnprintf twice with the
>> same va raises the problem.
>
> You can't reuse an instance of va_list. This is fairly clear in the
> manpage (on Linux). Sorry, I'm lazy to look it up in the standards.

Well, but I do not reuse va_list. I pass it to vsnprintf, by value,
that's it. How could this affect the va instance?

> | If ap is passed to a function that uses va_arg(ap,type) then the value
> | of ap is undefined after the return of that function.

Hmm, this would invalidate all vXprintf implementations on stackoverflow.
It would never be possible to estimate the size first.

> the first call to vsnprint kills "va" (by repeatedly calling va_arg),
> and the second call to vsnprintf gets garbage. You need to use va_copy()
> to be conform, even on systems where it seems to work.

va_copy? Never heard of. Seems to be new, at least in C++11.
We will check that, thanks.


Marcel

Alain Ketterlin

unread,
May 10, 2015, 6:06:45 AM5/10/15
to
Marcel Mueller <news.5...@spamgourmet.org> writes:

> On 10.05.15 11.07, Alain Ketterlin wrote:
>> Marcel Mueller <news.5...@spamgourmet.org> writes:
>>
>>> He tries to dig it. It seems that invoking vsnprintf twice with the
>>> same va raises the problem.
>>
>> You can't reuse an instance of va_list. This is fairly clear in the
>> manpage (on Linux). Sorry, I'm lazy to look it up in the standards.
>
> Well, but I do not reuse va_list. I pass it to vsnprintf, by value,
> that's it. How could this affect the va instance?

These things are no object, you pass it by value but it's probably a
pointer. And nothing says you can just duplicate by assignment.

>> | If ap is passed to a function that uses va_arg(ap,type) then the value
>> | of ap is undefined after the return of that function.
>
> Hmm, this would invalidate all vXprintf implementations on stackoverflow.

Someone should ask the authors to read the manual...

> It would never be possible to estimate the size first.

That's what va_copy() is for.

>> the first call to vsnprint kills "va" (by repeatedly calling va_arg),
>> and the second call to vsnprintf gets garbage. You need to use va_copy()
>> to be conform, even on systems where it seems to work.
>
> va_copy? Never heard of. Seems to be new, at least in C++11.
> We will check that, thanks.

It looks like it appeared in C99. It is therefore part of C++11. From
the Linux man page:

| CONFORMING TO
| The va_start(), va_arg(), and va_end() macros conform to C89. C99
| defines the va_copy() macro.

-- Alain.

Ben Bacarisse

unread,
May 10, 2015, 6:14:04 AM5/10/15
to
Marcel Mueller <news.5...@spamgourmet.org> writes:

> On 10.05.15 11.07, Alain Ketterlin wrote:
>> Marcel Mueller <news.5...@spamgourmet.org> writes:
>>
>>> He tries to dig it. It seems that invoking vsnprintf twice with the
>>> same va raises the problem.
>>
>> You can't reuse an instance of va_list. This is fairly clear in the
>> manpage (on Linux). Sorry, I'm lazy to look it up in the standards.
>
> Well, but I do not reuse va_list. I pass it to vsnprintf, by value,
> that's it. How could this affect the va instance?

va might be a pointer, or a structure containing one or more pointers,
or it might be an array which will be converted to a pointer...

>> | If ap is passed to a function that uses va_arg(ap,type) then the value
>> | of ap is undefined after the return of that function.
>
> Hmm, this would invalidate all vXprintf implementations on stackoverflow.
> It would never be possible to estimate the size first.

Before va_copy, some implementations permitted assignment of va_list
objects. The language standard never guaranteed that this would copy
the state, but it worked on some implementations.

>> the first call to vsnprint kills "va" (by repeatedly calling va_arg),
>> and the second call to vsnprintf gets garbage. You need to use va_copy()
>> to be conform, even on systems where it seems to work.
>
> va_copy? Never heard of. Seems to be new, at least in C++11.
> We will check that, thanks.

It was in C99, but not in C++2003. It is in C++11.

--
Ben.

Öö Tiib

unread,
May 10, 2015, 7:55:15 AM5/10/15
to
On Sunday, 10 May 2015 12:23:56 UTC+3, Marcel Mueller wrote:
> On 10.05.15 10.44, Öö Tiib wrote:
>
> > The library that uses variadic templates is usually couple of header
> > files ... but indeed, anything can cause problems.
>
> Especially cross platform. Another guy already complained about C++11
> because Debian Wheezy (gcc 4.7) does not support it. (Interestingly gcc
> 4.7 of OS/2 supports it - however.)

Some too loose variadic template code that compiled with gcc 4.6 stopped
compiling with gcc 4.7. It was AFAIK partly because of fixing an issue
of parameter packs in standard and conforming with the fixes:
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1399

I would still first ask from "another guy" a demo about more mundane
stuff like not missing '-std=c++11' and the like.

Marcel Mueller

unread,
May 10, 2015, 1:54:49 PM5/10/15
to
On 10.05.15 12.06, Alain Ketterlin wrote:
>> It would never be possible to estimate the size first.
>
> That's what va_copy() is for.

Seems so. I just got the confirmation that the code now works (with
va_copy). Thx again.


Marcel

Scott Lurndal

unread,
May 10, 2015, 6:34:43 PM5/10/15
to
Marcel Mueller <news.5...@spamgourmet.org> writes:
you're not allowed to use a va_list more than once. Use va_copy to
make a copy and use each copy once only.

/**
* Store a formatted output message in the log file. Optionally
* prefix the message with the current date and time, if the date option
* has been set.
*
* @param fmt A printf-style format string
* @param ap The arguments corresponding to the format string.
*/
void
c_mux_logger::do_log(const char *fmt, va_list ap)
{
for(size_t i=0; i < ml_logger_count; i++) {
va_list aq;

va_copy(aq, ap);
ml_loggers[i]->do_log(fmt, aq);
va_end(aq);
}
}

Richard

unread,
May 11, 2015, 1:52:33 PM5/11/15
to
[Please do not mail me a copy of your followup]

Marcel Mueller <news.5...@spamgourmet.org> spake the secret code
<miknb7$mun$1...@gwaiyur.mb-net.net> thusly:

>Is the code below valid and portable?

No. Use std::vector<char> if you want a writable buffer of characters.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
The Terminals Wiki <http://terminals.classiccmp.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>
0 new messages