Add basic_string::resize_uninitialized (or a similar mechanism)

894 views
Skip to first unread message

aml...@gmail.com

unread,
Nov 18, 2013, 3:41:59 PM11/18/13
to std-pr...@isocpp.org
There are many mechanisms around that create basic_string objects that contain some form of formatted data (e.g. ostringstream, to_string).  Unfortunately, there is currently no way to obtain a basic_string of nonzero size and interesting contents without either composing the contents externally and copying or writing to the string twice.

The former can be achieved by composing the contents of the string as an array or a sequence and using the basic_string constructors.  The latter can be achieved by using resize() and operator[] to create a blank string and fill it in.

For applications where a reasonable approximation of the final size of the string is known in advance (or where users are willing to over-allocate), it's tempting to try to optimize by resizing the string and composing in place.  This wastes time because basic_string::resize() always fully initializes the string.

If basic_string had a method like resize_uninitialized that changed the size of the underlying array but did *not* construct the elements (with similar semantics to declaring an uninitialized local variable of type char).  This would only be allowed for sufficiently trivial character types.

(std::vector has much the same issue.)

Thoughts?

Ville Voutilainen

unread,
Nov 18, 2013, 3:43:29 PM11/18/13
to std-pr...@isocpp.org
On 18 November 2013 22:41, <aml...@gmail.com> wrote:
> If basic_string had a method like resize_uninitialized that changed the size
> of the underlying array but did *not* construct the elements (with similar
> semantics to declaring an uninitialized local variable of type char). This
> would only be allowed for sufficiently trivial character types.
>
> (std::vector has much the same issue.)
>
> Thoughts?

There are multiple previous threads proposing such facilities for vector.

The answer is no.

aml...@gmail.com

unread,
Nov 18, 2013, 3:51:31 PM11/18/13
to std-pr...@isocpp.org

Do you have a link?  The closest I can come up with is:

https://groups.google.com/a/isocpp.org/d/topic/std-proposals/5BnNHEr07QM/discussion

which is rather different.
 

Ville Voutilainen

unread,
Nov 18, 2013, 4:05:43 PM11/18/13
to std-pr...@isocpp.org
What's the difference?

aml...@gmail.com

unread,
Nov 18, 2013, 4:10:43 PM11/18/13
to std-pr...@isocpp.org


That thread is proposing an unchecked or somehow optimized variant of push_back.  I'm proposing a way to change the *size* (not the capacity) of a basic_string in constant time.  With resize_uninitialized, the string would be in a completely valid state after resize_uninitialize; the contents would just be arbitrary.  (On platforms that have trap values for characters, the implementation might still have to initialize the data.  I'm not sure that such a platform exists, though.  In any case, there's precedent for similar semantics to what I'm proposing: malloc and get_temporary_buffer.)

Ville Voutilainen

unread,
Nov 18, 2013, 4:18:58 PM11/18/13
to std-pr...@isocpp.org
On 18 November 2013 23:10, <aml...@gmail.com> wrote:
>> What's the difference?
> That thread is proposing an unchecked or somehow optimized variant of
> push_back. I'm proposing a way to change the *size* (not the capacity) of a
> basic_string in constant time. With resize_uninitialized, the string would
> be in a completely valid state after resize_uninitialize; the contents would
> just be arbitrary. (On platforms that have trap values for characters, the
> implementation might still have to initialize the data. I'm not sure that
> such a platform exists, though. In any case, there's precedent for similar
> semantics to what I'm proposing: malloc and get_temporary_buffer.)


Yes? The thread proposes almost exactly that:
https://groups.google.com/a/isocpp.org/d/msg/std-proposals/5BnNHEr07QM/3-6IBGwMWSQJ

Billy O'Neal

unread,
Nov 18, 2013, 4:21:59 PM11/18/13
to std-proposals
Plenty of variants of a similar design were discussed there, too. In fact, the exact name discussed was "uninitialized_resize"

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--
 
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

aml...@gmail.com

unread,
Nov 18, 2013, 4:22:50 PM11/18/13
to std-pr...@isocpp.org


That seems to have been a short aside in the thread, and it didn't mention restricting it to trivial types.  There's certainly a problem with having a vector that contains unconstructed values of non-trivial type.  On the other hand, there's probably nothing wrong with arbitrary values of trivial types.

Bengt Gustafsson

unread,
Nov 18, 2013, 5:49:27 PM11/18/13
to std-pr...@isocpp.org, aml...@gmail.com
The aside of the previous thread ended (I think) with my suggestion of a new method vector::resize_default_constructed(). This method would call the default constructor just like new T; does, not the value constructor like new T() does. As these two new calls only differ for trivial types the new resize method would be safe for any type (as safe as having an array of that type at least). This thinking would of course carry over to basic_string<T> as well.

The reasoning behind allowing calls to resize_default_constructed for any T is to avoid having to specialize template code using vector<T> depending on if the method is available or not for the T at hand.

When it comes to the quite common use case of composing a string using several operator+() concatenations I have toyed with the idea of letting the operator+() return a helper object which refers to its lhs and rhs and only when all of the concatenations have been done and the result is somehow used as a string does the references get resolved, the total character count tallied, a suitably sized malloc block allocated and finally the characters copied into it. I have not attempted to implement such a thing, however, as I think there would be countless corner cases to take care of. One of them would be auto x = string1 + string2; where there is nothing to "drive" the actual concatenation before x is used, at which time the strings may have changed value. The suggested operator auto() in another thread would fix this particular issue, but there are probably many worse ones involving functions being called that change values of strings also partaking in the concatenation. Has anyone tried anything in this direction at home? 

Billy O'Neal

unread,
Nov 18, 2013, 5:51:08 PM11/18/13
to std-proposals
It sounds like what you want should be a specialized type, not built in to basic_string.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--

aml...@gmail.com

unread,
Nov 18, 2013, 5:54:10 PM11/18/13
to std-pr...@isocpp.org, aml...@gmail.com
On Monday, November 18, 2013 2:49:27 PM UTC-8, Bengt Gustafsson wrote:
The aside of the previous thread ended (I think) with my suggestion of a new method vector::resize_default_constructed(). This method would call the default constructor just like new T; does, not the value constructor like new T() does. As these two new calls only differ for trivial types the new resize method would be safe for any type (as safe as having an array of that type at least). This thinking would of course carry over to basic_string<T> as well.

The reasoning behind allowing calls to resize_default_constructed for any T is to avoid having to specialize template code using vector<T> depending on if the method is available or not for the T at hand.

When it comes to the quite common use case of composing a string using several operator+() concatenations I have toyed with the idea of letting the operator+() return a helper object which refers to its lhs and rhs and only when all of the concatenations have been done and the result is somehow used as a string does the references get resolved, the total character count tallied, a suitably sized malloc block allocated and finally the characters copied into it. I have not attempted to implement such a thing, however, as I think there would be countless corner cases to take care of. One of them would be auto x = string1 + string2; where there is nothing to "drive" the actual concatenation before x is used, at which time the strings may have changed value. The suggested operator auto() in another thread would fix this particular issue, but there are probably many worse ones involving functions being called that change values of strings also partaking in the concatenation. Has anyone tried anything in this direction at home? 

There are plenty of expression trees and such out there (mainly for things like linear algebra).  The tricky case is when you want to concatenate things that aren't strings into strings.

--Andy

aml...@gmail.com

unread,
Nov 18, 2013, 5:56:19 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 2:51:08 PM UTC-8, Billy O'Neal wrote:
It sounds like what you want should be a specialized type, not built in to basic_string.


If you want that type to efficiently convert (or move-convert) to basic_string, though, then you're stuck -- basic_string doesn't offer the necessary facilities.  (That's the whole point of this thread.  I *have* a specialized type for composing strings.  There's just no good way to get a basic_string out of it.)

--Andy

Billy O'Neal

unread,
Nov 18, 2013, 6:00:54 PM11/18/13
to std-proposals
Except everything you propose can be implemented in terms if .reserve() + .insert()/.append(), so you do in fact have the necessary facilities. Sure, those may be difficult to use in comparison, but they have the advantage that 1. they don't break basic_string's invariants, and 2. they are already in the standard and you can use them today.

If someone can show profiling data for a real world application where any kind of conversion like this mattered, then perhaps it would make sense to consider an application where a basic_string could "capture" a buffer already allocated by a user. But this uninitialized_resize thing, IMHO, induces far too many caveats.

It seems like for most of those cases where you would see a significant perf win, you'd be better off avoiding the conversion back into basic_string in the first place.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--

aml...@gmail.com

unread,
Nov 18, 2013, 6:23:11 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 3:00:54 PM UTC-8, Billy O'Neal wrote:
Except everything you propose can be implemented in terms if .reserve() + .insert()/.append(), so you do in fact have the necessary facilities. Sure, those may be difficult to use in comparison, but they have the advantage that 1. they don't break basic_string's invariants, and 2. they are already in the standard and you can use them today.


I admit I haven't real every post on the other thread, but how is this supposed to work now?  Suppose I want to do something as simple as:

string s;
s.reserve(BIG_ENOUGH);
char buf[BIG_ENOUGH];
sprintf(buf, "%.6f", something);

I can copy the data from buf to s using insert, or I could rewrite sprintf to stick each character in using push_back (and hope that the optimizer is really, really good), but neither of those is a real solution.  I want to be able to do:

auto orig_size = s.size();
s.resize_default_initialized(orig_size + BIG_ENOUGH);  /* or whatever it's called */
s.resize(orig_size + sprintf(&s[orig_size], "%.6f", something));  /* snprintf in real code */

This avoids copying and avoids rewriting sprintf.


If someone can show profiling data for a real world application where any kind of conversion like this mattered, then perhaps it would make sense to consider an application where a basic_string could "capture" a buffer already allocated by a user. But this uninitialized_resize thing, IMHO, induces far too many caveats.

What are the caveats to having a way to resize a string or a vector such that the newly-added elements are default-initialized?
 

It seems like for most of those cases where you would see a significant perf win, you'd be better off avoiding the conversion back into basic_string in the first place.

I would argue that the main reason that basic_string is useful is that it's a string type that's widely used.  A lot of APIs accept basic_string as input (and, with move semantics, the ability to std::move a string into some library is a big win).  You can't solve a deficiency in creating basic_strings by creating a new type.
 
--Andy

Billy O'Neal

unread,
Nov 18, 2013, 6:53:25 PM11/18/13
to std-proposals
>string s;
>s.reserve(BIG_ENOUGH);
>char buf[BIG_ENOUGH];
>sprintf(buf, "%.6f", something);
>I can copy the data from buf to s using insert, or I could rewrite sprintf to stick each character in using push_back (and hope that the optimizer is really, really good), but neither of those is a real solution.
Someone who cares about formatting performance isn't going to be using sprint; they're going to use a formatting function designed for their destination type.

In the specific case of sprintf, I suspect the cost of parsing the format string is going to be far greater than copying the couple of characters from a stack buffer into a heap buffer.

> What are the caveats to having a way to resize a string or a vector such that the newly-added elements are default-initialized?
You expose garbage-init data to the user, the observation of which results in undefined behavior. And you do it in such a way that existing code, not prepared to deal with this case, can be broken.

>I would argue that the main reason that basic_string is useful is that it's a string type that's widely used.  A lot of APIs accept basic_string as input (and, with move semantics, the ability to std::move a string into some library is a big win).  You can't solve a deficiency in creating basic_strings by creating a new type.
I have yet to see a library where the translation into or out of the library for a case like this resulted in significant performance differences. (I've seen it for the innards of a library, but for the innards of a library you can use your own type)

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--

Nevin Liber

unread,
Nov 18, 2013, 7:04:21 PM11/18/13
to std-pr...@isocpp.org
On 18 November 2013 17:23, <aml...@gmail.com> wrote:

but how is this supposed to work now?  Suppose I want to do something as simple as:

string s;
s.reserve(BIG_ENOUGH);
char buf[BIG_ENOUGH];
sprintf(buf, "%.6f", something);

I can copy the data from buf to s using insert, or I could rewrite sprintf to stick each character in using push_back (and hope that the optimizer is really, really good), but neither of those is a real solution. 

Or you could just use s,resize(BIG_ENOUGH) instead of s.reserve(BIG_ENOUGH).  Do you have any benchmarks to show that the zero filling is a performance bottleneck?

I want to be able to do:

auto orig_size = s.size();
s.resize_default_initialized(orig_size + BIG_ENOUGH);  /* or whatever it's called */
s.resize(orig_size + sprintf(&s[orig_size], "%.6f", something));  /* snprintf in real code */

This avoids copying and avoids rewriting sprintf.

If you really need this level of performance (which seems doubtful if you are using sprintf), use a basic_string with a different allocator that does default initialization.  Problem solved.
 

If someone can show profiling data for a real world application where any kind of conversion like this mattered, then perhaps it would make sense to consider an application where a basic_string could "capture" a buffer already allocated by a user. But this uninitialized_resize thing, IMHO, induces far too many caveats.

What are the caveats to having a way to resize a string or a vector such that the newly-added elements are default-initialized?

It's a large change to the standard library without any evidence it makes any bit of difference in 99% of real world code.
 
 I would argue that the main reason that basic_string is useful is that it's a string type that's widely used.  A lot of APIs accept basic_string as input (and, with move semantics, the ability to std::move a string into some library is a big win).  You can't solve a deficiency in creating basic_strings by creating a new type.

As long as they accept basic_string (as you allege) instead of std::string, this problem is easily solvable with a different allocator.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

aml...@gmail.com

unread,
Nov 18, 2013, 7:15:05 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 4:04:21 PM UTC-8, Nevin ":-)" Liber wrote:
On 18 November 2013 17:23, <aml...@gmail.com> wrote:

but how is this supposed to work now?  Suppose I want to do something as simple as:

string s;
s.reserve(BIG_ENOUGH);
char buf[BIG_ENOUGH];
sprintf(buf, "%.6f", something);

I can copy the data from buf to s using insert, or I could rewrite sprintf to stick each character in using push_back (and hope that the optimizer is really, really good), but neither of those is a real solution. 

Or you could just use s,resize(BIG_ENOUGH) instead of s.reserve(BIG_ENOUGH).  Do you have any benchmarks to show that the zero filling is a performance bottleneck?

I'm sure I can come up with one.  I'm equally sure that someone will tell me that my benchmark is too artificial or is otherwise irrelevant.  If STL doesn't care about performance on this level, then why is reserve() there?
 

I want to be able to do:

auto orig_size = s.size();
s.resize_default_initialized(orig_size + BIG_ENOUGH);  /* or whatever it's called */
s.resize(orig_size + sprintf(&s[orig_size], "%.6f", something));  /* snprintf in real code */

This avoids copying and avoids rewriting sprintf.

If you really need this level of performance (which seems doubtful if you are using sprintf), use a basic_string with a different allocator that does default initialization.  Problem solved.

I could have pasted in a super-highly-optimized number-to-string converter,  but that would have just cluttered the thread.

(std::ostream is already unusably slow for a lot of applications.  std::string is nowhere near as bad, but there's a factor of two available here for long strings.)

As long as they accept basic_string (as you allege) instead of std::string, this problem is easily solvable with a different allocator.


I thought the conclusion from the other thread was that basic_strings with different allocators don't interoperate.  Realistically, libraries except std::string, not arbitrary instantiations of basic_string.

--Andy

Billy O'Neal

unread,
Nov 18, 2013, 7:16:01 PM11/18/13
to std-proposals
> I'm sure I can come up with one.  I'm equally sure that someone will tell me that my benchmark is too artificial or is otherwise irrelevant.  If STL doesn't care about performance on this level, then why is reserve() there?

Reserve is about optimizing memory allocations, not object initialization.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--

aml...@gmail.com

unread,
Nov 18, 2013, 7:19:20 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 3:53:25 PM UTC-8, Billy O'Neal wrote:

> What are the caveats to having a way to resize a string or a vector such that the newly-added elements are default-initialized?
You expose garbage-init data to the user, the observation of which results in undefined behavior. And you do it in such a way that existing code, not prepared to deal with this case, can be broken.


Note that I proposed this for *basic_string*, which is almost always instantiated for char or wchar_t.  It would be straightforward to specify the new resize function as filling the newly-available space with indeterminate but well-defined values.  IOW reading would be explicitly allowed.  (This might result in odd interactions with custom allocators.)

There's already new char[len], which has *exactly* the semantics I want.  I don't see why the string class needs to hold users' hands more tightly than operator new.

--Andy

Billy O'Neal

unread,
Nov 18, 2013, 7:25:55 PM11/18/13
to std-proposals
>I don't see why the string class needs to hold users' hands more tightly than operator new.

Because the built in container classes are generally designed to prevent the user from breaking their invariants. A piece of code can get passed a std::basic_string and assume, without any other information, that the data inside is not garbage. That is a useful invariant to people passing around std::strings, particularly over an ability which is only useful in specialized text processing domains (which shouldn't be passing around strings anyway).

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--

aml...@gmail.com

unread,
Nov 18, 2013, 7:28:38 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 4:16:01 PM UTC-8, Billy O'Neal wrote:
> I'm sure I can come up with one.  I'm equally sure that someone will tell me that my benchmark is too artificial or is otherwise irrelevant.  If STL doesn't care about performance on this level, then why is reserve() there?

Reserve is about optimizing memory allocations, not object initialization.

This is, IMO, completely ridiculous.  It's benchmark time:

template<typename Func>
void benchmark(const char *desc, Func &&f)
{
    auto start = std::chrono::steady_clock::now();
    constexpr int iters = 10000;
    for (int i = 0; i < iters; i++)
        f();
    auto end = std::chrono::steady_clock::now();

    std::cout << desc << ": " << std::chrono::duration_cast<std::chrono::duration<double>>(end-start).count() * 1e6 / iters << " µs/iter\n";
}

benchmark("new + delete", [] { delete[] new char[1024]; });
benchmark("new + memset + delete", [] {
    auto array = new char[1024];
    memset(array, 0, 1024);
    delete[] array;
});
benchmark("string", [] { std::string s; s.resize(1024); });

prints (using tcmalloc):

ew + delete: 0.0118636 µs/iter
new + memset + delete: 0.0311982 µs/iter
string: 0.0514658 µs/iter

With a 1KiB string, initialization takes over three times as long as allocation.


Re: containers protecting users from garbage values, if I write a function that returns an std::string that contains garbage (which is easy to do with or without fancy resize functions), then the caller will receive garbage.  This isn't STL's fault.
 

Billy O'Neal

unread,
Nov 18, 2013, 7:34:34 PM11/18/13
to std-proposals
1. Your example allocates once. The intent is to avoid use cases where a string allocates 1, then allocates 2, then allocates 4, then allocates 8, then allocates 16 .... Nothing in either of your examples shows a case that reserve was intended to speed up.
2. You are probably benchmarking on a platform which does lazy commit, so you actually don't pay for much of the allocation costs until you actually write to the memory in question.
3. Considering there is no legal way to get to any of the memory between size() and capacity() I can't see how this is in any way an optimization for initializing the array.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


aml...@gmail.com

unread,
Nov 18, 2013, 7:41:43 PM11/18/13
to std-pr...@isocpp.org
On Monday, November 18, 2013 4:34:34 PM UTC-8, Billy O'Neal wrote:
1. Your example allocates once. The intent is to avoid use cases where a string allocates 1, then allocates 2, then allocates 4, then allocates 8, then allocates 16 .... Nothing in either of your examples shows a case that reserve was intended to speed up.
2. You are probably benchmarking on a platform which does lazy commit, so you actually don't pay for much of the allocation costs until you actually write to the memory in question.

Nope.  This is my production allocator.  It allocates (at least) 2MB of real, mlocked backing store every time it needs more.  That memory is real.  It just gets reused (as it should), so it's fast.  (tcmalloc is a really nice piece of software.)

Regardless, my point stands: the costs of allocation is not wildly higher than the cost of zero-filling.  This is also on a Sandy Bridge machine with a monstrous cache -- I suspect that lots of embedded systems will have much lower available cache/memory bandwidth for initialization, whereas allocation may have a similar cost (it's dominated by latency).
 
3. Considering there is no legal way to get to any of the memory between size() and capacity() I can't see how this is in any way an optimization for initializing the array.

Huh?  Every caller of resize is filling the string with a bunch of copies of the same character.  I suspect that almost none of them actually want that character there, so they'll overwrite it, so they're paying this cost.
 
--Andy

Billy O'Neal

unread,
Nov 18, 2013, 7:43:36 PM11/18/13
to std-proposals
> Huh?  Every caller of resize is filling the string with a bunch of copies of the same character.  I suspect that almost none of them actually want that character there, so they'll overwrite it, so they're paying this cost.

You said "reserve", not "resize". Resize always initializes the memory in question. Resize does not create space between size() and capacity().

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--
 
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

Jeffrey Yasskin

unread,
Nov 18, 2013, 7:57:16 PM11/18/13
to std-pr...@isocpp.org
On Mon, Nov 18, 2013 at 4:04 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
>> I want to be able to do:
>>
>> auto orig_size = s.size();
>> s.resize_default_initialized(orig_size + BIG_ENOUGH); /* or whatever it's
>> called */
>> s.resize(orig_size + sprintf(&s[orig_size], "%.6f", something)); /*
>> snprintf in real code */
>>
>> This avoids copying and avoids rewriting sprintf.
>
>
> If you really need this level of performance (which seems doubtful if you
> are using sprintf), use a basic_string with a different allocator that does
> default initialization. Problem solved.

I'd welcome a proposal to add this allocator to the standard. We keep
saying it's possible in order to ward off "foo_uninitialized"
proposals, but we've never really proven it works. If it does work,
having it in the standard might reduce the number of
"foo_uninitialized" proposals we need to ward off.

Jeffrey

aml...@gmail.com

unread,
Nov 18, 2013, 8:21:22 PM11/18/13
to std-pr...@isocpp.org

Even better than having it ward off proposals: it might actually work.

That being said, I doubt it will work: the move constructors and move assignment operators aren't templates.

--Andy

Nevin Liber

unread,
Nov 18, 2013, 8:45:46 PM11/18/13
to std-pr...@isocpp.org

On 18 November 2013 18:57, Jeffrey Yasskin <jyas...@google.com> wrote:
I'd welcome a proposal to add this allocator to the standard. We keep
saying it's possible in order to ward off "foo_uninitialized"
proposals, but we've never really proven it works.

Actually I'm already working on it in my spare time; I have an implementation and am writing up the proposal. I'll make more announcements when I'm further along.

Nevin Liber

unread,
Nov 18, 2013, 8:48:02 PM11/18/13
to std-pr...@isocpp.org
On 18 November 2013 19:21, <aml...@gmail.com> wrote:
That being said, I doubt it will work: the move constructors and move assignment operators aren't templates.

Could you elaborate?  Move constructors and assignment operators aren't templates pretty much by definition.  I fail to see the relevancy, let alone an issue.

Andrew Lutomirski

unread,
Nov 18, 2013, 8:51:08 PM11/18/13
to std-pr...@isocpp.org


On Nov 18, 2013 5:48 PM, "Nevin Liber" <ne...@eviloverlord.com> wrote:
>
> On 18 November 2013 19:21, <aml...@gmail.com> wrote:
>>
>> That being said, I doubt it will work: the move constructors and move assignment operators aren't templates.
>
>
> Could you elaborate?  Move constructors and assignment operators aren't templates pretty much by definition.  I fail to see the relevancy, let alone an issue.
>

How can a fancy allocator be used to create an std::string?

> --
>  Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404
>

> --
>  
> ---
> You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/XIO4KbBTxl0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

Billy O'Neal

unread,
Nov 18, 2013, 8:54:21 PM11/18/13
to std-proposals
It wouldn't be a std::string. It would be a std::basic_string<char, no_init_allocator<char>>.

Billy O'Neal
Malware Response Instructor - BleepingComputer.com


--
 
---

You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.

To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.

Nevin Liber

unread,
Nov 18, 2013, 9:00:34 PM11/18/13
to std-pr...@isocpp.org
On 18 November 2013 19:51, Andrew Lutomirski <an...@luto.us> wrote:

How can a fancy allocator be used to create an std::string?

Oh; I'm not trying to solve that problem.  In general, it is unsolvable w/o a type erased allocator (which are coming along in N3525, but obviously not to the current std::string).

If others wish to make a far more complicated and less likely to pass proposal to cover the .0001% of cases where they have high performance / low latency needs not to 0-fill a buffer but still are required to use std::string and vector<char> to accomplish it, they are more than welcome to implement it, write it up and come to a few meetings to champion it.

Thiago Macieira

unread,
Nov 19, 2013, 12:38:12 AM11/19/13
to std-pr...@isocpp.org
On segunda-feira, 18 de novembro de 2013 16:25:55, Billy O'Neal wrote:
> Because the built in container classes are generally designed to prevent
> the user from breaking their invariants. A piece of code can get passed a
> std::basic_string and assume, without any other information, that the data
> inside is not garbage. That is a useful invariant to people passing around
> std::strings, particularly over an ability which is only useful in
> specialized text processing domains (which shouldn't be passing around
> strings anyway).

Garbage is in the eye of the beholder.

string s;
s.resize(10);

Might be garbage if the receiver isn't expecting a string full of nulls.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Ion Gaztañaga

unread,
Nov 19, 2013, 3:25:59 AM11/19/13
to std-pr...@isocpp.org
El 19/11/2013 2:45, Nevin Liber escribi�:
>
> On 18 November 2013 18:57, Jeffrey Yasskin <jyas...@google.com
> <mailto:jyas...@google.com>> wrote:
>
> I'd welcome a proposal to add this allocator to the standard. We keep
> saying it's possible in order to ward off "foo_uninitialized"
> proposals, but we've never really proven it works.
>
>
> Actually I'm already working on it in my spare time; I have an
> implementation and am writing up the proposal. I'll make more
> announcements when I'm further along.

Just to start doing some measurements after reading "proposal to add
vector::push_back_()" thread, I added (silently) in Boost.Container the
following extensions:

vector::vector(size_type, default_init_t);
void vector::resize(size_type, default_init_t);

Those functions are available in recently released Boost 1.55 only for
internal testing. The documentation is wrong (it says stored objects are
value initialized instead of "default initialized"), but it will be
correctly documented and tested for Boost 1.56. I plan to add it also to
basic_string as the best way to know if it's useful is to have a working
implementation and real-world measurements.

The implementation is simple, boost::container::allocator_traits is
extended to detect if

a.construct(boost::container::default_init_t)
[where 'a' is the container allocator]

is callable. If not, the default implementation from allocator_traits is
used:

::new((void*)p) T;

Hope this helps,

Ion

Andrew Lutomirski

unread,
Nov 19, 2013, 2:10:10 PM11/19/13
to std-pr...@isocpp.org
On Tue, Nov 19, 2013 at 12:25 AM, Ion Gaztañaga <igazt...@gmail.com> wrote:
> El 19/11/2013 2:45, Nevin Liber escribió:
>>
>>
>> On 18 November 2013 18:57, Jeffrey Yasskin <jyas...@google.com
>> <mailto:jyas...@google.com>> wrote:
>>
>> I'd welcome a proposal to add this allocator to the standard. We keep
>> saying it's possible in order to ward off "foo_uninitialized"
>> proposals, but we've never really proven it works.
>>
>>
>> Actually I'm already working on it in my spare time; I have an
>> implementation and am writing up the proposal. I'll make more
>> announcements when I'm further along.
>
>
> Just to start doing some measurements after reading "proposal to add
> vector::push_back_()" thread, I added (silently) in Boost.Container the
> following extensions:
>
> vector::vector(size_type, default_init_t);
> void vector::resize(size_type, default_init_t);
>
> Those functions are available in recently released Boost 1.55 only for
> internal testing. The documentation is wrong (it says stored objects are
> value initialized instead of "default initialized"), but it will be
> correctly documented and tested for Boost 1.56. I plan to add it also to
> basic_string as the best way to know if it's useful is to have a working
> implementation and real-world measurements.

That sounds useful.

Random thoughts:

1. The standard talks about "default-initialized" and
"value-initialized". I wonder if any non-standard-reading programmers
would understand those terms. I would certainly think of char() (i.e.
0) as the "default" char, but that's not at all what it means. Would
something like "weak_init_t" be better?

2. I wonder how far this can be pushed. Would it make sense for char
x{default_init_t()} to result in an uninitialized char? Should new
char(default_init_t()) result in a pointer to an uninitialized char?
Should a struct like "struct Foo { char x; };" imply the creation of
Foo::Foo(default_init_t) (assuming that Foo has no user-defined
constructors or destructors and all of its members are either scalar
or have default_init_t constuctors)? I think that there's value in
having an operation "initialize to a state that's safe to copy and
destruct but otherwise minimize the extent of initialization".

--Andy

>
> The implementation is simple, boost::container::allocator_traits is extended
> to detect if
>
> a.construct(boost::container::default_init_t)
> [where 'a' is the container allocator]
>
> is callable. If not, the default implementation from allocator_traits is
> used:
>
> ::new((void*)p) T;
>
> Hope this helps,
>
> Ion
>
>
> --
>
> --- You received this message because you are subscribed to a topic in the
> Google Groups "ISO C++ Standard - Future Proposals" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/a/isocpp.org/d/topic/std-proposals/XIO4KbBTxl0/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to

vadim.pet...@gmail.com

unread,
Nov 24, 2013, 6:00:24 AM11/24/13
to std-pr...@isocpp.org
Is basic_string required to use its allocator to construct/destruct elements?
As I understand the standard, it is not (allocator is used only for allocating/deallocating memory), i.e. allocator is not enough to achieve the desired effect for basic_string (but it is enough for e.g. vector).

>>::new((void*)p) T
BTW, I tested this approach with Visual C++ 2012 and, sadly, such a construct in allocator isn't properly eliminated/optimized out for primitive types and gives no performance gain compared to value initialization ::new((void*)p) T()
How other compilers behave in this situation?
Reply all
Reply to author
Forward
0 new messages