Views on a small_vector<T, N>?

1,521 views
Skip to first unread message

Tristan Brindle

unread,
Jan 4, 2016, 2:41:44 AM1/4/16
to ISO C++ Standard - Future Proposals

(tl;dr: We should have a vector which does the "small string optimisation")

Overview

It is sometimes the case that one needs to store items of a type T, where the number of items will vary dynamically but it known that it will (usually) be small. For example, consider an implementation of a sudoku solver: the number of values which could possibly occupy a single square will vary as the puzzle is solved, but in no case can it ever be more than 9. Or perhaps we have an algorithm which we know will usually require temporary storage for 5 items, but in the worst case will require us to store 99.

The standard library has two containers which could be used for such purposes:
  • std::array<T, N>
  • std::vector<T>
Unfortunately, neither is perfectly suited to the task, for reasons given below. What would be ideal is a hybrid container with the semantics of a vector, but which preallocates a certain number of Ts on the stack, only after which (if more elements are added) does it touch the heap -- in essence, the small string optimisation for std::string, but applied to a vector. Let's call such a container a small_vector<T, N>.

Why not std::array?

A std::array is a simple wrapper around a raw array. Its semantics are quite different from those of a vector; it cannot grow, and its size is fixed at compile time. There are many places where std::array is the right choicel, but equally many places where it cannot be used.

Why not std::vector?

Unlike std::array, std::vector has the semantics we want -- that is, an array with dynamic size. The downside is that in order to do this is needs to use dynamic allocation. In some scenarios (for example creating and destroying millions of tiny vectors) this can be extremely expensive. Where the number of items will (usually) be small, we would much prefer to use only the stack and not touch malloc().

Okay, so why not std::vector with a stack allocator?

With stateful allocators introduced C++11, it might be possible to store some Ts inside the allocator itself, and overflow to a fallback allocator once the local storage is exhausted. This would seem ideal, but unfortunately there are two problems:
  • It's not clear that this is permitted within the allocator rules as they stand. I'm happy to defer to anyone with better knowledge, but my understanding is that copies of allocators must compare equal, and this does not seem to be possible if storage is embedded in the allocator itself. Notably, Howard Hinnant's stack allocator uses a separate arena which the user must provide.
  • Embedding the preallocated array in the vector itself means that the storage can overlap (i.e. in a union) with the standard three-pointer implementation of a heap-allocated vector -- as is the case with the small string optimisation for std::string. On the contrary, with a stack allocator the array cannot overlap and will we always be using 3 words more than necessary.
The latter point is particularly important as the speed of move operations is be directly proportional to the sizeof the vector; using preallocated storage will unavoidably make moves slower. We certainly don't want to make the situation worse by making a vector 24 bytes larger than it needs to be. 

Implementations
  
This is by no means a new idea. There are four implementations that I'm aware of, all from major projects:
So there is clearly "a market" for this optimisation. Would there be any interest in proposing such a container for standardisation?

Feedback much appreciated,


Tristan




Thiago Macieira

unread,
Jan 4, 2016, 6:56:30 AM1/4/16
to std-pr...@isocpp.org
On Sunday 03 January 2016 23:41:44 Tristan Brindle wrote:
> This is by no means a new idea. There are four implementations that I'm
> aware of, all from major projects:
>
> - SmallVec from LLVM
> <http://llvm.org/docs/ProgrammersManual.html#llvm-adt-smallvector-h>
> - folly::small_vector from Facebook
> <https://github.com/facebook/folly/blob/master/folly/docs/small_vector.md
> > - boost::container::small_vector
> <http://www.boost.org/doc/libs/1_60_0/doc/html/boost/container/small_vect
> or.html> - fixed_vector from EASTL
> <https://github.com/paulhodge/EASTL/blob/community/include/EASTL/fixed_ve
> ctor.h> (with template param bEnableOverflow = true)

You can add QVarLengthArray to that list too.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Tony V E

unread,
Jan 4, 2016, 9:12:01 AM1/4/16
to ISO C++ Standard - Future Proposals
On your list of things to consider, note that vector has non-throwing move and swap, whereas a smallvec won't (unless T is non throwing). 

Doesn't mean I don't want it, but it can be an important difference.

Sent from my BlackBerry portable Babbage Device
From: Tristan Brindle
Sent: Monday, January 4, 2016 2:41 AM
To: ISO C++ Standard - Future Proposals
Subject: [std-proposals] Views on a small_vector<T, N>?

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-proposals/.

Alisdair Meredith

unread,
Jan 4, 2016, 9:18:32 AM1/4/16
to std-pr...@isocpp.org
Your analysis of allocator support (within the current standard) is
correct.  Stack-allocators are typically handed a reference to some
stack-based buffer, rather than owning their own internal buffer, due
to the copies-must-compare-equal requirement.  This works well
for a single container, but does not cover the case of containers-of-
containers, where you may want millions of small containers.

I think this would be a great proposal for a TS, especially as it has
been re-implemented many times.  One of the key issues is
reframing the container requirements documentation to allow for
containers with different iterator/reference stability rules, although
this is already done for ‘basic_string’.

I am not yet convinced that such a special-purpose container belongs
in the main standard, the concern being how many more special-purpose
containers would then follow the precedent.  I think that is a worthwhile
question to pursue though, either as part of this as a proposal, or as a
separate follow-up.

AlisdairM

Alisdair Meredith

unread,
Jan 4, 2016, 9:23:09 AM1/4/16
to std-pr...@isocpp.org
This is why I would strongly oppose allowing the ‘small object’ optimization
for ‘std::vector’.  I think it is much more interesting as a distinct type though,
where users can pick the container optimized to match their needs.

One interesting question that occurs to me is whether we should want
‘vector’ and ‘small_vector’ to have the same iterator type?  That would make
it easier to change containers in (function) implementations without
impacting ABIs, as the iterator type (more typically used in interfaces) remains
the same.  That would also deal with ‘template bloat” issues.

On the other hand, there are enough folks that seem to prefer that ‘array’,
‘basic_string’, and ‘vector’ all have distinct iterator types, even though a simple
pointer would be a valid implementation in each case.

AlisdairM

Tony V E

unread,
Jan 4, 2016, 1:07:38 PM1/4/16
to ISO C++ Standard - Future Proposals
Also, for nonthrowing or trivial T, could a small buffer optimization inside std::vector<T> be conforming? (as a constant such as 16 elements is still constant time?).

Sent from my BlackBerry portable Babbage Device
From: Tony V E
Sent: Monday, January 4, 2016 9:11 AM
To: Tristan Brindle; ISO C++ Standard - Future Proposals
Subject: Re: [std-proposals] Views on a small_vector<T, N>?

Nevin Liber

unread,
Jan 4, 2016, 1:28:21 PM1/4/16
to std-pr...@isocpp.org
On 4 January 2016 at 01:41, Tristan Brindle <tcbr...@gmail.com> wrote:
Would there be any interest in proposing such a container for standardisation?

Yes (as a separate container and not trying to adapt vector to support it).  LEWG said as much in Kansas when I brought it up, but I have not yet had time to write up a proposal.  (If someone else wishes to do it, go for it!)
--
 Nevin ":-)" Liber  <mailto:ne...@cplusplusguy.com+1-847-691-1404

Agustín K-ballo Bergé

unread,
Jan 4, 2016, 1:29:12 PM1/4/16
to std-pr...@isocpp.org
On 1/4/2016 3:07 PM, Tony V E wrote:
> Also, for nonthrowing or trivial T, could a small buffer optimization
> inside std::vector<T> be conforming? (as a constant such as 16 elements
> is still constant time?).

No, because swapping `std::vector<T>` shall not invalidate any
references, pointers, or iterators.

> Sent from my BlackBerry portable Babbage Device
> *From: *Tony V E
> *Sent: *Monday, January 4, 2016 9:11 AM
> *To: *Tristan Brindle; ISO C++ Standard - Future Proposals
> *Subject: *Re: [std-proposals] Views on a small_vector<T, N>?
>
>
> On your list of things to consider, note that vector has non-throwing
> move and swap, whereas a smallvec won't (unless T is non throwing).
>
> Doesn't mean I don't want it, but it can be an important difference.

Regards,
--
Agustín K-ballo Bergé.-
http://talesofcpp.fusionfenix.com

Tony V E

unread,
Jan 4, 2016, 1:38:18 PM1/4/16
to Agustín K-ballo Bergé
Ah, right. I'm not a fan of that feature generally, thus I forgot it :-)

Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Agustín K-ballo Bergé
Sent: Monday, January 4, 2016 1:29 PM
To: std-pr...@isocpp.org
Reply To: std-pr...@isocpp.org
Subject: Re: [std-proposals] Views on a small_vector<T, N>?

Marcelo Zimbres

unread,
Jan 4, 2016, 2:33:13 PM1/4/16
to std-pr...@isocpp.org
2016-01-04 12:18 GMT-02:00 Alisdair Meredith <alis...@me.com>:
> Your analysis of allocator support (within the current standard) is
> correct. Stack-allocators are typically handed a reference to some
> stack-based buffer, rather than owning their own internal buffer, due
> to the copies-must-compare-equal requirement. This works well
> for a single container, but does not cover the case of containers-of-
> containers, where you may want millions of small containers.

Can't scoped allocators cover the container of containers case?
For example:

using inner_alloc_type = my::allocator<int>;
using inner_vector_type = std::vector<int, inner_alloc_type>;
using outer_alloc_type =
std::scoped_allocator_adaptor< std::allocator<inner_vector_type>
, inner_alloc_type>;

// All small inner-allocations go on this buffer.
std::array<char, 2000> buffer = {{}};

inner_alloc_type alloc1(buffer);
std::allocator<inner_vector_type> alloc2;

outer_alloc_type alloc(alloc2, alloc1);

std::list<inner_vector_type, outer_alloc_type> t1(alloc);

t1.push_back({{5, 3, 7, 20}});
t1.push_back({{1, 44, 22, 8}});

Then the implementation of my::allocator<int>::allocate can decide
what goes on the stack buffer and what goes on the heap:

...
my::allocator<int>::allocate(size_type n)
{
if (n < stack_max_size)
return find_space_on_the_stack_buffer();
return find_space_on_the_heap();
}

This has the advantage that the shared stack buffer can have a smaller size
than if I had individual buffers for each small_vector, since in this case, it
is very unlikely that all containers have reached their "stack limit" at the
same time.

Marcelo

Christopher Jefferson

unread,
Jan 5, 2016, 8:37:54 AM1/5/16
to std-pr...@isocpp.org
On 4 January 2016 at 18:07, Tony V E <tvan...@gmail.com> wrote:
Also, for nonthrowing or trivial T, could a small buffer optimization inside std::vector<T> be conforming? (as a constant such as 16 elements is still constant time?).

Nope, because swap has to preserve iterators, and you would then have pointers into another vector's internal state, which could go out of scope.

Chris

Alisdair Meredith

unread,
Jan 5, 2016, 8:57:48 AM1/5/16
to std-pr...@isocpp.org
> On Jan 4, 2016, at 2:33 PM, Marcelo Zimbres <mzim...@gmail.com> wrote:
>
> Can't scoped allocators cover the container of containers case?


This is fine for the simple case, where you can put a simple limit on the
amount of stack space to the used by all elements in the container, but
the original problem is slightly mis-stated.

The intent is that for a small vector, memory allocation comes out of space
in the vector object itself, rather than necessitating a trip to the heap. When
we have a container with millions of small vectors, that saving is notable. In
this case, we do not really mean ‘allocate on the stack’ but ‘allocate from
internal state’, and that is where allocators can no longer help, but the
optimization must be built into the data structure itself.

AlisdairM


Andrew Tomazos

unread,
Jan 5, 2016, 9:06:30 AM1/5/16
to std-pr...@isocpp.org
On Mon, Jan 4, 2016 at 3:18 PM, Alisdair Meredith <alis...@me.com> wrote:
I am not yet convinced that such a special-purpose container belongs
in the main standard, the concern being how many more special-purpose
containers would then follow the precedent.  I think that is a worthwhile
question to pursue though, either as part of this as a proposal, or as a
separate follow-up.

I'm with Alisdair on this one.  There are many many different data structures with all sorts of different interfaces and performance profiles.  The standard containers do and should only cover a handful of the most common general-purpose ones.

I'd be much more interested in proposals that would make it easier and safer to implement data structures like small_vector yourself.  Implementing a data structure to the level of sophistication of say std::vector (with all the iterators and allocators and member types and specializations of various things and so on) is way more work than it could be if we provided better tools/primtivites/building-blocks to assist.

Ville Voutilainen

unread,
Jan 5, 2016, 9:22:19 AM1/5/16
to ISO C++ Standard - Future Proposals
Even so, there's fairly good amounts of evidence that a small-vector
is a commonly
used container, so concerns about that somehow opening the door for too many
special-purpose containers, for whatever values of "too many" and
"special-purpose",
seem completely theoretical, whereas the need for a small-vector is
seemingly not so theoretical.
Building blocks aren't that useful, because the real benefit from a
small-vector comes
when it can be used as a portable "vocabulary type", rather than
having a plethora of small-vectors
built from said building blocks. Sure, it's probably much more
"special-purpose" than
std::vector, but there is some evidence that it's not so
"special-purpose" that there wouldn't
exist a very good amount of use cases to provide it in the standard library.

Andrew Tomazos

unread,
Jan 5, 2016, 10:00:23 AM1/5/16
to std-pr...@isocpp.org
I concede that the amount of existing practice is compelling.  I do not agree that small_vector is a vocabulary type.  It is just a different flavor of std::vector with slightly different (not even asymptotic) time-space tradeoffs.  I find it hard to imagine it crossing library boundaries.  I'm under the impression it is used mainly as an internal performance optimization, maybe even prematurely/unsuccesfully in some cases.  The heap allocator usually does a pretty damn good job with small blocks anyway.

Ville Voutilainen

unread,
Jan 5, 2016, 10:10:16 AM1/5/16
to ISO C++ Standard - Future Proposals
On 5 January 2016 at 17:00, Andrew Tomazos <andrew...@gmail.com> wrote:
> I concede that the amount of existing practice is compelling. I do not
> agree that small_vector is a vocabulary type. It is just a different flavor
> of std::vector with slightly different (not even asymptotic) time-space
> tradeoffs. I find it hard to imagine it crossing library boundaries. I'm

Those tradeoffs, and their repercussions, are apparently sufficiently
different that
many people think it shouldn't and can't be a "flavor of std::vector".

> under the impression it is used mainly as an internal performance
> optimization, maybe even prematurely/unsuccesfully in some cases. The heap
> allocator usually does a pretty damn good job with small blocks anyway.

Whether it crosses library boundaries or not doesn't make or not make
it a vocabulary
type. It's a type that tends to cross other boundaries that have
different programmers
and different teams on the ends of those boundaries, so practically it
helps them
decide what type to choose if one is readily available, rather than
just an ephemeral
idea out of which a concrete type can be built if certain building
blocks are combined
just the right way.

Matthew Woehlke

unread,
Jan 5, 2016, 10:35:07 AM1/5/16
to std-pr...@isocpp.org
On 2016-01-05 10:00, Andrew Tomazos wrote:
> I concede that the amount of existing practice is compelling. I do not
> agree that small_vector is a vocabulary type. It is just a different
> flavor of std::vector with slightly different (not even asymptotic)
> time-space tradeoffs.

...and other "gotchas" like swap invalidating iterators.

> I'm under the impression it is used mainly as an internal performance
> optimization, maybe even prematurely/unsuccesfully in some cases.
> The heap allocator usually does a pretty damn good job with small
> blocks anyway.

No matter how perfect the heap allocator may be, it can't escape cache
non-locality...

--
Matthew

Nicol Bolas

unread,
Jan 5, 2016, 10:36:26 AM1/5/16
to ISO C++ Standard - Future Proposals
Details for any eventual proposal:

1) Make sure that `small_vector` can be moved into/outof regular `vector`. Obviously, they must use compatible allocator types. Also obviously, if the incoming `vector` is smaller than the small size, then no memory allocation takes place and the old `vector` doesn't necessarily lose its memory. It would then look like a copy operation, and it would only be noexcept if the copy for `T` were noexcept.

2) Since the number of "small" elements is stated up-front, there should be a function to ask "are you heap allocated yet?", rather than asking for the capacity and comparing it to the small element count.

Nevin Liber

unread,
Jan 5, 2016, 10:45:16 AM1/5/16
to std-pr...@isocpp.org
On 5 January 2016 at 09:36, Nicol Bolas <jmck...@gmail.com> wrote:
Details for any eventual proposal:

1) Make sure that `small_vector` can be moved into/outof regular `vector`. Obviously, they must use compatible allocator types. Also obviously, if the incoming `vector` is smaller than the small size, then no memory allocation takes place and the old `vector` doesn't necessarily lose its memory. It would then look like a copy operation, and it would only be noexcept if the copy for `T` were noexcept.

What is the motivation?  This seems like a lot of complexity for an incredibly minor use case.
 
2) Since the number of "small" elements is stated up-front, there should be a function to ask "are you heap allocated yet?", rather than asking for the capacity and comparing it to the small element count.

While I don't see much motivation for this as well, it is fairly easy to provide.
__ 
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

Nicol Bolas

unread,
Jan 5, 2016, 10:48:01 AM1/5/16
to ISO C++ Standard - Future Proposals

Let's assume for a moment that this is true. Let's take it for granted that `small_vector` doesn't qualify as a "vocabulary type", that it's just a "different flavor of std::vector".

Two things. First, we have library fundamentals TS's. Why can't we stick such types in there?

It seems to me that the library fundamentals TS is the perfect place for container types that aren't "vocabulary types" yet are in strong demand by the community. They wouldn't be a core part of the standard, so they wouldn't interfere with the notion that the standard library should only have vocabulary types.

Second... there are lots of things being standardized that are not "vocabulary types". Networking isn't vocabulary. Filesystem isn't vocabulary. There are strong pushes to get these, not just into TS's, but into the core C++ standard.

So it seems to me that the "vocabulary types" ship has long since sailed.

Ville Voutilainen

unread,
Jan 5, 2016, 10:54:48 AM1/5/16
to ISO C++ Standard - Future Proposals
On 5 January 2016 at 17:48, Nicol Bolas <jmck...@gmail.com> wrote:
> Let's assume for a moment that this is true. Let's take it for granted that
> `small_vector` doesn't qualify as a "vocabulary type", that it's just a
> "different flavor of std::vector".
> Two things. First, we have library fundamentals TS's. Why can't we stick
> such types in there?

So as to keep such types there without ever moving them to the main standard?
Yes, technically we could do that.

> It seems to me that the library fundamentals TS is the perfect place for
> container types that aren't "vocabulary types" yet are in strong demand by
> the community. They wouldn't be a core part of the standard, so they
> wouldn't interfere with the notion that the standard library should only
> have vocabulary types.

Or if we have problems with the word "fundamentals", we could well come up
with a separate "Batteries Included TS". :)

> Second... there are lots of things being standardized that are not
> "vocabulary types". Networking isn't vocabulary. Filesystem isn't
> vocabulary. There are strong pushes to get these, not just into TS's, but
> into the core C++ standard.
> So it seems to me that the "vocabulary types" ship has long since sailed.


That may well be, but it still requires committee agreement to add types into
any TS, and some are going to argue that they don't agree on the applicability
of whatever type to be broad enough to ship such types in any standard document,
TS or IS. ;)

Nicol Bolas

unread,
Jan 5, 2016, 10:59:39 AM1/5/16
to ISO C++ Standard - Future Proposals


On Tuesday, January 5, 2016 at 10:45:16 AM UTC-5, Nevin ":-)" Liber wrote:
On 5 January 2016 at 09:36, Nicol Bolas <jmck...@gmail.com> wrote:
Details for any eventual proposal:

1) Make sure that `small_vector` can be moved into/outof regular `vector`. Obviously, they must use compatible allocator types. Also obviously, if the incoming `vector` is smaller than the small size, then no memory allocation takes place and the old `vector` doesn't necessarily lose its memory. It would then look like a copy operation, and it would only be noexcept if the copy for `T` were noexcept.

What is the motivation?  This seems like a lot of complexity for an incredibly minor use case.

If an API takes a vector, and I use a small_vector (or vice-versa), I want to be able to transfer that information with the fastest performance possible. After all, `small_vector` exists for performance reasons. We wouldn't want to do things to make code slower.

This should also extend to different `small_vector` types that have different small element counts.

Nevin Liber

unread,
Jan 5, 2016, 11:16:09 AM1/5/16
to std-pr...@isocpp.org
On 5 January 2016 at 09:59, Nicol Bolas <jmck...@gmail.com> wrote:
If an API takes a vector, and I use a small_vector (or vice-versa), I want to be able to transfer that information with the fastest performance possible. After all, `small_vector` exists for performance reasons. We wouldn't want to do things to make code slower.

While there may be pathological cases that can be micro-optimized, I'm not seeing why I just wouldn't use vector in this case.

Do you also suggest specializing small_vector<bool> for this type exchangeability as well?
--

Tristan Brindle

unread,
Jan 6, 2016, 2:30:21 AM1/6/16
to ISO C++ Standard - Future Proposals
Let's view a `vector<T>` as being the special case `small_vector<T, 0>`, i.e. zero elements preallocated*. Then imagine we had a move operation

     small_vector<T, N> -> small_vector<T, M>

The only time this will actually be a move and not a copy is if the size of the source vector is greater than both N and M. In every other case, you will either be copying from the embedded storage of the source, copying into the embedded storage of the target, or both.

I would suggest that if you are overflowing the preallocated storage often enough that this single move case makes a difference to performance, then you shouldn't be using a `small_vector` at all.


* FWIW, I would advocate prohibiting `small_vector<T, 0>` for precisely this reason.


Tristan


 

Andrew Tomazos

unread,
Jan 6, 2016, 7:48:34 AM1/6/16
to std-pr...@isocpp.org
On Wed, Jan 6, 2016 at 8:30 AM, Tristan Brindle <tcbr...@gmail.com> wrote:


On Wednesday, 6 January 2016 04:59:39 UTC+13, Nicol Bolas wrote:


On Tuesday, January 5, 2016 at 10:45:16 AM UTC-5, Nevin ":-)" Liber wrote:
On 5 January 2016 at 09:36, Nicol Bolas <jmck...@gmail.com> wrote:
Details for any eventual proposal:

1) Make sure that `small_vector` can be moved into/outof regular `vector`. Obviously, they must use compatible allocator types. Also obviously, if the incoming `vector` is smaller than the small size, then no memory allocation takes place and the old `vector` doesn't necessarily lose its memory. It would then look like a copy operation, and it would only be noexcept if the copy for `T` were noexcept.

What is the motivation?  This seems like a lot of complexity for an incredibly minor use case.

If an API takes a vector, and I use a small_vector (or vice-versa), I want to be able to transfer that information with the fastest performance possible. After all, `small_vector` exists for performance reasons. We wouldn't want to do things to make code slower.

This should also extend to different `small_vector` types that have different small element counts.

Let's view a `vector<T>` as being the special case `small_vector<T, 0>`, i.e. zero elements preallocated*. Then imagine we had a move operation

I'm not sure if the idea was discussed previously, but the above statement admits an idea - another option, instead of providing a separate small_vector class template - would be to add another template parameter to std::vector.  That being "size_t num_preallocated_elements = 0".  As shown defaulting to 0.  The existing std::vector requirements/constraints that are incompatible with preallocated elements would then be adjusted to only apply when num_preallocated_elements is 0.

Not sure if this is a good or bad idea quite yet - but may be worth exploring.

Nevin ":-)" Liber

unread,
Jan 6, 2016, 10:12:13 AM1/6/16
to std-pr...@isocpp.org


> On Jan 6, 2016, at 6:48 AM, Andrew Tomazos <andrew...@gmail.com> wrote:
>
> Not sure if this is a good or bad idea quite yet - but may be worth exploring.

Adding a third parameter at the end is an ABI breakage and hinders usability, as most people rarely have to specify allocators.

Adding it as the second parameter breaks source for anyone using allocators or passing vectors generically.

The only benefit is reusing the name. Doesn't seem worth it to me.

Other than these trivially obvious concerns, what else makes this worth exploring?
--
Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> +1-312-623-5420

Nevin ":-)" Liber

unread,
Jan 6, 2016, 10:14:01 AM1/6/16
to std-pr...@isocpp.org
>
> On Jan 6, 2016, at 1:30 AM, Tristan Brindle <tcbr...@gmail.com> wrote:
>
> FWIW, I would advocate prohibiting `small_vector<T, 0>` for precisely this reason.

I disagree. I'd love to use this to replace vector, at a minimum because it doesn't have a bool specialization.

--
Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> +1-312-623-5420

Tristan Brindle

unread,
Jan 6, 2016, 10:53:55 AM1/6/16
to ISO C++ Standard - Future Proposals


On Thursday, 7 January 2016 04:14:01 UTC+13, Nevin ":-)" Liber wrote:

I disagree. I'd love to use this to replace vector, at a minimum because it doesn't have a bool specialization.


I think phrase "replace vector" might be enough to cause some committee members to have kittens, which is why I'd advocate forbidding the zero case, at least to start with. The restriction can always be relaxed later if there's good reason.

Nevin Liber

unread,
Jan 6, 2016, 11:05:04 AM1/6/16
to std-pr...@isocpp.org
And by not putting it in we harm its use in generic programming.  We added array<T, 0>; I see no reason not to add small_vector<T, 0, A>.

I'm not seeing any actual technical concerns; just FUD.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

Andrew Tomazos

unread,
Jan 6, 2016, 11:05:25 AM1/6/16
to std-pr...@isocpp.org
On Wed, Jan 6, 2016 at 4:12 PM, Nevin ":-)" Liber <nli...@gmail.com> wrote:


> On Jan 6, 2016, at 6:48 AM, Andrew Tomazos <andrew...@gmail.com> wrote:
>
> Not sure if this is a good or bad idea quite yet - but may be worth exploring.

Adding a third parameter at the end is an ABI breakage and hinders usability, as most people rarely have to specify allocators.

Adding it as the second parameter breaks source for anyone using allocators or passing vectors generically.

The only benefit is reusing the name. Doesn't seem worth it to me.

You are underestimating the advantages I think.  It's not just "reusing the name", its reusing the entire template.  It's conceptually simpler, easier to specify, easier to implement, its a single class template definition (the branch if (num_preallocated_elements > 0) would be optimized out in the zero case).  The interfaces and implementations of operations that dispatch on two std::vectors of different num_preallocated_elements (comparisons, conversions, etc, etc) would be cleaner.

I don't however have a good answer currently for your ABI and source compatibility concerns.

Other than these trivially obvious concerns, what else makes this worth exploring?
--
 Nevin ":-)" Liber <mailto:ne...@eviloverlord.com+1-312-623-5420

Andrew Tomazos

unread,
Jan 6, 2016, 11:18:13 AM1/6/16
to std-pr...@isocpp.org
On Wed, Jan 6, 2016 at 5:05 PM, Andrew Tomazos <andrew...@gmail.com> wrote:
On Wed, Jan 6, 2016 at 4:12 PM, Nevin ":-)" Liber <nli...@gmail.com> wrote:


> On Jan 6, 2016, at 6:48 AM, Andrew Tomazos <andrew...@gmail.com> wrote:
>
> Not sure if this is a good or bad idea quite yet - but may be worth exploring.

Adding a third parameter at the end is an ABI breakage and hinders usability, as most people rarely have to specify allocators.

Adding it as the second parameter breaks source for anyone using allocators or passing vectors generically.

The only benefit is reusing the name. Doesn't seem worth it to me.

You are underestimating the advantages I think.  It's not just "reusing the name", its reusing the entire template.  It's conceptually simpler, easier to specify, easier to implement, its a single class template definition (the branch if (num_preallocated_elements > 0) would be optimized out in the zero case).  The interfaces and implementations of operations that dispatch on two std::vectors of different num_preallocated_elements (comparisons, conversions, etc, etc) would be cleaner.

I don't however have a good answer currently for your ABI and source compatibility concerns.
 
An idea that addresses source compatibility maybe to pick a new name (vector was always a silly name anyway, its not a mathematical vector, valarray should have really been called vector) and then make vector an alias template:

template<

    class T,

    size_t NumPreallocatedElements = 0,
    class Allocator = std::allocator<T>

>
class sequence { ... };

template<

    class T,
    class Allocator = std::allocator<T>

> using vector = sequence<T,0,Allocator>;

If you don't like "sequence", an alternative name is "vararray".

Tristan Brindle

unread,
Jan 6, 2016, 11:46:14 AM1/6/16
to std-pr...@isocpp.org

> On 7/01/2016, at 5:04 AM, Nevin Liber <ne...@eviloverlord.com> wrote:
>
> And by not putting it in we harm its use in generic programming. We added array<T, 0>; I see no reason not to add small_vector<T, 0, A>.
>

That’s a good point. I hadn’t considered the generic programming aspect. The symmetry with `array` is appealing.


> I'm not seeing any actual technical concerns; just FUD.

Sure. The technical concern is that `small_vector<T, 0>` winds up being identical to `vector<T>`, except that the two cannot be interchanged. Having two ways to say the same thing is never good, particularly for a type as fundamental and widely used as vector.

On the other hand, perhaps we could frame the discussion differently. The “small vector” being discussed here is really a generalisation of vector, similar to how `tuple` is a generalisation of `pair` — both of which still exist, despite the latter being technically redundant.

Looking at it this way, `small_vector` is actually a poor name choice. Perhaps a better name would be `varray`, or variable array. If we default the second template argument, then `varray` becomes a drop-in replacement for `vector`, except without the `vector<bool>` legacy. It also evokes `dynarray`, and covers many of the same use-cases (i.e. vector semantics on the stack), but requires no core language changes.

Hmmm...



--

Tristan





Andrew Tomazos

unread,
Jan 6, 2016, 11:51:42 AM1/6/16
to std-pr...@isocpp.org
I've mostly arrived at the same conclusion, but I think "vararray" (short for "variable-length array") is a better name than "varray".  Abbreviating to a single character is no longer in vogue.

Hmmm...



--

Tristan

Thiago Macieira

unread,
Jan 6, 2016, 12:59:58 PM1/6/16
to std-pr...@isocpp.org
On Wednesday 06 January 2016 13:48:32 Andrew Tomazos wrote:
> would be to add another template parameter to
> std::vector.

Breaks binary compatibility. Please don't.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Thiago Macieira

unread,
Jan 6, 2016, 1:01:15 PM1/6/16
to std-pr...@isocpp.org
On Wednesday 06 January 2016 17:05:23 Andrew Tomazos wrote:
> You are underestimating the advantages I think. It's not just "reusing the
> name", its reusing the entire template. It's conceptually simpler, easier
> to specify, easier to implement, its a single class template definition
> (the branch if (num_preallocated_elements > 0) would be optimized out in
> the zero case). The interfaces and implementations of operations that
> dispatch on two std::vectors of different num_preallocated_elements
> (comparisons, conversions, etc, etc) would be cleaner.

Can't you do that by:

template <typename T, typename Allocator>
class vector : public small_vector<T, 0, Allocator>
{
using small_vector;
};

Nevin Liber

unread,
Jan 6, 2016, 1:36:24 PM1/6/16
to std-pr...@isocpp.org
On 6 January 2016 at 10:46, Tristan Brindle <tcbr...@gmail.com> wrote:

Looking at it this way, `small_vector` is actually a poor name choice.

I'd much rather we not use the word "vector" in the name for the initial discussion, as it limits thinking on design choices.  No one says it has to have exactly the same interface (no more, no less) as vector.

For instance, it might make sense to have a bool emplace_back_if_space(...) member that returns false if the container is already at its capacity.  We may or may not want to add such a thing to std::vector.
 
Perhaps a better name would be `varray`, or variable array.

I would say no.  In C++, arrays refer to things that have exactly N elements (well, maybe not valarray; I don't know, as I've never had cause to use it).

I also think that bike shedding the name at this point is the least of our worries.  (Note:  while I'm against it, calling it vector is not just a bike shedding issue.)

Nevin Liber

unread,
Jan 6, 2016, 1:38:57 PM1/6/16
to std-pr...@isocpp.org
On 6 January 2016 at 12:01, Thiago Macieira <thi...@macieira.org> wrote:
Can't you do that by:

template <typename T, typename Allocator>
class vector : public small_vector<T, 0, Allocator>
{
        using small_vector;
};

You still have to forward non-compiler-generated constructors and assignment operators.

Is this something you would want the standard to mandate, or is it just an implementation detail?
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

Nicol Bolas

unread,
Jan 6, 2016, 1:51:27 PM1/6/16
to ISO C++ Standard - Future Proposals


On Wednesday, January 6, 2016 at 1:38:57 PM UTC-5, Nevin ":-)" Liber wrote:
On 6 January 2016 at 12:01, Thiago Macieira <thi...@macieira.org> wrote:
Can't you do that by:

template <typename T, typename Allocator>
class vector : public small_vector<T, 0, Allocator>
{
        using small_vector;
};

You still have to forward non-compiler-generated constructors and assignment operators.

Is this something you would want the standard to mandate, or is it just an implementation detail?


Actually, it'd be better to mandate the reverse as a specialization:

template<typename T, typename Allocator>
class small_vector<T, 0, Allocator> : public vector<T, Allocator>
{
 
using vector;

 
//Add interfaces specific to small_vector.
}


Andrew Tomazos

unread,
Jan 6, 2016, 2:08:56 PM1/6/16
to std-pr...@isocpp.org
On Wed, Jan 6, 2016 at 7:35 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
On 6 January 2016 at 10:46, Tristan Brindle <tcbr...@gmail.com> wrote:

Looking at it this way, `small_vector` is actually a poor name choice.

I'd much rather we not use the word "vector" in the name for the initial discussion, as it limits thinking on design choices.  No one says it has to have exactly the same interface (no more, no less) as vector.
 
The advantage of having exactly the same interface (or even being exactly the same type) is the same as why we didn't standardize two different flavors of, for example, variant.  Having multiple different, mostly similar, but slightly different versions of things adds complexity to the language for little gain.

For instance, it might make sense to have a bool emplace_back_if_space(...) member that returns false if the container is already at its capacity.  We may or may not want to add such a thing to std::vector.
 
Perhaps a better name would be `varray`, or variable array.

I would say no.  In C++, arrays refer to things that have exactly N elements (well, maybe not valarray; I don't know, as I've never had cause to use it).

I also think that bike shedding the name at this point is the least of our worries.  (Note:  while I'm against it, calling it vector is not just a bike shedding issue.)
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

--

Andrew Tomazos

unread,
Jan 6, 2016, 2:28:59 PM1/6/16
to std-pr...@isocpp.org
On Wed, Jan 6, 2016 at 7:01 PM, Thiago Macieira <thi...@macieira.org> wrote:
On Wednesday 06 January 2016 17:05:23 Andrew Tomazos wrote:
> You are underestimating the advantages I think.  It's not just "reusing the
> name", its reusing the entire template.  It's conceptually simpler, easier
> to specify, easier to implement, its a single class template definition
> (the branch if (num_preallocated_elements > 0) would be optimized out in
> the zero case).  The interfaces and implementations of operations that
> dispatch on two std::vectors of different num_preallocated_elements
> (comparisons, conversions, etc, etc) would be cleaner.

Can't you do that by:

template <typename T, typename Allocator>
class vector : public small_vector<T, 0, Allocator>
{
        using small_vector;
};


I think  you may be onto something here:

template<
    class T,
    size_t NumPreallocatedElements = 0,
    class Allocator = std::allocator<T>
>
class vararray { ... };

template <typename T, typename Allocator>
class vector : public vararray<T, 0, Allocator>
{
        using vararray;
        // plus necessary forwards
};

This maintains source and binary compatibility.  I wonder if this works:

   template<size_t N>
   void f(const std::vararray<int, N>& v);

   std::vector<int> v;

   f(v);

Is the deduction going to consider base classes?  Can't quite remember right now.

Ion Gaztañaga

unread,
Jan 6, 2016, 3:57:59 PM1/6/16
to std-pr...@isocpp.org, alis...@me.com
On 05/01/2016 14:57, Alisdair Meredith wrote:
>> On Jan 4, 2016, at 2:33 PM, Marcelo Zimbres <mzim...@gmail.com> wrote:
>>
>> Can't scoped allocators cover the container of containers case?
>
>
> This is fine for the simple case, where you can put a simple limit on the
> amount of stack space to the used by all elements in the container, but
> the original problem is slightly mis-stated.
>
> The intent is that for a small vector, memory allocation comes out of space
> in the vector object itself, rather than necessitating a trip to the heap. When
> we have a container with millions of small vectors, that saving is notable. In
> this case, we do not really mean ‘allocate on the stack’ but ‘allocate from
> internal state’, and that is where allocators can no longer help, but the
> optimization must be built into the data structure itself.

Alidair, in case this helps,

In order to implement small_vector for Boost, I've tried to reuse all
vector code to improve maintenance. The implementation detail (which is
not ready to be a proposal at all), is to declare a new allocator attribute:

is_partially_propagable

which means that memory allocated by one allocator might not be
propagated as a general rule, but it must be queried storage per
storage. Currently it is done by:

bool storage_is_unpropagable(pointer p);

(maybe is not a good idea to have a negative query, this might change in
future versions)

Also, in a "partially propagable" allocator, one allocator compares
equal to another one if *storage that is propagable*
(!storage_is_unpropagable()) can be deallocated by another allocator.

boost::container::vector handles those partially propagable allocators
and small_vector is just a wrapper a vector with a partially propagable
allocator.

This scheme might not be scalable to other classes like node-based
allocators (one would need to query "storage_is_unpropagable" in all
nodes, so the performance might suffer a bit) but at least for single
storage holders like vector it's quite useful.

Best,

Ion

Nevin Liber

unread,
Jan 6, 2016, 6:20:00 PM1/6/16
to std-pr...@isocpp.org
On 6 January 2016 at 13:08, Andrew Tomazos <andrew...@gmail.com> wrote:
On Wed, Jan 6, 2016 at 7:35 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
On 6 January 2016 at 10:46, Tristan Brindle <tcbr...@gmail.com> wrote:

Looking at it this way, `small_vector` is actually a poor name choice.

I'd much rather we not use the word "vector" in the name for the initial discussion, as it limits thinking on design choices.  No one says it has to have exactly the same interface (no more, no less) as vector.
 
The advantage of having exactly the same interface (or even being exactly the same type) is the same as why we didn't standardize two different flavors of, for example, variant.  Having multiple different, mostly similar, but slightly different versions of things adds complexity to the language for little gain.

Breaking backwards compatibility is a huge loss, especially since we don't need to, because unlike minor breakages to vector in the past (say between C++98 and C++11 for move semantics), existing code will not benefit.  If this were a world where we didn't already have vector for almost two decades, I might think differently.

If we stick with vector, we are saddled with the bool specialization wart, we cannot conceptify the interface, etc.

This differs from the variant case in that (a) we were starting from scratch and (b) the differences were in expert-only usage.

Combining it with vector without guidance from both LEWG and LWG saying that they want them tied together as well has exactly how they want them tied together seems a bit premature.

AFAIK, none of the current implementations have an interface that has std::vector anywhere in it, so it is pure invention that there are enough practical use cases for this to be worthwhile.

Nicol Bolas

unread,
Jan 7, 2016, 12:24:02 AM1/7/16
to ISO C++ Standard - Future Proposals
On Wednesday, January 6, 2016 at 2:08:56 PM UTC-5, Andrew Tomazos wrote:
On Wed, Jan 6, 2016 at 7:35 PM, Nevin Liber <ne...@eviloverlord.com> wrote:
On 6 January 2016 at 10:46, Tristan Brindle <tcbr...@gmail.com> wrote:

Looking at it this way, `small_vector` is actually a poor name choice.

I'd much rather we not use the word "vector" in the name for the initial discussion, as it limits thinking on design choices.  No one says it has to have exactly the same interface (no more, no less) as vector.
 
The advantage of having exactly the same interface (or even being exactly the same type) is the same as why we didn't standardize two different flavors of, for example, variant.  Having multiple different, mostly similar, but slightly different versions of things adds complexity to the language for little gain.

And yet we do.

What is std::forward_list if not purely an optimization over std::list? What does std::string provide over std::vector<char>? It offers a few minor differences: guaranteed null-termination, implicit construction from T arrays and pointers, and the possibility for small-array optimization. Is not most of the interface for std::unordered_map equivalent to that of std::map?

Having "multiple different, mostly similar, but slightly different versions of things" is a ship that has already sailed in the standard library.

Tristan Brindle

unread,
Jan 7, 2016, 6:02:53 AM1/7/16
to std-pr...@isocpp.org
On Thu, Jan 7, 2016 at 7:35 AM, Nevin Liber <ne...@eviloverlord.com> wrote:
I'd much rather we not use the word "vector" in the name for the initial discussion, as it limits thinking on design choices.  No one says it has to have exactly the same interface (no more, no less) as vector.
 
Perhaps a better name would be `varray`, or variable array.

I would say no.  In C++, arrays refer to things that have exactly N elements (well, maybe not valarray; I don't know, as I've never had cause to use it).

Well, if you don't want to call it a vector and you don't want to call it an array, then we're rapidly running out of commonly-used terms for a linear data structure. I'm pretty sure "list" is the wrong one to go for :-)

I also think that bike shedding the name at this point is the least of our worries.  (Note:  while I'm against it, calling it vector is not just a bike shedding issue.)

I agree that bike-shedding on the name at this point is unhelpful, but if we’re going to talk about it then we need to call it something. So I suggest that purely for the purposes of this discussion, we follow Qt’s lead and refer to “a growable contiguous collection with zero or more elements preallocated” as a "variable-length array" or vararray, as suggested by Andrew.


Moving on, I think everyone can agree that all of the following are true:

i. We cannot change the interface of vector in any backwards-incompatible way. This includes removing the vector<bool> specialisation, at least in any reasonably short time frame — it would require an alternative to be available in the IS, plus at least one cycle of deprecation.

ii. Declaring vector<T> to be an alias for vararray<T, 0> today would require vararray<bool, 0> to be specialised, due to (i).

iii. We cannot allow vararray<T, 0> to be a separate type today, and at some point in the future declare vector to be an alias for it. This would break any code written in the intervening time which overloads on the two types.
 


This leads me to think that we have five alternatives:

        A. We do not consider vararray until such time as the vector<bool> specialisation is removed.

        B. We introduce vararray but forbid the zero case. We make no attempt to reconcile this with vector.

        C. We introduce vararray but forbid the zero case. Years pass, and when/if vector<bool> is removed, we remove the zero prohibition and redefine vector as an alias.

        D. We introduce vararray and permit the zero case. We hold our noses and define vararray<bool, 0> to be specialised (at least temporarily) due to (ii), and redeclare vector to be an alias.

        E. We introduce vararray and permit the zero case with no bool specialisation. Due to (iii), vector<T> and vararray<T, 0> must remain separate types forever.

(I believe these options are exhaustive, but if there are any other realistic cases I haven't considered then please do suggest them.)

Let’s have a look at each one in more detail.

Option A

The least interesting option, and one that really requires no further discussion. If we go for this, our time would be better spent trying to standardise boost::dynamic_bitset, as that's a likely requirement for the vector<bool> wart to be removed.

Option B

This was my initial suggestion, and one that I think is relatively self-contained and uncontroversial. It is effectively where the various small_vector implementations stand now. The advantage is that since it is unrelated to vector, there is no potential confusion for users as to which one should be used -- we tell people to use vector almost all the time, unless they have very particular needs that vararray (probably better named small_vector in this case) can help with. The downside is that as pointed out above, barring N=0 is inelegant and may be a pain for generic programming. Notably, this restriction was removed from array between TR1 and C++11.

Option C

Basically the same a Option B, except that we have the ability to "unlock" the zero case for generic programming at some point in the future. The cost of this is that vararray's API must be fixed to be no narrower than vector's from the word go, whereas with Option B we have some leeway to propose a different API (but see below for a discussion about that).

Option D

With this option we remove any potential confusion between vararray and vector; vector is just another name for a vararray which always uses the heap. The downside, of course, is that we are still lumbered with the bool specialisation, which we'd still have to get rid of at some point down the line. But on the other hand, it leaves us no worse off than we are at the moment, and does not artificially prohibit the N=0 case. Whether we default the second template parameter to 0 and encourage people to start saying "vararray<int> v = { 1, 2, 3 };" is an open question. This option would require vararray's API to be a superset of vector's, due to (i).

Option E

This option gives us the most freedom, but is easily the most controversial. If we allow vararray and vector to be completely independent, and permit the N=0 case, then we have two containers that offer identical semantics, except that vararray<bool> behaves properly. This being the case, we'd have to make the choice of telling people which one they should prefer, and it makes much more sense to say "always use vararray" than it does to say "use a vector unless you're dealing with bools or need stack-reservation, in which case use vector". Thus vector becomes redundant.

We have the choice in this case of whether vararray offers an identical API to vector. If it does, then it becomes a drop-in replacement. We can promote vararray as the modern way to use a variable-length array -- free from the vector<bool> legacy, and with optional stack-reservation when you need it.

On the other hand, in theory we have the freedom to design a new, "superior" API for vararray if we wish. In theory that is, in practice we don't actually have much freedom at all, since we would of course want vararray to be a SequenceContainer. By the time most of those requirements are in place, you basically have the vector API anyway. In any case, we'd want to be consistent with list, deque, and developers' existing familiarity with vector, which again limits any choices we could make. It's a matter for debate of course, but unlike std::string and its hundreds of member functions, my opinion is that the vector API doesn't really have any particular warts that need removing. There is much to be gained from keeping vararray's API the same as vector's, and little to be lost; conversely, there is little to be gained and much to be lost if we are gratuitously incompatible.

Option E -- replacing std::vector as the default container! -- might seem at first glance to be unrealistic, perhaps even a good suggestion for an April Fool. At first I dismissed it out of hand, but having given it some more thought I'm warming to the idea. With the arrival of concepts and ranges, and the talk of "STL2", this is a unique opportunity to do something so radical. A vararray, as proposed, is a strict superset of vector. It is never worse and sometimes better. We can easily maintain API compatibility. The hundreds of tutorials and millions of lines of code that use vector wouldn't suddenly be wrong, any more than the millions of lines of code that use pair are wrong now that tuple exists. There is an appetite in the C++ community at the moment to embrace the new; if we are ever going to do something like this, now is the time.

-------

Phew! I think that covers just about every eventuality. If I've made any mistakes or left anything out please let me know. Otherwise, any thoughts on the pros and cons of the various options would be great.

Tristan


Nicol Bolas

unread,
Jan 7, 2016, 9:41:53 AM1/7/16
to ISO C++ Standard - Future Proposals
On Thursday, January 7, 2016 at 6:02:53 AM UTC-5, Tristan Brindle wrote:
Option E -- replacing std::vector as the default container! -- might seem at first glance to be unrealistic, perhaps even a good suggestion for an April Fool. At first I dismissed it out of hand, but having given it some more thought I'm warming to the idea. With the arrival of concepts and ranges, and the talk of "STL2", this is a unique opportunity to do something so radical. A vararray, as proposed, is a strict superset of vector. It is never worse and sometimes better. We can easily maintain API compatibility. The hundreds of tutorials and millions of lines of code that use vector wouldn't suddenly be wrong, any more than the millions of lines of code that use pair are wrong now that tuple exists. There is an appetite in the C++ community at the moment to embrace the new; if we are ever going to do something like this, now is the time.

I do not believe that now is the time at all.

Ranges makes modifications to the iterator paradigm. Since it makes such sweeping changes to iterators and creates new algorithms (and new concepts like ranges and adapters), it brings with it the possibility of "STL2", which is good. But here's the thing: until we actually get all of that stuff finished, we're not in a position to start specifying how containers should function.

Also, if you're going to start saying "hey, new container paradigm!", then there are going to be a lot of people who want in on that conversation. For example, I'd love to see a new vector where you can extract the memory from it as a free-standing object, and pass it around elsewhere (along with the allocator, of course). It could be given to a C-style API. Or more to the point, a C-style API could provide such data, which could be used as the initial storage for the vector (again, along with the allocator).

And if you're doing this for vector, you'd have to do it for string. Both what I just suggested and explicit sizing for small-strings.

Some people will want a vector that can have an explicit limit to how big it gets. Oh and since we're talking about changing the container paradigm, why not throw away allocators and use a better allocation interface? Maybe one that understands the concept of "type erasure". There are going to be API suggestions for the associative containers. There are going to be changes suggested for list and deque. Someone will probably suggest a blanket ban on any throwing-move container implementation for the new containers (I wouldn't say no to that one). And so on.

The discussion you're wanting to start is endless. Everyone and their mother will come out and add their own pet feature.

If you want to start having that discussion, fine. And maybe in preparation of the completion of the Ranges TS (probably version 2), it might be appropriate to start collecting ideas now. But just be prepared for where it leads.

Andrew Tomazos

unread,
Jan 7, 2016, 10:39:24 AM1/7/16
to std-pr...@isocpp.org
On Thu, Jan 7, 2016 at 3:41 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Ranges makes modifications to the iterator paradigm. Since it makes such sweeping changes to iterators and creates new algorithms (and new concepts like ranges and adapters), it brings with it the possibility of "STL2", which is good. But here's the thing: until we actually get all of that stuff finished, we're not in a position to start specifying how containers should function.

That's the standard dilemma between waiting for the slow big vague new thing or making fast incremental concrete changes.  The best approach is to propose both, and let the committee gate/merge things as they will.
 
I'd love to see a new vector where you can extract the memory from it as a free-standing object, and pass it around elsewhere (along with the allocator, of course).  It could be given to a C-style API. Or more to the point, a C-style API could provide such data, which could be used as the initial storage for the vector (again, along with the allocator).

Why can't that be proposed as an extension to vector today?  I don't see the relevance.  It seems orthogonal to everything discussed, and a compatible extension to either/both vector and vararray.
 
And if you're doing this for vector, you'd have to do it for string. Both what I just suggested and explicit sizing for small-strings.

Is there an existing practice of small_string?  Isn't there already a "small string optimization" possible and used in std::string?  std::string is mainly targetted at text, whereas vector/vararray are targeted at contiguous homogenous sequences of an arbitrary type.  They are intentionally distinct.

Some people will want a vector that can have an explicit limit to how big it gets.

This could be a feature of vararray, it just depends how much demand there is.  The tradeoff is between additional complexity versus extra functionality.  Whether it is worth it depends on how much it would be used.
 
Oh and since we're talking about changing the container paradigm, why not throw away allocators and use a better allocation interface? Maybe one that understands the concept of "type erasure". There are going to be API suggestions for the associative containers. There are going to be changes suggested for list and deque. Someone will probably suggest a blanket ban on any throwing-move container implementation for the new containers (I wouldn't say no to that one). And so on.

This is some sort of slippery slope fallacy.  Noone is going to reject vararray because they think we should wait for an allocators v2.  Very few people use anything but the default allocators anyway - virtually noone cares.

The discussion you're wanting to start is endless. Everyone and their mother will come out and add their own pet feature.

I don't agree.  A proposal of vararray should not be disccouraged by this thought.  All proposals get pulled in all directions.  It doesn't mean it won't be successful.
 
If you want to start having that discussion, fine. And maybe in preparation of the completion of the Ranges TS (probably version 2), it might be appropriate to start collecting ideas now. But just be prepared for where it leads.
 
I think the right thing to do would be to write a proposal of vararray for LEWG JAX, carefully considering and exploring all the options and making a recommendation for a course of action - and see what they have to say.

Ville Voutilainen

unread,
Jan 7, 2016, 10:56:01 AM1/7/16
to ISO C++ Standard - Future Proposals
On 7 January 2016 at 17:39, Andrew Tomazos <andrew...@gmail.com> wrote:
>> Oh and since we're talking about changing the container paradigm, why not
>> throw away allocators and use a better allocation interface? Maybe one that
>> understands the concept of "type erasure". There are going to be API
>> suggestions for the associative containers. There are going to be changes
>> suggested for list and deque. Someone will probably suggest a blanket ban on
>> any throwing-move container implementation for the new containers (I
>> wouldn't say no to that one). And so on.
> This is some sort of slippery slope fallacy. Noone is going to reject
> vararray because they think we should wait for an allocators v2. Very few
> people use anything but the default allocators anyway - virtually noone
> cares.

There's no shortage of vocal proponents of allocators in WG21, so I don't know
what you mean by "virtually noone cares". The high-performance crowd cares,
embedded programmers care, there's a whole host of audiences who care.

Andrew Tomazos

unread,
Jan 7, 2016, 11:15:05 AM1/7/16
to std-pr...@isocpp.org
What I mean is (and admittedly I don't have exact numbers) that I claim 99.99%+ of instantiations of the standard container templates (like std::vector) have a defaulted allocator argument.  Interpret that as you will.  While certainly we won't break the existing allocator infrastructure, I expect the voters will not give some new version of allocators "right of way" over more mainstream proposals and use cases.

Nicol Bolas

unread,
Jan 7, 2016, 2:21:19 PM1/7/16
to ISO C++ Standard - Future Proposals
On Thursday, January 7, 2016 at 10:39:24 AM UTC-5, Andrew Tomazos wrote:
On Thu, Jan 7, 2016 at 3:41 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Ranges makes modifications to the iterator paradigm. Since it makes such sweeping changes to iterators and creates new algorithms (and new concepts like ranges and adapters), it brings with it the possibility of "STL2", which is good. But here's the thing: until we actually get all of that stuff finished, we're not in a position to start specifying how containers should function.

That's the standard dilemma between waiting for the slow big vague new thing or making fast incremental concrete changes.  The best approach is to propose both, and let the committee gate/merge things as they will.

No, I can't say that I agree with that. The committee isn't good at merging things. Or have you looked at std::string's interface lately?

I'd love to see a new vector where you can extract the memory from it as a free-standing object, and pass it around elsewhere (along with the allocator, of course).  It could be given to a C-style API. Or more to the point, a C-style API could provide such data, which could be used as the initial storage for the vector (again, along with the allocator).

Why can't that be proposed as an extension to vector today?  I don't see the relevance.  It seems orthogonal to everything discussed, and a compatible extension to either/both vector and vararray.

It certainly could be. But if you're going to revamp the allocation mechanism (which has to be a breaking change if it's going to be useful), then whatever type you use has to use the new allocation mechanism.

Which means you'd need two of them: one for STL1-style allocators and one for STL2-style allocators.

And if you're doing this for vector, you'd have to do it for string. Both what I just suggested and explicit sizing for small-strings.

Is there an existing practice of small_string?  Isn't there already a "small string optimization" possible and used in std::string?  std::string is mainly targetted at text, whereas vector/vararray are targeted at contiguous homogenous sequences of an arbitrary type.  They are intentionally distinct.

Yes, but different `std::string` implementations have different small sizes. I personally like Microsoft's approach, where the small size is 16 characters, no matter how big the character is. As I understand it, libc++'s approach is to pick the size such that `std::string` is never bigger than 3 pointers. So their `std::string<char>` provides a small size of 12 in 32-bits and 24 in 64-bits. But `std::string<char16_t>` only gives you 6 in 32-bits, which is not exactly a useful small size.

Members of SG14 and people here have expressed interest in having the ability to directly specify the number of characters. So if we're going to have `small_vector`, we should have a corresponding `small_string`, with `std::string` being an implementation-defined selection from. That one at least shouldn't cause ABI issues.

Some people will want a vector that can have an explicit limit to how big it gets.

This could be a feature of vararray, it just depends how much demand there is.  The tradeoff is between additional complexity versus extra functionality.  Whether it is worth it depends on how much it would be used.

There's lots of demand. Just talk to the SG14 guys. Go look at EASTL. The same kind of people who care about `small_vector` are the same kind who would care about a dynarray class.

Oh and since we're talking about changing the container paradigm, why not throw away allocators and use a better allocation interface? Maybe one that understands the concept of "type erasure". There are going to be API suggestions for the associative containers. There are going to be changes suggested for list and deque. Someone will probably suggest a blanket ban on any throwing-move container implementation for the new containers (I wouldn't say no to that one). And so on.

This is some sort of slippery slope fallacy.  Noone is going to reject vararray because they think we should wait for an allocators v2.  Very few people use anything but the default allocators anyway - virtually noone cares.

If you want to believe that, I can't stop you. But you may want to take a look around at other C++ programmers outside of your usual programming circles.

The discussion you're wanting to start is endless. Everyone and their mother will come out and add their own pet feature.

I don't agree.  A proposal of vararray should not be disccouraged by this thought.  All proposals get pulled in all directions.  It doesn't mean it won't be successful.

Please remember the context of my statements. Namely, I was replying to his "option E", which was to propose this as replacement to `vector` as part of some STL2 effort.

My point is that his talk about ranges and STL2 containers is woefully premature. Let `small_vector` work within the existing paradigm, without trying to replace or improve `vector`.

Patrice Roy

unread,
Jan 7, 2016, 8:41:11 PM1/7/16
to std-pr...@isocpp.org
I'd be one of those who would react to throwing away allocators unless there's a very, very good replacement. The SG14 crowd uses this a lot (they might complain about the way things work, but in principle they are heavy users of allocators). Let's not forget that there are many kinds of C++ users out there, and that there are reasons for them being C++ users :)

Tristan Brindle

unread,
Jan 7, 2016, 9:27:30 PM1/7/16
to std-pr...@isocpp.org

> On 8/01/2016, at 3:41 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> Also, if you're going to start saying "hey, new container paradigm!", then there are going to be a lot of people who want in on that conversation.

Personally, I absolutely don’t want a new container *paradigm*. What I want is a new container that’s identical to vector, except that it can optionally use some storage in the object itself to avoid the allocator in certain situations. That was what I suggested in the first place.

But consequences of that are that either

* You need to forbid people from creating small_vector/vararray with zero elements stack-reserved, or

* You need to make vararray<T, 0> and vector<T> the same type, which implies a specialisation for vararray<bool, 0>, or

* If you do neither of these, you need to accept that people will start using vararray<T, 0> as a “better vector”, simply because they can use it generically without worrying about the bool case.

We — or rather LEWG, most likely — need to decide between one of these three cases.

My e-mail that you quoted was intended to suggest that the last option isn’t actually as terrible as it might seem, but for it to work we’d need to "get out in front” and promote it this way from the start — having a replacement vector sneak in by the back door is the worst of all possibilities.

I certainly wasn’t looking to open a can of worms about new container models. I don’t think there is any realistic prospect of that — and even if we were going to do such a thing, it would be wise to get a few years experience with ranges and concepts first, to see where (if anywhere) the current container models are deficient, and to see what grows in the community.



Tristan

Alisdair Meredith

unread,
Jan 8, 2016, 12:07:33 AM1/8/16
to std-pr...@isocpp.org
Note that if you make vector an alias template for even an identical implementation and interface (complete with bool support) you still break users who specialize vector for their own user-defined types, as you cannot specialize an alias template.

While I am interested in the idea of an optimized vector-like type, I would be very concerned at trying to make it part of our current vector - coupling thing for an apparent similarity that is not fundamental, tends to lead to overly constrained, more complicated solutions.

I would be more open to the idea in the context of "STL 2.0", where we might take a broader look at the design space of our containers in general.

Sent from my iPhone

Alisdair Meredith

unread,
Jan 8, 2016, 12:12:14 AM1/8/16
to std-pr...@isocpp.org
If you never use specialized allocators, that is easy to imagine.  If you do use customized allocator, you might well be familiar with code bases where the figure is the other way around.  It is exceedingly rare for me to see a container using std::allocator as it allocator parameter - perhaps not 99.99% rare, as that is a surprisingly high barrier, but probably over the 99% case.

I do not claim to be the common case, but I think this is common where custom allocators are used.  If you need them, you really need them.  Otherwise, they are a solution looking for a problem.  It is no accident that the library allows you to painlessly ignore the feature unless you actually need it.

Sent from my iPhone

Thiago Macieira

unread,
Jan 8, 2016, 3:15:11 AM1/8/16
to std-pr...@isocpp.org
On Friday 08 January 2016 00:07:26 Alisdair Meredith wrote:
> users who specialize vector for their own user-defined types, as you cannot
> specialize an alias template.

Is that even allowed by the standard?

Nevin ":-)" Liber

unread,
Jan 8, 2016, 9:08:31 AM1/8/16
to std-pr...@isocpp.org


> On Jan 8, 2016, at 2:15 AM, Thiago Macieira <thi...@macieira.org> wrote:
>
> Is that even allowed by the standard?

Yes. If not specifically forbidden, users can specialize it.

Personally, I think we should list what can be specialize and forbid the rest by default, but that ship sailed a long time ago for the current library.

Nicol Bolas

unread,
Jan 8, 2016, 10:18:42 AM1/8/16
to ISO C++ Standard - Future Proposals
On Thursday, January 7, 2016 at 9:27:30 PM UTC-5, Tristan Brindle wrote:

> On 8/01/2016, at 3:41 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> Also, if you're going to start saying "hey, new container paradigm!", then there are going to be a lot of people who want in on that conversation.

Personally, I absolutely don’t want a new container *paradigm*. What I want is a new container that’s identical to vector, except that it can optionally use some storage in the object itself to avoid the allocator in certain situations. That was what I suggested in the first place.

But consequences of that are that either

    * You need to forbid people from creating small_vector/vararray with zero elements stack-reserved, or

    * You need to make vararray<T, 0> and vector<T> the same type, which implies a specialisation for vararray<bool, 0>, or
 
    * If you do neither of these, you need to accept that people will start using vararray<T, 0> as a “better vector”, simply because they can use it generically without worrying about the bool case.

On the one hand, the conceptual differences between `small_vector<T, 0>` and `small_vector<T, 1>` are minimal. Which means if you forbid the 0 case, or if you make the 0 case an alias of `vector` with its bool specialization intact, people will just use `small_vector<T, 1>` to work around it.

So `small_vector` is going to compete with `vector`, no matter what you do.

However, there are substantive differences between `small_vector<T, 0>` and `small_vector<T, 1>` Move behavior/iterator invalidation is a good example. `small_vector<T, 0>` would function in every way exactly like `vector`, while `small_vector<T, 1>` would function like a `small_vector<T, 20>`. Or more to the point, like `string`. No matter how small the size is, as long as it isn't zero, you can't guarantee the same move behavior.

Given these facts, we need to ask ourselves these questions:

1) Do we want to create a type that is intended to be special case, but will compete with a fundamental library type beyond that special case, just because of an idiotic decision made 20+ years ago?

2) If we do want that, do we want to encourage such competition by allowing the new type to completely subsume all uses of the old?

After all, people are only supposed to use `small_vector` to be an optimization. But the moment people brought up the fact that `small_vector` wouldn't have a `bool` specialization, I started mentally doing a find/replace for `vector` to `small_vector` in my code.

The correct answer to question 2 seems, to me, to be "yes". If there's going to be competition, then the new type ought to win. It does nobody any good to see template code using `small_vector<T, 1>` just to avoid the `T = bool` case for `vector`. So `small_vector<T, 0>` should be in every way equivalent to `vector`, and it should be interoperable with `vector` (movement from one to the other). With the exception of `vector<bool>` of course; `small_vector` shouldn't have that. If we're going to allow `small_vector` to compete with `vector` outside of its particular optimization domain, then just replace it.

So the big question is really #1: do we want to add a type, originally proposed solely for a narrow case of optimization, that subsumes an existing type which is widely used (not to mention promoted) in lots of C++?

I don't have an answer to that. My abject hatred for the `vector<bool>` specialization (and my continued annoyance with the fact that nobody's put forth any effort to fix it) makes me want to say "Hell yes!" At the same time, I despise the idea of replacing a perfectly good class over some idiotic experimental decision from 20+ years. I also hate the idea of using a type outside of its intended purpose; `small_vector` is supposed to be for a narrow optimization.

Plus, if there's going to be an eventual STL2 that includes containers, we don't want to have to create a third dynamic array type. That's just too confusing.

To me, the only viable solutions are:

1) Design `small_vector` so that it completely takes over `vector` in all situations (except for people who like `vector<bool>`.

2) We don't have `small_vector` at all.

Nevin Liber

unread,
Jan 8, 2016, 10:49:01 AM1/8/16
to std-pr...@isocpp.org
On 8 January 2016 at 09:18, Nicol Bolas <jmck...@gmail.com> wrote:
I don't have an answer to that. My abject hatred for the `vector<bool>` specialization (and my continued annoyance with the fact that nobody's put forth any effort to fix it)

Since the problem with vector<bool> is that it is a good class with a bad name, the only remotely palatable fix is to come up with a new class that has a superset of the functionality of vector<bool>, give people some compelling reason to move to it, then deprecate vector<bool> for a couple of releases of the standard.

And if we try that, someone is bound to come along and say something like:

I despise the idea of replacing a perfectly good class.  `vector<bool>` is supposed to be for a narrow optimization.

Plus, if there's going to be an eventual STL2 that includes containers, we don't want to have to create a third dynamic bitset type. That's just too confusing.

:-)

Tony V E

unread,
Jan 8, 2016, 11:06:46 AM1/8/16
to Standard Proposals
On Fri, Jan 8, 2016 at 10:18 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Thursday, January 7, 2016 at 9:27:30 PM UTC-5, Tristan Brindle wrote:

> On 8/01/2016, at 3:41 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> Also, if you're going to start saying "hey, new container paradigm!", then there are going to be a lot of people who want in on that conversation.

Personally, I absolutely don’t want a new container *paradigm*. What I want is a new container that’s identical to vector, except that it can optionally use some storage in the object itself to avoid the allocator in certain situations. That was what I suggested in the first place.

But consequences of that are that either

    * You need to forbid people from creating small_vector/vararray with zero elements stack-reserved, or

    * You need to make vararray<T, 0> and vector<T> the same type, which implies a specialisation for vararray<bool, 0>, or
 
    * If you do neither of these, you need to accept that people will start using vararray<T, 0> as a “better vector”, simply because they can use it generically without worrying about the bool case.

On the one hand, the conceptual differences between `small_vector<T, 0>` and `small_vector<T, 1>` are minimal. Which means if you forbid the 0 case, or if you make the 0 case an alias of `vector` with its bool specialization intact, people will just use `small_vector<T, 1>` to work around it.

So `small_vector` is going to compete with `vector`, no matter what you do.

However, there are substantive differences between `small_vector<T, 0>` and `small_vector<T, 1>` Move behavior/iterator invalidation is a good example. `small_vector<T, 0>` would function in every way exactly like `vector`, while `small_vector<T, 1>` would function like a `small_vector<T, 20>`. Or more to the point, like `string`. No matter how small the size is, as long as it isn't zero, you can't guarantee the same move behavior.

small_vector<T,0> only behaves like vector<T> if we specify that behaviour.
ie if we specify that iterators are not invalidated, etc.
The typical implementation may allow for that, but I don't think we should specify it.
ie let's not specialize the specification for the 0 case. Allow small_vector<T,0>, but don't specify its behaviour (nor T's requirements) differently that small_vector<T,1>.

Let's design small vector targeting usage like small_vector<T,20>, not to replace vector.  Let small_vector<T,0> fall out naturally, without special casing it.
 

Given these facts, we need to ask ourselves these questions:

1) Do we want to create a type that is intended to be special case, but will compete with a fundamental library type beyond that special case, just because of an idiotic decision made 20+ years ago?

2) If we do want that, do we want to encourage such competition by allowing the new type to completely subsume all uses of the old?

no.
 

After all, people are only supposed to use `small_vector` to be an optimization. But the moment people brought up the fact that `small_vector` wouldn't have a `bool` specialization, I started mentally doing a find/replace for `vector` to `small_vector` in my code.


Let's fix vector<bool> separately. First, introduce a vector_bool class (or whatever you want to call it), then deprecate vector<bool> specialization, then remove the specialization in STL2.

Or remove vector<> altogether, replacing it with clone_ptr<T[]>.  I find it odd that vector<> acts like a value type, but we expose its pointer-based implementation details (via iterators not invalidating, move semantics, etc).  If we expose its pointer details, maybe its name should reflect that.

 
The correct answer to question 2 seems, to me, to be "yes". If there's going to be competition, then the new type ought to win. It does nobody any good to see template code using `small_vector<T, 1>` just to avoid the `T = bool` case for `vector`. So `small_vector<T, 0>` should be in every way equivalent to `vector`, and it should be interoperable with `vector` (movement from one to the other). With the exception of `vector<bool>` of course; `small_vector` shouldn't have that. If we're going to allow `small_vector` to compete with `vector` outside of its particular optimization domain, then just replace it.


small_vector<T,1> has different properties than vector<T>.  It is _mostly_ the same, but one handles move-only T (or unmovable even) better than the other, etc.
 
So the big question is really #1: do we want to add a type, originally proposed solely for a narrow case of optimization, that subsumes an existing type which is widely used (not to mention promoted) in lots of C++?

I don't have an answer to that. My abject hatred for the `vector<bool>` specialization (and my continued annoyance with the fact that nobody's put forth any effort to fix it) makes me want to say "Hell yes!" At the same time, I despise the idea of replacing a perfectly good class over some idiotic experimental decision from 20+ years. I also hate the idea of using a type outside of its intended purpose; `small_vector` is supposed to be for a narrow optimization.



While on the topic, just curious - I understand the idea behind the vector<bool> "optimization", but what part of the spec _mandates_ the optimization?

 
Plus, if there's going to be an eventual STL2 that includes containers, we don't want to have to create a third dynamic array type. That's just too confusing.

To me, the only viable solutions are:

1) Design `small_vector` so that it completely takes over `vector` in all situations (except for people who like `vector<bool>`.
 
2) We don't have `small_vector` at all.


3) design small_vector to be a sbo_vector.  Ignore whether it replaces vector.

Tony

Alisdair Meredith

unread,
Jan 8, 2016, 1:37:11 PM1/8/16
to std-pr...@isocpp.org
The vector<bool> optimization is permitted, but not mandated.  However, the interface, complete with a specific proxy type, IS mandated, and that is sufficient to cause problems, regardless of defining a meaning for 'data()'.

Sent from my iPhone

Nicol Bolas

unread,
Jan 8, 2016, 5:32:05 PM1/8/16
to ISO C++ Standard - Future Proposals
On Friday, January 8, 2016 at 11:06:46 AM UTC-5, Tony V E wrote:
On Fri, Jan 8, 2016 at 10:18 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Thursday, January 7, 2016 at 9:27:30 PM UTC-5, Tristan Brindle wrote:

> On 8/01/2016, at 3:41 AM, Nicol Bolas <jmck...@gmail.com> wrote:
> Also, if you're going to start saying "hey, new container paradigm!", then there are going to be a lot of people who want in on that conversation.

Personally, I absolutely don’t want a new container *paradigm*. What I want is a new container that’s identical to vector, except that it can optionally use some storage in the object itself to avoid the allocator in certain situations. That was what I suggested in the first place.

But consequences of that are that either

    * You need to forbid people from creating small_vector/vararray with zero elements stack-reserved, or

    * You need to make vararray<T, 0> and vector<T> the same type, which implies a specialisation for vararray<bool, 0>, or
 
    * If you do neither of these, you need to accept that people will start using vararray<T, 0> as a “better vector”, simply because they can use it generically without worrying about the bool case.

On the one hand, the conceptual differences between `small_vector<T, 0>` and `small_vector<T, 1>` are minimal. Which means if you forbid the 0 case, or if you make the 0 case an alias of `vector` with its bool specialization intact, people will just use `small_vector<T, 1>` to work around it.

So `small_vector` is going to compete with `vector`, no matter what you do.

However, there are substantive differences between `small_vector<T, 0>` and `small_vector<T, 1>` Move behavior/iterator invalidation is a good example. `small_vector<T, 0>` would function in every way exactly like `vector`, while `small_vector<T, 1>` would function like a `small_vector<T, 20>`. Or more to the point, like `string`. No matter how small the size is, as long as it isn't zero, you can't guarantee the same move behavior.

small_vector<T,0> only behaves like vector<T> if we specify that behaviour.
ie if we specify that iterators are not invalidated, etc.
The typical implementation may allow for that, but I don't think we should specify it. 
ie let's not specialize the specification for the 0 case. Allow small_vector<T,0>, but don't specify its behaviour (nor T's requirements) differently that small_vector<T,1>.

So what you're saying is that, if I have a `small_vector<T, 4>` which contains 12 elements, and I move it to another `small_vector<T, 4>`, you're telling me that the specification of `small_vector` should not allow me to keep the iterators?

`string` doesn't let iterators work because the number of characters in the small buffer is implementation defined and not user-accessible. And of course because SSO is not required, so the user has no way to develop code where they know it will work. But with `small_vector`:

small_vector<int, 4> v(12, 2220);
small_vector
<int, 4> v2(std::move(v));

We know that `v` allocated memory for (at least) 12 values. We know that `v2` received the memory from `v` directly; it didn't copy into its own buffer. And so forth.

There is no reason that iterators should be invalidated with this operation. Similarly:

void func(small_vector<int, 4> &&v)
{
 
if(v.size() > v.buffer_size())
    v
.reserve(v.buffer_size() * 1.5);

  small_vector
<int, 4> v2(std::move(v));
}

This would not be the first instance of such a runtime-defined invalidation system. After all, `vector` permits you to retain iterators on insertion, so long as you insert at the end and so long as your insertions didn't blow the capacity. This `small_vector` movement is the same thing: iterators work, conditioned on certain runtime-defined facts.

This requirement would not place an undue burden on implementations, since it's exactly what they have to do now with `vector`. It wouldn't require them to do anything special to support it. It would just work.

Of course, once you permit this, `small_vector<T, 0>` starts behaving almost exactly like `vector<T>`. So by simply doing the right thing for `small_vector<T, N>` we're suddenly rewriting `vector<T>`.

Oops...

Let's design small vector targeting usage like small_vector<T,20>, not to replace vector.  Let small_vector<T,0> fall out naturally, without special casing it.

And if the most "natural" design for `small_vector<T, 0>` turns it into `vector<T>` without the bool specialization?
 
The correct answer to question 2 seems, to me, to be "yes". If there's going to be competition, then the new type ought to win. It does nobody any good to see template code using `small_vector<T, 1>` just to avoid the `T = bool` case for `vector`. So `small_vector<T, 0>` should be in every way equivalent to `vector`, and it should be interoperable with `vector` (movement from one to the other). With the exception of `vector<bool>` of course; `small_vector` shouldn't have that. If we're going to allow `small_vector` to compete with `vector` outside of its particular optimization domain, then just replace it.


small_vector<T,1> has different properties than vector<T>.  It is _mostly_ the same, but one handles move-only T (or unmovable even) better than the other, etc.

std::deque<T> has far more different properties from vector<T>, but I've seen C++ books that recommend using it when `T` might need to be bool. Why would such books not simply switch their suggestion to `std::small_vector<T, 0>`? And after a while, I could see them simply switching their recommendation to use `small_vector<T, 0>`, with `vector` being for specialized cases (maybe even just `vector<bool>`.

Plus, if there's going to be an eventual STL2 that includes containers, we don't want to have to create a third dynamic array type. That's just too confusing.

To me, the only viable solutions are:

1) Design `small_vector` so that it completely takes over `vector` in all situations (except for people who like `vector<bool>`.
 
2) We don't have `small_vector` at all.


3) design small_vector to be a sbo_vector.  Ignore whether it replaces vector.

So it is OK if it complete takes over `vector` by accident?

I don't believe that the effects of `small_vector` on `vector` should be ignored or glossed over. If for no other reason than that I doubt the standards committee will ignore it. I think they're going to notice when you've added a type that's identical to something they already have that's widely used.

At the very least, any proposal for small_vector it needs to acknowledge the replacement issue.

Nevin Liber

unread,
Jan 9, 2016, 5:13:40 AM1/9/16
to std-pr...@isocpp.org
On 8 January 2016 at 16:32, Nicol Bolas <jmck...@gmail.com> wrote:
So it is OK if it complete takes over `vector` by accident?

Sure.  If a better type wins, so be it.


First you complain that the committee never fixes the vector<bool> wart.

This will fix that wart.  All we have to do afterwards is deprecate std::vector except for the bool specialization. (Some might say that was my secret plan all along :-))

For some reason, you no longer want it fixed.  You would rather wait decades (C++95?) for the mythical STL2 beast (after all, you told us we shouldn't even start on it until there has been much user experience with concepts and ranges, and I'm sure by that time we'll find another excuse why we should endlessly delay it), which will, of course, have exactly the same problem of competing types.  Your great-great-great grandchildren will probably bring up the same objection.

There just isn't pleasing some people...

:-)


Tongue and cheek aside, boost::container::small_vector allows 0.  I suspect (but haven't looked) that the other existing ones do also.  The whole point of standardizing this is because people keep inventing this type; if the committee goes and creates a type that is less useful than the ones in existing practice, we will have failed and wasted both committee time and implementor time.

I consider not allowing 0 to be as bad a wart as specializing vector for bool.  I also consider under specifying things like iterator invalidation to be ridiculous, as people are going to discover and use it anyway.  For instance, people discovered and used the fact that vectors had contiguous data long before the standard mandated it (IIRC C++98 did not while C++03 did).  And you already have my negative opinion on combining vector and small_vector.
--
 Nevin ":-)" Liber  <mailto:ne...@cplusplusguy.com+1-847-691-1404

Nicol Bolas

unread,
Jan 9, 2016, 9:48:07 AM1/9/16
to ISO C++ Standard - Future Proposals, ne...@cplusplusguy.com
On Saturday, January 9, 2016 at 5:13:40 AM UTC-5, Nevin Liber wrote:
On 8 January 2016 at 16:32, Nicol Bolas <jmck...@gmail.com> wrote:
So it is OK if it complete takes over `vector` by accident?

Sure.  If a better type wins, so be it.


First you complain that the committee never fixes the vector<bool> wart.

This will fix that wart.  All we have to do afterwards is deprecate std::vector except for the bool specialization. (Some might say that was my secret plan all along :-))

For some reason, you no longer want it fixed.

I didn't say that. I'm saying that, if `small_vector` progresses forward as it is, it will replace `vector`. And that fact needs to be recognized and discussed for its implications.

Not to mention, as others have suggested, if `small_vector` is going to completely replace `vector`, if `small_vector`, is really just vector 2.0... maybe it should have a more appropriate name. I hate to get all bikesheddy here, but that's not an unreasonable point to make. So again, the issue of `vector` replacement is important.

Tongue and cheek aside, boost::container::small_vector allows 0.  I suspect (but haven't looked) that the other existing ones do also.  The whole point of standardizing this is because people keep inventing this type; if the committee goes and creates a type that is less useful than the ones in existing practice, we will have failed and wasted both committee time and implementor time.

Actually, that brings up an interesting point. Boost's documentation for small_vector is decidedly lacking. As in... there isn't any. So it's not clear whether `boost::small_vector` invalidates iterators on movement or not.

Granted, that doesn't mean that we ought to always invalidate iterators, for previously covered reasons. But it's hard to use existing practice as a justification when existing practice is not well-specified.

I consider not allowing 0 to be as bad a wart as specializing vector for bool.

There's no workaround for `vector<bool>`. If you need an actual contiguous, growing array of actual `bool` types, you're SOL unless you write your own.

Whereas `small_vector<T, 1>` is almost as good as `small_vector<T, 0>`. Now, I'm not saying we should disallow 0; that' would be silly and accomplish nothing, if for no other reason than the ease of the aforementioned workaround.

My point is simply that a problem which can be worked around is never as bad as a problem which cannot.

Christopher Jefferson

unread,
Jan 9, 2016, 9:49:27 AM1/9/16
to std-pr...@isocpp.org
On 8 January 2016 at 22:32, Nicol Bolas <jmck...@gmail.com> wrote:
> On Friday, January 8, 2016 at 11:06:46 AM UTC-5, Tony V E wrote:

>
> So what you're saying is that, if I have a `small_vector<T, 4>` which
> contains 12 elements, and I move it to another `small_vector<T, 4>`, you're
> telling me that the specification of `small_vector` should not allow me to
> keep the iterators?
>

Here is one reason to forbid keeping the iterators.

In my implementation of small_vector, sizeof(small_vector<char, 3>) ==
sizeof(small_vector<char, 8>), because of padding. Giving the space is
already used, small_vector<char, 3> stores 8 elements before it
allocates from the heap.

Of course, I could (I haven't bothered) add a member telling users
when heap allocation will (or has) occurred.

Chris

David Krauss

unread,
Jan 9, 2016, 10:50:32 AM1/9/16
to std-pr...@isocpp.org

On 2016–01–09, at 10:48 PM, Nicol Bolas <jmck...@gmail.com> wrote:

I didn't say that. I'm saying that, if `small_vector` progresses forward as it is, it will replace `vector`. And that fact needs to be recognized and discussed for its implications.

small_vector is more complex (prone to bloat) and less capable (no iterator transfers, at least not reliably). Some folks used to say that deque would displace vector, and it has better iterator stability.

Note that a typical layout of std::vector is three machine-words: {begin, end, capacity_end}. When small_vector storage is local, begin and capacity_end become redundant. Implementations will likely want to recycle at least one of those words for the storage block. Whether or not the user cares about bloat, implementation commonality (such as class inheritance) may involve performance trade-offs in both small_vector and vector.

Also, just a note, one use-case would be forbidding heap allocation and letting the local capacity be an upper bound on the size. This could be done by a custom allocator which always throws or asserts on allocation. Unless there’s a better means to the end, the standard library should provide such an allocator.

Just a bikeshed suggestion: “small” describes the precondition to optimization, not the class. Typically sizeof(small_vector<T>) > sizeof(vector<T>). Perhaps consider local_vector.

Tony V E

unread,
Jan 9, 2016, 12:30:32 PM1/9/16
to Christopher Jefferson
Also, once a small vector allocates, then gets small again, does it free the allocation and go back to local memory? I would guess not. So checking the number of elements doesn't tell you if iterators will be maintained. I guess you need to check capacity or we offer an explicit function?

 I imagine code trying to work with maybe-valid-if-this-or-that terators‎ is going to be convoluted and error-prone. 


Sent from my BlackBerry portable Babbage Device
  Original Message  
From: Christopher Jefferson
Sent: Saturday, January 9, 2016 9:49 AM
To: std-pr...@isocpp.org
Reply To: std-pr...@isocpp.org
Subject: Re: [std-proposals] Views on a small_vector<T, N>?

Nicol Bolas

unread,
Jan 9, 2016, 4:43:46 PM1/9/16
to ISO C++ Standard - Future Proposals
On Saturday, January 9, 2016 at 12:30:32 PM UTC-5, Tony V E wrote:
Also, once a small vector allocates, then gets small again, does it free the allocation and go back to local memory? I would guess not. So checking the number of elements doesn't tell you if iterators will be maintained. I guess you need to check capacity or we offer an explicit function?

It should be no different from doing this:

small_vector<int, 6> v;
v
.reserve(12);

This should be required to allocate memory.

`vector` is not allowed to reallocate just because you reduce the size. The only operations that are allowed to reallocate are insertions, explicit `reserve`, and explicit `shrink_to_fit` calls. The same should be true of `small_vector`.

A `small_vector` is only small when its capacity is less than the requested size.

 I imagine code trying to work with maybe-valid-if-this-or-that terators‎ is going to be convoluted and error-prone.

To be honest, it's no more convoluted than `vector'`s already convoluted rules for invalidation. As long as you can put a check in the code that expresses when invalidation will and won't happen, it's fine.

Nicol Bolas

unread,
Jan 9, 2016, 4:53:14 PM1/9/16
to ISO C++ Standard - Future Proposals
On Saturday, January 9, 2016 at 9:49:27 AM UTC-5, Chris Jefferson wrote:
On 8 January 2016 at 22:32, Nicol Bolas <jmck...@gmail.com> wrote:
> On Friday, January 8, 2016 at 11:06:46 AM UTC-5, Tony V E wrote:

>
> So what you're saying is that, if I have a `small_vector<T, 4>` which
> contains 12 elements, and I move it to another `small_vector<T, 4>`, you're
> telling me that the specification of `small_vector` should not allow me to
> keep the iterators?
>

Here is one reason to forbid keeping the iterators.

In my implementation of small_vector, sizeof(small_vector<char, 3>) ==
sizeof(small_vector<char, 8>), because of padding. Giving the space is
already used, small_vector<char, 3> stores 8 elements before it
allocates from the heap.

I would never use such a small_vector.

If I ask for a `small_vector<T, 3>`, then I mean for it to allocate memory once the capacity goes past 4. Regardless of what T is.

We tolerate this behavior from std::string because it's not required to use small allocations and we aren't allowed to specify them. That's not the case for small_vector. It should do exactly what it is told to do.

The small_vector buffer size is not a recommendation; it's a requirement. If you want your kind of behavior, we can have a metafunction that computes a minimum small element count, based on the size of `T` and the size of `small_vector`. Perhaps we could have the default size calculate such an optimal, implementation-dependent size.

But some of us want the type to do exactly what we tell it to do. Personally, I've had enough of implementations being able to weasel around `reserve` behavior and such, and I hope N4524 (enforced capacity requirements) gets in for exactly this reason.

Nicol Bolas

unread,
Jan 9, 2016, 5:05:10 PM1/9/16
to ISO C++ Standard - Future Proposals
On Saturday, January 9, 2016 at 10:50:32 AM UTC-5, David Krauss wrote:
On 2016–01–09, at 10:48 PM, Nicol Bolas <jmck...@gmail.com> wrote:

I didn't say that. I'm saying that, if `small_vector` progresses forward as it is, it will replace `vector`. And that fact needs to be recognized and discussed for its implications.

small_vector is more complex (prone to bloat) and less capable (no iterator transfers, at least not reliably).

When people talk about `small_vector` replacing `vector`, it would be `small_vector<T, 0>` that does the replacing. Which by all rights ought to be functionally identical to `vector`, since it has no small buffer size.

Indeed, if `small_vector<T, 0>` is allowed, it's going to have to, at some level, do some specialization of the implementation. After all, you can't have zero-sized arrays in C++, so the usual method of having a memory buffer that is one thing or the other doesn't work. So by specializing it a bit more, all implementation overhead goes away, and `small_vector<T, 0>` is just `vector<T>` with a different name and a few extraneous functions.
 
Also, just a note, one use-case would be forbidding heap allocation and letting the local capacity be an upper bound on the size.

No, that's a different class. A useful one to be sure, but different.

`small_vector` is for when I know that I will mostly live within a certain reasonable boundary, but sometimes will blow past it. It is purely an optimization of `vector` for certain specific use cases.

What you're talking about is Boost's `static_vector`, which is for when I know I won't exceed a certain limit, and I want that object to live entirely on the stack. I think `static_vector` is, while useful, a much more narrow use case.

Ross Smith

unread,
Jan 10, 2016, 2:57:55 PM1/10/16
to std-pr...@isocpp.org
On 2016-01-08 00:02, Tristan Brindle wrote:
>
> ii. Declaring vector<T>to be an alias for vararray<T, 0>today would
> require vararray<bool, 0>to be specialised, due to (i).

You can get around that:

template <typename T, size_t N> class vararray;
class bitarray; // = current vector<bool>

template <typename T, size_t N> class vector_helper {
using type = vararray<T, N>;
};

template <> class vector_helper<bool, 0> {
using type = bitarray;
};

template <typename T, size_t N> using vector
= typename vector_helper<T, N>::type;


Ross Smith


Agustín K-ballo Bergé

unread,
Jan 10, 2016, 3:14:43 PM1/10/16
to std-pr...@isocpp.org
On 1/10/2016 4:57 PM, Ross Smith wrote:
> On 2016-01-08 00:02, Tristan Brindle wrote:
>>
>> ii. Declaring vector<T>to be an alias for vararray<T, 0>today would
>> require vararray<bool, 0>to be specialised, due to (i).
>
> You can get around that:

Not transparently, no.

> template <typename T, size_t N> class vararray;
> class bitarray; // = current vector<bool>
>
> template <typename T, size_t N> class vector_helper {
> using type = vararray<T, N>;
> };
>
> template <> class vector_helper<bool, 0> {
> using type = bitarray;
> };

Those nested `type`s are private.

> template <typename T, size_t N> using vector
> = typename vector_helper<T, N>::type;

This alias being dependent means you'll lose deductive powers. You'd
probably still want the following snippet to work:

template <typename T, std::size_t N>
void foo(vector<T, N> const& vs) {}

vector<int, 42> vs;
foo(vs); // error: couldn't infer template argument

Regards,
--
Agustín K-ballo Bergé.-
http://talesofcpp.fusionfenix.com
Reply all
Reply to author
Forward
0 new messages