Add all useful array types to the standard library

393 views
Skip to first unread message

Matthew Fioravante

unread,
Mar 18, 2017, 3:13:22 PM3/18/17
to ISO C++ Standard - Future Proposals
The array is the most fundamental and important data structure in computer science. If a problem can be efficiently solved with arrays
there is not likely to be a more optimal solution on modern hardware.

The C++ standard library provides only 2 array types. A static and fixed size std::array<T,N> and a dynamic and growable std::vector<T,N>.

These are a good foundation, but there are a lot of different array variations.
In performance critical code, I've found all of these to be very useful.

I'd really like to see these in the standard and am considering writing a paper. Many of them are already being discussed elsewhere or have active proposals. Here is the rough outline of the idea. A real paper would need a lot more detail and examples of course.

The different types of arrays you can create are split across a few dimensions. All of them have trade-offs.

One dimension is the storage policy:

static - The size is determined at compile time. When you can set compile time upper bounds on storage, you can construct tightly packed
data structures with no pointer indirections. That's about the best you could hope for in terms of cache efficiency. The limitation of course
is the fixed size. What do you do if the application wants to use more? Silently truncate? Throw an error?

dynamic - The size is determined at runtime and allocated on the heap (maybe stack via magic tricks like alloca()). This is of course more
flexible for all the obvious reasons, and can be safety created locally on the stack without worrying about using too much memory.
Dyanmic arrays can also be moved efficiently O(1) by swapping pointers, whereas with the static array you're forced to move all of
individual members O(N) in sequence. The downside are the costs of allocation and deallocation. Also if you have structures with lots
of different vectors, you'll be risking cache misses and as you do random access memory hops when accessing the different buffers.

sbo - The small buffer optimization where you have some small amount of static memory and fallback to dynamic allocation when the
capacity exceeds that amount. This is a great compromise, especially if you have a system where rare outliers in memory use need to
be handled and its acceptable to pay a penality for it. Its also not hard to add to some instrumentation and tune the static sizes after
the fact so that for the 80 part of the 80 / 20 rule, your application runs super fast and doesn't allocate memory.

Another is whether the storage is fixed or growable.

Growable Storage - This is std::vector. It is the most convenient and default option. It always works and the memory is laid out
linearly so normal access is super fast. Growable means in theory your application can possibly run out of memory and crash. This is not
such a bad thing if the limits are sufficiently high and you are not concerned about or are otherwise protected from malicious denial of
service attacks. It also leaves room for hardware memory upgrades to improve the capabilities of the
application. Growable arrays require that you have different concepts of "size" and "capacity", which means you need to store more
data in the array object to keep track of this information. The automatic resizing while amortized is still expensive when it happens, and
if you have a real-time application that can't handle random slowdowns like this then automatically growable arrays may be off the table.

Fixed Storage - Fixed size storage is less flexible but offers some performance benefits. Mostly in that you don't need to keep track of a
capacity, you can also choose to encode the size at compile time (std::array). Since the size is fixed, dynamic memory allocation can be
done once and done exactly for the space requested.

Fixed size arrays can also store non-copyable/non-movable elements, whereas growable arrays require the objects can be relocated or copied.

One big limitation fixed storage has right now is that as soon as you construct the array itself, all of the elements have to be constructed too.
If your T's are moveable, its easy to just default construct the array, and then loop through and carefully construct the real data on the side and then move it in.

If T's default constructor is complex and not inline, this default and then move approach can be less efficient. Its also impossible if T is not movable or copyable.

For non-movable types, fixed arrays make it difficult to pass different constructor arguments, or even call different constructors for each individual element.
I once rigged up a technique for doing this using lambdas and iterators, but it wasn't pretty.


Finally, the last dimension is ownership.

Owns the data - Pretty much all array types actually own and manage the underlying buffer.

Does not own the data - This is array_view, which I'll describe more later.

So given all of these dimensions, how can we combine them to create useful types. Here are the ones I've used:

Note that template arguments are:
T - the type of the data
N - a compile time size
A - an allocator

vector<T,A> - Dynamic and Growable

The standard bread and butter of C++. Any implementation will require 3 data members for the data pointer, size, and capacity. It can be implemented using a pointer and 2 ints, or 3 pointers. On 64 bit platforms, std::vector will likely be 24 bytes. If you know you will never need to store more than 4 billion elements (an extremely reasonable assumption), you can shave it down to 16 bytes using a T* and 2 uint32_t's. Unfortunately, std::vector doesn't provide a way for us to switch on this optimization.

There's also the choice of the growth factor. 2 being the common naive one and 1.5 providing better underlying memory reuse.

Finally, theres also the discussion of realloc() that keeps popping up, and whether or not a standard version of realloc() would make the vector resize operation more efficient.

static_vector<T,N> - Static capacity, growable


This data structure has the exact same interface as std::vector. The difference is that the capacity is set at compile time and not stored within the object. If you try to insert an element once the capacity is full the container will throw an exception instead of allocating.

This one is less common. You get the cache benefits of a compile time capacity, which means you can embed this object directly in other data structures without pointer indirections. Its essentially a std::array<T,N>, with an extra counter embedded inside to keep track of how many objects are "active".

The resulting size of this data structure will be sizeof(T) * N + the size of a 64 bit or 32 bit size counter, with additional padding as needed.

One alternative to this is using an array<T,N> and having some state of T represent whether or not the object is active.

A very common example is an array<T*,N> where the index of the first null pointer indicates the "size". This is actually faster to iterate in a for loop than static_vector<T,N> since you can do only a single null check. You don't need to store a counter, but you may need to store a dummy sentinel at the end to terminate the loop. The downside is that the size() operation becomes O(N) instead of O(1).

For other cases, you may need to store an additional bool to say whether its "alive" or not. In terms of memory usage, you trade off the cost of a single counter vs storing additional "alive" state in your T object. Because of padding, storing an additional flag can actually be very expensive in terms of memory use and cache efficiency. For non-pod complex types, it also requires that all of them are "constructed" all the time, which may not be ideal.


small_vector<T,N,A> - Sbo capacity, growable.


There have been a lot of other threads talking about small_vector and examples of its use in the wild. Examples include llvm, google, facebook, game engines, and more.. So I'm not going to go into huge detail here. In C++11, std::string already uses the sbo storage policy.

small_vector is a probabilistic bet. If most of the time your maximum size is <= N, you have the potential to gain large performance benefits. In particular a vector<small_vector<T,N>> will end up being a single linear buffer. If you end up exceeding N most of the time, that sbo space just becomes wasted memory and a simple vector would be faster.

Any small_vector implementation must allow for N to be configurable by the user. Unlike a string, vector has a nearly infinite set of use cases and performance critical applications will need to tune it carefully.

array<T,N> - Static and Fixed (also T[N])


The fixed size statically allocated array. You get all of the benefits of the memory layout along with the limitations of fixed storage mentioned before.

The size is of course sizeof(T) * N.

Not much to say here, we've all been using this for decades.


dyn_array<T,A> - Dynamic and Fixed

Not to be confused with the earlier attempt at creating a magic array type with stack allocation.
Of all of the new types mentioned here, after small_vector, this is the most important one missing from the standard.

Essentially, this is like a std::vector without the insert, push_back, erase, methods. Not having these capabilities seems like a limitation, but its also a strength.

* The underlying memory store can be allocated exactly and only once during construction. vector implementations may speculatively allocate more, betting that you will call push_back() later. Unless you use reserve(), inserting a bunch of things into a vector will result in multiple memory allocations and deallocations. Slowing down your application and fragmenting your heap.
* Smaller, more cache friendly footprint. You only need a single pointer and a size, so on 64bit systems, sizeof(dyn_array<T,A>) == 16. On 32 bit systems, the size is only 8.
* Works with non-movable types, with the construction caveats I mentioned before.
* You can store const T, and still be able to move the array but not copy it.
* Application correctness: If the use case demands an array that cannot be resized, dyn_array provides exactly that. It becomes impossible for users to break this contract.

Compared to std::array, the advantages are:
* Determine the size at runtime
* O(1) move operations
* No worry about blowing out stack space with large allocations.

Finally, unlike array, dyn_array still has the ability to be resized at runtime. Its just that unlike vector, it has to be done manually be the user. In my implementations of this, I've just allowed copy and/or move operations to change the size. You could also argue that including the resize() method still makes sense.

auto a = dyn_array<int>(5, 0); //Initialize to {0, 0, 0, 0, 0}
a
= dyn_array<int>(2, 1); //Change to {1, 1}

Finally, if you need some flexibility you can also create your data with a std::vector, and then move it into a dyn_array. If these types are using the same allocator, we should allow them to be movable to each other

//Do setup
std
::vector<T> temp;
temp
.emplace_back(stuff..);
temp
.emplace_back(more complicated stuff...);

//All done, now do permanent storage.
auto permanent = dyn_array<T>(std::move(temp));


array_view<T> - Non-owning view

There have already been several proposals for this. Personally I love array_view and use it all the time.

The best part is that it can be used with all of the above array types. This allows you to write generic functions which accept any kind of array. Like string_view, it allows generic code without the complexity and compile time overhead of using templates. Bjarne Stroustrup's GSL library has this and they call it "span<T>".

The standard 1-dimensional array_view requires only a begin and end pointer.

Theres also discussion of how to handle multiple dimensions. I can have a flat buffer of memory and construct a multi-dimension view over it.
While multi-dimensional views can be great for numeric code, I find that the single dimensional array_view is an absolutely essential tool. Again 80 / 20 principle.

For every dimension you add, you need to include additional state to store it, either in memory at runtime or at compile time. I'm leaving this subject alone for now as the various array_view proposals are trying to tackle it.

static_array_view<T,N> - Non-owning view with static size.


This one seems a bit odd and obscure. Essentially this object stores only a single pointer and has a compile time configured size.

Why would you possibly want to use this thing? Personally I invented this one time because there was some generic code I wanted to optimize for single element case and multi-element case.

The generic code was written for array_view<T> and looped over it several times doing all kinds of nasty business logic. This code was absolutely performance critical. It was extremely common that we would pass in an array_view with size 1, with some scenarios never using the multi-element capability at all.

Writing 2 versions of the code was a maintenance nightmare I didn't want to deal with. Instead of writing all of this code twice, once with loops and again without them I simply templated it.

The multi-element case templated to array_view<T>. The single-element case templated to static_array_view<T,1>. In the static case, the compiler was smart enough to remove all of the loops and compile the code down as if I had written it in the single element.


All of these array types are not impossible to write on your own but its tedious and complicated. It takes a lot of time to write all of the access methods and then you must unit test it hard as its easy to have a bug in one of them that your application happens to not use often or ever.

Would you use any of these types in your code?

inkwizyt...@gmail.com

unread,
Mar 18, 2017, 5:07:54 PM3/18/17
to ISO C++ Standard - Future Proposals


On Saturday, March 18, 2017 at 8:13:22 PM UTC+1, Matthew Fioravante wrote:
The array is the most fundamental and important data structure in computer science. If a problem can be efficiently solved with arrays
there is not likely to be a more optimal solution on modern hardware.

The C++ standard library provides only 2 array types. A static and fixed size std::array<T,N> and a dynamic and growable std::vector<T,N>.

These are a good foundation, but there are a lot of different array variations.
In performance critical code, I've found all of these to be very useful.

I'd really like to see these in the standard and am considering writing a paper. Many of them are already being discussed elsewhere or have active proposals. Here is the rough outline of the idea. A real paper would need a lot more detail and examples of course.

The different types of arrays you can create are split across a few dimensions. All of them have trade-offs.

 
 Nicol Bolas suggest adding `omi_vector` that will be configurable by type traits to change behavior as you see fit.

Matthew Fioravante

unread,
Mar 18, 2017, 5:20:32 PM3/18/17
to ISO C++ Standard - Future Proposals


On Saturday, March 18, 2017 at 4:07:54 PM UTC-5, Marcin Jaczewski wrote:
 Nicol Bolas suggest adding `omi_vector` that will be configurable by type traits to change behavior as you see fit.

That's a great idea. Especially if it also lets me adjust the size counters to be uint32_t.

In that scenario, there are a few more configurations that to consider.

* Instead of storing runtime capacity, make it a function of size. One example I used once to save space was making the capacity always the next power of 2 of the size. Checking whether you need to resize requires only a fast is power of 2 check.
* Sentinel arrays. Like the example I mentioned with std::array<T*,N>, make the size an O(N) operation function of the elements. For most efficient iteration, optionally require a permanent sentinel element at the end of the array that cannot be overwritten.

Magnus Fromreide

unread,
Mar 19, 2017, 3:05:06 PM3/19/17
to ISO C++ Standard - Future Proposals
On Sat, Mar 18, 2017 at 12:13:22PM -0700, Matthew Fioravante wrote:
> The array is the most fundamental and important data structure in computer
> science. If a problem can be efficiently solved with arrays
> there is not likely to be a more optimal solution on modern hardware.
>
> The C++ standard library provides only 2 array types. A static and fixed
> size std::array<T,N> and a dynamic and growable std::vector<T,N>.
>
> These are a good foundation, but there are a lot of different array
> variations.
> In performance critical code, I've found all of these to be very useful.
>
> I'd really like to see these in the standard and am considering writing a
> paper. Many of them are already being discussed elsewhere or have active
> proposals. Here is the rough outline of the idea. A real paper would need a
> lot more detail and examples of course.
>
> The different types of arrays you can create are split across a few
> dimensions. All of them have trade-offs.
>
> Another is whether the storage is fixed or growable.
>
> Growable Storage - This is std::vector. It is the most convenient and
> default option. It always works and the memory is laid out
> linearly so normal access is super fast. Growable means in theory your
> application can possibly run out of memory and crash. This is not
> such a bad thing if the limits are sufficiently high and you are not
> concerned about or are otherwise protected from malicious denial of
> service attacks. It also leaves room for hardware memory upgrades to
> improve the capabilities of the
> application. Growable arrays require that you have different concepts of
> "size" and "capacity", which means you need to store more
> data in the array object to keep track of this information. The automatic
> resizing while amortized is still expensive when it happens, and
> if you have a real-time application that can't handle random slowdowns like
> this then automatically growable arrays may be off the table.
>
> Fixed Storage - Fixed size storage is less flexible but offers some
> performance benefits. Mostly in that you don't need to keep track of a
> capacity, you can also choose to encode the size at compile time
> (std::array). Since the size is fixed, dynamic memory allocation can be
> done once and done exactly for the space requested.
>
> Fixed size arrays can also store non-copyable/non-movable elements, whereas
> growable arrays require the objects can be relocated or copied.

This is actuallly false, you can store non-movable objects in a growable
array, you just can't resize it while it is non-empty.

This means you have to use "reserve" prior to putting objects into it, but you
can still use it dynamically.

/MF

Thomas Köppe

unread,
Mar 20, 2017, 4:35:12 PM3/20/17
to ISO C++ Standard - Future Proposals
You may also like to consider existing proposals and LEWG's response to them:
  • P0210r0, a light-weight, compact dynamic array, by yours truly. Reaction: meh, too specific, not necessary.
  • P0274r0, clump, a vector-like container with embedded storage, by Nevin. LEWG liked this one quite a bit I believe, though we haven't seen it since.

3dw...@verizon.net

unread,
Apr 14, 2017, 12:34:22 AM4/14/17
to ISO C++ Standard - Future Proposals
I've often thought that a capacity constructor to vector would be nice (probably with a capacity_t tag or something). This would take the place of one of your vectors I think. Plus one for the shock vector!

3dw...@verizon.net

unread,
Apr 14, 2017, 12:35:51 AM4/14/17
to ISO C++ Standard - Future Proposals
I mean short or sbo vector not shock vector.

Tony V E

unread,
Apr 14, 2017, 7:15:58 AM4/14/17
to ISO C++ Standard - Future Proposals
I don't know what a shock vector is, but I want one.

Sent from my BlackBerry portable Babbage Device
  Original Message  
From: 3dw...@verizon.net
Sent: Friday, April 14, 2017 12:35 AM
To: ISO C++ Standard - Future Proposals
Reply To: std-pr...@isocpp.org
Subject: [std-proposals] Add all useful array types to the standard library

I mean short or sbo vector not shock vector.

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/59238645-c3d5-4282-86d8-b4044731f3cf%40isocpp.org.

Thiago Macieira

unread,
Apr 14, 2017, 12:48:47 PM4/14/17
to std-pr...@isocpp.org
Em sexta-feira, 14 de abril de 2017, às 04:15:54 PDT, Tony V E escreveu:
> I don't know what a shock vector is, but I want one.

A shock vector is what happens to a vector when you reply to mailing lists
with a phone.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Bengt Gustafsson

unread,
Apr 15, 2017, 7:33:43 PM4/15/17
to ISO C++ Standard - Future Proposals
 
As a variant it would be possible to use different allocators to achieve these different results.

This could be done using these two allocators:

     FixedAllocator<T, N> // Has a fixed block to allocate from, part of its own size.
     SooAllocator<T, N, A> // Has N elements inside and can also allocate from the subordinate allocator A which defaults to the default allocator

To get all the variants we then use std::vector with one of these and to get the dyn_array behaviour we also need a template class which allocates one block from its allocator at construction time, lets call it DynArray for now:

DynArray<T, std::allocator>               // Always allocate a heap block
DynArray<T, FixedAllocator<T, N>>  // Consumes N elements worth of memory and allows up to N as the size constructor parameter.
DynArray<T, SooAllotactor<T, N>>   // Allocates on heap if requirement is larger than N

This is sort of elegant as it allows composing the resizable behaviour (DynArray or std::vector) freely with the different allocators, but there are drawbacks:

- The types are more cumbersome to write out.
- For performance/size reasons DynArray and std.:vector may be better off specialized for the different allocators. This reduces the terseness of the code.

These drawbacks together may well cause the addition of type aliases for "good" combinations which takes us back to the same old hariy naming issue., and if each of these correspond to a specialized implementation, what has actually been gained?

So my conclusion is that while it looked nice for a while this may not be a better idea after all. But now that I wrote it down I can just as well send it, maybe someone else gets a better idea out of this...

Bengt


Nicol Bolas

unread,
Apr 15, 2017, 10:33:32 PM4/15/17
to ISO C++ Standard - Future Proposals
On Saturday, April 15, 2017 at 7:33:43 PM UTC-4, Bengt Gustafsson wrote:
 
As a variant it would be possible to use different allocators to achieve these different results.

This could be done using these two allocators:

No, it could not.

You can get some of the effect of it. But without partial specializations of `vector` that explicitly have the desired properties, it becomes a big kludge.

`FixedAllocator` cannot get rid of two now worthless pointers inside `vector`. Neither can `SooAllocator` somehow create a union between the pointers and the fixed block of elements.

Furthermore, `small_vector` and `static_vector` has fundamental behavioral differences from `vector` which cannot merely be a matter of allocators. Moving a `static_vector` into another `static_vector` will always invoke `T`'s move constructor; for `vector`, it never will, no matter the allocator. `small_vector` has different iterator invalidation behavior.

The current specification of `vector` doesn't allow these things. So if you're going to force these things, and complicate the specification, it would be much better to use a true parameterization of `vector`'s behavior, rather than an allocator-based hack.

Bengt Gustafsson

unread,
Apr 17, 2017, 11:18:57 AM4/17/17
to ISO C++ Standard - Future Proposals


Den söndag 16 april 2017 kl. 04:33:32 UTC+2 skrev Nicol Bolas:
On Saturday, April 15, 2017 at 7:33:43 PM UTC-4, Bengt Gustafsson wrote:
 
As a variant it would be possible to use different allocators to achieve these different results.

This could be done using these two allocators:

No, it could not.

You can get some of the effect of it. But without partial specializations of `vector` that explicitly have the desired properties, it becomes a big kludge.

`FixedAllocator` cannot get rid of two now worthless pointers inside `vector`. Neither can `SooAllocator` somehow create a union between the pointers and the fixed block of elements.
Well, I did write that the actual implementatation of vector would probably be specialized for the different allocators as a QoI excercise.
 

Furthermore, `small_vector` and `static_vector` has fundamental behavioral differences from `vector` which cannot merely be a matter of allocators. Moving a `static_vector` into another `static_vector` will always invoke `T`'s move constructor; for `vector`, it never will, no matter the allocator. `small_vector` has different iterator invalidation behavior.
Yes, this would be a somewhat valid concern, but is it so much harder to know that a vector with a certain allocator will invalidate its iterators at move assignment than that a container with a slighly different name (or template parameters of other sorts than allocators) would?

I would be reluctant to rely on iterators outliving the move assignment of their vector, but of course there is code out there that does. If such code is templated on the container to use both solutions are still having the same risk level. If such code is templated on the allocator but uses a vector of the supplied allocator type then the allocator based solution would maybe fail silently while the new-class solution would not be applicable at all. This is a very small corner.
 

The current specification of `vector` doesn't allow these things. So if you're going to force these things, and complicate the specification, it would be much better to use a true parameterization of `vector`'s behavior, rather than an allocator-based hack.
But unfortunately we can't get such true parameterisation unless we start afresh with a new container name. I was basically just exploring the possibility of avoiding having to have two names for the same basic type of container where one is "better" and the other is "simpler".

As for your omi_vector, I guess it is similar to how Eigen handles its configurable matrices. This is powerful but rather hard to set up the 6 template parameters so there are quite a few alias templates for common cases. It would be good if you could post the specification of this class, as google could not find it for me. 
 

Matthew Fioravante

unread,
Apr 17, 2017, 11:58:37 AM4/17/17
to ISO C++ Standard - Future Proposals


On Sunday, March 19, 2017 at 2:05:06 PM UTC-5, Magnus Fromreide wrote:
This is actuallly false, you can store non-movable objects in a growable 
array, you just can't resize it while it is non-empty. 

This means you have to use "reserve" prior to putting objects into it, but you 
can still use it dynamically. 

/MF 

That doesn't play well with concepts. If I have a resizable vector, then at compile time I might want to check that T is movable when I call push_back. This is a compile time check which cannot know whether or not sufficient size has been reserved ahead of time.

If the use case is a vector that is pre-allocated and will never attempt to do a move of T, then use a different array type which is designed to do exactly that.

On Saturday, April 15, 2017 at 6:33:43 PM UTC-5, Bengt Gustafsson wrote:
 
As a variant it would be possible to use different allocators to achieve these different results.

This could be done using these two allocators:

     FixedAllocator<T, N> // Has a fixed block to allocate from, part of its own size.
     SooAllocator<T, N, A> // Has N elements inside and can also allocate from the subordinate allocator A which defaults to the default allocator

For reasons already mentioned, allocators are a dead end. You need to actually modify the vector itself. What data members it has, what methods its exposes to the client, and what concepts its compatible with, what are the invariants / contracts, etc... 
I believe Nicol's omni_vec idea is the way to go here. Allocators are a dead end. Yes this also means you could redundantly reconstruct vector and array using a variant of omni_vec. That's not a bad thing.

I don't think we want to have 6 or 7 template arguments.

I would make omni_vec only take 2 arguments, T and an omni_vec_traits which describes the desired behavior.

It would look something like this:
template <typename T, typename Traits> class omni_vec;

template <typename T, typename Allocator=std::allocator<T>>
struct ov_vector_traits {
using allocator_t = Allocator;
using size_type = size_t;
static constexpr storage_policy = dynamic; //Could be inferred by the presense of allocator_t typedef.
static constexpr is_growable = true; //
};

//Behaves exactly the same way as vector.
template <typename T, typename Allocator=std::allocator<T>>
using vector2 = omni_vec<T, ov_vector_traits<T,Allocator>>;

template <typename T, size_t N.
struct ov_array_traits {
static constexpr storage_policy = fixed; //Maybe implied by size = N compile time constant.
static constexpr size = N;
};

//Same as array
template <typename T, size_t N>
using array2 = omni_vec<T,ov_array_traits<T,N>>;


The idea here is that omni_vec is not a template you would normally use directly. Instead its a toolbox you configure with traits to construct the kind of vector you want. Once you've setup the traits, you create a typedef and use that in your code. Obviously, the standard could provide a set of typedefs for common scenarios. STL 2.0 if that happens can just define vector and array to be typedefs of omni_vec. vector<bool> sticks around for backwards compatibility, and likewise omni_vec<bool,Traits> does the right thing.

The traits approach offers several advantages:
- You don't have to remember what all N ordered template arguments mean. Its like using struct vs tuple.
- You don't need to specify everything. Only specify whats needed for your behavior. Easy things like vector, array, dyn_array, etc.. only need a few simple tokens specified. Hard things like a vector which stores its capacity and size in the memory buffer and uses uint32_t for size_type need more complex configuration.
- Its extensible in the future. We can add many omni_vec configuration knobs in the future and not break ABI. Adding more template arguments is ABI breaking.

Nicol Bolas

unread,
Apr 17, 2017, 12:00:38 PM4/17/17
to ISO C++ Standard - Future Proposals
On Monday, April 17, 2017 at 11:18:57 AM UTC-4, Bengt Gustafsson wrote:
Den söndag 16 april 2017 kl. 04:33:32 UTC+2 skrev Nicol Bolas:
On Saturday, April 15, 2017 at 7:33:43 PM UTC-4, Bengt Gustafsson wrote:
 
As a variant it would be possible to use different allocators to achieve these different results.

This could be done using these two allocators:

No, it could not.

You can get some of the effect of it. But without partial specializations of `vector` that explicitly have the desired properties, it becomes a big kludge.

`FixedAllocator` cannot get rid of two now worthless pointers inside `vector`. Neither can `SooAllocator` somehow create a union between the pointers and the fixed block of elements.
Well, I did write that the actual implementatation of vector would probably be specialized for the different allocators as a QoI excercise.
 
Furthermore, `small_vector` and `static_vector` has fundamental behavioral differences from `vector` which cannot merely be a matter of allocators. Moving a `static_vector` into another `static_vector` will always invoke `T`'s move constructor; for `vector`, it never will, no matter the allocator. `small_vector` has different iterator invalidation behavior.
Yes, this would be a somewhat valid concern, but is it so much harder to know that a vector with a certain allocator will invalidate its iterators at move assignment than that a container with a slighly different name (or template parameters of other sorts than allocators) would?

It's not about one being particularly "much harder to know". It's about being a kludge rather than how it should be implemented. Allocators should not control fundamental aspects of the container's behavior like that. And the only advantage you gain by this is not having to create a new class name, even though you have to create new template specializations for it.

I would be reluctant to rely on iterators outliving the move assignment of their vector, but of course there is code out there that does. If such code is templated on the container to use both solutions are still having the same risk level. If such code is templated on the allocator but uses a vector of the supplied allocator type then the allocator based solution would maybe fail silently while the new-class solution would not be applicable at all. This is a very small corner.

I didn't say anything about iterators outliving move assignment. I was talking about the fundamental aspects of `vector`'s move behavior. For example, `vector`'s move constructor and move assignment operator are `noexcept`, regardless of the properties of `T` (its exception property only depends on the allocator's move behavior). That cannot happen with a static vector or a small-sized vector. The `noexcept` status of these operations must be based on the `noexcept` status of the move constructor of `T`.


The current specification of `vector` doesn't allow these things. So if you're going to force these things, and complicate the specification, it would be much better to use a true parameterization of `vector`'s behavior, rather than an allocator-based hack.
But unfortunately we can't get such true parameterisation unless we start afresh with a new container name. I was basically just exploring the possibility of avoiding having to have two names for the same basic type of container where one is "better" and the other is "simpler".

I don't see the difference between `vector<T, ParameterPretendingToBeAllocator<N, ActualAllocator>>` and `parameter_vector<T, Parameter<N>, ActualAllocator>`. Especially since the specification has to define `vector` specializations for these allocators anyway. And since implementers will have to use completely different implementations for those specializations.

Shoving this parameter in the wrong place, all for the sake of a name? I just don't see the point. Especially when odds are good that people are going to make aliases like `static_vector` and `small_vector` when using them in actual code.

The point of unifying all of these dimensions within one class is to be able to have a foundation for expansion (adding new behaviors) as well as allow users to mix-and-match functionality in useful ways. If you're only going to support these two variations, I'd much rather they just be separate classes.
 
As for your omi_vector, I guess it is similar to how Eigen handles its configurable matrices. This is powerful but rather hard to set up the 6 template parameters so there are quite a few alias templates for common cases. It would be good if you could post the specification of this class, as google could not find it for me. 

The `omni_vector` was just an idea, not to the level of a specification or implementation. I talked about it primarily to get input on what dimensions of `vector` behavior people were interested in having.

Matthew Fioravante

unread,
Apr 17, 2017, 12:51:15 PM4/17/17
to ISO C++ Standard - Future Proposals
As for configuration options, here's all I can think of at the moment:
  • Capacity
    • None (==size, fixed size array types)
    • Static: Compile time constant
    • Dynamic: Stored in container
    • Dynamic: Stored in allocated buffer
    • Dynamic: f(size) - not stored at all, computed from size. example: next power of 2 >= than size
      • Should also allow overridable fast computation of size >= capacity in push_back(). For example ispow2(size) is faster than size == ceil_pow2(size).
  • Size
    • Static: Compile time constant (requires no capacity)
    • Dynamic: Stored in container
    • Dynamic: Stored in allocated buffer
    • Dynamic: f(T) - not stored at all, unique sentinel element at the end of the buffer. (e.g. nan, nullptr, etc..)
      • size() and push_back() become O(N)
      • Debug mode checks to assert that a sentinel element is never inadvertently added by the user.
  • Growable Size
    • True: Requires T to be movable. enables push_back(), insert(), erase(),  etc...
      • If capacity is dynamic, default construction should be guaranteed to do no allocations.
    • False: Requires capacity to be "none"
      • All non-move constructors always do an exact allocation of size() T objects.
      • Allows non-movable T, and const T
  • Growable Capacity
    • True: (requires dynamic capacity and dynamic size)
      • Calling push_back() or insert() when full reallocates.
    • False:
      • Calling push_back() or insert() when full throws exception.
  • Growth Factor
    • If present, requires Growable Capacity == true. Can be size_type or std::ratio<size_type,size_type>. Default can either be something agreed upon or implementation defined. ratio<3,2> can be used for the popular 1.5 growth factor.
  • size_type
    • Integral, can be signed or unsigned
    • If not specified, defaults to decltype(size) if size a compile time constant, or size_t otherwise.
    • UB if you try to create an omni_vec with size() >= numeric_limits<size_type>::max()
  • allocator_t
    • Required for dynamic capacity.
    • Where possible, any omni_vec<T,TraitsA>, omni_vec<T,TraitsB> should be able to efficiently swap/move underlying buffers with each other if they use the same allocator.
      • This would have to be disallowed in the case where capacity() is computed as f(size) as we could not guarantee anything about the capacity of some other buffer moved into this. An optional cast could be added for this narrow use case which does a runtime check on the capacity and throws if its not == f(size).
    • omni_vec<T,Traits> should be able to swap/move underlying buffers with vector<T,A> if they use the same allocator.
  • Small buffer optimization:
    • Disabled - normal behavior
    • Some size N > 0 - Requires dynamic capacity.
      • move operations of container now require movable T
  • Bit Storage and proxy iterators
    • Some special T which makes the array behave like a bitset / vector<bool>. Then we get the underlying storage bitset and dynamic_bitset basically for free and they can use all of the omni_vec storage policy variants. 
array_view and static_array_view are pretty different. It can be argued that they belong in their own separate array_view / span library.

Nicol Bolas

unread,
Apr 17, 2017, 1:47:01 PM4/17/17
to ISO C++ Standard - Future Proposals
On Monday, April 17, 2017 at 12:51:15 PM UTC-4, Matthew Fioravante wrote:
As for configuration options, here's all I can think of at the moment:

There is a very great deal of overlap in that list. As well as contradictory dimensions: you can't have a static capacity stored in the object with dynamic resizing that's allocated. The parameterization system should be designed to avoid such contradictory options, rather than simply declaring them illegal.

Also, there are several options of... dubious merit. Control over `size_type`, for example. And having a sentinel value rather than a genuine size; that's just begging for someone to break the container. Even `basic_string`, a type designed specifically for such a use case, doesn't actually use the NUL character to determine the string's size.

Similarly, no form of `vector` should support "Bit Storage and proxy iterators". `vector<T>` is supposed to mean "contiguous storage of `T`", and every form of `vector` ought to provide that.

Matthew Fioravante

unread,
Apr 17, 2017, 2:28:11 PM4/17/17
to ISO C++ Standard - Future Proposals


On Monday, April 17, 2017 at 12:47:01 PM UTC-5, Nicol Bolas wrote:
On Monday, April 17, 2017 at 12:51:15 PM UTC-4, Matthew Fioravante wrote:
As for configuration options, here's all I can think of at the moment:

There is a very great deal of overlap in that list. As well as contradictory dimensions: you can't have a static capacity stored in the object with dynamic resizing that's allocated.

Sure, this is an off the cuff list I produced in 2 minutes. Certainly there are various illegal combinations, contradictions, and possibly other options I've left out. But hopefully it gives a starting point for fleshing out a real interface.

For example, I just realized that growth_factor should actually just be a function object instead of a size or ratio. Sometimes you really want an additive linear growth factor.
 
The parameterization system should be designed to avoid such contradictory options, rather than simply declaring them illegal.

At a minimum, I would suggest using static_asserts in omni_vec to detect and complain about all illegal combinations with helpful diagnostics. Obviously if the interface is designed well to lead you avoid mistakes all the better.
 

Also, there are several options of... dubious merit. Control over `size_type`, for example.

This is not dubious at all. I can give you 2 solid use cases, both of which I've done in the past.

On 64 bit platforms, a vector using size_t is 24 bytes, whereas vector using uint32_t is only 16 bytes. Those 8 bytes can be valuable if you have several of these packed tightly in a single cache line in hot memory. For a lot of use cases, we'll never need more than 4 billions elements in a vector.

Signed size_type also might let me finally start using gcc's -Wconversion without getting a ton of false positives. I don't use -Wconversion because all of the required casts around STL containers can easily cause bugs of their own. Signed size() might also enable optimizations in some cases because the compiler can assume signed arithmetic sourced from a call to size() can never overflow.
 
And having a sentinel value rather than a genuine size; that's just begging for someone to break the container. Even `basic_string`, a type designed specifically for such a use case, doesn't actually use the NUL character to determine the string's size.

I've used the sentinel in production code as well. Again I don't need to store a size variable anywhere, saving valuable cache space. The most common use case for this is when you have a list of pointers to callbacks to be iterated through. All of the pointers except the last one are null.

This makes size() and push_back() slow, but in my use case it was a situation where I set everything up on startup and then need to iterate the list over and over again in the hotpath. I don't give a damn about a bit slower slow setup (not even measurable in real tests), when it buys me the fastest possible iteration loop.

This optimizes the hell out of iteration. After optimization its essentially this:
for(T**p = v.data(); *p != nullptr; ++p) {
  doSomething(*p);
}

The vector itself only stores a single pointer and is only 8 bytes in size.

Capacity can either be a function of size (size % N, next power of 2, etc..) or a static compile time upper bound.

Now I agree that this is easy to use wrong in the general case. But again, omni_vec is a toolbox that doesn't always have to stand on its own in the wild. I can create a sentinel based omni_vec inside and then further wrap it in a safer interface.

I think the decision for string was more about 2 things: (1) allowing for strings with embedded nulls (mandatory for binary string data) and (2) making size() O(1). For string data, this is the right trade-off because in the context of string, those are 2 very important properties to have. For some other generic situations, they may not be necessary.

One might also want to make a C style null terminated string type which doesn't store the size, and they could easily start out by using omni_vec<char,Traits> to define the storage policy and build a semantic string wrapper on-top.
 
Similarly, no form of `vector` should support "Bit Storage and proxy iterators". `vector<T>` is supposed to mean "contiguous storage of `T`", and every form of `vector` ought to provide that.

I think my original idea of using some kind of type T to enable this is wrong, as its similar to vector<bool>. Certaintly though it could make sense to add a traits parameter to enable bitset behavior, with a static_assert() that T is bool.

Iterating over bits is fundamentally different than iterating over T. I agree that it is a bit uncomfortable how it crosses the turf of vector a bitset. However, if we added some facility to omni_vec, it means we get all the storage foundations for all possible bitsets for free.

Bitsets operate exactly like vectors and arrays. The only difference is that bits need special handling because they can't be represented with a type T directly.

Another option could be an omni_bitset, which uses the exact same traits as omni_vec but has bitset semantics.

I guess the idea I'm envisioning here is a completely flexible array-like storage backend for any possible kind of array like thing you would want. Simple inheritance/composition can be used to add additional semantic layers as needed (strings, bitsets, circular buffers, etc..).

Writing your own vector like type is a huge pain. Especially getting all of the move semantics and exception safety right. Having a flexible omni_vec tool would provide all of that foundation for us and let us create all kinds of really interesting and immediately useful data structures.

Nicol Bolas

unread,
Apr 17, 2017, 4:05:00 PM4/17/17
to ISO C++ Standard - Future Proposals
On Monday, April 17, 2017 at 2:28:11 PM UTC-4, Matthew Fioravante wrote:
On Monday, April 17, 2017 at 12:47:01 PM UTC-5, Nicol Bolas wrote:
On Monday, April 17, 2017 at 12:51:15 PM UTC-4, Matthew Fioravante wrote:
As for configuration options, here's all I can think of at the moment:

There is a very great deal of overlap in that list. As well as contradictory dimensions: you can't have a static capacity stored in the object with dynamic resizing that's allocated.

Sure, this is an off the cuff list I produced in 2 minutes. Certainly there are various illegal combinations, contradictions, and possibly other options I've left out. But hopefully it gives a starting point for fleshing out a real interface.

For example, I just realized that growth_factor should actually just be a function object instead of a size or ratio. Sometimes you really want an additive linear growth factor.
 
The parameterization system should be designed to avoid such contradictory options, rather than simply declaring them illegal.

At a minimum, I would suggest using static_asserts in omni_vec to detect and complain about all illegal combinations with helpful diagnostics. Obviously if the interface is designed well to lead you avoid mistakes all the better.
 

Also, there are several options of... dubious merit. Control over `size_type`, for example.

This is not dubious at all. I can give you 2 solid use cases, both of which I've done in the past.

On 64 bit platforms, a vector using size_t is 24 bytes, whereas vector using uint32_t is only 16 bytes. Those 8 bytes can be valuable if you have several of these packed tightly in a single cache line in hot memory. For a lot of use cases, we'll never need more than 4 billions elements in a vector.

You assume that `size_type` is actually stored in the `vector` to contain the size. Many `vector` implementations simply contain 3 pointers, computing `size` and `capacity` as needed.

So controlling `size_type` will not help you on such implementations.

Signed size_type also might let me finally start using gcc's -Wconversion without getting a ton of false positives. I don't use -Wconversion because all of the required casts around STL containers can easily cause bugs of their own. Signed size() might also enable optimizations in some cases because the compiler can assume signed arithmetic sourced from a call to size() can never overflow.

And what exactly would control over `size_type` do for that problem in general? Because it won't change the fact that every other container uses an unsigned `size_type`. Nor will it change the fact that `span<T>` will use an unsigned `size_type`. Nor will it change the fact that the unparameterized `vector` will use an unsigned `size_type`. Nor will it change the fact that someone else can pass you a `vector` that uses an unsigned `size_type`.

Also, `container::size_type` is required by the standard to be an unsigned integer, big enough to store any non-negative value of `container::difference_type`. If you have a problem with the way containers/ranges return their sizes, then that's a problem that needs to be handled at the specification level, not in a specific container class.

And having a sentinel value rather than a genuine size; that's just begging for someone to break the container. Even `basic_string`, a type designed specifically for such a use case, doesn't actually use the NUL character to determine the string's size.

I've used the sentinel in production code as well.

I never said that nobody uses such things in production code. I said that it's not something the standard library should support. It's just too fragile of a container for the standard library.

Again I don't need to store a size variable anywhere, saving valuable cache space. The most common use case for this is when you have a list of pointers to callbacks to be iterated through. All of the pointers except the last one are null.

This makes size() and push_back() slow, but in my use case it was a situation where I set everything up on startup and then need to iterate the list over and over again in the hotpath. I don't give a damn about a bit slower slow setup (not even measurable in real tests), when it buys me the fastest possible iteration loop.

This optimizes the hell out of iteration. After optimization its essentially this:
for(T**p = v.data(); *p != nullptr; ++p) {
  doSomething(*p);
}


Which you can do right now by manually putting a NULL pointer at the end of the list. This sounds like something best handled by some kind of iterator/range adapter, not by changing the container itself. Indeed, by making it an adapter rather than part of the class, it can now be used for lots of tasks, the equivalent of doing a loop until you reach the value that `find_if` would return.

The vector itself only stores a single pointer and is only 8 bytes in size.

So you create this incredibly fragile container... just to save 16 bytes.

Capacity can either be a function of size (size % N, next power of 2, etc..) or a static compile time upper bound.

Um, if the capacity is a function of the size, then the capacity must change whenever the size does. Which makes the capacity functionally useless; it will have to reallocate on any insertion or removal.

Now I agree that this is easy to use wrong in the general case. But again, omni_vec is a toolbox that doesn't always have to stand on its own in the wild. I can create a sentinel based omni_vec inside and then further wrap it in a safer interface.

I think the decision for string was more about 2 things: (1) allowing for strings with embedded nulls (mandatory for binary string data) and (2) making size() O(1). For string data, this is the right trade-off because in the context of string, those are 2 very important properties to have. For some other generic situations, they may not be necessary.

One might also want to make a C style null terminated string type which doesn't store the size, and they could easily start out by using omni_vec<char,Traits> to define the storage policy and build a semantic string wrapper on-top.
 
Similarly, no form of `vector` should support "Bit Storage and proxy iterators". `vector<T>` is supposed to mean "contiguous storage of `T`", and every form of `vector` ought to provide that.

I think my original idea of using some kind of type T to enable this is wrong, as its similar to vector<bool>. Certaintly though it could make sense to add a traits parameter to enable bitset behavior, with a static_assert() that T is bool.

Iterating over bits is fundamentally different than iterating over T. I agree that it is a bit uncomfortable how it crosses the turf of vector a bitset. However, if we added some facility to omni_vec, it means we get all the storage foundations for all possible bitsets for free.

Bitsets operate exactly like vectors and arrays.

No, they do not. A `vector<T>` is a contiguous container of `T`s, which can be iterated over and accessed by pointers. A bitset isn't that. So no, they do not "operate exactly" like `vector`.

You can use a `vector` to create a bitset, but that's different from a bitset actually being a specialization of `vector`. I could even imagine a new `bitset` class that takes the same parameters as `omni_vector`, forwarding them to an internal `vector` implementation.
 
The only difference is that bits need special handling because they can't be represented with a type T directly.

Another option could be an omni_bitset, which uses the exact same traits as omni_vec but has bitset semantics.

I guess the idea I'm envisioning here is a completely flexible array-like storage backend for any possible kind of array like thing you would want. Simple inheritance/composition can be used to add additional semantic layers as needed (strings, bitsets, circular buffers, etc..).

You can use `vector` in an implementation of bitset as it currently stands; you don't need to shove bitset functionality into `vector` to do that. You'll just have to write a bit more code.

And that "bit more code" is not the hard code of implementing `vector`.

Matthew Fioravante

unread,
Apr 17, 2017, 5:16:44 PM4/17/17
to ISO C++ Standard - Future Proposals


On Monday, April 17, 2017 at 3:05:00 PM UTC-5, Nicol Bolas wrote:
You assume that `size_type` is actually stored in the `vector` to contain the size. Many `vector` implementations simply contain 3 pointers, computing `size` and `capacity` as needed.

So controlling `size_type` will not help you on such implementations.

I would suggest that for omni_vec, if sizeof(size_type) < sizeof(ptrdiff_t), a space saving integer counter implementation must be used instead of pointers. Otherwise then yes the option is completely pointless.
 

Signed size_type also might let me finally start using gcc's -Wconversion without getting a ton of false positives. I don't use -Wconversion because all of the required casts around STL containers can easily cause bugs of their own. Signed size() might also enable optimizations in some cases because the compiler can assume signed arithmetic sourced from a call to size() can never overflow.

And what exactly would control over `size_type` do for that problem in general? Because it won't change the fact that every other container uses an unsigned `size_type`. Nor will it change the fact that `span<T>` will use an unsigned `size_type`. Nor will it change the fact that the unparameterized `vector` will use an unsigned `size_type`. Nor will it change the fact that someone else can pass you a `vector` that uses an unsigned `size_type`.

While it wouldn't fix the problem in the general case (we're already screwed for that), it would fix the majority of use cases. First at least for me, I already use vector or something vector-like 90% of the time. Second, most of the time when I do need to work with the size() of the container, that container is a vector. Most commonly in an index based for loop. While size() is sometimes useful for unordered_map and other things, I find myself querying it far less often in those cases than I do with arrays and certainly even less often needing to compare it with signed values.
 

Also, `container::size_type` is required by the standard to be an unsigned integer, big enough to store any non-negative value of `container::difference_type`. If you have a problem with the way containers/ranges return their sizes, then that's a problem that needs to be handled at the specification level, not in a specific container class.

I remember one of the C++ talks (I think it was a microsoft conference?) where some of the C++ committee members actually admitted publicly using unsigned types for container sizes was a mistake. I don't remember which talk this was.

But you are right the question of signed vs unsigned size() is a bigger issue than what we're talking about here. Again, the ship has probably sailed on this unless we do stl 2.0 full rewrite.

 

And having a sentinel value rather than a genuine size; that's just begging for someone to break the container. Even `basic_string`, a type designed specifically for such a use case, doesn't actually use the NUL character to determine the string's size.

I've used the sentinel in production code as well.

I never said that nobody uses such things in production code. I said that it's not something the standard library should support. It's just too fragile of a container for the standard library.

Why does everything in the standard library need to be an idiot proof novice friendly all out complete solution? I'd really like to see more expert level building block style tools we can use to quickly construct our own types and avoid the boilerplate, drudgery, testing, and bugs coming from rewriting the basics over and over again.

This is not the only unsafe container with loose semantics I've had to write before. There's nothing wrong with a slightly dangerous interface that's almost always intended to be wrapped in a safer more well defined and application specific interface which use it in slightly different ways depending on context.

std::mutex and std::atomic are good examples of sharp tools most people would recommend only experts use and carefully abstract away from the general application code whenever possible.

 

Again I don't need to store a size variable anywhere, saving valuable cache space. The most common use case for this is when you have a list of pointers to callbacks to be iterated through. All of the pointers except the last one are null.

This makes size() and push_back() slow, but in my use case it was a situation where I set everything up on startup and then need to iterate the list over and over again in the hotpath. I don't give a damn about a bit slower slow setup (not even measurable in real tests), when it buys me the fastest possible iteration loop.

This optimizes the hell out of iteration. After optimization its essentially this:
for(T**p = v.data(); *p != nullptr; ++p) {
  doSomething(*p);
}


Which you can do right now by manually putting a NULL pointer at the end of the list. This sounds like something best handled by some kind of iterator/range adapter, not by changing the container itself. Indeed, by making it an adapter rather than part of the class, it can now be used for lots of tasks, the equivalent of doing a loop until you reach the value that `find_if` would return.

Adding wrappers on-top doesn't let you optimize out data and logic inside the underlying data structure that your limited interface doesn't need anymore. The only thing wrappers can do is improve
correctness by constricting the interface. They can't provide optimization.

I'm still wasting memory storing the size and capacity.  Also in my loop I need to add an additional check for empty(), otherwise dereferencing the first element will crash.

I could try to remember at runtime to always allocate one sentinel. That's an invariant now that has to be guaranteed at runtime, instead of at compile time with sentinel_vector. Also, sentinel_vector can store one copy of the sentinel object as a static const data member. Then all empty sentinel_vectors can avoid allocating at all and their sentinel_nodes all share the same address, improving cache locality and reducing heap fragmentation if you have a lot of empty sentinel_vectors.
 

The vector itself only stores a single pointer and is only 8 bytes in size.

So you create this incredibly fragile container... just to save 16 bytes.

Yes, and it resulted in a measurable improvement in application runtime. 16 bytes is already 1/4 of a cache line on Intel.
 

Capacity can either be a function of size (size % N, next power of 2, etc..) or a static compile time upper bound.

Um, if the capacity is a function of the size, then the capacity must change whenever the size does. Which makes the capacity functionally useless; it will have to reallocate on any insertion or removal.

Not exactly. If capacity is ceil_pow2(size()), the capacity stays constant until your size hits the next power of 2 boundary.  For push_back(), just check if size is a power of 2, if so you need to reallocate more up to the next boundary.

The modulus example is like a linear growth factor. For example if N is 5, you'll reallocate when size is 5, 10, 15, etc..

These options are not appropriate if you want insertion/removal to be fast. Many times however, we just do setup once and then need to iterate (read) over and over again in the hot path. Any kind of simulation use case where you set everything up and then run the simulation has this general paradigm. Sometimes we only need to push_back() and never do any removal until we tear down the whole system at shutdown.

Finally, the other example is static_vector<T,N>, where the capacity is a compile time constant. Essentially now you've got a fixed upper bound on the number of things you can store, and instead of storing and checking size, you have a sentinel.

Admittedly for many uses, storing the size counter inside the allocated memory buffer could be a good alternative to using a sentinel. You still get the space savings in the container. Empty vectors again could point to a static 0 size_type somewhere to avoid allocations, and now you avoid all of that messy fragility of the sentinel.


 

Now I agree that this is easy to use wrong in the general case. But again, omni_vec is a toolbox that doesn't always have to stand on its own in the wild. I can create a sentinel based omni_vec inside and then further wrap it in a safer interface.

I think the decision for string was more about 2 things: (1) allowing for strings with embedded nulls (mandatory for binary string data) and (2) making size() O(1). For string data, this is the right trade-off because in the context of string, those are 2 very important properties to have. For some other generic situations, they may not be necessary.

One might also want to make a C style null terminated string type which doesn't store the size, and they could easily start out by using omni_vec<char,Traits> to define the storage policy and build a semantic string wrapper on-top.
 
Similarly, no form of `vector` should support "Bit Storage and proxy iterators". `vector<T>` is supposed to mean "contiguous storage of `T`", and every form of `vector` ought to provide that.

I think my original idea of using some kind of type T to enable this is wrong, as its similar to vector<bool>. Certaintly though it could make sense to add a traits parameter to enable bitset behavior, with a static_assert() that T is bool.

Iterating over bits is fundamentally different than iterating over T. I agree that it is a bit uncomfortable how it crosses the turf of vector a bitset. However, if we added some facility to omni_vec, it means we get all the storage foundations for all possible bitsets for free.

Bitsets operate exactly like vectors and arrays.

No, they do not. A `vector<T>` is a contiguous container of `T`s, which can be iterated over and accessed by pointers. A bitset isn't that. So no, they do not "operate exactly" like `vector`.

You can use a `vector` to create a bitset, but that's different from a bitset actually being a specialization of `vector`. I could even imagine a new `bitset` class that takes the same parameters as `omni_vector`, forwarding them to an internal `vector` implementation.
 
The only difference is that bits need special handling because they can't be represented with a type T directly.

Another option could be an omni_bitset, which uses the exact same traits as omni_vec but has bitset semantics.

I guess the idea I'm envisioning here is a completely flexible array-like storage backend for any possible kind of array like thing you would want. Simple inheritance/composition can be used to add additional semantic layers as needed (strings, bitsets, circular buffers, etc..).

You can use `vector` in an implementation of bitset as it currently stands; you don't need to shove bitset functionality into `vector` to do that. You'll just have to write a bit more code.

And that "bit more code" is not the hard code of implementing `vector`.

Fair enough.

Anyway after thinking about it more, if we wanted to talk about bitset, I think omni_bitset<Traits> would be the way to go. It can be implemented ontop of omni_vec as you suggested and all the bitset related proxy hackery kept cleanly separated.

Some configuration options just make no sense for bitset (example: sentinels) and should be static_asserted away.

Also, bitset has some other configuration parameters you might want to tweak with traits.

Specifically controlling the word size and alignment used to construct it. Right now std::bitset<T,32> on gcc uses 64 bits. This makes it space inefficient at best and unusable at worse. A good example is a struct which is supposed to overlay binary protocols such as network packet headers. If I could control the size and alignment of my bitset, I could overlay it directly over binary data without making copies. Then I can manipulate my data using the nice already tested bitset interface instead of hacking my own logical ops.

I tried to propose something like this for bitset before and it failed to go through. If we had an omni_bitset, these fine controls would fall in nicely and it would be better than trying to add them to std::bitset and breaking ABI compatibility.

inkwizyt...@gmail.com

unread,
Apr 17, 2017, 6:33:00 PM4/17/17
to ISO C++ Standard - Future Proposals


On Monday, April 17, 2017 at 10:05:00 PM UTC+2, Nicol Bolas wrote:
On Monday, April 17, 2017 at 2:28:11 PM UTC-4, Matthew Fioravante wrote:
I've used the sentinel in production code as well.

I never said that nobody uses such things in production code. I said that it's not something the standard library should support. It's just too fragile of a container for the standard library.

Again I don't need to store a size variable anywhere, saving valuable cache space. The most common use case for this is when you have a list of pointers to callbacks to be iterated through. All of the pointers except the last one are null.

This makes size() and push_back() slow, but in my use case it was a situation where I set everything up on startup and then need to iterate the list over and over again in the hotpath. I don't give a damn about a bit slower slow setup (not even measurable in real tests), when it buys me the fastest possible iteration loop.

This optimizes the hell out of iteration. After optimization its essentially this:
for(T**p = v.data(); *p != nullptr; ++p) {
  doSomething(*p);
}


Which you can do right now by manually putting a NULL pointer at the end of the list. This sounds like something best handled by some kind of iterator/range adapter, not by changing the container itself. Indeed, by making it an adapter rather than part of the class, it can now be used for lots of tasks, the equivalent of doing a loop until you reach the value that `find_if` would return.

The vector itself only stores a single pointer and is only 8 bytes in size.

So you create this incredibly fragile container... just to save 16 bytes.

I do not see that will be more fragile than normal `begin`/`end` combo. Trick would be use pair of iterators from range TS (`begin` and `end` return different types).
Adding sentinel could be done automatically too.

Matthew Fioravante

unread,
Apr 17, 2017, 6:39:09 PM4/17/17
to ISO C++ Standard - Future Proposals
The fragility comes not from iteration, but mutation. For example with sentintel_vector<std::unique_ptr<T>> with nullptr sentinel. There's basically nothing stopping me from inserting a nullptr somewhere in the middle and completely screwing up the size calculation, leaking the trailing pointers in the process after the sentinel_vector is destroyed.

Nicol Bolas

unread,
Apr 17, 2017, 11:05:33 PM4/17/17
to ISO C++ Standard - Future Proposals
On Monday, April 17, 2017 at 5:16:44 PM UTC-4, Matthew Fioravante wrote:
On Monday, April 17, 2017 at 3:05:00 PM UTC-5, Nicol Bolas wrote:
You assume that `size_type` is actually stored in the `vector` to contain the size. Many `vector` implementations simply contain 3 pointers, computing `size` and `capacity` as needed.

So controlling `size_type` will not help you on such implementations.

I would suggest that for omni_vec, if sizeof(size_type) < sizeof(ptrdiff_t), a space saving integer counter implementation must be used instead of pointers. Otherwise then yes the option is completely pointless.
 

Signed size_type also might let me finally start using gcc's -Wconversion without getting a ton of false positives. I don't use -Wconversion because all of the required casts around STL containers can easily cause bugs of their own. Signed size() might also enable optimizations in some cases because the compiler can assume signed arithmetic sourced from a call to size() can never overflow.

And what exactly would control over `size_type` do for that problem in general? Because it won't change the fact that every other container uses an unsigned `size_type`. Nor will it change the fact that `span<T>` will use an unsigned `size_type`. Nor will it change the fact that the unparameterized `vector` will use an unsigned `size_type`. Nor will it change the fact that someone else can pass you a `vector` that uses an unsigned `size_type`.

While it wouldn't fix the problem in the general case (we're already screwed for that), it would fix the majority of use cases. First at least for me, I already use vector or something vector-like 90% of the time. Second, most of the time when I do need to work with the size() of the container, that container is a vector. Most commonly in an index based for loop. While size() is sometimes useful for unordered_map and other things, I find myself querying it far less often in those cases than I do with arrays and certainly even less often needing to compare it with signed values.
 

Also, `container::size_type` is required by the standard to be an unsigned integer, big enough to store any non-negative value of `container::difference_type`. If you have a problem with the way containers/ranges return their sizes, then that's a problem that needs to be handled at the specification level, not in a specific container class.

I remember one of the C++ talks (I think it was a microsoft conference?) where some of the C++ committee members actually admitted publicly using unsigned types for container sizes was a mistake. I don't remember which talk this was.

But you are right the question of signed vs unsigned size() is a bigger issue than what we're talking about here. Again, the ship has probably sailed on this unless we do stl 2.0 full rewrite.

 

And having a sentinel value rather than a genuine size; that's just begging for someone to break the container. Even `basic_string`, a type designed specifically for such a use case, doesn't actually use the NUL character to determine the string's size.

I've used the sentinel in production code as well.

I never said that nobody uses such things in production code. I said that it's not something the standard library should support. It's just too fragile of a container for the standard library.

Why does everything in the standard library need to be an idiot proof novice friendly all out complete solution?

There's a lot of space between "not easily broken" and "idiot proof novice friendly all out complete solution". Consider `vector` itself. Invalidating iterators/pointers in a `vector` is a source of bugs for some people who aren't familiar with the quirks of that type. Yet we still have `vector`, and we recommend its use essentially for all sequence container tasks (and for some non-sequence-container tasks).

It's not idiot proof. But it's not easy to break either.
 
I'd really like to see more expert level building block style tools we can use to quickly construct our own types and avoid the boilerplate, drudgery, testing, and bugs coming from rewriting the basics over and over again.

This type is not for all C++ experts. It's for C++ experts who really care about 16 bytes, who care so much that they're willing to use a very fiddly type to achieve their goals (not to mention the capacity issues, essentially allocating more memory than is needed). So you're not talking about what every C++ expert would use; you're talking about a very specific subset of C++ experts.

This sort of thing can go on and on and on. Everyone has some need, some tool that they use which might be useful to someone else. At some point, you have to decide that something is too small of a corner case for the standard library. At some point, you have to tell people to go implement it themselves.

I can see broad utility for a lot of these options: small storage, fixed capacity, the combination of the two, and others. These tools are useful to more than just C++ experts who work in very specific, high-performance code. It's a lot harder to show that sentinel vectors have such broad utility.

This is not the only unsafe container with loose semantics I've had to write before. There's nothing wrong with a slightly dangerous interface that's almost always intended to be wrapped in a safer more well defined and application specific interface which use it in slightly different ways depending on context.

std::mutex and std::atomic are good examples of sharp tools most people would recommend only experts use and carefully abstract away from the general application code whenever possible.

This is not a question of "let's not give people easily broken low-level tools". It's "when giving people exceedingly fragile tools, let's examine the cost/benefit of doing so". If you don't standardize low-level thread synchronization tools, then users will be unable to effectively use threading without using platform-dependent tools and/or other libraries.

So on the one had, you have the large burden on the standard to define its behavior (which includes the various interactions with other parameters), implementers to implement it (which again includes the various interactions with other parameters), and users to use it without breaking it. And all of this effort benefits only a small group of users.

Again I don't need to store a size variable anywhere, saving valuable cache space. The most common use case for this is when you have a list of pointers to callbacks to be iterated through. All of the pointers except the last one are null.

This makes size() and push_back() slow, but in my use case it was a situation where I set everything up on startup and then need to iterate the list over and over again in the hotpath. I don't give a damn about a bit slower slow setup (not even measurable in real tests), when it buys me the fastest possible iteration loop.

This optimizes the hell out of iteration. After optimization its essentially this:
for(T**p = v.data(); *p != nullptr; ++p) {
  doSomething(*p);
}


Which you can do right now by manually putting a NULL pointer at the end of the list. This sounds like something best handled by some kind of iterator/range adapter, not by changing the container itself. Indeed, by making it an adapter rather than part of the class, it can now be used for lots of tasks, the equivalent of doing a loop until you reach the value that `find_if` would return.

Adding wrappers on-top doesn't let you optimize out data and logic inside the underlying data structure that your limited interface doesn't need anymore. The only thing wrappers can do is improve
correctness by constricting the interface. They can't provide optimization.

Your example was of iterating over such a range. The optimization in question was about the mechanism of iteration ("This optimizes the hell out of iteration"), not of the size of the container. So the adapter does provide the optimization you were asking for. As well as providing that same optimization elsewhere too, such as with `span`, `array`, `string_view`, and other types.

----

I think you mentioned something in one of your posts about the difficulty of correctly implementing `vector`-style types. It occurs to me that we could add utilities that make such implementations easier to write. For example, a function like `std::uninitialized_copy`, but unlike that function, this one will perform rollbacks on the newly constructed memory on exceptions. They will also use the allocator's `construct` function. There would be functions for shifting elements up/down , but with proper exception support built into them and vector's specific copy/move behavior based on the properties of `T`. And so forth.

Once all of the tools are available, then writing a `vector` type would be a lot easier. What we lack is a comprehensive enumeration of the tools that we need.

Magnus Fromreide

unread,
Apr 18, 2017, 2:03:21 AM4/18/17
to ISO C++ Standard - Future Proposals
On Mon, Apr 17, 2017 at 08:05:32PM -0700, Nicol Bolas wrote:
> On Monday, April 17, 2017 at 5:16:44 PM UTC-4, Matthew Fioravante wrote:
> >
> > On Monday, April 17, 2017 at 3:05:00 PM UTC-5, Nicol Bolas wrote:
> >>
> >> I never said that nobody uses such things in production code. I said that
> >> it's not something the standard library should support. It's just too
> >> fragile of a container for the standard library.
> >>
> >
> > Why does everything in the standard library need to be an idiot proof
> > novice friendly all out complete solution?
> >
>
> There's a lot of space between "not easily broken" and "idiot proof novice
> friendly all out complete solution". Consider `vector` itself. Invalidating
> iterators/pointers in a `vector` is a source of bugs for some people who
> aren't familiar with the quirks of that type. Yet we still have `vector`,
> and we recommend its use essentially for all sequence container tasks (and
> for some non-sequence-container tasks).
>
> It's not idiot proof. But it's not easy to break either.

I would claim that there is some validity in this complaint. Consider the
glaring absence of stl::tree (the red-black tree implementation class
underlying stl::set and stl::map, stl refers to SGI STL) from the std
library.

If I remember correctly there was a proposal to add it to one of the
later versions of the standard but it got turned down.

/MF

Magnus Fromreide

unread,
Apr 18, 2017, 2:12:17 AM4/18/17
to std-pr...@isocpp.org
On Mon, Apr 17, 2017 at 11:28:11AM -0700, Matthew Fioravante wrote:
>
>
> On Monday, April 17, 2017 at 12:47:01 PM UTC-5, Nicol Bolas wrote:
> >
> > On Monday, April 17, 2017 at 12:51:15 PM UTC-4, Matthew Fioravante wrote:
> >>
>
> > Similarly, no form of `vector` should support "Bit Storage and proxy
> > iterators". `vector<T>` is supposed to mean "contiguous storage of `T`",
> > and every form of `vector` ought to provide that.
> >
>
> I think my original idea of using some kind of type T to enable this is
> wrong, as its similar to vector<bool>. Certaintly though it could make
> sense to add a traits parameter to enable bitset behavior, with a
> static_assert() that T is bool.
>
> Iterating over bits is fundamentally different than iterating over T. I
> agree that it is a bit uncomfortable how it crosses the turf of vector a
> bitset. However, if we added some facility to omni_vec, it means we get all
> the storage foundations for all possible bitsets for free.
>
> Bitsets operate exactly like vectors and arrays. The only difference is
> that bits need special handling because they can't be represented with a
> type T directly.

But what about omni_vec<nybble>? Why should bits be more equal than nybbles?

(A nybble is a 4-bit number)

/MF

Magnus Fromreide

unread,
Apr 18, 2017, 2:21:31 AM4/18/17
to ISO C++ Standard - Future Proposals
On Mon, Apr 17, 2017 at 09:51:14AM -0700, Matthew Fioravante wrote:
> As for configuration options, here's all I can think of at the moment:

I think you forgot some options:

1. Capacity is decided upon construction but is treated as constant
afterwards.
2. Capacity is constant when the omni_vec is non-empty.
3. The growth direction is reversed, this implies the existance of *_front
and the absence of *_back changers and so this isn't really a container
at all but it is still useful in some cases.

Given your argumentation that it might be useful at some time then they
should be added but I would happlily agree that at least 3. should be a
separate type.

/MF

inkwizyt...@gmail.com

unread,
Apr 18, 2017, 7:18:15 AM4/18/17
to ISO C++ Standard - Future Proposals

Right, probably only way to prevent it is ban not-const access to elements outside of vector api. `data` will always return const pointer to array of `T` but iterators will returns proxies that will check if you write proper values. But this will create `vector<bool>` v2 and made this more complicated that is worth.

Mathias Gaunard

unread,
Apr 18, 2017, 8:46:31 AM4/18/17
to std-pr...@isocpp.org
I think the first one was better when it was std::dynarray.
I didn't follow the whole story and don't remember why the dynarray proposal was killed just before C++14 was released.

For the second one (be it static_vector or small_vector), SG14 folks are also interested in it, so if anyone wants to submit a new proposal, they could use their backing.

Matthew Fioravante

unread,
May 15, 2017, 11:00:53 AM5/15/17
to ISO C++ Standard - Future Proposals
A blog post that just showed up on the ISO CPP rss feed today:


And this paragraph about bitset is relevant here (emphasis mine):

The only drawback of using bitset is that it requires compile time N constant. Also, bitset is implementation specific, so we’re not sure how the memory is laid out internally. I would reject this version from the final production code, but might be good for comparisons.

An issue about bitset I raised a few years ago which failed to pass in the committee. We really need more low level building blocks in C++ so that expert users can construct the tools they need more easily.
Reply all
Reply to author
Forward
0 new messages