[N4359] vector::release() - Missing symetric array ownership acquisition operations.

386 views
Skip to first unread message

Klaim - Joël Lamotte

unread,
Feb 14, 2015, 10:17:00 AM2/14/15
to std-pr...@isocpp.org
While reading N4359 it occured to me that while a release()  operation
would be useful for the same reasons than unique_ptr's release() operation
is useful, vector do not have a way to acquire ownership
of a manually provided array of data.

That is, if an array is released from a vector, it cannot be inserted in another vector
without copying it.

It seems to me that this is a missed opportunity in cases where 
you get a dynamically allocated array from one API and you want 
to manage it in as a vector in your code. That is, you want to keep the
growing behaviour of vector (otherwise you would use a unique_pointer with array)
but you also want to avoid an array copy and delete when getting an array from
a C api for example.

Ownership acquisition would be implemented as a constructor (maybe tagged by a parametter to help clarity)
and an member function.
I exclude assignment operator overloads because there is a need to provide the size 
and and maybe a (polymophic?) allocator to the vector.

Of course this ownership acquisition would be less safe than the current copy behaviour
but it would be useful for high performance applications in contexts described as before.


Any thoughts on this?

Dale Weiler

unread,
Feb 14, 2015, 1:04:54 PM2/14/15
to std-pr...@isocpp.org
Having looked over this proposal I'm actually wondering if release should return a pair of pointer and size.
I think this would be much better (and safer) since the size will often be required and it will prevent strange
interface contracts; like grabbing the size before the call to release, as the call invalidates the vector contents
(including size.)

So a minor modification I'd suggest


pair<T*, size_type> release() noexcept;

At least this way one could use this as such

vector<int> stuff;
getStuff
(stuff);

auto p = stuff.release();
some_c_function
(p.first, p.second); // some_c_function(int *, size_t)

Perhaps however it would be better to use a tuple of pointer, size and allocator

tuple<T*, size_type, allocator_type> release() noexcept;

Then at least all the relevant information about the release can be grabbed all at once when releasing the vector

vector<A> stuff;
getStuff
(stuff);

auto p = stuff.release();

doStuffWith
(std::get<0>(p), std::get<1>(p)); // doStuff(A *, size_t)

for (size_t i = 0; i < std::get<1>(p); i++) std::get<0>(p)[i].~A();

std
::get<2>(p)().release(std::get<0>(p), std::get<1>(p));

However come to think of it, it may just be wiser (in general) to add something like
vector_view
that contains all this information

template <typename T, typename A = allocator<T>>
struct vector_view {
  vector_view
(vector<T, A> &&vec) {
    size_
= vec.size();
    allocator_
= vec.get_allocator();
    data_
= vec.data_; // vector<T, A> will have to friend vector_view
 
}
 
typedef typename vector<T, A>::value_type value_type;
 
typedef typename vector<T, A>::allocator_type allocator_type;
 
typedef typename vector<T, A>::reference reference;
 
typedef typename vector<T, A>::const_reference const_reference;
 
typedef typename vector<T, A>::pointer pointer;
 
typedef typename vector<T, A>::const_pointer const_pointer;
 
typedef typename vector<T, A>::iterator iterator;
 
typedef typename vector<T, A>::const_iterator const_iterator;
 
typedef typename vector<T, A>::size_type size_type;

 
// Implements begin, end, size, empty, operator[], at, front, back, data, get_allocator
  iterator
begin() { return data_; }
  const_iterator
begin() const { return data_; }
  iterator
end() { return data_ + size_; }
  const_iterator
end() const { return data_ + size_; }

  size_type size
() const { return size_; }
 
bool empty() const { return size_ == 0; }

  reference
operator[](size_type n) { return data_[n]; }
  const_reference
operator[](size_type n) const { return data_[n]; }
  reference at
(size_type n) { if (n >= size_) throw std::out_of_range("vector_view::at"); return data_[n]; }
  const_reference at
(size_type n) { if (n >= size_) throw throw std::out_of_range("vector_view::at"); return data_[n]; }

  reference front
() { return *data_; }
  const_reference front
() const { return *data_; }
  reference back
() { return data_[size_ - 1]; }
  const_reference back
() const { return data_[size - 1]; }

  pointer data
() noexcept { return data_; }
  const_pointer data
() const noexcept { return data_; }

  allocator_type
& get_allocator() const { return allocator_; }

private:
  value_type
*data_;
  size_type size_
;
  allocator_type
& allocator_;
};

Then vector<T, A> could implement release as outputting one of these

vector_view<T, A> release() noexcept;

Then one could write the previous example as such

auto view = vec.release();
doStuffWith
(&view[0], view.size());

for (auto &it : view) it.~A();
view
.get_allocator().deallocate(view.data(), view.size());

Could even add a utility function to vector_view that does the deallocation for you,
perhaps in the nature of allocator<T> we could call it deallocate.

template <typename T, typename A = allocator<T>>
struct vector_view {
 
...
 
void deallocate() noexcept {
   
for (auto &it : *this) {
      it
.~value_type();
   
}
    allocator_
.deallocate(data_, size_);
 
}
 
...
};


Then the full example becomes

vector<A> stuff;
getStuff
(stuff);

auto view = stuff.release();

doStuffWith
(&view[0] /* or view.data() */, view.size()); // doStuff(A *, size_t)

view
.deallocate();


Nevin Liber

unread,
Feb 14, 2015, 1:28:45 PM2/14/15
to std-pr...@isocpp.org
On 14 February 2015 at 12:04, Dale Weiler <weile...@gmail.com> wrote:
Having looked over this proposal I'm actually wondering if release should return a pair of pointer and size.
I think this would be much better (and safer) since the size will often be required and it will prevent strange
interface contracts; like grabbing the size before the call to release, as the call invalidates the vector contents
(including size.)

You need to return three things:

1.  Pointer to the data.
2.  Number of elements.
3.  Allocator.
 

So a minor modification I'd suggest


pair<T*, size_type> release() noexcept;

I much prefer returning things with names, to make things readable.  It should return a dedicated class/struct instance.

Note:  do not consider this message to be an endorsement by me of this proposal; I'm still thinking about it.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com(847) 691-1404

Dale Weiler

unread,
Feb 14, 2015, 1:32:26 PM2/14/15
to std-pr...@isocpp.org
@Nevin Liber
Expand my full post. I ended up designing such a construct (but google groups decides to truncate it)

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Klaim - Joël Lamotte

unread,
Feb 14, 2015, 2:09:32 PM2/14/15
to std-pr...@isocpp.org

​A type owning a dynamically allocated array is not a "view"...

Thiago Macieira

unread,
Feb 14, 2015, 2:10:09 PM2/14/15
to std-pr...@isocpp.org
On Saturday 14 February 2015 12:28:01 Nevin Liber wrote:
> You need to return three things:
>
> 1. Pointer to the data.
> 2. Number of elements.
> 3. Allocator.

Make it 4:

1. Pointer to the data
2. Number of elements
3. Pointer to be passed to the allocator to be freed
4. Allocator

std::vector may allocate a a block of data that is bigger than the actual
array and it may store some book-keeping information at the beginning of such
a block. Example:

template <typename T, typanem Allocator>
struct vector_data
{
size_t count;
size_t capacity;
Allocator allocator;
T array[0]; // compiler extension support
};

template <class T, class Allocator = allocator<T> >
class vector
{
vector_data<T, Allocator> data;
[...]
};

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Howard Hinnant

unread,
Feb 14, 2015, 2:15:38 PM2/14/15
to std-pr...@isocpp.org
This is decent anecdotal evidence that the proposal places unreasonably error prone responsibilities on the caller of release(). I’ve yet to see a correct handling of the vector’s buffer, either in the paper, or here, even if I assume that the allocator is std::allocator, much less an arbitrary user-supplied allocator.

By the time you correctly dispose of this buffer, you’ve gone a long way towards re-implementing std::vector.

(update) Just saw Thiago’s post! :-) That’s the closest yet! :-)

Howard

Miro Knejp

unread,
Feb 14, 2015, 2:27:03 PM2/14/15
to std-pr...@isocpp.org

Am 14.02.2015 um 20:10 schrieb Thiago Macieira:
> On Saturday 14 February 2015 12:28:01 Nevin Liber wrote:
>> You need to return three things:
>>
>> 1. Pointer to the data.
>> 2. Number of elements.
>> 3. Allocator.
> Make it 4:
>
> 1. Pointer to the data
> 2. Number of elements
> 3. Pointer to be passed to the allocator to be freed
> 4. Allocator
5. Total capacity of the allocated block for further shenanigans

Howard Hinnant

unread,
Feb 14, 2015, 2:38:37 PM2/14/15
to std-pr...@isocpp.org
<nod>

And you can’t traffic in T* as the proposal says. You must deallocate vector<T>::pointer which may not be a T*. And you can’t call ~T() directly. And to be consistent, you probably shouldn’t call allocator.deallocate() directly either, though I can see a rationale that says there’s no way to get into trouble if you do (a very subtle point). And when calling allocator_traits<Allocator>::destroy, you must correctly convert vector<T>::pointer to a T*. And please don’t do any of this in a way that could compromise basic exception safety.

Howard

Howard Hinnant

unread,
Feb 14, 2015, 4:49:44 PM2/14/15
to std-pr...@isocpp.org
It occurs to me that one way to rescue this proposal is to follow the spirit of:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3586.pdf

1. Introduce a nested type vector<T, A>::node_ptr.

2. node_ptr is an alias for unique_ptr<T[], allocator_deleter<T, A>>.

3. allocator_deleter<T, A> looks something like:

template <class T, class Allocator>
class allocator_deleter
{
public:
using value_type = T;
using allocator_type = Allocator;
using traits = std::allocator_traits<allocator_type>;
using pointer = typename traits::pointer;
using size_type = typename traits::size_type;
private:
std::tuple<allocator_type, size_type, size_type> data_;
public:
allocator_deleter(allocator_type const& a, size_type size,
size_type capacity) noexcept
: data_(a, size, capacity)
{}

void operator()(pointer p) noexcept
{
for (pointer e = p + std::get<1>(data_); p != e;)
traits::destroy(std::get<0>(data_), std::addressof(*--e));
traits::deallocate(std::get<0>(data_), p, std::get<2>(data_));
}

allocator_type get_allocator() const noexcept {return std::get<0>(data_);}
size_type size() const noexcept {return std::get<1>(data_);}
size_type capacity() const noexcept {return std::get<2>(data_);}

static_assert((std::is_same<value_type, typename allocator_type::value_type>::value),
"Invalid allocator::value_type");
};

3a. If allocator_deleter needs to do anything fancier than this to handle details such as those Thiago alludes to, it is up to the author of vector to put those details into allocator_deleter.

3b. allocator_deleter is not necessarily a user-accessible name. Clients should reach for vector<T, A>::node_ptr::deleter_type if they need a name for this type.

4. Introduce to vector<T, A>:

template <class T, class A>
typename vector<T, A>::node_ptr
release();

5. Introduce an explicit vector<T, A> constructor:

template <class T, class A>
vector<T, A>::vector(node_ptr&& np);

Example use:

int
main()
{
std::vector<X> v;
for (int i = 0; i < 3; ++i)
v.push_back(X(i));
auto up = v.release();
for (auto i = 0; i < up.get_deleter().size(); ++i)
std::cout << up[i] << '\n';
}

Open questions (at least for me).

1. This was fun to create. But does it have sufficient use cases to justify standardization? Combined with a custom allocator built on malloc/free it *might* enable the memory ownership transfer with legacy code capability alluded to in N4359.

2. Is the API of allocator_deleter sufficiently general to serve the use cases it needs to? One possibility would be to add another pointer to its API to allow the first element to begin at a non-zero offset with respect to the allocated pointer. This *might* allow ownership transfer between vector and user-defined containers such as a “sliding buffer” (untested).

3. Is the error-proneness sufficiently reduced? I’m too biased to answer that question myself.

At any rate, I think this at least addresses the technical problems with the proposal. Please feel free to take this and run with it. At this point I’m not planning on proposing it.

Howard

inkwizyt...@gmail.com

unread,
Feb 15, 2015, 9:05:06 AM2/15/15
to std-pr...@isocpp.org
I think your approach is best, it will allow handling raw vectors with very low chance of error. This will be exactly error prone like current `unique_ptr` release with custom deleter.
Overall in some simple cases this approach allow simple syntax: `f(vec.release().release())`, I think this is feature because you need type more to have dangerous effect.

Another thing is that some people wanted raw/uninitialized elements in vector:
https://groups.google.com/a/isocpp.org/forum/?fromgroups#!searchin/std-proposals/vector/std-proposals/d6WYvULWZCo/zuEo_CXCBpoJ
https://groups.google.com/a/isocpp.org/forum/?fromgroups=#!searchin/std-proposals/push_back_/std-proposals/5BnNHEr07QM/i8lb7fqKSWkJ

This proposal will allow handling them without breaking vector invariants.

If `std::string` will have similar proposal then transferring data between them will be easy (if its your version):
std::string s = std::string{ vec.release() };

gmis...@gmail.com

unread,
Feb 17, 2015, 8:37:58 AM2/17/15
to std-pr...@isocpp.org
I think the concept behind this proposal has a whole heap (groan) of merit.

I was recently doing something that made me want the OP's proposal already.

That sent me on the look out for something appropriate and I saw others before me felt the same, see this:
Notice the Detach method. I think the demand for memory interop is high.

vector is a starting point, but that begs the question of string too, as someone else already pointed out.

But more new / malloc  free / delete interoperability would be something worth exploring deeply in my opinion.
Now that C++ has move semantics, maybe a C++ renew ability to match realloc makes sense too.
And generally more ability to gain more precise control over memory would be good if it is reasonably possible.

I think C style interfaces will not lessen in value, just gain more C++ on the inside of them as well as outside, so if C++ can support that better without totally compromising itself, then I think that would be a good thing for both languages. It seems the analysis needs to go beyond just vector.

Maybe a C++ memory transfer / compatibility Study Group is appropriate here?

Jeremy Maitin-Shepard

unread,
Feb 19, 2015, 11:59:27 AM2/19/15
to std-pr...@isocpp.org
On Saturday, February 14, 2015 at 1:49:44 PM UTC-8, Howard Hinnant wrote:

It occurs to me that one way to rescue this proposal is to follow the spirit of:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3586.pdf

1.  Introduce a nested type vector<T, A>::node_ptr.

2.  node_ptr is an alias for unique_ptr<T[], allocator_deleter<T, A>>.

3.  allocator_deleter<T, A> looks something like:
[snip]

I think we have to consider what the underlying goal is.  One thing that a lot of people would like to do is use std::vector as a generic buffer interface.  Release without attach doesn't seem very useful.  I am having trouble understanding in what circumstance this "node_ptr" would actually be useful, rather than just using std::vector itself.  A shared_ptr representing the contents of a vector is already possible by moving the vector into the deleter.

The real problem, I think, is that without standardizing exactly how std::vector uses the allocator, release and attach don't make a lot of sense.  If we standardize that std::vector does the obvious thing with the allocator, and doesn't do any tricks like putting a header at the start of the allocated memory, then it starts to be a very powerful thing.  The default allocator would work for buffers allocated with new [], a malloc/free allocator could be defined for memory obtain via malloc, and std::vector would be the generic buffer interface.  We just need to add a few more methods to allow users to directly manipulate the size without invoking any constructors/destructors.  I don't think there is any advantage in having a separate type to hold the released data, as it woulld really be equivalent to std::vector itself.

There is the question of whether any implementation does anything fancy with the allocator that would conflict with these new requirements.  Hopefully not.  Potentially this would be a good time to finally blow away vector<bool>.

Alternatively, to avoid ABI breaks, a new class std::buffer or something could be defined with these requirements, and std::vector could be deprecated.

Ville Voutilainen

unread,
Feb 19, 2015, 12:26:07 PM2/19/15
to std-pr...@isocpp.org
On 19 February 2015 at 18:59, Jeremy Maitin-Shepard <jer...@jeremyms.com> wrote:
> There is the question of whether any implementation does anything fancy with
> the allocator that would conflict with these new requirements. Hopefully
> not. Potentially this would be a good time to finally blow away
> vector<bool>.

Well, if there is a superior replacement, such "blowing away" might be
considered..

> Alternatively, to avoid ABI breaks, a new class std::buffer or something
> could be defined with these requirements, and std::vector could be
> deprecated.

..but there are some things that don't make sense to deprecate even if you
come up with a superior replacement. If you want to extend vector with new
capabilities, the next version of the standard library might perhaps be a good
place for such a thing, *if* we end up not making it fully compatible.

gmis...@gmail.com

unread,
Feb 19, 2015, 11:05:00 PM2/19/15
to std-pr...@isocpp.org

I think we have to consider what the underlying goal is.  One thing that a lot of people would like to do is use std::vector as a generic buffer interface.  Release without attach doesn't seem very useful. 

Yes that's what I was getting at in my earlier post - vector having attach and detach buffer functions like the Buffer class I linked to. Plus full interchange between new malloc realloc and free and delete.

It would be useful to make string have the same facilities but if I'm thinking clearly here small string optimization might make that interface problematic as you would logically think that anytime you had a string you could detach the buffer and use it as a char[] array, but if the buffer was in the SSO area you would effectively be asking for an allocation to Detach it and that might be a surprising point of failure. You could give up SSO but I don't know if people would or should vote for that. Or maybe their is another workable interface but otherwise it could get a little fiddly. I don't know how viable any of this is in reality. Maybe supporting this is too limiting for the types. But it seems a worthwhile discussion to have though.

Ville talks about a new STL. Would people want this ability from a new STL if it were possible? I believe I would, but I don't know what the costs of getting it are. i.e. what will get more difficult or will I lose elsewhere if I get this. If that cost turns out more than I expected, maybe I wouldn't want this feature after all. But until then I know for sure, it seems appealing.

Jeremy Maitin-Shepard

unread,
Feb 19, 2015, 11:12:01 PM2/19/15
to std-pr...@isocpp.org
On Thu, Feb 19, 2015 at 9:26 AM, Ville Voutilainen <ville.vo...@gmail.com> wrote:
On 19 February 2015 at 18:59, Jeremy Maitin-Shepard <jer...@jeremyms.com> wrote:
> There is the question of whether any implementation does anything fancy with
> the allocator that would conflict with these new requirements.  Hopefully
> not.  Potentially this would be a good time to finally blow away
> vector<bool>.

Well, if there is a superior replacement, such "blowing away" might be
considered..

It hardly needs to be repeated, but the main problem with vector<bool> is its name.  Changing it to bitvector would at least solve much of the problem.  I'm sure there is a lot that could be improved about its interface, but just providing a way to reliably *avoid* it in generic code would be a good start.

Jeremy Maitin-Shepard

unread,
Feb 19, 2015, 11:18:00 PM2/19/15
to std-pr...@isocpp.org
I agree that supporting it for std::string would be very useful (and more generally allowing inexpensive moving between string and vector).  In the case that SSO is used, an allocation would be required, but that wouldn't necessarily be a problem, since I imagine that most of the time release() a heap buffer really is required and if there isn't already one, it would have to be allocated.  The main issue here would be the same as with vector: the precise way in which string uses the allocator would have to be defined by the standard, and it is possible that this would mean an ABI break for some implementations.


--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/H2-dd8sFAKA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

jerry.l...@gmail.com

unread,
Apr 27, 2015, 5:33:22 AM4/27/15
to std-pr...@isocpp.org
I think we should be wiser, thinking that calling release definitely cause losing of some information. 
This imperfection should be introduced to solve a problem that we can't solve yesterday. In many situations I need a release operation,so I access the private field of a vector to achieve that but then I lose portability.

I would like to see the problem going away for some reason and we can transfer the ownership of the internal array.
I would also like to see there is no such thing as release. If we have to have a release, it should be simple, and not dangerous in most cases.


Marc

unread,
Apr 27, 2015, 7:12:16 AM4/27/15
to std-pr...@isocpp.org
Le samedi 14 février 2015 20:10:09 UTC+1, Thiago Macieira a écrit :
std::vector may allocate a a block of data that is bigger than the actual
array and it may store some book-keeping information at the beginning of such
a block. Example:

template <typename T, typanem Allocator>
struct vector_data
{
        size_t count;
        size_t capacity;
        Allocator allocator;
        T array[0];                // compiler extension support
};

template <class T, class Allocator = allocator<T> >
class vector
{
        vector_data<T, Allocator> data;
        [...]
};

(data is probably a pointer to vector_data)
Not directly related to this thread, but do you know why none of the common implementations of std::vector chose such a strategy? Mozilla has nsTArray but restricted to POD types, std::string was like this in libstdc++, but I don't see any real vector. In some applications, having a lot of empty vectors can be very convenient, and if it only takes one pointer it is actually efficient. So I am wondering if the choice was done for a small speed-up when computing the size of a vector, or if there are implementation difficulties (possibly due to allocators? Or because the most efficient way to handle the empty vector is pointing to a singleton object and windows still has issues with that across DLLs?).
Reply all
Reply to author
Forward
0 new messages