Sometimes when using vectors in low-level embedded land it is nice to know where the underlying buffer of a vector is before any data exists in the vector. As it stands it is not possible to do this, even if first calling reserve.
....
This will allow people to grab a pointer to the underlying buffer of a vector if and only if they have reserved space first. I suspect many people incorrectly expect one of the above methods I have listed to work as intended (though they do not).
vector<int> v;
v.reserve(10);
memcpy(v.data(), ..., 10 * sizeof(int));
v.resize(10);
vector<int> v(10, std::default_init);
memcpy(v.data(), ..., 10 * sizeof(int));
On Friday, July 21, 2017 at 10:40:25 PM UTC-4, stevem...@gmail.com wrote:Sometimes when using vectors in low-level embedded land it is nice to know where the underlying buffer of a vector is before any data exists in the vector. As it stands it is not possible to do this, even if first calling reserve.....This will allow people to grab a pointer to the underlying buffer of a vector if and only if they have reserved space first. I suspect many people incorrectly expect one of the above methods I have listed to work as intended (though they do not).
To what end?
Assume now that i want to use the vector to hold assembly instructions. Various instructions' encoding are relative to their location in memory, so in order to encode them properly one must know where they are about to live inside the vector.
OK, let's say that we allow you to get a pointer to that buffer. What are you going to do with it? Nothing that's legal in C++, that's for sure.
I really do want to get a pointer to that buffer. And for good reason. If i can get a pointer to that buffer then i can use std::uninitiallized_copy to fill it in at my leisure. There really are cases where I need to know the memory address that my buffer lives at before I put anything into it. Adding a default-initialization to all standard types would be cool, but I hardly think that's the easiest solution to this issue. There's also cases where I don't want to pay for this default initialization, if you give me a way to fill that buffer myself and then tell vector it's change in size() then there's no reason to pay for default initialization.
This is all well and good from a puritan standpoint. But at the end of the day people use the STL for real world things and from embedded land I can tell you that I need to know where my buffer is.
This is all well and good from a puritan standpoint.
But at the end of the day people use the STL for real world things and from embedded land I can tell you that I need to know where my buffer is.
If adding default initialization to all the types actually happens then i'm totally for it. I'm weighing the probability of this actually happening. It seems like doing something like that would come around in C++ 30, while this could be done tomorrow.
I'm more than aware this is currently not safe that is WHY i made this thread, so that this vectors can morph into a thing that can finally replace C style arrays. As it is I am forced to use C style arrays, memcpy into them, and then construct a vector from that. STL vectors already do 99% of anything a C style array can do, if we add just a tiny bit more then C style arrays will finally have no real use case. I don't care AT ALL how that happens, just let me memcpy into a vector by calling reserve and i'll be a happy man.
I'm more than aware this is currently not safe that is WHY i made this thread, so that this vectors can morph into a thing that can finally replace C style arrays. As it is I am forced to use C style arrays, memcpy into them, and then construct a vector from that.
Please explain why it is difficult to write a class that just manages one heap allocation and a length?
Call resize(), not reserve(). The reserve call may be ignored, if for any
reason the container does not wish to comply.
Re-implementing my custom vector class that lets me memcpy into it is totally a possibility, but i'd be re-implementing exactly what the STL does + making .data() return a valid pointer, no reason to do that and it'd be a horrible waste of time. I could totally use a C style array wrapped in a unique_ptr but what happens when i want to add more instructions to my array, oh i have to resize, copy, do all that logic.
On Saturday, July 22, 2017 at 12:20:57 AM UTC-4, Nevin ":-)" Liber wrote:On Fri, Jul 21, 2017 at 11:14 PM, Nevin Liber <ne...@eviloverlord.com> wrote:--Please explain why it is difficult to write a class that just manages one heap allocation and a length?I'm sorry; that is over-engineering. You were not proposing that vector keep track of the length.Simpler question: why isn't unique_ptr<T[]> the solution to your problem?Nevin ":-)" Liber <mailto:ne...@eviloverlord.com> +1-847-691-1404Re-implementing my custom vector class that lets me memcpy into it is totally a possibility, but i'd be re-implementing exactly what the STL does + making .data() return a valid pointer, no reason to do that and it'd be a horrible waste of time.
That's true for std::vector and the Standard Library containers, but it is not
a general rule for all containers.
Reserving space is a hint that you're going
to need at least that amount. The keyword is "hint".
Besides, is there any wording that resize() with the same size on a space-
reserved container must NOT relocate?
What happens if you shrunk the
container, is it allowed to relocate?
Sometimes when using vectors in low-level embedded land it is nice to know where the underlying buffer of a vector is before any data exists in the vector. As it stands it is not possible to do this, even if first calling reserve.
....This will allow people to grab a pointer to the underlying buffer of a vector if and only if they have reserved space first. I suspect many people incorrectly expect one of the above methods I have listed to work as intended (though they do not).
To what end?
OK, let's say that we allow you to get a pointer to that buffer. What are you going to do with it? Nothing that's legal in C++, that's for sure.
After all, the standard specifically states that the return from `data` is:
> A pointer such that [data(), data() + size()) is a valid range.
Since `size` is zero, that will be an empty range.
But let's pretend that we allowed you to access up to `capacity` rather than `size`. What good is that? First, there are no `T`'s past the end of the sequence. So pointer arithmetic in that region is dubious, and you can't just write to objects that don't exist.
int id = 13;
std::vector<std::byte> v;
v.reserve(1024);
message_buffer(id, v.data());
//some time after
v.clear();
v.push_back(1);
v.push_back(2);
v.push_back(3);
v.push_back(0); //or `memcpy` of some struct to `v.data()` with correct size
message_notyfy(id);
May be it's time to reanimate std::dynarray proposal?
I'm more than aware this is currently not safe that is WHY i made this thread, so that this vectors can morph into a thing that can finally replace C style arrays. As it is I am forced to use C style arrays, memcpy into them, and then construct a vector from that. STL vectors already do 99% of anything a C style array can do, if we add just a tiny bit more then C style arrays will finally have no real use case. I don't care AT ALL how that happens, just let me memcpy into a vector by calling reserve and i'll be a happy man.
On Monday, July 24, 2017 at 6:49:48 PM UTC-4, Arthur O'Dwyer wrote:
Your thread started off great — let's have v.data() do the intuitive thing for reserved vectors! [...]
However, I do think that what you originally asked for is useful in obscure cases, and doesn't cost us anything to standardize:std::vector<char> v;v.reserve(10);char *p = v.data(); // currently UB; you proposed making this OKv.resize(5);assert(p == v.data()); // assert that the vector should not have reallocatedI have an example implementation here. (The example implementation is not surprising and matches all existing implementations I'm aware of; what you originally asked for was basically just a change in the wording to match what vendors already do.)
OK, so... what exactly would you change the wording to? The current wording is:
> A pointer such that `[data(), data() + size())` is a valid range. For a non-empty vector, `data() == addressof(front())`.
If the `vector` is empty, what does the pointer point to? What are you guaranteeing about that pointer? You can't guarantee that it points to a valid range, since there isn't one. And you can't just say that it points to something that will be a valid range, since... what exactly does that even mean?
So what is it pointing to? The internal allocation, cast to a `T*`?
So what is it pointing to? The internal allocation, cast to a `T*`?Yes, exactly; that's what it points to.The only question is, how do we express that real-world requirement opaquely and obscurely enough to satisfy the Committee?
But we avoid saying so explicitly, and thus preserve the sacred mystery of std::vector against those philistines who would callously reduce it to a simple dynamically allocated array. :)
Honestly, what was the possible point of that phrasing?
I see one possible change in behavior: currently it is permissible for an implementation given a call to reserve() on an empty vector to defer any allocation to the first insert (and then allocate the full amount required by capacity()); this would no longer be possible (since data() is noexcept).
On Tue, Jul 25, 2017 at 6:27 PM, Edward Catmur <e...@catmur.co.uk> wrote:
I see one possible change in behavior: currently it is permissible for an implementation given a call to reserve() on an empty vector to defer any allocation to the first insert (and then allocate the full amount required by capacity()); this would no longer be possible (since data() is noexcept).How exactly is that possible, given that the post condition for reserve(n) is capacity() >= n?
If you defer the allocation, how can vector guarantee the allocation succeeds, especially since the allocation takes place in the allocator and not by vector itself?
I think it best to require that when the capacity > 0 this pointer must be a "real - non-zero" pointer that points to the same place that the first element will be placed at (i.e to the buffer). This would mean that allocations can no longer be deffered, but i don't see how that's an issue, i'd rather expect it to be a perk. Someone using reserve likely expects that to be the point where the allocation occurs.
On Wed, Jul 26, 2017 at 10:19 AM, 'Edward Catmur' via ISO C++ Standard - Future Proposals <std-pr...@isocpp.org> wrote:On Wed, Jul 26, 2017 at 4:30 AM, Nicol Bolas <jmck...@gmail.com> wrote:On Tuesday, July 25, 2017 at 10:59:34 PM UTC-4, Nevin ":-)" Liber wrote:On Tue, Jul 25, 2017 at 6:27 PM, Edward Catmur <e...@catmur.co.uk> wrote:
I see one possible change in behavior: currently it is permissible for an implementation given a call to reserve() on an empty vector to defer any allocation to the first insert (and then allocate the full amount required by capacity()); this would no longer be possible (since data() is noexcept).How exactly is that possible, given that the post condition for reserve(n) is capacity() >= n?
Simple: you return the new capacity, but you didn't allocate any of the memory behind it yet.
If you defer the allocation, how can vector guarantee the allocation succeeds, especially since the allocation takes place in the allocator and not by vector itself?
That's the sticking point. If there is unused capacity in a `vector`, if `size()` < `capacity()`, then a vector is not allowed to fail due to allocation errors. So it's unclear how an implementation could implement this requirement and still not allocate upon `reserve` > `capacity`.Am I missing something? [vector.modifiers] just says that in case no reallocation occurs on push_back() all pointers, iterators and references to elements remain valid; an empty vector has no elements, so this is trivially true. Is this somewhere within the overall library, container or sequence container requirements?Under the wording for "reserve" itself, N4659 says: "No reallocation shall take place during insertions that happen after a call to reserve() until the time when an insertion would make the size of the vector greater than the value of capacity()."
The Standard never exactly defines what it means by "reallocation" AFAIK,
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/45d228c5-6b66-4426-a620-bcfc33f79c94%40isocpp.org.
So it's unclear whether reserve() is allowed to postpone the actual allocation?
I rely on it to do the actual allocation and defer it to later
vector<int> v(10, std::default_init);
memcpy(v.data(), ..., 10 * sizeof(int));
There are exactly 3 different things being discussed in this thread
There are exactly 3 different things being discussed in this thread1) Allowing .data() to return a pointer that can be used properly for pointer arithmatic, but have any dereference of it be undefined. I.E it points to the buffer, but that's it, if you use it U.B2) Default init the vector instead of value init. This could be a 'real' solution to the next case (#3)3) Using the .data() pointer to allow direct filling of the buffer, useful in cases where a C api is exposed. But this has been mentioned that the vector can't know of someone modifying it externally, so resize and things will overwrite changes you make and letting people do this is probably bad. (the solution was therefore to make dereferencing .data()'s pointer past .size() be U.B)I initially proposed #1, after further discussion, i suggested we add #3 for performance reasons for C apis. There should be two seperate solutions in my mind for these.
This thread is getting horribly off topic by bouncing between all three of these. Does anyone at least disagree with chaning the wording of .data() to allow it to be valid for pointer arithmatic, as long as the vector has a capacity > 0?
On quarta-feira, 26 de julho de 2017 15:50:21 PDT Nicol Bolas wrote:
> The point that's unclear is (oddly enough) whether "reallocation" is *just*
> about the behavior of iterators/references/pointers, or if it is also about
> actually allocating memory and all of the side-effects thereof which are
> not just iterators/references/pointers. The point is that the standard
> doesn't *explicitly* state that "no reallocation" means no allocating
> memory.
I don't think the allocation is the issue. The issue is the lifetime begin of
the objects in the vector, even if trivial.
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t* raw_buf = buf.data();
buf.push_back(makeJmpInst(raw_buf).bytes());
buf.push_back(makeCallInst(raw_buf).bytes());
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t* raw_buf = buf.data();
buf.push_back({1,2,3});
if(buf.data() !=raw_buf)
error("oh no we got moved on push_back");
int fillSomeBuffer(T* buf); //returns the number of T's actually put into buf
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t cBuf[32];
int count = fillSomebuffer(cBuf);
buf.insert(cBuf, cBuf + count);
On quarta-feira, 26 de julho de 2017 19:31:30 PDT Nicol Bolas wrote:
> What do you mean by "valid for pointer arithmatic[sic]"? Pointer arithmetic
> in C++ is (at present) only defined for arrays of actual live objects,
> which `vector` creates when you insert elements.
You can do pointer arithmetic on the pointer returned by malloc(), operator
new() and the allocators up to the size you allocated (plus one). Initialising
the objects in that storage is not required.
In fact, arithmetic on those pointers is a requirement to start the lifetime
of the objects in the first place.
On Wednesday, July 26, 2017 at 10:49:08 PM UTC-4, Thiago Macieira wrote:On quarta-feira, 26 de julho de 2017 19:31:30 PDT Nicol Bolas wrote:
> What do you mean by "valid for pointer arithmatic[sic]"? Pointer arithmetic
> in C++ is (at present) only defined for arrays of actual live objects,
> which `vector` creates when you insert elements.
You can do pointer arithmetic on the pointer returned by malloc(), operator
new() and the allocators up to the size you allocated (plus one). Initialising
the objects in that storage is not required.
In fact, arithmetic on those pointers is a requirement to start the lifetime
of the objects in the first place.
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
As nicol states problem 1) Using the value of an empty `vector`'s pointer as some kind of marker, a pointer to hold until its filled in later.There's two immediate cases i can think of1) the functions makeJmpInst and makeCallInst need to be given a pointer to where the instruction will be in memory. Their encoding depends upon their memory location. I.E you cannot put them into the vector, without first knowing where they will be put at.
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t* raw_buf = buf.data();
buf.push_back(makeJmpInst(raw_buf).bytes());
buf.push_back(makeCallInst(raw_buf).bytes());
2) gives you a way to check if the buffer was re-allocated
vector<int> vec1 = ...
auto ptr = vec1.data();
vec1.push_back(); //Cause reallocation. `ptr` is invallid.
vector<int> vec2 = ... //Just so happens to have the `ptr` value.
vec1 = std::move(vec2);
ptr == vec1.data(); //Reallocation happened, but we can't tell.
Where is it stated that a pointer has to point to a valid object, can it not just point to memory.
A call to reserve is absolutely going to put an array of bytes into the underlying vector. What am i misunderstanding that dis-allows you to cast this to a T* and let me to pointer arithmetic.
If it's an issue with T*, just give me a char*.
As Nicol states problem 2) Filling in the contents of a `vector` without having to initialize those contents twice.
...
That's two real-world cases why something needs to change. I'm not a standardese guy i don't know all the necessary nuances to get this done properly. I simply ask that others help guide me to get a solution to these.
2) gives you a way to check if the buffer was re-allocated
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t* raw_buf = buf.data();
buf.push_back({1,2,3});
if(buf.data() !=raw_buf)
error("oh no we got moved on push_back");
On Wednesday, July 26, 2017 at 10:10:26 PM UTC-4, stevem...@gmail.com wrote:There are exactly 3 different things being discussed in this thread1) Allowing .data() to return a pointer that can be used properly for pointer arithmatic, but have any dereference of it be undefined. I.E it points to the buffer, but that's it, if you use it U.B2) Default init the vector instead of value init. This could be a 'real' solution to the next case (#3)3) Using the .data() pointer to allow direct filling of the buffer, useful in cases where a C api is exposed. But this has been mentioned that the vector can't know of someone modifying it externally, so resize and things will overwrite changes you make and letting people do this is probably bad. (the solution was therefore to make dereferencing .data()'s pointer past .size() be U.B)I initially proposed #1, after further discussion, i suggested we add #3 for performance reasons for C apis. There should be two seperate solutions in my mind for these.
There have been only two problems presented here:
1: Using the value of an empty `vector`'s pointer as some kind of marker, a pointer to hold until its filled in later.
2: Filling in the contents of a `vector` without having to initialize those contents twice.
These two problems have absolutely nothing to do with one another. As such, they require separate solutions.
Also, I have no idea how "make dereferencing .data()'s pointer past .size() be U.B" is a solution to anything. The whole point of this thread is to define behavior, not to make new undefined behavior.
This thread is getting horribly off topic by bouncing between all three of these. Does anyone at least disagree with chaning the wording of .data() to allow it to be valid for pointer arithmatic, as long as the vector has a capacity > 0?
What do you mean by "valid for pointer arithmatic[sic]"? Pointer arithmetic in C++ is (at present) only defined for arrays of actual live objects, which `vector` creates when you insert elements. Well, the return value of `data()` for an empty `vector` would (at best) be just a memory allocation; it's not an array of live objects. So you can't do pointer arithmetic on it.
Incorrect. " Indirection through an invalid pointer value and passing
an invalid pointer value to a deallocation
function have undefined behavior. Any other use of an invalid pointer
value has implementation-defined behavior."
That was the point of this entire discussion: you can use the .data() pointer
*because* you know it's valid. You can't dereference it yet because the data
there has not begun its lifetime, though.
Conclusion: you cannot portably determine if the old pointer is the same as
the new one because you can't use the old pointer in the first place.
(But everyone does it)
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/2233516.vx36bHqgMl%40tjmaciei-mobl1.
As nicol states problem 1) Using the value of an empty `vector`'s pointer as some kind of marker, a pointer to hold until its filled in later.
There's two immediate cases i can think of [...][...] gives you a way to check if the buffer was re-allocated
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t* raw_buf = buf.data();
buf.push_back({1,2,3});
if(buf.data() !=raw_buf)
error("oh no we got moved on push_back");
Where is it stated that a pointer has to point to a valid object, can it not just point to memory.
As Nicol states problem 2) Filling in the contents of a `vector` without having to initialize those contents twice.Imagine any C api ever that look like this
int fillSomeBuffer(T* buf); //returns the number of T's actually put into bufAs it is you have to do this to not invoke U.B or overwrite elements
std::vector<uint8_t> buf;
buf.reserve(32);
uint8_t cBuf[32];
int count = fillSomebuffer(cBuf);
buf.insert(cBuf, cBuf + count);Ew, look at how horrible that is. Surely we can do better.
However, if you really really want to avoid all-bits-zero-constructing the elements of a container, that's what the C++ allocator model is for.This is the second time I've given you this link. Please click on it.
On Thursday, 27 July 2017 09:29:20 PDT Hyman Rosen wrote:
> On Thu, Jul 27, 2017 at 12:16 PM, Thiago Macieira <thi...@macieira.org>
>
> wrote:
> > Conclusion: you cannot portably determine if the old pointer is the same
> > as
> > the new one because you can't use the old pointer in the first place.
> > (But everyone does it)
>
> You could use memcmp to compare the pointers byte by byte.
Bitwise comparison success is not necessary for equality. Now think of real-
mode x86, where you have 32 bits in a FAR pointer, but only 20 of which
determine the actual address in the low megabyte of RAM.
I'm not sure we can even say bitwise comparison success is sufficient: is it
possible that two pointers are bitwise equal but not really equal?
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposals+unsubscribe@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/CAFk2RUZFt0gWvPw-d4vbKj_%2Bu32%3DxtHZ6Uj9S67cxqtfgP5icQ%40mail.gmail.com.
> FYI, no, if the buffer did get reallocated then this "check" would have
> undefined behavior.
This is, again, incorrect, so you might as well all stop repeating that claim.
On Thu, Jul 27, 2017 at 1:22 PM, Ville Voutilainen <ville.vo...@gmail.com> wrote:> FYI, no, if the buffer did get reallocated then this "check" would have
> undefined behavior.
This is, again, incorrect, so you might as well all stop repeating that claim.
It was undefined behavior in C++03. There are still plenty of people using C++03 compilers (my own large employer included).
Now it's not undefined but it can cause a "system-generated runtime fault". That sounds like a distinction without a difference.