alignas(int) char buf[sizeof(int)];
void f() {
// turn the memory into an int: (??) from the POV of the abstract machine!
::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)
// access storage:
*((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value
}
Now, I'm *not* asking whether the current C++ Standard requires - or not - the noop placement new for this code to be defined.
What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few years
and whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new.
Simple links to any paper(s) discussing this would be already appreciated, the only reference I found was P0137R1, and that's more about clarifying current wording afaikt.
Thanks.
- Martin
p.s.: (*) is "trivial type" the correct term?
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
Hi.
I'm currently trying to understand a few ... interesting ... observations I have been making wrt. the C++ Standard and using char arrays as raw storage.
Essentially, as far as I can tell (have been told), the current C++ Standard only allows using a char array as raw storage (see also std::aligned_storage) when objects are put into this via placement new, even for e.g. int or other trivial(*) types.
See: http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array or related questions where I'm told I'm expected to do the following:alignas(int) char buf[sizeof(int)]; void f() { // turn the memory into an int: (??) from the POV of the abstract machine! ::new (buf) int; // is this strictly required? (aside: it's obviously a no-op) // access storage: *((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value }
Now, I'm *not* asking whether the current C++ Standard requires - or not - the noop placement new for this code to be defined.
What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few years
and whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new.
Simple links to any paper(s) discussing this would be already appreciated, the only reference I found was P0137R1, and that's more about clarifying current wording afaikt.
Thanks.
- Martin
p.s.: (*) is "trivial type" the correct term?
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
I cannot tell you if any discussion has been had. However, ...
... So the only change has been essentially a defect fix that makes unions actually work, in accord with the standard.
In at least 12 years of standardization, the committee has made no substantive change to the causes of bringing an object into being. While this is not conclusive, the fact that C++17 did put a fix into this section means that they have looked at it and talked about it at some point. So I would suggest that, if there was discussion about it, it did not progress beyond discussion.
-snip-
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
Compilers do treat it as UB. UB doesn't mean "crash"; UB can still do what you want.
The point of the UB designation is to allow implementations to be reasonably fast. If you reinterpret cast a pointer to a different type, the compiler doesn't have to check to see if that object really exists there; it will simply trust your cast and pretend that there is an object there.
On Monday, January 16, 2017 at 2:08:39 PM UTC-5, Martin Ba wrote:On Monday, January 16, 2017 at 12:37:00 AM UTC+1, Nicol Bolas wrote:I cannot tell you if any discussion has been had. However, ...
... So the only change has been essentially a defect fix that makes unions actually work, in accord with the standard.
In at least 12 years of standardization, the committee has made no substantive change to the causes of bringing an object into being. While this is not conclusive, the fact that C++17 did put a fix into this section means that they have looked at it and talked about it at some point. So I would suggest that, if there was discussion about it, it did not progress beyond discussion.
Thanks a lot for that wrap up!
-snip-
p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
Compilers do treat it as UB. UB doesn't mean "crash"; UB can still do what you want.
The point of the UB designation is to allow implementations to be reasonably fast. If you reinterpret cast a pointer to a different type, the compiler doesn't have to check to see if that object really exists there; it will simply trust your cast and pretend that there is an object there.
What I meant by "treating it as UB" was in the same vein as, e.g., signed integer overflow. Compilers generate code today that doesn't work anymore if it relies/relied on signed integer overflow, although older optimizer didn't "break" anything.In the same vein, I'm sure we can imagine several transformations that break code that has no "placement new" (from my OP) that used (and uses) to work.
Such as?
Assuming a lack of signed integer overflow means that the compiler doesn't have to insert code to check for integer overflow. The UB designation allows correct code (code without overflows) to execute at maximum performance. Any degrading of incorrect code is merely a consequence of making correct code as fast as possible.
Let's say that you have a function that returns a `T*`. The fastest code generated which uses this return value is code which assumes that `T*` points to a live, valid object of type `T`. To do anything else makes correct code slower. Even if you inlined that function or could otherwise be certain that the `T*` was not valid, that simply means UB happens. Do you think compiler writers are going to detect such circumstances and make the code fail in some way?
Can you give an example of these "several transformations"? How would they speed up correct code?
-fdelete-null-pointer-checks - (see e.g. http://stackoverflow.com/questions/23153445/can-branches-with-undefined-behavior-be-assumed-unreachable-and-optimized-as-dea) the compiler sees a branch that definitiely invokes UB and optimizes away the branch and the branch check.
It should also be noted that, well, we can trace this rule back at least 12 years. Compilers haven't done anything to break such code yet.
delete-null-pointer-checks have happened in the sense that compiler writers saw legal optimization opportunities that break some code. So, just because I or you cannot see any reason today, that's not much consolation to me :-)
In the same vein, I'm sure we can imagine several transformations that break code that has no "placement new" (from my OP) that used (and uses) to work.
Such as?
On 01/15/2017 09:56 PM, Martin Ba wrote:
> Hi.
>
> I'm currently trying to understand a few ... interesting ... observations I have been making wrt. the C++ Standard and using char arrays as raw storage.
>
> Essentially, as far as I can tell (have been told), the current C++ Standard only allows using a char array as raw storage (see also std::aligned_storage) when objects are put into this via placement new, even for e.g. int or other trivial(*) types.
>
> See: http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array or related questions where I'm told I'm expected to do the following:
>
> |alignas(int) char buf[sizeof(int)];
>
> void f() {
> // turn the memory into an int: (??) from the POV of the abstract machine!
> ::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)
>
> // access storage:
> *((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value
> }
> What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few years
> and whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new./
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.
If you believe that intent is misguided, feel free to propose a change.
All the change I can propose is that CWG considers some way to make this work. (As it does in practice anyway *today*.) As I understand so far from what I gleaned from P0137R1 is that the problem we have at the moment is that the definition for objects (in the memory location sense) doesn't allow this and that it's pretty complex and hard to come up with something that does allow it without restricting other things.
I'm sure compiler writers will explain to you how that substantially
pessimizes their code generation.
> p.s.: (*) is "trivial type" the correct term?
>
> p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
> that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
Some compilers might make special allowances for their particular user
community, precisely out of concerns you cited. That doesn't make your
code any better.
On Monday, January 16, 2017 at 2:08:39 PM UTC-5, Martin Ba wrote:In the same vein, I'm sure we can imagine several transformations that break code that has no "placement new" (from my OP) that used (and uses) to work.
Such as?
On Monday, January 16, 2017 at 10:11:19 PM UTC+1, Jens Maurer wrote:On 01/15/2017 09:56 PM, Martin Ba wrote:
> Hi.
>
> I'm currently trying to understand a few ... interesting ... observations I have been making wrt. the C++ Standard and using char arrays as raw storage.
>
> Essentially, as far as I can tell (have been told), the current C++ Standard only allows using a char array as raw storage (see also std::aligned_storage) when objects are put into this via placement new, even for e.g. int or other trivial(*) types.
>
> See: http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array or related questions where I'm told I'm expected to do the following:
>
> |alignas(int) char buf[sizeof(int)];
>
> void f() {
> // turn the memory into an int: (??) from the POV of the abstract machine!
> ::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)
>
> // access storage:
> *((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value
> }
> What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few years
> and whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new./
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.
If this is really the intent, then this needs to be more clearly communicated and, I feel, rationalized. (Maybe it already has? Thats what the OP was actually about.)
If you believe that intent is misguided, feel free to propose a change.
Yes, I very much feel the intent is misguided. For two reasons:
- This intent declares UB totally reasonable legacy code. At least I consider it reasonable too *not* have to place a no-op placement new in straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char buffer to back any other typed data is automatically UB in C++. Another unnecessary incompatibility.
All the change I can propose is that CWG considers some way to make this work. (As it does in practice anyway *today*.) As I understand so far from what I gleaned from P0137R1 is that the problem we have at the moment is that the definition for objects (in the memory location sense) doesn't allow this and that it's pretty complex and hard to come up with something that does allow it without restricting other things.
I'm sure compiler writers will explain to you how that substantially
pessimizes their code generation.
For this specific case, I do hope not. I'm braced for anything.
> p.s.: (*) is "trivial type" the correct term?
>
> p.p.s.: My personal impression on the matter is that requiring the placement new for trivial types (like int, ...) is rather insane and the amount of real world code compiled with C++ compilers
> that would be broken should any C++ compiler/optimizer ever manage to actually treat this as UB is quite huge. 'Course I may be totally off here. Just take this as a disclaimer :-)
Some compilers might make special allowances for their particular user
community, precisely out of concerns you cited. That doesn't make your
code any better.
As far as the C++ Standard goes, I'm not so much concerned with "better" but with not allowing future compilers to break reasonable legacy code.
*When* using char arrays (or malloc'ed memory) as backing store for trivial types, I fully assume most (non generic) existing code to *not* employ placement new, simply because it's the straightforward thing to (not) do and the placement new would be a no-op and all compilers up to today seem to generate working code.
I think, here, the C++ Standard should take into account this "existing practice". (Yeah, I know the same arguments were/are raised wrt. signed integer overflow or the nullpointer-check-elimination, but I at least feel those cases, while possible problematic in quite some cases, are historically quite more clear cut. And at least both affect C and C++ code the same.)
cheers.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
To post to this group, send email to std-dis...@isocpp.org.
Visit this group at https://groups.google.com/a/isocpp.org/group/std-discussion/.
So, how would you suggest the object model change to accommodate such a thing? What is the syntax that causes a piece of storage that contains all objects to contain just one?
Personally? I say let it go. C++ programmers have managed to survive this being UB since at least 2004. We're teaching C++ programmers nowadays to avoid pointless casting; the average C++ programmer today is far more likely to employ placement-new than to do casts and assume it was constructed.
I'd rather the committee spend time shoring up the object model for genuine C++ purposes, like making it possible for `vector` to be implemented without UB.
--
So where would you be allocating these different objects?
--
I very much disagree.C++ can use placement new, true. But C cannot, and many programs need to compile and run as both.Furthermore, I don't know of any reasonable way a compiler could exploit this to produce better code. Strict aliasing doesn't apply, since char pointers can alias anything. More importantly, you make it impossible to take an aligned char array — say, one filled in by an I/O operation — and cast it to an array of (say) int without an O(n) copy and a 2x memory overhead! That is anything BUT fast. C++ should NOT impose such overheads.
> Of course you have to explain
> to the implementation that each raw memory location corresponds to an
> int, and *you* imposed that burden on yourself--not the language.
The point I'm making is that the requirement to explain this is imposed
by the standard, while there is no technical reason to do that.
It seems to me that this entire discussion can be paraphrased as, "is c++ compatible with c or not?". [compatibility being defined as PODs legally created by one are valid in the other].The answer appears, at least insofar as the standard is concerned, "no, or at best, implementation defined".I think it would be fair to say that the vast majority of c++'s user base would hope that the answer is, "yes".I think it's therefore reasonable that the standard should give PODs special treatment, allowing them to have been created (default-initialised) simply by allocating properly aligned storage of sufficient size.This would then allow c++ to interoperate with c (in both directions) both legally and de-facto.
--
> Yes, I very much feel the intent is misguided. For two reasons:
>
> * This intent declares UB totally reasonable legacy code.
Even legacy code should have used "memcpy" here.
switch (*reinterpret_cast<uint16_t const*>(buf)) {
case MsgA::value:
handle(*reinterpret_cast<MsgA const*>(buf);
break;
case MsgB::value:
handle(*reinterpret_cast<MsgB const*>(buf);
break;
// ...
}
uint16_t msgType;
memcpy(&msgType, buf, sizeof(msgType));
switch (msgType) {
case MsgA::value: {
MsgA msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}
case MsgB::value: {
MsgB msg;
memcpy(&msg, buf, sizeof(msg));
handle(msg);
break;
}
template <class T, class U>
T* start_lifetime_of_object_without_any_initialization_cast(U*);
> It would also create a dysfunctional C++ object model. A piece of storage would have to have every object that could fit into it all at the same time. >You're basically saying that any piece of memory should be able to be treated as a union of all appropriate types.
> As I understand it, that's not even the C object model; it manifests objects in arbitrary memory by you writing to them.OK, I think it's reasonable to require an intrinsic to have been written to before it is an 'object'. That makes sense as it allows the use of sentinels etc. Presumably this is why C mandates it this way.So I'll modify my argument while continuing on the theme: What's sauce for the goose ought to be sauce for the supposedly compatible gander.I argue (and I don't think I am alone) that for intrinsics and PODs solely thereof, elements that have been written to, either by C or C++ ought to have been constructed. If they have been written to through a cast of a correctly aligned memory pointer, they ought to 'exist' in that actual memory [subject to as-if-compatible optimisations, of course]. This is what we expect in C, and it is arguably what we would be reasonable to expect in C++.I accept that this would require a differentiation in handling between PODs and non-POD structs in the standard. I think that's reasonable:When constructors, destructors, copy, move ops are non-trivial, we expect 'object-like' behaviour. When they are trivial (particularly when they are NOPs) we expect memory-like behaviour.Again, this is the de-facto reality on which the code base of every c++ program that calls a C library depends. Why not codify a de-facto reality in order to legitimise it?
auto alloc = malloc(sizeof(int) * n);
get_ints(alloc, n, ...);
auto ints = reinterpret_cast<int*>(alloc);
ints[5]; //UB
--
You already have that. It's called using placement `new` with default initialization. If `T` is trivially default constructible, then `::new(p) T` will begin the lifetime of `T` with no initialization.
You already have that. It's called using placement `new` with default initialization. If `T` is trivially default constructible, then `::new(p) T` will begin the lifetime of `T` with no initialization.What if T isn't trivially default constructible?
What if it is, but my compiler decides to "default-initialize" some fundamental types with fixed values in debug mode?
On Tuesday, January 17, 2017 at 12:22:24 PM UTC-5, barry....@gmail.com wrote:
You already have that. It's called using placement `new` with default initialization. If `T` is trivially default constructible, then `::new(p) T` will begin the lifetime of `T` with no initialization.What if T isn't trivially default constructible?
If a type is not trivially default constructible, then the writer of that type has explicitly decided that the type cannot take on arbitrary values. Therefore, it can only take on a specific set of values, defined by the constructors of that type. It can still be trivially copyable, but that requires you to start from a valid instance of that type, as created by one of its constructors.
Therefore, whatever construct you want to have that adopts the data in existing storage cannot apply to non-trivially default constructible types.
What if it is, but my compiler decides to "default-initialize" some fundamental types with fixed values in debug mode?
... That's a fair point.
I cannot see any reasonable argument that pointer arithmetic should not be allowed to work on consecutive objects.
int* p = get_ints_from_c(); *(++p); should absolutely be defined behaviour in c++ provided there is actually some memory at std::addressof(*p) + sizeof(p); - there is no conceivable reason why it should not.Note that I am asserting *should absolutely* - a very strong statement. This is because we absolutely cannot move away from c. There are no c++ operating systems. Therefore all useful libraries are written with C interfaces. Thousands of c++ wrapper libraries exist to turn those C interfaces back into c++. We don't do that because we want to. We do that because C++ is not suitable creating portable object libraries, having no modules or common ABI.
By all means lets talk about moving forward - after we have modules, defined ABIs, an agreed-upon means to transmitting exceptions and so on.Until then, the entire foundation of our C++ universe is C. To try to pretend otherwise is a fallacy.OK, it's "difficult" to marry the c++ abstract machine model with the C memory model. So what? That doesn't mean that it should not be done.
Clearly the definition of the C++ abstract machine needs to be revised, or made more granular. Difficulty does not come into it.Should bitwise copies of non-trivial objects be allowed? Of course not.
We can express the reason why not as a high level "because of memory model concerns" or we can be truthful: "because the pointers will be wrong and your double delete will crash the program".
Em terça-feira, 17 de janeiro de 2017, às 18:09:17 PST, Andrey Semashev
escreveu:
> My gut feeling is that the C++ object model has to allow a POD object to
> automatically begin its lifetime whenever it is modified in the raw
> storage. This is similar to how an active member of a union begins its
> lifetime on the first modification.
A POD object's lifetime should begin when storage for it is provided and end
when storage is freed. In that sense, it happens before the modification
through a pointer. In fact, you could say it happens sometime inside malloc().
Em terça-feira, 17 de janeiro de 2017, às 13:20:10 PST, Demi Obenour escreveu:
> THIS. C++'s main selling point is C compatibility. Much code is REQUIRED
> to compile as BOTH C and C++. Without changes.
Yes, but no.
Yes, I agree with your agreeing with Richard.
But no, there's not a lot of code that needs to compile as both C and C++.
That's limited to a few (static) inline functions in headers. It may be that
they're used extremely often, especially if they come from the standard C
library itself or from POSIX or a relevant standard like the ancillary socket
data payloads defined by RFC 3542 -- CMSG_DATA and CMSG_NXTHDR are *ugly*.
Richard's point, which I agree with, is that C++ needs to interoperate with C
libraries and vice-versa. That does not imply compiling a lot of code as
either.
--
On Tuesday, January 17, 2017 at 1:10:25 PM UTC-5, Nicol Bolas wrote:On Tuesday, January 17, 2017 at 12:22:24 PM UTC-5, barry....@gmail.com wrote:
You already have that. It's called using placement `new` with default initialization. If `T` is trivially default constructible, then `::new(p) T` will begin the lifetime of `T` with no initialization.What if T isn't trivially default constructible?
If a type is not trivially default constructible, then the writer of that type has explicitly decided that the type cannot take on arbitrary values. Therefore, it can only take on a specific set of values, defined by the constructors of that type. It can still be trivially copyable, but that requires you to start from a valid instance of that type, as created by one of its constructors.
Therefore, whatever construct you want to have that adopts the data in existing storage cannot apply to non-trivially default constructible types.Not necessarily. struct X { const int a; }; isn't trivially default constructible but nothing else you wrote above applies to it.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
But it should be, because c++ should be compatible with c. The code has clearlily been declared as external "C". So it's not a c++ struct, it's a c struct.
It should obey the c memory model, and Interoperate correctly with c++. In the same way that objective-c++ understands c, c++ and objective c.
#include <stdlib.h>
extern "C"
{
struct Foo {
int bars;
double bar[];
};
Foo *makeFoo() {
int a = 6;
auto vp = malloc(sizeof(Foo) + a * sizeof(double));
auto p = (Foo *) vp;
p->bars = a;
for (int i = 0; i < a; ++i) {
p->bar[i] = i * 2;
}
return p;
}
void deleteFoo(Foo *p) {
free(p);
}
}
#include <memory>
#include <iostream>
#include <algorithm>
#include <iterator>
struct FooDeleter {
void operator()(Foo *p) const {
deleteFoo(p);
}
};
int main() {
using fooptr = std::unique_ptr<Foo, FooDeleter>;
auto p = fooptr(makeFoo());
auto first = p->bar;
auto last = first + p->bars;
std::copy(first, last, std::ostream_iterator<double>(std::cout, ", "));
std::cout << std::endl;
}
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
X x1 = {5};
X x2 = {10};
memcpy(&x1, &x2, sizeof(X));
On terça-feira, 17 de janeiro de 2017 10:33:12 PST Nicol Bolas wrote:
> > A POD object's lifetime should begin when storage for it is provided and
> > end
> > when storage is freed. In that sense, it happens before the modification
> > through a pointer. In fact, you could say it happens sometime inside
> > malloc().
>
> *Which* POD object? There are an arbitrary number of them that can fit into
> that storage. If you say that `malloc(sizeof(void*))` begins the lifetime
> of every pointer type, every pointer-to-pointer type, every
> pointer-to-pointer-to-pointer etc, then the entire idea of having a typed
> object model loses all meaning.
It's unspecified which one, but one only. That means the compiler cannot assume
that the code did not initialise, but it can infer from code that uses that
storage area what type it was.
struct S { int i; };
struct T { float f; };
auto mem = malloc(std::max(sizeof(S), sizeof(T)));
S *s = new(mem) S;
S->i;
T *t = static_cast<T*>(mem); //OK, but you can't use `t`.
T *t2 = new(mem) T; //Can use `t2`, but not `s` anymore.
t = std::launder(t); //I can use `t` now.
> C++ is not a superset of C and it never has been. Users should not expect to be able to throw any C struct at C++ and have it work with it.And yet this is exactly what I can do, de-facto, today. And it is exactly this undefined behaviour that the entire c++ enterprise depends on, today.
All the standard needs is the addition of a paragraph that says:'any type or intrinsic object declared inside extern "C" exists in the C memory model (see ISO standard xxx). Members of objects of an extern "C" class type shall behave as per the C language, previously cited. The C++ and C memory models must coexist in an unsurprising way'End of problem.
--
The latter, yes. You can still bit-blast them into buffers that don't
contain live objects yet,
so for that reason it's apparently rather important that such types
are trivially copyable,
but trivially copyable doesn't necessarily mean assignable or "can
blast values over existing objects".
struct X { const int val; };
// this is all well and good
alignas(X) char buffer[sizeof(X)];
new (buffer) X{42};
::send(buffer, sizeof(buffer));
alignas(X) char recv_buffer[sizeof(X)];
::recv(recv_buffer, sizeof(recv_buffer));
// can't do this
X x; // nope
memcpy(&a, recv_buffer, sizeof(a));
// this is UB
X b(*reinterpret_cast<X const*>(recv_buffer));
// this is well-defined yet totally unmaintainable
int v;
memcpy(&v, recv_buffer, sizeof(v));
X c{v};
On Tuesday, January 17, 2017 at 11:32:55 AM UTC-5, Richard Hodges wrote:I cannot see any reasonable argument that pointer arithmetic should not be allowed to work on consecutive objects.
And nobody has made such an argument. Indeed, I'm pretty sure that I stated quite the opposite. Though for very different reasons and different restrictions.
int* p = get_ints_from_c(); *(++p); should absolutely be defined behaviour in c++ provided there is actually some memory at std::addressof(*p) + sizeof(p); - there is no conceivable reason why it should not.Note that I am asserting *should absolutely* - a very strong statement. This is because we absolutely cannot move away from c. There are no c++ operating systems. Therefore all useful libraries are written with C interfaces. Thousands of c++ wrapper libraries exist to turn those C interfaces back into c++. We don't do that because we want to. We do that because C++ is not suitable creating portable object libraries, having no modules or common ABI.By all means lets talk about moving forward - after we have modules, defined ABIs, an agreed-upon means to transmitting exceptions and so on.Until then, the entire foundation of our C++ universe is C. To try to pretend otherwise is a fallacy.OK, it's "difficult" to marry the c++ abstract machine model with the C memory model. So what? That doesn't mean that it should not be done.
OK, so explain what we will gain by doing all of this work. How will it make my currently functional code faster and/or better? How will it make my programs more correct? How will it improve the C++ object model in ways that are useful for actual C++ programs?
The status quo is adequately functional. And if C is as entwined with C++ as you believe, then no compiler vendor is going to break the world with "optimizations" that don't actually make things more optimal.
On Monday, January 16, 2017 at 11:09:59 PM UTC-5, Richard Smith wrote:On 16 Jan 2017 6:25 pm, "Nicol Bolas" <jmck...@gmail.com> wrote:On Monday, January 16, 2017 at 9:05:11 PM UTC-5, Richard Smith wrote:On 16 January 2017 at 17:47, Nicol Bolas <jmck...@gmail.com> wrote:On Monday, January 16, 2017 at 7:23:30 PM UTC-5, Richard Smith wrote:On 16 January 2017 at 13:49, Martin Ba <0xcdc...@gmx.at> wrote:On Monday, January 16, 2017 at 10:11:19 PM UTC+1, Jens Maurer wrote:On 01/15/2017 09:56 PM, Martin Ba wrote:
> Hi.
>
> I'm currently trying to understand a few ... interesting ... observations I have been making wrt. the C++ Standard and using char arrays as raw storage.
>
> Essentially, as far as I can tell (have been told), the current C++ Standard only allows using a char array as raw storage (see also std::aligned_storage) when objects are put into this via placement new, even for e.g. int or other trivial(*) types.
>
> See: http://stackoverflow.com/questions/41624685/is-placement-new-legally-required-for-putting-an-int-into-a-char-array or related questions where I'm told I'm expected to do the following:
>
> |alignas(int) char buf[sizeof(int)];
>
> void f() {
> // turn the memory into an int: (??) from the POV of the abstract machine!
> ::new (buf) int; // is this strictly required? (aside: it's obviously a no-op)
>
> // access storage:
> *((int*)buf) = 42; // for this discussion, just assume the cast itself yields the correct pointer value
> }
> What I would be interested in is whether this has been discussed in the committee (CWG?) in the last very few years
> and whether there is any agreement if omitting the placement new (for trivial type) should be allowed or if Standard C++ should absolutely require the placement new./
I believe I can say that CWG agrees that the words now in C++17 correctly
reflect the intent that you need the placement new in the case above.
If this is really the intent, then this needs to be more clearly communicated and, I feel, rationalized. (Maybe it already has? Thats what the OP was actually about.)
If you believe that intent is misguided, feel free to propose a change.
Yes, I very much feel the intent is misguided. For two reasons:
- This intent declares UB totally reasonable legacy code. At least I consider it reasonable too *not* have to place a no-op placement new in straightforward buffer backed code for trivial types.
- Since C doesn't have placement new, any C code that uses a char buffer to back any other typed data is automatically UB in C++. Another unnecessary incompatibility.
The above code has undefined behavior in C too. C's effective type rules do not permit changing the effective type of a declared object to something other than its declared type; it only permits that for objects allocated with malloc or similar.In the case where the storage /was/ allocated through malloc or similar, C++ requires a placement new where C simply allows the effective type to change through a store (and some parts of the C effective type model don't work as a result...). It would seem reasonable to me for such allocation functions to be specified to have implicitly created whatever set of objects the following code relies on existing[1] -- the compiler typically has to make that pessimistic assumption anyway, since it doesn't know what objects the implementation of an opaque function might create, so it seems like we'd lose little and gain more C compatibility by guaranteeing something like that.[1]: that is, we could require the compiler to assume that malloc runs a sequence of placement news (for types with trivial default construction and destruction) before it returns, where that set is chosen to be whatever set gives the program defined behavior -- if such a set exists
The result of a "sequence of placement news" on a piece of memory is the creation of an object of the last type `new`ed. The C++ object model does not permit storage to have an indeterminate object or many separate objects (outside of nesting). If you allocate 4 bytes and new an `int` into it, then it is an int. If you new a `float` into it, it stops being an `int`.I never said they would all be at the start of the allocation.
... that doesn't make sense. I mean, where else are they going to be except for the start? If I allocate 4 bytes, then you need to `new` up both `int` and `float` (assuming they're both 4 bytes, of course). But there's no room to `new` them at different addresses within that allocation, since the allocation is only 4 bytes.I don't know what this example is supposed to demonstrate.
You originally said:
> we could require the compiler to assume that malloc runs a sequence of placement news (for types with trivial default construction and destruction) before it returns, where that set is chosen to be whatever set gives the program defined behavior -- if such a set exists
Therefore, if I 'malloc' 4 bytes of storage, then placement `new` will be executed on that storage for both `int` and `float`. Among others. That's what you're asking for.
And as I said, that would make the memory both an `int` and a `float` at the same time. You then said:
> I never said they would all be at the start of the allocation.
Then where is it going to be? Where does the `int` get created and where does the `float` get created, since there's not room enough for both?
I'm trying to understand what you're suggesting the standard do here, and thus far, it does not make sense.
> Um, no. This now requires that every C++ compiler also implement <insert version here> of C.This is not what I am saying. I am saying that the objects that are imbued with extern "C" must exist in the C memory model. They already do, and the entirety of the c++ world depends upon this fact today. This is an inescapable truth.> Just because "c++ enterprise" depends on some non-C++ features doesn't mean we should shove them into the standard.The alternative is that almost every meaningful application and library in existence is fundamentally non-portable. This is not in the interests of c++ developers, users of their work, or indeed manufacturers of the compilers. So I have to say, with respect, that you are mistaken. An ISO standard describing a system built upon UB is meaningless because all programs become strictly non-portable. See above.
> Furthermore, declaring that the two object models "must coexist in an unsurprising way" basically says nothing. It's about as useful for deciding on behavior as the wording on pointer-to-`intptr_t` conversions.You know full well that I am paraphrasing.I'll make you a bet. Let's put this question to a poll of c++ developers (say with more than 4 years' experience). I'll give you even money on any bet you care to take that I my position would win by a ratio exceeding 7:3It's what the language does. It's what the entire developer base expects the compiler to do. It is the de-facto truth of c++. The standard is currently perverse in stating otherwise.
The committee should hang its head in shame.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
Damn right I'll step up. Who do I need to speak to?
So, who's in charge?
The latter, yes. You can still bit-blast them into buffers that don't
contain live objects yet,
so for that reason it's apparently rather important that such types
are trivially copyable,
but trivially copyable doesn't necessarily mean assignable or "can
blast values over existing objects".I can bit-blast it into a buffer. But how can I bit-blast it out of a buffer?
struct X { const int val; };
// this is all well and good
alignas(X) char buffer[sizeof(X)];
new (buffer) X{42};
::send(buffer, sizeof(buffer));
alignas(X) char recv_buffer[sizeof(X)];
::recv(recv_buffer, sizeof(recv_buffer));
// can't do this
X x; // nope
memcpy(&a, recv_buffer, sizeof(a));
auto x = std::trivial_copy_construct<X>(recv_buffer);
On terça-feira, 17 de janeiro de 2017 13:04:55 PST Nicol Bolas wrote:
> Standards do not work based off of vague inferences. They have to *specify*
> behavior. So if an "inference" is going to be made, then there *must* be an
> explicit enumeration of syntactic constructs which the compiler will use to
> "infer" the type.
It wasn't a vague inference. It was a logical conclusion based on existing
rules.
If the compiler sees you casting a memory block to a given class, unless it
has reason to doubt you, it should trust you that you're right and that
pointer points to an area containing an object of that type. That is, if your
code is:
extern "C" void *allocate();
void *ptr = allocate();
S *s = static_cast<S *>(ptr);
s->i = 0;
why should it doubt you?
What's to say that the allocate function isn't:
void *allocate() { return new S; }
To be explicit: unless the compiler can prove that the allocation function
*isn't* allocating an S, it has no reason to doubt your casting.
> At which point, those syntaxes are not being used to "infer" something;
> they now become alternate syntaxes for creating an object.
As I said in another email, static_cast does not create the object and nor
does the dereferencing of that pointer. The creation of the POD object
happened in the allocation of the storage, since the constructor is trivial.
> If you say that `static_cast` can begin the lifetime of an object, then you
> need to say under which circumstances that will happen. Does it only work
> from memory fresh out of `malloc`, or can it work on `malloc`ed memory that
> used to have an object in it and you're now replacing it with another? How
> do you tell the difference between pointer conversion and object
> initialization?
See above. Initialisation happens inside the allocation function, not on
casting.
Now, the lifetime can end if you repurpose the storage by memcpy'ing something
else there. See my other email where I said memcpy can be the same as:
x2.~X();
new (&x2) X(x1);
> C++ has a way to tell the difference: `static_cast` is for pointer
> conversion; `new()` is for object initialization. Because of that, it can
> do this:
>
> struct S { int i; };
> struct T { float f; };
>
> auto mem = malloc(std::max(sizeof(S), sizeof(T)));
>
> S *s = new(mem) S;
> S->i;
>
> T *t = static_cast<T*>(mem); //OK, but you can't use `t`.
Actually, I've seen UBSan complain about a static cast of the wrong type, so
you shouldn't cast to the wrong type, even if you don't use the pointer.
Though in that case we were talking about polymorphic types and here we're
talking about trivial ones.
> T *t2 = new(mem) T; //Can use `t2`, but not `s` anymore.
> t = std::launder(t); //I can use `t` now.
Agreed, your code is fine. And using the placement new allows us to be explicit
about the object initialisation and also safe if the type in question isn't
trivially constructible.
But the compiler cannot prove that malloc didn't initialise the object before
it returned.
Take the allocate() function from above: if we expand the
operator new, we get:
void *allocate()
{
auto ptr = ::operator new(sizeof(S));
new (ptr) S;
return ptr;
}
But since S has a trivial constructor, the placement new must expand to
absolutely nothing and have no side effects. Therefore, that function is
functionally identical to:
void *allocate() { return ::operator new(sizeof(S)); }
Finally, since the default ::operator new function just calls malloc, it's no
different from:
void *allocate() { return malloc(sizeof(S)); }
To me, this proves that you cannot distinguish malloc() or any other memory
allocation function from a function that initialises a trivial object.
auto ptr = allocate();
auto s_ptr = static_cast<S*>(ptr);
s_ptr->i = 5;
On Tuesday, January 17, 2017 at 7:28:44 PM UTC+1, Nicol Bolas wrote:On Tuesday, January 17, 2017 at 11:32:55 AM UTC-5, Richard Hodges wrote:I cannot see any reasonable argument that pointer arithmetic should not be allowed to work on consecutive objects.
And nobody has made such an argument. Indeed, I'm pretty sure that I stated quite the opposite. Though for very different reasons and different restrictions.
int* p = get_ints_from_c(); *(++p); should absolutely be defined behaviour in c++ provided there is actually some memory at std::addressof(*p) + sizeof(p); - there is no conceivable reason why it should not.Note that I am asserting *should absolutely* - a very strong statement. This is because we absolutely cannot move away from c. There are no c++ operating systems. Therefore all useful libraries are written with C interfaces. Thousands of c++ wrapper libraries exist to turn those C interfaces back into c++. We don't do that because we want to. We do that because C++ is not suitable creating portable object libraries, having no modules or common ABI.By all means lets talk about moving forward - after we have modules, defined ABIs, an agreed-upon means to transmitting exceptions and so on.Until then, the entire foundation of our C++ universe is C. To try to pretend otherwise is a fallacy.OK, it's "difficult" to marry the c++ abstract machine model with the C memory model. So what? That doesn't mean that it should not be done.
OK, so explain what we will gain by doing all of this work. How will it make my currently functional code faster and/or better? How will it make my programs more correct? How will it improve the C++ object model in ways that are useful for actual C++ programs?
What we will gain is a Standard that is not contradicting reality, which is *at least* a marketing asset for the language. (compare: isocpp.org)
What we will gain is people writing reasonable real world "low level" code *not being told* that their code is UB and that they should resort to memcpy and no-op placement-new contortions - this is at least an asset wrt. the learning curve of the language.
What we will gain is not having to spend time on rather fruitless discussion like this one here.
The status quo is adequately functional. And if C is as entwined with C++ as you believe, then no compiler vendor is going to break the world with "optimizations" that don't actually make things more optimal.
The status quo in reality is functional. The Standard contradicts reality in this regard. That it works everywhere in practice, and can be expected to do so, is only an argument for the priority of fixing this, not an argument for not fixing the Standard.
I have to say I do not quite follow you argumentation wrt. this: on one hand you seem to care very much about the Standard supplying a useful and consistent object model, but on the other hand, so seem to say that the places where this shiningly consistent model is violated by a huge fraction of programs in existence don't matter because they will continue to work anyway.
The committee has already rejected async io and continuation futures
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
--
---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussion+unsubscribe@isocpp.org.
> those too were not "rejected""not adopted" means exactly the same thing as "rejected" when you are waiting for a feature to come out so that you can standardise code across platforms.
> Essentially yes, but there's more to it than that.
Only if you enjoy complicating your life.
> The problem basically boils down to this: C++ makes C-isms undefined behavior, but a lot of code relies on C-isms, so compilers aren't free to discard them or do anything about them. The solutions being tossed about here are that we should make them well-defined behavior.
The problem is that intrinsic types are not objects, and neither are PODs. To treat them the same is counter-factual. aligned memory is inherently a union of all PODs that will fit.
To unsubscribe from this group and stop receiving emails from it, send an email to std-discussio...@isocpp.org.
> Essentially yes, but there's more to it than that.Only if you enjoy complicating your life.> The problem basically boils down to this: C++ makes C-isms undefined behavior, but a lot of code relies on C-isms, so compilers aren't free to discard them or do anything about them. The solutions being tossed about here are that we should make them well-defined behavior.The problem is that intrinsic types are not objects, and neither are PODs. To treat them the same is counter-factual. aligned memory is inherently a union of all PODs that will fit. Make it so in the standard. End the argument forever.
This makes the C++ behaviour the same as C behaviour when intrinsics and PODS are mapped onto memory. It's logical, everyone does it anyway, and it's never going away in gcc or clang. End of problem. Lets get on with something new.> I have a different solution. Instead of promoting garbage C-isms like pointer casting and so forth, we make C++ equivalents. Placement `new` is one such C++-ism which allows the creation of C++ objects in arbitrary memory. But we can add many more.NO - because that just adds more useless work for programmers. It's the reverse of what auto does - which is make life easier and better. Useless work like having to formally introduce storage (which is what you're suggesting)
auto ptr = malloc(4);
autp i_ptr = static_cast<int*>(ptr);
*i_ptr = 5;
auto ptr = malloc(4);
auto i_ptr = new(ptr) int;
*i_ptr = 5;
Type t;
memcpy(&t, some_ptr, sizeof(T);
auto t = std::trivial_copy_construct<T>(some_ptr);
is what COBOL and Pascal did. They're dead now. Let's not do that.> If people need a way to take memory that has been filled in from external code and use that as a C++ object which is compatible with the layout of that memory, lets provide them with a function that does that. If people need a way to initialize an object directly from compatible data externally provided, let's provide them a way to do that. Let's take all of the useful C-isms and provide C++ ways to do them, rather than promoting pointer casting and whatnot as good code.No need for any of that. It already happens in gcc. gcc *is* the standard.
--
> You keep saying that as though it were some objective fact rather than a choice.The fact that I can cast properly aligned storage to a POD and use it as a POD:a) is a de-facto reality and always will be
b) is necessary to allow c++ to interact with every computer system in the world.
c) should therefore obviously be mandated as true in the standard.> The concept of "object" only exists at the level of the standard. And thus, an object can be whatever we choose for it to be.Good. We agree on that. Let's choose for for an 'object' to be something more complex than a BASIC POD. Let's define a BASIC POD as being a POD with only defaulted special functions. Lets also choose that a pointer to BASIC POD is a template through which we manipulate memory (subject to the as-if-rule). Lets also choose that any sufficiently aligned and sized memory block can be viewed in a defined way through a pointer to a BASIC POD.
Now, if we choose to overlay a BASIC POD half-way into some other object, then OBVIOUSLY, access to that other object is undefined. But the BASIC POD is not.Why?a) Because this is reality and,b) It's necessary and,c) it solves your pet problem - implementing a vector correctly.
Lets also allow a BASIC POD to have its last member as a Zero-sized array. Such an array may be validly accessed provided there is storage behind it - because this allows us to create really useful things like buffers that the average programmer can understand.Further. Lets further legalise pointer arithmetic.
Finally, let's stop trying to pretend that memory is some nebulous thing. It's memory. Sometimes C++ needs to go low level and it's useful for it to be high level. Let's keep the versatility. gcc's optimiser copes with that, so does clang's. There is no start writing doublespeak in the standard about it not being true. It is true.> What I'm saying is that casting should not be something that is encouraged.Casting cannot be avoided when you interface with C libraries. Every production c++ program interfaces with C (and sometimes objective-C) libraries. Therefore, casting cannot be avoided. Interacting with C's "BASIC PODS" cannot be avoided. Therefore it must not be undefined. If nothing else, this will prevent every 20th post on stackoverflow from being howls of outrage that being forced to write memcpy, only for the copy to be thrown away.Let me put this another way:this code:std::memcpy(&myints, your_chars, sizeof(int) * 10);currently signals to the compiler that your_chars are really an array of 10 ints.
std::trivial_copy_assign_strict<int>(&myints, your_chars, 10);
so should this:struct F {int n;int a[]};
> Are you stoned?No, but if you ever visit Spain I can show you where you may become so (quite legally) if you wish.What is your problem with memory being... memory?on a system where ints are 32 bits, 32-bit words are addressable without bitwise arithmetic, and the compiler deems that 128 bits is a reasonable alignment strategy...:struct A {int a;int b[2];};struct B {int a[2];int b;};... both A and B occupy 128 bits. The value of the last 32 bits is irrelevant.There is no reason whatsoever (other than handwaving from the 'c++ is not a low level language crowd') that they should not be a union of each other.
--