This is similar to a previous idea I had, but I think that the kinks are worked out better this time.
I think that we should split trivially-copyable into a new category, "consistent-layout", separating out the memory layout portions of trivially-copyable from triviality of copying.
#include <cassert>
#include <cmath>
#include <cstddef>
#include <cstdio>
#include <cstring>
#include <type_traits>
struct Meow
{
int x;
float GetY() const { return y; }
protected:
float y;
friend void Location1();
friend void Location2();
};
static_assert(std::is_trivially_copyable<Meow>::value);
static union
{
unsigned char meowBytes1[sizeof(Meow)];
Meow meow1;
};
std::ptrdiff_t offset1;
unsigned char blob1[sizeof(Meow)];
unsigned char *py1;
static union
{
unsigned char meowBytes2[sizeof(Meow)];
Meow meow2;
};
std::ptrdiff_t offset2;
unsigned char blob2[sizeof(Meow)];
unsigned char *py2;
void Location1()
{
meow1.x = 1;
meow1.y = std::exp(1.0f);
py1 = reinterpret_cast<unsigned char *>(&meow1.y);
offset1 = py1 - meowBytes1;
std::memcpy(blob1, &meow1, sizeof(meow1));
}
void Location2()
{
meow2.x = 2;
meow2.y = 2 * std::asin(1.0f);
py2 = reinterpret_cast<unsigned char *>(&meow2.y);
offset2 = py2 - meowBytes2;
std::memcpy(blob2, &meow2, sizeof(meow2));
}
int main()
{
Location1();
Location2();
float f1 = 123.0f;
std::memcpy(&meow1, blob2, sizeof(meow1));
std::memcpy(py1, &f1, sizeof(f1));
float f2 = 321.0f;
unsigned char temp[sizeof(meow2)];
std::memcpy(temp, blob1, sizeof(temp));
std::memcpy(&temp[offset2], &f2, sizeof(f2));
std::memcpy(&meow2, temp, sizeof(meow2));
std::printf("%f %f\n", meow1.GetY(), meow2.GetY());
std::printf("%td %td\n", offset1, offset2);
return 0;
}
On Monday, May 18, 2015 at 2:35:06 PM UTC-7, Nicol Bolas wrote:On Monday, May 18, 2015 at 4:57:20 PM UTC-4, Myriachan wrote:This is similar to a previous idea I had, but I think that the kinks are worked out better this time.
I think that we should split trivially-copyable into a new category, "consistent-layout", separating out the memory layout portions of trivially-copyable from triviality of copying.
Um, that's already separated. There's the concept of "standard layout". It means almost exactly the same thing as you want, and offsetof is defined to work with such types.
The reason that standard layout does not allow for non-empty base classes or public/private shenanigans is (in part) because `offsetof` works with them. It's a macro; it's not smart enough to know how to actually do the computation for things like public/private (because implementations can reorder them) and base classes can be placed in front of or behind other classes. Both of these confound a macro like `offsetof`.
The current definition of trivially-copyable is such that trivially-copyable classes must be (conceptually) compatible with offsetof(), because otherwise they don't comply with the Standard in other ways. With trivially-copyable types, the compiler already must be assigning fixed offsets to each member, and keeping these offsets for the lifetime of the program. If the compiler did not do this, Standard-compliant programs would break.
To prove this, let's say that at two different places in a program, trivially-copyable but non-standard-layout class Meow's member "y" is at two different offsets "a" and "b".
On 2015–05–19, at 9:50 AM, Nicol Bolas <jmck...@gmail.com> wrote:`offsetof` is not a compiler intrinsic that is capable of accessing compiler data; it is nothing more than a C macro (which is why it doesn't give a static_assert or error if you give it a non-standard layout type).
On 2015–05–19, at 9:50 AM, Nicol Bolas <jmck...@gmail.com> wrote:`offsetof` is not a compiler intrinsic that is capable of accessing compiler data; it is nothing more than a C macro (which is why it doesn't give a static_assert or error if you give it a non-standard layout type).offsetof is a macro (#ifdef will recognize it) which expands to an intrinsic.
The pointer arithmetic implementation is UB and went out of style long ago.
Clang (by default) warns on all invalid use of offsetof and produces a hard error when you apply it to a virtually inherited member. GCC does not implement the warning, but the error is the same.
This is similar to a previous idea I had, but I think that the kinks are worked out better this time.
I think that we should split trivially-copyable into a new category, "consistent-layout", separating out the memory layout portions of trivially-copyable from triviality of copying. I feel that this would make certain aspects of C++ more orthogonal, as well as put firm grounding in certain existing practices among programmers. My proposed definitions are below.
A consistent-layout type is one that occupies contiguous bytes of storage
struct Purr {
char c;
long long ll;
};
On 2015–05–19, at 1:03 PM, Nicol Bolas <jmck...@gmail.com> wrote:Undefined behavior can be defined by a particular implementation if they so choose. Even if it's just for that particular expression.
Just a small point, but, unless char and long long have the same alignment,
struct Purr {
char c;
long long ll;
};does not occupy contiguous bytes of storage ;)
The fundamental storage unit in the C++ memory model is the byte. A byte is at least large enough to contain any member of the basic execution character set (2.3) and the eight-bit code units of the Unicode UTF-8 encoding form and is composed of a contiguous sequence of bits, the number of which is implementation-defined. ...
The sizeof operator yields the number of bytes in the object representation of its operand. ... sizeof(char), sizeof(signed char) and sizeof(unsigned char) are 1. ...
#define myria_offsetof(type, ...) \
(static_cast< ::std::size_t>( \
reinterpret_cast<unsigned char *>( \
(&reinterpret_cast<type *>(
static_cast< ::std::uintptr_t>(alignof(type)))->__VA_ARGS__)) - \
reinterpret_cast<unsigned char *>(
static_cast< ::std::uintptr_t>(alignof(type)))))
On Tuesday, May 19, 2015 at 6:11:45 AM UTC-7, David Krauss wrote:On 2015–05–19, at 1:03 PM, Nicol Bolas <jmck...@gmail.com> wrote:Undefined behavior can be defined by a particular implementation if they so choose. Even if it's just for that particular expression.Not within constant expression evaluation. An expression that invokes UB is not a constant expression, even if it’s defined by the implementation. Before that rule existed, C++98 specifically said “pointers… shall not be used” in an integral constant expression.In standard C++, offsetof has always been “nearly” an operator. It just doesn’t get respect because of its heritage and its rivalry with PTMs. (There committee did endorse an evolutionary direction for PTMs years ago, to supersede offsetof once and for all, which has been discussed here more recently.)
As much as I'd appreciate something like "T void::*", that wouldn't help with interfacing with lower-level code that continues to work with structure offsets. I think the best way is to fix the minor wart with offsetof, then say, "but look, here's a much better C++ way to do it!" with "T void::*" or whatever the syntax will be.
My proposal with "consistent-layout" is to ground the current situation in reality. Compilers already have to choose a fixed layout for trivially-copyable classes in order to be compatible with memcpy(); this just formalizes it. Additionally, this proposal would answer Core errata 1701.
... An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage.
On Tuesday, May 19, 2015 at 2:15:47 PM UTC-7, Nicol Bolas wrote:On Tuesday, May 19, 2015 at 5:04:31 PM UTC-4, Ville Voutilainen wrote:That's not correct. A standard-layout type can easily be non-trivially copyable.
[class]/10 has an example.
OK, good point.
That being said, you could still expand the definition of `offsetof` (or better yet, make a new thing that is a real intrinsic) which can get a compile-time offset from any type+member, as long as it doesn't have to go through a virtual base class. That's a superset of both standard layout and trivially copyable.
The current rule only requires that trivially-copyable and standard-layout types occupy contiguous memory (1.8/5):... An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage.
In other words, types with nontrivial copy constructors or destructors, or classes with virtual functions, etc. yet don't have virtual base classes can have strange layouts under current rules. They can be implemented with internal hidden pointers similar to the way some compilers implement virtual base classes if the compiler so chooses. Such an implementation is silly, particularly if virtual functions are not involved, but doesn't appear to be barred by anything in the Standard.
On 2015–05–20, at 4:22 AM, Myriachan <myri...@gmail.com> wrote:As much as I'd appreciate something like "T void::*", that wouldn't help with interfacing with lower-level code that continues to work with structure offsets. I think the best way is to fix the minor wart with offsetof, then say, "but look, here's a much better C++ way to do it!" with "T void::*" or whatever the syntax will be.
Do you know of compilers that implement classes, such that the offset from a pointer to type T to any non-virtual base member is not a compile-time constant? If all compilers implement classes such that the non-virtual-base members of any particular class have a consistent byte offset, then you don't need this classification at all. The only thing you need is a new version of `offsetof` that can take any type/member pair, so long as the member is in a virtual base of the type.
Is your choice of excluding all polymorphic types based on actual knowledge of how compilers work? Or is it based on something else?
Note that these compilers did have lots of differences. Of particular note were variances between where the base class goes relative to the derived. On one platform, it was after, while on the others, it was before.
And yet, at no time were there non-static offsets for members of polymorphic types.
So do you have actual experience with actual compilers that show that compilers will have variances regarding polymorphic types? If you haven't done the research, then my first suggestion would be to actually do the research and find out where the real intersection of your features E&F are. Not merely to think you know it, but to actually know where that intersection is in actual, live compilers.
Also, from the examples you've given, I smell some kind of horrible serialization system behind this request, one that uses byte offsets and such to serialize members and so forth, rather than being hard-coded to the member names. My concern there is that reflection will basically do everything you need, and do it better, in a way that won't require this kind of low-level fiddling.
This would also explain why you're willing to sacrifice polymorphic types entirely. Reconstructing such data from a serialized object would be very difficult. So your serialization system would consider it out of scope. Thus, it's not part of your proposal.
If so, maybe it would be better to wait for the right solution (reflection) than to modify the standard in such a way. I would hate for the committee to standardize something, only for another feature to come along and make it a moot point (re: std::bind. Good idea at the time, dumb idea with lambdas around).
On Tuesday 19 May 2015 17:12:04 Myriachan wrote:
> F: A data member of a class instance may be accessed by aliasing the class
> with a char pointer and adding the member's offset.
I don't think we need to create a new type for this. Simply change 18.2
[support.types] p4:
- If type is not a standard-layout class (Clause 9), the results are
undefined.
+ If type is not a standard-layout or trivially copyable class (Clause 9), the
results are undefined.
Or refer to the contiguous storage definition from 1.8/5 that you quoted.
I personally don't think we should overcomplicate this because offsetof is
supposed to be low-level C code. C code shouldn't be dealing with more complex
C++ types -- for that, we can have a C++ solution, including the subtraction
operator with pointer-to-members.
struct S
{
int i;
};
void f()
{
S s{0};
auto pm1 = &S::i;
int *p1 = pm1 - &s;
*p1 = 1;
assert(s.i == 1);
}
This bypasses the need for F.
struct Kitty
{
char asdf;
double fdsa;
};
struct Meow
{
Kitty kitty;
};
double Meow::*member = &Meow::kitty.fdsa; // currently ill-formed
std::size_t offset = offsetof(Meow, kitty.fdsa); // extension accepted by the compilers I use
template <type Member, typename Outer> ptrdiff_t
make_offset(Outer *object, Member Outer:: *pointer_to_member);
template <type Member, typename Outer> Member Outer::*
make_pointer_to_member(Outer *object, ptrdiff_t offset);
On quite a few ABIs, the pointer to member *is* an offset anyway and those two
functions would ignore the first argument and return a reinterpreted second
argument.
On Tuesday, May 19, 2015 at 6:10:23 PM UTC-7, Nicol Bolas wrote:Do you know of compilers that implement classes, such that the offset from a pointer to type T to any non-virtual base member is not a compile-time constant? If all compilers implement classes such that the non-virtual-base members of any particular class have a consistent byte offset, then you don't need this classification at all. The only thing you need is a new version of `offsetof` that can take any type/member pair, so long as the member is in a virtual base of the type.
No, I'm not aware of any such implementations. I think that offsetof() fails on some compilers that don't strictly enforce the rules if attempted on a class with virtual bases, even if the chosen member is not a member of one of the virtual bases...?Is your choice of excluding all polymorphic types based on actual knowledge of how compilers work? Or is it based on something else?
No, it's based on what I've tried to interpret of the Standard, and from various discussions of the topic on these mailing lists. In particular, the exclusion of classes with virtual functions was because of objections raised in a previous thread, particularly by Thiago, if my memory is correct. His (I think) point of view on the matter made sense to me--the Standard does not specify how virtual dispatch works, and crazy implementations are legal, so all bets are off when it comes to classes with virtual functions. Classes with virtual bases, of course, go crazy in both theory and in reality.
Note that these compilers did have lots of differences. Of particular note were variances between where the base class goes relative to the derived. On one platform, it was after, while on the others, it was before.
And yet, at no time were there non-static offsets for members of polymorphic types.
Yes, that may be true, but the Standard does not appear to exclude this possibility. The Standard does not even explicitly exclude this possibility for trivially-copyable types, but that proof earlier in the thread shows that this exclusion can be derived. The exclusion should probably be stated for clarity, in my opinion; "consistent-layout" would do this.
So do you have actual experience with actual compilers that show that compilers will have variances regarding polymorphic types? If you haven't done the research, then my first suggestion would be to actually do the research and find out where the real intersection of your features E&F are. Not merely to think you know it, but to actually know where that intersection is in actual, live compilers.
Also, from the examples you've given, I smell some kind of horrible serialization system behind this request, one that uses byte offsets and such to serialize members and so forth, rather than being hard-coded to the member names. My concern there is that reflection will basically do everything you need, and do it better, in a way that won't require this kind of low-level fiddling.
No, it's actually not about that. It's about having a meaningful low-level memory layout for classes.
This sort of thing is used for custom container classes for optimization, and also to interface with C code or operating system calls.
On Tuesday, May 19, 2015 at 10:12:40 PM UTC-4, Myriachan wrote:On Tuesday, May 19, 2015 at 6:10:23 PM UTC-7, Nicol Bolas wrote:Do you know of compilers that implement classes, such that the offset from a pointer to type T to any non-virtual base member is not a compile-time constant? If all compilers implement classes such that the non-virtual-base members of any particular class have a consistent byte offset, then you don't need this classification at all. The only thing you need is a new version of `offsetof` that can take any type/member pair, so long as the member is in a virtual base of the type.
No, I'm not aware of any such implementations. I think that offsetof() fails on some compilers that don't strictly enforce the rules if attempted on a class with virtual bases, even if the chosen member is not a member of one of the virtual bases...?Is your choice of excluding all polymorphic types based on actual knowledge of how compilers work? Or is it based on something else?
No, it's based on what I've tried to interpret of the Standard, and from various discussions of the topic on these mailing lists. In particular, the exclusion of classes with virtual functions was because of objections raised in a previous thread, particularly by Thiago, if my memory is correct. His (I think) point of view on the matter made sense to me--the Standard does not specify how virtual dispatch works, and crazy implementations are legal, so all bets are off when it comes to classes with virtual functions. Classes with virtual bases, of course, go crazy in both theory and in reality.
Note that these compilers did have lots of differences. Of particular note were variances between where the base class goes relative to the derived. On one platform, it was after, while on the others, it was before.
And yet, at no time were there non-static offsets for members of polymorphic types.
Yes, that may be true, but the Standard does not appear to exclude this possibility. The Standard does not even explicitly exclude this possibility for trivially-copyable types, but that proof earlier in the thread shows that this exclusion can be derived. The exclusion should probably be stated for clarity, in my opinion; "consistent-layout" would do this.
That's why I mentioned the whole POD-to-standard layout thing. C++98/03 didn't specify
If what you want to do is standardize existing practice, you first need to know what existing practice is. Arbitrarily deciding that existing practice for compile-time layout stops at virtual functions seems... arbitrary.