Syntax for Empty Base Optimization (second attempt)

429 views
Skip to first unread message

Avi Kivity

unread,
May 22, 2016, 10:49:54 AM5/22/16
to ISO C++ Standard - Future Proposals
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.


Library writers often find themselves wrapping possibly-empty objects in synthetic structs to take advantage of EBO:

  template <typename T, typename Allocator>
  class my_container {
      struct alloc_n_size : Allocator {
          size_t size;
          // ctor etc.
      } _M_alloc_n_size;   // only occupies sizeof(size_t) if is_empty<Allocator>.
      ...
  };

Would it not be more comfortable to supply some syntax for this:

  template <typename T, typename Allocator>
  class my_container {
      size_t _M_size;
      std::allow_zero_size<Allocator> _M_allocator;
      ...
  };

std::allow_zero_size<> is a library class template that uses compiler magic to tell the compiler that it is acceptable that the address of _M_allocator compare equal to that of some other object (that is, it need not insert padding if sizeof(Allocator) == 0).  It overrides operator.() and friends so that _M_allocator can be used as if std::allow_zero_size<> was not specified.

The new syntax allows library authors to provide optimized code, while avoiding the need to write obfuscated code everywhere.

Arthur O'Dwyer

unread,
May 23, 2016, 8:09:32 PM5/23/16
to ISO C++ Standard - Future Proposals
On Sunday, May 22, 2016 at 7:49:54 AM UTC-7, Avi Kivity wrote:
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.

Library writers often find themselves wrapping possibly-empty objects in synthetic structs to take advantage of EBO:

  template <typename T, typename Allocator>
  class my_container {
      struct alloc_n_size : Allocator {
          size_t size;
          // ctor etc.
      } _M_alloc_n_size;   // only occupies sizeof(size_t) if is_empty<Allocator>.
      ...
  };

Would it not be more comfortable to supply some syntax for this:

  template <typename T, typename Allocator>
  class my_container {
      size_t _M_size;
      std::allow_zero_size<Allocator> _M_allocator;
      ...
  };

std::allow_zero_size<> is a library class template that uses compiler magic to tell the compiler that it is acceptable that the address of _M_allocator compare equal to that of some other object (that is, it need not insert padding if sizeof(Allocator) == 0).  It overrides operator.() and friends so that _M_allocator can be used as if std::allow_zero_size<> was not specified.

I think that if it were possible to implement such a std::allow_zero_size<T> today, everybody would be doing it (and that includes Boost).
The hard part isn't so much the semantics of the allow_zero_size class template — that sounds great to me — but rather the problem is that there is no possible implementation of it today.

It's kind of like saying "wouldn't it be a good idea if an object named std::cout existed and we could just write std::cout << foo to print the value of any type at all", in the days before operator overloading existed.  Those high-level semantics (arguably) sound great... but in order to implement those semantics, we need someone to do the core-language work of figuring out what it means to overload an operator, or in this case, to have an object with the same address as a different object.

If I'm wrong and there does currently exist a (non-portable but) working proof-of-concept implementation of foo::allow_zero_size<T>, then that's awesome and I want to see it. And your proposal should include a link to it.

–Arthur

Tony V E

unread,
May 23, 2016, 8:18:02 PM5/23/16
to ISO C++ Standard - Future Proposals
From the original : "std::allow_zero_size<> is a library class template that uses compiler magic"

"Compiler magic" means there is no way for me or you to write it, but we could mandate that compilers recognize allow_zero_size<> and magically make it work. 

Maybe

Allocator a[0]; 

Would be an acceptable language alternative. 
Or

Allocator a = delete; //default? register? ...

Otherwise, you need a new keyword.


Sent from my BlackBerry portable Babbage Device
From: Arthur O'Dwyer
Sent: Monday, May 23, 2016 8:09 PM
To: ISO C++ Standard - Future Proposals
Subject: [std-proposals] Re: Syntax for Empty Base Optimization (second attempt)

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
To view this discussion on the web visit https://groups.google.com/a/isocpp.org/d/msgid/std-proposals/fb70edf7-1666-4668-8da7-069c02ee31ee%40isocpp.org.

Jeffrey Yasskin

unread,
May 23, 2016, 9:08:20 PM5/23/16
to std-pr...@isocpp.org
Or a context-sensitive keyword like override. I don't fully understand the constraints on them, but "Type var allow_empty;" could work. Dunno if we want "allow_empty" or something more directly connected to the effect like "allow_same_address". 

Nicol Bolas

unread,
May 24, 2016, 1:40:44 PM5/24/16
to ISO C++ Standard - Future Proposals

To be fair, we already have that. Standard layout types with empty base classes require that the base class pointers point to the derived classes. They enforce EBO.

It's simply a matter of expanding that to members. But doing so in a way that doesn't break the layout of existing members.

Nicol Bolas

unread,
May 24, 2016, 2:24:11 PM5/24/16
to ISO C++ Standard - Future Proposals

To expand on this, I once wrote up a prospective proposal that included "stateless classes": empty types which would always not take up space. That's slightly different compared to what the OP wants; he wants a variable to possibly take up space or not, depending on whether the type is empty. His happens at the cite of use; my happens at the cite of declaration.

I think the OP's proposal could work with my `stateless` construct, with two small changes to it. First, as I specified it, a `stateless` class cannot have subobjects of non-stateless type. That could be expanded to allow base-class subobjects of empty types. Since all stateless classes are by definition standard layout, they are required to have EBO. So their pointers would already be considered "related" to a pointer to a derived class, and thus allowed to be equal.

The other change would be permitting the `stateless` specifier to take a constant expression, like `noexcept`. If that expression evaluates to `true`, then the type is stateless; if it evaluates to false, it is not.

That would allow you to implement the OP's idea as follows:

template<typename T>
stateless
(is_empty_v<T>) struct allow_zero_size : public T
{
 
using T::T; //Forward constructors; inherit everything else.
};

If `T` is empty, then `allow_zero_size<T>` will be stateless.

I would say that a class which is marked as conditionally stateless should have all of the limitations of stateless types as I outline them (not being able to be aggregated into arrays).

I don't think there are any issues with implementing `stateless` types, within the limitations as I outlined them in my proposal. The key issue I sidestepped with my proposal was that `stateless` types are not zero-sized. They simply do not affect the size or layout of the types they are a subobject of.

The standardese issue is a bigger deal. Though `stateless` types not being zero-sized probably sidesteps most of the big issues.

Vinnie Falco

unread,
May 25, 2016, 6:32:54 AM5/25/16
to ISO C++ Standard - Future Proposals
On Sunday, May 22, 2016 at 10:49:54 AM UTC-4, Avi Kivity wrote:
std::allow_zero_size<> is a library class template that uses compiler magic to tell the compiler that it is acceptable that the address of _M_allocator compare equal to that of some other object (that is, it need not insert padding if sizeof(Allocator) == 0).

Avi Kivity

unread,
May 26, 2016, 8:21:32 AM5/26/16
to ISO C++ Standard - Future Proposals

On Tuesday, May 24, 2016 at 3:09:32 AM UTC+3, Arthur O'Dwyer wrote:
On Sunday, May 22, 2016 at 7:49:54 AM UTC-7, Avi Kivity wrote:
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.

Library writers often find themselves wrapping possibly-empty objects in synthetic structs to take advantage of EBO:

  template <typename T, typename Allocator>
  class my_container {
      struct alloc_n_size : Allocator {
          size_t size;
          // ctor etc.
      } _M_alloc_n_size;   // only occupies sizeof(size_t) if is_empty<Allocator>.
      ...
  };

Would it not be more comfortable to supply some syntax for this:

  template <typename T, typename Allocator>
  class my_container {
      size_t _M_size;
      std::allow_zero_size<Allocator> _M_allocator;
      ...
  };

std::allow_zero_size<> is a library class template that uses compiler magic to tell the compiler that it is acceptable that the address of _M_allocator compare equal to that of some other object (that is, it need not insert padding if sizeof(Allocator) == 0).  It overrides operator.() and friends so that _M_allocator can be used as if std::allow_zero_size<> was not specified.

I think that if it were possible to implement such a std::allow_zero_size<T> today, everybody would be doing it (and that includes Boost).
The hard part isn't so much the semantics of the allow_zero_size class template — that sounds great to me — but rather the problem is that there is no possible implementation of it today.


Of course the implementation relies on compiler magic.  For example, libstdc++ might define it as

  template <typename T>
  struct allow_zero_size {
    T _M_elem [[gnu::allow_zero_size]];
    ...
  };

relying on a new, compiler-specific attribute.  Other parts of the standard library do this; for example search for __has_trivial_copy in <type_traits>.

 

It's kind of like saying "wouldn't it be a good idea if an object named std::cout existed and we could just write std::cout << foo to print the value of any type at all", in the days before operator overloading existed.  Those high-level semantics (arguably) sound great... but in order to implement those semantics, we need someone to do the core-language work of figuring out what it means to overload an operator, or in this case, to have an object with the same address as a different object.

If I'm wrong and there does currently exist a (non-portable but) working proof-of-concept implementation of foo::allow_zero_size<T>, then that's awesome and I want to see it. And your proposal should include a link to it.




There isn't.  The intent is to introduce functionality without changing the syntax, by wrapping compiler-specific syntax in a library class.

Marc

unread,
Jun 5, 2016, 4:56:53 AM6/5/16
to ISO C++ Standard - Future Proposals
On Sunday, May 22, 2016 at 4:49:54 PM UTC+2, Avi Kivity wrote:
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.

I believe that the best way of moving forward with this is to implement your proposal (the attribute version) as an extension in gcc ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63579 ) or clang. That would include: write and test a patch, submit the patch to gcc/clang's mailing list, send a heads up to the cxx-abi-dev mailing list to give developers for other compilers a chance to comment on your exact ABI choices, start adding uses of this attribute in your code, boost, etc. And then you would be able to come to the committee with a stronger position.

Avi Kivity

unread,
Jun 10, 2016, 12:13:09 PM6/10/16
to ISO C++ Standard - Future Proposals
Note that EBO is actively dangerous.  If you inherit from a class that defines a virtual member function that matches the signature of one of your own methods, then you end up overriding it for your EBO'd type.

Avi Kivity

unread,
Jun 10, 2016, 12:16:23 PM6/10/16
to ISO C++ Standard - Future Proposals
That is a very expensive way of moving forward.  It requires me to learn the details of gcc/clang (both large projects with a high barrier to entry).

I understand it for a complex proposal where there is a lot of effort needed anyway, but for small/trivial proposals like mine it's a good way to kill the proposal in its infancy. 

Nicol Bolas

unread,
Jun 10, 2016, 1:02:25 PM6/10/16
to ISO C++ Standard - Future Proposals

Your proposal is most assuredly not trivial.

Your proposal requires changing how the compiler lays out a class; you declare an NSDM, but it somehow takes up no room. Your proposal requires that the address of an object (the empty NSDM) need not be distinct from other unrelated objects. And so forth.

I know it sounds rather burdensome to have to go through so much effort just to get something standardized. But despite how simple the idea sounds, you are still talking about a rather significant change to some very low-level parts of the system.

Empty base optimization is something that is much easier to do mechanically, because the conversion from derived class pointer to base class pointer is designed to not require the pointer value to change.


On Friday, June 10, 2016 at 12:13:09 PM UTC-4, Avi Kivity wrote:
Note that EBO is actively dangerous.  If you inherit from a class that defines a virtual member function that matches the signature of one of your own methods, then you end up overriding it for your EBO'd type.

An issue that could easily be fixed by adding one of two features:

1: The ability to declare a function which will *not* override from a base class virtual. That is, an explicit `nonvirtual`.

void funcname() nonvirtual;

2: The ability to declare that when inheriting from a class, you want to override nothing from that class. I would call this "final inheritance"; neither you nor your child classes can override virtual members of the specified base class.

class foo : public final bar {...};

Both of these would be a *lot* easier to implement than stateless members.

Thiago Macieira

unread,
Jun 10, 2016, 1:30:20 PM6/10/16
to std-pr...@isocpp.org
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous. If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Avi Kivity

unread,
Jun 10, 2016, 2:11:01 PM6/10/16
to ISO C++ Standard - Future Proposals


On Friday, June 10, 2016 at 8:02:25 PM UTC+3, Nicol Bolas wrote:
On Friday, June 10, 2016 at 12:16:23 PM UTC-4, Avi Kivity wrote:
On Sunday, June 5, 2016 at 11:56:53 AM UTC+3, Marc wrote:
On Sunday, May 22, 2016 at 4:49:54 PM UTC+2, Avi Kivity wrote:
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.

I believe that the best way of moving forward with this is to implement your proposal (the attribute version) as an extension in gcc ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63579 ) or clang. That would include: write and test a patch, submit the patch to gcc/clang's mailing list, send a heads up to the cxx-abi-dev mailing list to give developers for other compilers a chance to comment on your exact ABI choices, start adding uses of this attribute in your code, boost, etc. And then you would be able to come to the committee with a stronger position.

That is a very expensive way of moving forward.  It requires me to learn the details of gcc/clang (both large projects with a high barrier to entry).

I understand it for a complex proposal where there is a lot of effort needed anyway, but for small/trivial proposals like mine it's a good way to kill the proposal in its infancy. 

Your proposal is most assuredly not trivial.

Your proposal requires changing how the compiler lays out a class; you declare an NSDM, but it somehow takes up no room.


C compilers (including gcc and clang) manage to do it just fine.
 
Your proposal requires that the address of an object (the empty NSDM) need not be distinct from other unrelated objects.

Something that C compilers seem to be able to live with.

 
And so forth.

Is there anything else?
 

I know it sounds rather burdensome to have to go through so much effort just to get something standardized. But despite how simple the idea sounds, you are still talking about a rather significant change to some very low-level parts of the system.

I must disagree.  Both the C compiler prior art, and the compilers laying out base classes with zero size (base classes are no more than data members at these low levels) support me.  Any non-simplicity would be in possible conflicts with aliasing rules, but since both gcc and clang support empty data members (in C), implementing the front-end syntax for that would be no help in figuring those out.
 

Empty base optimization is something that is much easier to do mechanically, because the conversion from derived class pointer to base class pointer is designed to not require the pointer value to change.


It is not easy to do mechanically, esp. for a template class.  What if the class is final?  What if the class is not an aggregate, but a primitive type?  What if the class starts making member functions of the inheriting class virtual?

Avi Kivity

unread,
Jun 10, 2016, 2:13:41 PM6/10/16
to ISO C++ Standard - Future Proposals


On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.


You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?

Avi Kivity

unread,
Jun 10, 2016, 2:17:38 PM6/10/16
to ISO C++ Standard - Future Proposals


On Friday, June 10, 2016 at 8:02:25 PM UTC+3, Nicol Bolas wrote:

An issue that could easily be fixed by adding one of two features:

1: The ability to declare a function which will *not* override from a base class virtual. That is, an explicit `nonvirtual`.

void funcname() nonvirtual;


So, my container type, and all derived classes, must declare all their methods virtual?  That's no solution.

 
2: The ability to declare that when inheriting from a class, you want to override nothing from that class. I would call this "final inheritance"; neither you nor your child classes can override virtual members of the specified base class.

class foo : public final bar {...};


Then my

  template <class Bar>
  struct foo : public final Bar { ... };

would break every time I add some virtual method to Bar that happens to conflict with it (or any of the derived classes).

 
Both of these would be a *lot* easier to implement than stateless members.

Actually, they are far harder to implement.  Zero-sized members are already supported by (at least most) compilers, all that is missing is syntax to use the capability.
 

Ville Voutilainen

unread,
Jun 10, 2016, 2:22:19 PM6/10/16
to ISO C++ Standard - Future Proposals
On 10 June 2016 at 20:02, Nicol Bolas <jmck...@gmail.com> wrote:
> 2: The ability to declare that when inheriting from a class, you want to
> override nothing from that class. I would call this "final inheritance";
> neither you nor your child classes can override virtual members of the
> specified base class.
>
> class foo : public final bar {...};


I have considered proposing and implementing something like that. I spelled it

class foo final_overrider : public bar {...};

because that allows designating a class that doesn't inherit from any
other class as a final overrider.
Sure, it's different in the sense that foo can be derived from but
none of the virtual functions
in it or its bases can be overridden in such derived classes. The base
controls the ability to override,
not the derived class, like in your idea. I'm not saying either is
necessarily better, I'm just reporting
that such ideas have been considered.

Arthur O'Dwyer

unread,
Jun 10, 2016, 3:07:19 PM6/10/16
to ISO C++ Standard - Future Proposals
On Fri, Jun 10, 2016 at 11:13 AM, Avi Kivity <a...@scylladb.com> wrote:
On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?

Instead, you'd do something roughly like

template<class PossiblyEmptyComparator, class Enable=void>
struct ComparatorPlusInt {
    PossiblyEmptyComparator c;
    int i;
    PossiblyEmptyComparator& get_comparator() { return c; }
    int& get_int() { return i; }
};

template<class EmptyComparator>
struct ComparatorPlusInt<EmptyComparator, enable_if_t<is_empty_v<EmptyComparator>>> : EmptyComparator {
    int i;
    EmptyComparator& get_comparator() { return *(EmptyComparator*)(this); }
    int& get_int() { return i; }
};

and then make a member of type ComparatorPlusInt<MyComparator>, which does the right thing either way. Your class doesn't need to trust the MyComparator class, because it's only ever inherited-from when it is empty (and even then, it's not inherited by your class but by the ComparatorPlusInt class).

Which I guess raises the question of whether std::pair<EmptyComparator, int> and/or std::pair<int, EmptyComparator> should be allowed and/or required to DTRT in this case. I don't actually know what the current wording is, and suspect that it might have the goal of allowing std::pair<X,Y> to be layout-equivalent to a POD struct whenever possible (which conflicts with the goal of making it as small as possible).

I'm pretty sure you (Avi) know all this already, so I'm kind of confused how we got onto the topic of "what if my empty base class has virtual members", "what if my empty base class isn't a class at all", etc.  If you want better support for EBO, how about designing an EBO-friendly std::pair/std::tuple (which might already be done, for all I know); or else doing the heavy lifting of figuring out what it would mean for two non-zero-size objects to share a memory address; or else doing the heavy lifting of figuring out what it would mean for an object to have zero size.

Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

–Arthur

Avi Kivity

unread,
Jun 10, 2016, 3:21:50 PM6/10/16
to std-pr...@isocpp.org
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
On Fri, Jun 10, 2016 at 11:13 AM, Avi Kivity <a...@scylladb.com> wrote:
On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?


They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Are you suggesting you should never inherit from a template parameter?  Because then template classes can choose from:

1. Inheriting from the base class and hitting weird problems
2. Using complex enable_if style solutions
3. Eliding the optimization altogether.

This could be so easily solved with 

4. Adding [[allow_empty_size]] attribute to the data member.

But I guess we must preserve C++'s reputation for making things hard on its users.
 

Instead, you'd do something roughly like

template<class PossiblyEmptyComparator, class Enable=void>
struct ComparatorPlusInt {
    PossiblyEmptyComparator c;
    int i;
    PossiblyEmptyComparator& get_comparator() { return c; }
    int& get_int() { return i; }
};

template<class EmptyComparator>
struct ComparatorPlusInt<EmptyComparator, enable_if_t<is_empty_v<EmptyComparator>>> : EmptyComparator {
    int i;
    EmptyComparator& get_comparator() { return *(EmptyComparator*)(this); }
    int& get_int() { return i; }
};

and then make a member of type ComparatorPlusInt<MyComparator>, which does the right thing either way. Your class doesn't need to trust the MyComparator class, because it's only ever inherited-from when it is empty (and even then, it's not inherited by your class but by the ComparatorPlusInt class).


I'm not saying it can't be done.  I'm saying that a 15-line solution which must be repeated every time it is used is a horrible solution.

 
Which I guess raises the question of whether std::pair<EmptyComparator, int> and/or std::pair<int, EmptyComparator> should be allowed and/or required to DTRT in this case. I don't actually know what the current wording is, and suspect that it might have the goal of allowing std::pair<X,Y> to be layout-equivalent to a POD struct whenever possible (which conflicts with the goal of making it as small as possible).

I'm pretty sure you (Avi) know all this already, so I'm kind of confused how we got onto the topic of "what if my empty base class has virtual members", "what if my empty base class isn't a class at all",


As I already explained, at the point where I (or any library writer) want to apply EBO, it is not known whether the base class is empty or not.  As I'm sure you also know already.

 
etc.  If you want better support for EBO, how about designing an EBO-friendly std::pair/std::tuple (which might already be done, for all I know); or else doing the heavy lifting of figuring out what it would mean for two non-zero-size objects to share a memory address; or else doing the heavy lifting of figuring out what it would mean for an object to have zero size.


Again I'm sure it can be done, but this obfuscates the code.  I wanted the data member not to take any space, not to pair it with another member.

I understand not wanting to add to the standard for every little code cleanup, but EBO is very common, is required for an efficient implementation of the standard itself, and would be a lot more common if it were not so hard to implement.
 

Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.


Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?  

 

–Arthur

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Nicol Bolas

unread,
Jun 10, 2016, 3:25:20 PM6/10/16
to ISO C++ Standard - Future Proposals
On Friday, June 10, 2016 at 3:07:19 PM UTC-4, Arthur O'Dwyer wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

–Arthur

To be fair, the high level semantics are the easy part.

We know exactly what behavior we want from these things. All we really want is for a type to be able to take up no space while it is an NSDM or base class of another type. It can still have a non-zero `sizeof` (thus permitting allocating them on the heap), so that's a can of worms we don't need to open up. And the `this` pointer for member functions is irrelevant, since because the class is stateless, it doesn't need to access any non-stateless NSDMs.

The principle standardization problems that I can see for the concept are:

1: Aliasing. Pointers to such classes have to be able to alias with unrelated types.

2: Layout. Stateless members should not disrupt the layout of the other NSDMs within a type. This also affects ABIs, since NSDMs have always disrupted C++ member layouts before. So the Itanium C++ ABI would need to be changed... I guess?

3: The rule(s) on objects having to have a specific piece of storage.

4: Pointer addressing and arithmetic. If you increment a pointer to a stateless NSDM, what do you get? Can you have NSDM arrays of stateless members, and if so, do they take up space?

5: Trivial copyability. If the object takes up no space, yet has a non-zero `sizeof()`, is the type trivially copyable? If a stateless pointer always aliases with some other object's memory, wouldn't trivially copying it bash that other object's memory? Or do we say that stateless types are not trivially copyable, but having them as subobjects does not by itself make their owning objects non-trivially copyable?

Note that #5 is something I just realized while composing this message; I hadn't considered that issue before.

I don't consider these "high-level" problems. The high-level of the idea is quite clear. But they are significant things that have to be ironed out.

Howard Hinnant

unread,
Jun 10, 2016, 3:32:37 PM6/10/16
to std-pr...@isocpp.org
On Jun 10, 2016, at 3:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
>
> I'm pretty sure you (Avi) know all this already, so I'm kind of confused how we got onto the topic of "what if my empty base class has virtual members", "what if my empty base class isn't a class at all", etc. If you want better support for EBO, how about designing an EBO-friendly std::pair/std::tuple (which might already be done, for all I know); or else doing the heavy lifting of figuring out what it would mean for two non-zero-size objects to share a memory address; or else doing the heavy lifting of figuring out what it would mean for an object to have zero size.

Can’t do it for pair because of the need to have the named data members first and second. But for tuple…


#include <iostream>
#include <tuple>

struct empty {};

int
main()
{
std::cout << sizeof(int) << '\n';
std::cout << sizeof(std::tuple<int, empty>) << '\n';
std::cout << sizeof(long long) << '\n';
std::cout << sizeof(std::tuple<long long, empty>) << '\n';
}

gcc and clang/libc++ output:

4
4
8
8

The optimization is allowed but not required for tuple.

Howard

signature.asc

Nicol Bolas

unread,
Jun 10, 2016, 3:38:05 PM6/10/16
to ISO C++ Standard - Future Proposals
On Friday, June 10, 2016 at 3:21:50 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
On Fri, Jun 10, 2016 at 11:13 AM, Avi Kivity <a...@scylladb.com> wrote:
On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?


They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Are you suggesting you should never inherit from a template parameter?

As he clearly said, you shouldn't inherit from something you don't know what it is.
 
Because then template classes can choose from:

1. Inheriting from the base class and hitting weird problems
2. Using complex enable_if style solutions
3. Eliding the optimization altogether.

This could be so easily solved with 

4. Adding [[allow_empty_size]] attribute to the data member.

Attributes are never allowed to change the behavior of a program. That's the rule with them. Period.
 
But I guess we must preserve C++'s reputation for making things hard on its users.
 
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition. You would have to rewrite a lot of the standard before you can permit zero-sized objects.

Nicol Bolas

unread,
Jun 10, 2016, 3:38:50 PM6/10/16
to ISO C++ Standard - Future Proposals
On Friday, June 10, 2016 at 2:11:01 PM UTC-4, Avi Kivity wrote:
On Friday, June 10, 2016 at 8:02:25 PM UTC+3, Nicol Bolas wrote:
On Friday, June 10, 2016 at 12:16:23 PM UTC-4, Avi Kivity wrote:
On Sunday, June 5, 2016 at 11:56:53 AM UTC+3, Marc wrote:
On Sunday, May 22, 2016 at 4:49:54 PM UTC+2, Avi Kivity wrote:
Some time ago I proposed [1] new syntax for EBO. At that time the discussion devolved into an argument about the attribute syntax.  I'm proposing it again, changing the syntax to avoid attributes.

I believe that the best way of moving forward with this is to implement your proposal (the attribute version) as an extension in gcc ( https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63579 ) or clang. That would include: write and test a patch, submit the patch to gcc/clang's mailing list, send a heads up to the cxx-abi-dev mailing list to give developers for other compilers a chance to comment on your exact ABI choices, start adding uses of this attribute in your code, boost, etc. And then you would be able to come to the committee with a stronger position.

That is a very expensive way of moving forward.  It requires me to learn the details of gcc/clang (both large projects with a high barrier to entry).

I understand it for a complex proposal where there is a lot of effort needed anyway, but for small/trivial proposals like mine it's a good way to kill the proposal in its infancy. 

Your proposal is most assuredly not trivial.

Your proposal requires changing how the compiler lays out a class; you declare an NSDM, but it somehow takes up no room.

C compilers (including gcc and clang) manage to do it just fine.

As Arthur said, C allows zero-sized objects. There are a lot of rules in C++ that get broken if you permit that. That's not an implementation problem so much as a standardization problem.

And so forth.

Is there anything else?

How about the Itanium ABI? I don't see the part of that which permits empty NSDMs of a class to not take up space. It permits EBO, but not stateless NSDMs.

Empty base optimization is something that is much easier to do mechanically, because the conversion from derived class pointer to base class pointer is designed to not require the pointer value to change.

It is not easy to do mechanically, esp. for a template class.  What if the class is final?  What if the class is not an aggregate, but a primitive type?  What if the class starts making member functions of the inheriting class virtual?

It's easier to do mechanically from the perspective of the standard and implementation. The layout of base classes relative to the derived class and to one another is implementation defined. The standard has to permit base class pointers to alias with derived class ones, so as to support implementations where base class members come first. This is what makes EBO possible.

That's what I meant by "mechanically". How you happen to use any available EBO in your application is entirely up to you.

Avi Kivity

unread,
Jun 10, 2016, 3:48:02 PM6/10/16
to std-pr...@isocpp.org
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:21:50 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
On Fri, Jun 10, 2016 at 11:13 AM, Avi Kivity <a...@scylladb.com> wrote:
On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?


They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Are you suggesting you should never inherit from a template parameter?

As he clearly said, you shouldn't inherit from something you don't know what it is.


Well, then how can you apply EBO in a library? Say, std::unordered_set<Key, Hash, ...>, where Hash may and often is empty.
 
 
Because then template classes can choose from:

1. Inheriting from the base class and hitting weird problems
2. Using complex enable_if style solutions
3. Eliding the optimization altogether.

This could be so easily solved with 

4. Adding [[allow_empty_size]] attribute to the data member.

Attributes are never allowed to change the behavior of a program. That's the rule with them. Period.
 

I'm fine with other syntax.  But please, not 15-lines of enable_if.

 
But I guess we must preserve C++'s reputation for making things hard on its users.
 

Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

 
You would have to rewrite a lot of the standard before you can permit zero-sized objects.


Could you give me an example?
 

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jun 10, 2016, 3:53:53 PM6/10/16
to std-pr...@isocpp.org
On sexta-feira, 10 de junho de 2016 11:13:41 PDT Avi Kivity wrote:
> On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
> > On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> >
> > > Note that EBO is actively dangerous. If you inherit from a class that
> > > defines a virtual member function that matches the signature of one of
> >
> > your
> >
> > > own methods, then you end up overriding it for your EBO'd type.
> >
> > A class with virtuals is not empty.
>
> You don't know that beforehand.

Yes, you do, otherwise you're putting the cart ahead of the ox.

EBO is a solution for the problem of taking up space when you know the class
is empty. If you know it's not empty or you don't know, you don't derive from
that class.

Thiago Macieira

unread,
Jun 10, 2016, 3:56:24 PM6/10/16
to std-pr...@isocpp.org
On sexta-feira, 10 de junho de 2016 12:25:20 PDT Nicol Bolas wrote:
> We know *exactly* what behavior we want from these things. All we really
> want is for a type to be able to take up no space while it is an NSDM or
> base class of another type. It can still have a non-zero `sizeof` (thus
> permitting allocating them on the heap), so that's a can of worms we don't
> need to open up. And the `this` pointer for member functions is irrelevant,
> since because the class is stateless, it doesn't need to access any
> non-stateless NSDMs.

The this pointer may be used for other things, like looking up information in
a global variable keyed to the address.

Avi Kivity

unread,
Jun 10, 2016, 4:11:42 PM6/10/16
to std-pr...@isocpp.org
On Fri, Jun 10, 2016 at 10:53 PM, Thiago Macieira <thi...@macieira.org> wrote:
On sexta-feira, 10 de junho de 2016 11:13:41 PDT Avi Kivity wrote:
> On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
> > On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> >
> > > Note that EBO is actively dangerous.  If you inherit from a class that
> > > defines a virtual member function that matches the signature of one of
> >
> > your
> >
> > > own methods, then you end up overriding it for your EBO'd type.
> >
> > A class with virtuals is not empty.
>
> You don't know that beforehand.

Yes, you do, otherwise you're putting the cart ahead of the ox.

EBO is a solution for the problem of taking up space when you know the class
is empty. If you know it's not empty or you don't know, you don't derive from
that class.



Then it's very difficult to use EBO.  You have to provide two specializations for the two cases, because in the general case, you know very little about the parameter.

I'm trying to make EBO usable.

 
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Nicol Bolas

unread,
Jun 10, 2016, 5:56:05 PM6/10/16
to ISO C++ Standard - Future Proposals
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.

What does it mean to have the address of, or a reference to, nothing? What does it mean to perform pointer arithmetic on a pointer to nothing? Can you have an array of nothing?

You would have to rewrite a lot of the standard before you can permit zero-sized objects.


Could you give me an example?

You're the one proposing it. You're the one claiming that it's easy. The burden of proof here is on you.

Nicol Bolas

unread,
Jun 10, 2016, 6:04:35 PM6/10/16
to ISO C++ Standard - Future Proposals
On Friday, June 10, 2016 at 4:11:42 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:53 PM, Thiago Macieira <thi...@macieira.org> wrote:
On sexta-feira, 10 de junho de 2016 11:13:41 PDT Avi Kivity wrote:
> On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
> > On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> >
> > > Note that EBO is actively dangerous.  If you inherit from a class that
> > > defines a virtual member function that matches the signature of one of
> >
> > your
> >
> > > own methods, then you end up overriding it for your EBO'd type.
> >
> > A class with virtuals is not empty.
>
> You don't know that beforehand.

Yes, you do, otherwise you're putting the cart ahead of the ox.

EBO is a solution for the problem of taking up space when you know the class
is empty. If you know it's not empty or you don't know, you don't derive from 
that class.

Then it's very difficult to use EBO.  You have to provide two specializations for the two cases, because in the general case, you know very little about the parameter.

I'm trying to make EBO usable.

Please stop treating the Empty Base Optimization like its sole purpose is to be used by some template type to store some other template type without taking up space in the total aggregate object. That is certainly a viable use for the EBO, but that's not the only reason why we have it.

The Empty Base Optimization is perfectly usable right now. It's just cumbersome for your specific use case.

Arthur O'Dwyer

unread,
Jun 10, 2016, 7:14:41 PM6/10/16
to ISO C++ Standard - Future Proposals
On Fri, Jun 10, 2016 at 12:25 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:07:19 PM UTC-4, Arthur O'Dwyer wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.
To be fair, the high level semantics are the easy part.

We know exactly what behavior we want from these things. [...]

Hi Nicol,
For the record, we're actually in violent agreement here.  When I wrote "high-level", I meant "high-level" as in "high-level language".  I believe everyone agrees 100% on what machine code we want generated for these things; the heavy lifting is all on the philosophical, language-lawyery side. "Dude, what does zero size even mean?", you know?  So by "low-level" I meant "machine code" and by "high-level" I meant "C++ standard".

Whereas by "high-level" you (seem to have) meant "abstract handwavey idea" and by "low-level" you (seem to have) meant "nitty-gritty details of standardese".

I think we're in complete agreement on where the problems are, even if we've used exactly opposite adjectives to describe them. :)

I agree with all of your 5 specific points, especially the one about arrays of zero-sized elements.

–Arthur

Thiago Macieira

unread,
Jun 10, 2016, 7:56:11 PM6/10/16
to std-pr...@isocpp.org
On sexta-feira, 10 de junho de 2016 23:11:21 PDT Avi Kivity wrote:
> > Yes, you do, otherwise you're putting the cart ahead of the ox.
> >
> > EBO is a solution for the problem of taking up space when you know the
> > class
> > is empty. If you know it's not empty or you don't know, you don't derive
> > from
> > that class.
>
> Then it's very difficult to use EBO. You have to provide two
> specializations for the two cases, because in the general case, you know
> very little about the parameter.
>
> I'm trying to make EBO usable.

You're again putting the cart ahead of the oxen.

You don't use EBO. Compilers aren't required to have that optimisation.

It just happens that some do and therefore library writers have used that
optimisation to to save space. The objective is to save space, not to derive.

Arthur O'Dwyer

unread,
Jun 10, 2016, 8:16:30 PM6/10/16
to ISO C++ Standard - Future Proposals
On Fri, Jun 10, 2016 at 12:47 PM, Avi Kivity <a...@scylladb.com> wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:21:50 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?

They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Huh. I checked with the Boost code and then with Wandbox and you're correct, boost::heap::binomial_heap is incorrectly implemented. (It's also apparently been unmaintained for a while, as it produces a whole spew of warnings when compiled with Clang.)  binomial_heap exposes (makes user-visible) private member functions such as allocate() and construct() which aren't supposed to exist in its interface; basically Boost is claiming that a heap is-an allocator, which is nonsense as far as I'm concerned.

This may be defensible on the grounds of "everybody used to do it this way," or it may be indefensible, I'm not old enough to judge. :)  But I do think that these days it's not the right way to do it.

 

Are you suggesting you should never inherit from a template parameter?

As he clearly said, you shouldn't inherit from something you don't know what it is.

Well, then how can you apply EBO in a library? Say, std::unordered_set<Key, Hash, ...>, where Hash may and often is empty.

The way I said (and provided code for).  You pick one of your members to combine in a tuple with that possibly-empty member. Thanks to Howard Hinnant's reply in this thread, I gather that on good implementations (which I'm just going to blithely assume means "all popular implementations" ;)) you can do it via

    template<class Key, class Hash, class EqualityComparator>
    class unordered_set {
        using bucket = std::list<Key>;
        std::tuple<Hash, EqualityComparator, std::vector<bucket>> m;
        auto&& hash() { return std::get<0>(m); }
        auto&& cmp() { return std::get<1>(m); }
        auto&& buckets() { return std::get<2>(m); }
    };

unordered_set is-not-a hash, and is-not-a comparator; but it does have-a hash and have-a comparator. We write our code to express that relationship, and then we get correct and efficient code basically for free.

To the extent that std::tuple doesn't give us efficient code (e.g. if it orders the members wrong and thus wastes a lot of space on padding), vendors can go fix that; that's easy. Or if users are demanding a way to implement tuple-like types (not identical to std::tuple) without so much metaprogramming, then that sounds like it might produce some kind of change. But it sounds basically like you're asking for zero-sized objects in C++ (which is unlikely to happen), and your particular use-case is already solved by std::tuple (modulo possible Quality of Implementation issues with some vendors), so there's not a lot of motivation to do anything about it.

–Arthur

Avi Kivity

unread,
Jun 11, 2016, 6:15:28 AM6/11/16
to std-pr...@isocpp.org
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.



That never comes into question.  The zero-sized region is always part of a larger non-zero-sized region.

Zero-sized regions seem to exist just fine in C++, when inheriting from an empty base class, and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

 
What does it mean to have the address of, or a reference to, nothing?


The same thing as

  struct B {};
  struct D : B { int x; };
  D d;
  B* ptr_to_nothing = &d;  // actually points at d.x
  B& ref_to_nothing = d;

 
What does it mean to perform pointer arithmetic on a pointer to nothing?


 B* arithmetic = ptr_to_nothing + 1;
 
Can you have an array of nothing?

No.  The attribute or whatever syntax we choose applies to structs, not arrays.

 

You would have to rewrite a lot of the standard before you can permit zero-sized objects.


Could you give me an example?

You're the one proposing it. You're the one claiming that it's easy. The burden of proof here is on you.


I'm perfectly willing to shoulder the burden of proof, but since you claimed the problem is Everywhere, I thought it would be trivial for me to help me start out.

I am unable to even see the problem.  Perhaps I'm not suited to the task, or perhaps that's because the compiler writers have already figured it out for base classes and C data members.
 

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Avi Kivity

unread,
Jun 11, 2016, 6:16:30 AM6/11/16
to std-pr...@isocpp.org
On Sat, Jun 11, 2016 at 2:56 AM, Thiago Macieira <thi...@macieira.org> wrote:
On sexta-feira, 10 de junho de 2016 23:11:21 PDT Avi Kivity wrote:
> > Yes, you do, otherwise you're putting the cart ahead of the ox.
> >
> > EBO is a solution for the problem of taking up space when you know the
> > class
> > is empty. If you know it's not empty or you don't know, you don't derive
> > from
> > that class.
>
> Then it's very difficult to use EBO.  You have to provide two
> specializations for the two cases, because in the general case, you know
> very little about the parameter.
>
> I'm trying to make EBO usable.

You're again putting the cart ahead of the oxen.

You don't use EBO. Compilers aren't required to have that optimisation.

It just happens that some do and therefore library writers have used that
optimisation to to save space. The objective is to save space, not to derive.



You are right, I worded it poorly.  My proposal is an alternative to EBO that makes EBO unnecessary.

 
--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Avi Kivity

unread,
Jun 11, 2016, 6:24:29 AM6/11/16
to std-pr...@isocpp.org
On Sat, Jun 11, 2016 at 3:16 AM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
On Fri, Jun 10, 2016 at 12:47 PM, Avi Kivity <a...@scylladb.com> wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:21:50 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:

Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?

They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Huh. I checked with the Boost code and then with Wandbox and you're correct, boost::heap::binomial_heap is incorrectly implemented. (It's also apparently been unmaintained for a while, as it produces a whole spew of warnings when compiled with Clang.)  binomial_heap exposes (makes user-visible) private member functions such as allocate() and construct() which aren't supposed to exist in its interface; basically Boost is claiming that a heap is-an allocator, which is nonsense as far as I'm concerned.

This may be defensible on the grounds of "everybody used to do it this way," or it may be indefensible, I'm not old enough to judge. :)  But I do think that these days it's not the right way to do it.



Of course it has a bug.  My point is that save that extra word of memory is hard and error prone.  Should the language force you to decide between verbose and error prone code, or efficient code?
 
 

Are you suggesting you should never inherit from a template parameter?

As he clearly said, you shouldn't inherit from something you don't know what it is.

Well, then how can you apply EBO in a library? Say, std::unordered_set<Key, Hash, ...>, where Hash may and often is empty.

The way I said (and provided code for).  You pick one of your members to combine in a tuple with that possibly-empty member. Thanks to Howard Hinnant's reply in this thread, I gather that on good implementations (which I'm just going to blithely assume means "all popular implementations" ;)) you can do it via

    template<class Key, class Hash, class EqualityComparator>
    class unordered_set {
        using bucket = std::list<Key>;
        std::tuple<Hash, EqualityComparator, std::vector<bucket>> m;
        auto&& hash() { return std::get<0>(m); }
        auto&& cmp() { return std::get<1>(m); }
        auto&& buckets() { return std::get<2>(m); }
    };

unordered_set is-not-a hash, and is-not-a comparator; but it does have-a hash and have-a comparator. We write our code to express that relationship, and then we get correct and efficient code basically for free.



I agree this is much better than most EBO uses I've seen.  But (a) non-EBOing std::tuple implementations exist, and are in widespread use. (b) you're still writing boilerplate (and you omitted const accessors) (c) this only works if you know for sure there's one member which is non-empty (d) it forces an initialization order on your members.

 
To the extent that std::tuple doesn't give us efficient code (e.g. if it orders the members wrong and thus wastes a lot of space on padding), vendors can go fix that; that's easy.

Not if they want to maintain ABI compatibility, which they do.
 
Or if users are demanding a way to implement tuple-like types (not identical to std::tuple) without so much metaprogramming, then that sounds like it might produce some kind of change. But it sounds basically like you're asking for zero-sized objects in C++ (which is unlikely to happen), and your particular use-case is already solved by std::tuple (modulo possible Quality of Implementation issues with some vendors), so there's not a lot of motivation to do anything about it.


You mean, if it can be theoretically made to work, but is completely impractical, we can consider the problem solved?

 
–Arthur

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jun 11, 2016, 11:38:55 AM6/11/16
to std-pr...@isocpp.org
On sábado, 11 de junho de 2016 13:15:06 PDT Avi Kivity wrote:
> The same thing as
>
> struct B {};
> struct D : B { int x; };
> D d;
> B* ptr_to_nothing = &d; // actually points at d.x
> B& ref_to_nothing = d;
>
> > What does it mean to perform pointer arithmetic on a pointer to nothing?
>
> B* arithmetic = ptr_to_nothing + 1;

And where does this point to?

Also, what's "one past the last element" for zero-sized elements? The same
pointer, or different?

Nicol Bolas

unread,
Jun 11, 2016, 11:51:15 AM6/11/16
to ISO C++ Standard - Future Proposals
On Saturday, June 11, 2016 at 6:15:28 AM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.



That never comes into question.  The zero-sized region is always part of a larger non-zero-sized region.

Zero-sized regions seem to exist just fine in C++, when inheriting from an empty base class,

But they're not zero-sized regions of memory. sizeof for the empty object will return non-zero.
 
and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

C++ only "solved that problem" by making specific exceptions to certain operations when dealing with base class subobjects. Do you know where you would have to make similar exceptions for member subobjects? Do you know if you can just piggy back off of that language, or would you have to scour the spec for locations where a member subobject that doesn't take up space would be problematic?

What does it mean to have the address of, or a reference to, nothing?

The same thing as

  struct B {};
  struct D : B { int x; };
  D d;
  B* ptr_to_nothing = &d;  // actually points at d.x
  B& ref_to_nothing = d;

But `B` is not zero-sized. Nor is `D::B` zero sized.

The use of `B` as a base class simply does not take up room in the layout of `D`. That's a far cry from saying that `B` is zero-sized.

The other thing you're not getting is this.

This is perfectly legal:

B q;
B r
;
memcpy
(&q, &r, sizeof(B));

Because `B` is non-zero sized, that makes sense. `B` is trivially copyable, so you can copy it via memcpy.

This too is legal.

struct D { B b; };

D q
;
D r
;
memcpy
(&q.b, r.b, sizeof(B));

This however, is not legal:

struct D : B {...};

D q
;
D r
;
memcpy
((B*)&q, (B*)&r, sizeof(B));

While both B and D are trivially copyable, you are not allowed to trivially copy into a base-class subobject of another object. The standard explicitly forbids this in [basic.types]/2. Why?

Because it would break empty base optimization (among other things).

By standard layout rules, the presence of `B` as a base class of `D` does not distrub `D`'s layout. Therefore, a pointer to the `B` subobject must point to some storage within `D`. And that storage is probably taken up by one of the members of `D`. And since `B` is not zero-sized, copying anything into a base-class subobject can cause problems.
 
What does it mean to perform pointer arithmetic on a pointer to nothing?

 B* arithmetic = ptr_to_nothing + 1;

That's not an answer. What's the relationship between these two pointers? Are they pointing to the same object?

Greg Marr

unread,
Jun 11, 2016, 11:53:11 AM6/11/16
to ISO C++ Standard - Future Proposals
On Saturday, June 11, 2016 at 6:15:28 AM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
What does it mean to have the address of, or a reference to, nothing?

The same thing as

  struct B {};
  struct D : B { int x; };
  D d;
  B* ptr_to_nothing = &d;  // actually points at d.x
  B& ref_to_nothing = d;

This isn't a pointer to nothing.
In general, B * is a pointer to an object of type B, which has a non-zero size,
or an object of a type derived from B, which has a non-zero size.
 
 
What does it mean to perform pointer arithmetic on a pointer to nothing?

 B* arithmetic = ptr_to_nothing + 1;

You can't do this, because ptr_to_nothing doesn't point to an object or array
of type B or a type derived from B with no NSDMs.

Avi Kivity

unread,
Jun 11, 2016, 1:26:38 PM6/11/16
to std-pr...@isocpp.org
On Sat, Jun 11, 2016 at 6:38 PM, Thiago Macieira <thi...@macieira.org> wrote:
On sábado, 11 de junho de 2016 13:15:06 PDT Avi Kivity wrote:
> The same thing as
>
>   struct B {};
>   struct D : B { int x; };
>   D d;
>   B* ptr_to_nothing = &d;  // actually points at d.x
>   B& ref_to_nothing = d;
>
> > What does it mean to perform pointer arithmetic on a pointer to nothing?
>
>  B* arithmetic = ptr_to_nothing + 1;

And where does this point to?


I'm guessing it adds sizeof(B) to the representation of ptr_to_nothing, which is usually 1 for zero-sized classes.

Note that this is nothing new; you can do this with C++14 (and indeed C++98).  If you wish, you can check with your favorite compiler.
 

Also, what's "one past the last element" for zero-sized elements? The same
pointer, or different?

I did not say I propose this for arrays.  For arrays, the behavior can remain unchanged.

 


--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Avi Kivity

unread,
Jun 11, 2016, 1:45:20 PM6/11/16
to std-pr...@isocpp.org
On Sat, Jun 11, 2016 at 6:51 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Saturday, June 11, 2016 at 6:15:28 AM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.



That never comes into question.  The zero-sized region is always part of a larger non-zero-sized region.

Zero-sized regions seem to exist just fine in C++, when inheriting from an empty base class,

But they're not zero-sized regions of memory. sizeof for the empty object will return non-zero.

Let me go back and explain what I want in detail so there are no misunderstandings.

If I have

   struct A {};
   struct B : A {
      int x;
   };

then A takes up no space in B, and will have the same address as a pointer to B or a pointer to B::x, in most implementations.

What I want is

  struct A {};
  struct C {
      A a /* + some new syntax */;   
      int x;
  };

with the same characteristics: a pointer to C::a can have the same address as a pointer to C, and a pointer to C::x.

The motivation for this is to avoid the need for library writers to jump through hoops with error-prone EBO optimizations, or to attempt to use std::tuple<> which is verbose, and won't work in many of today's major implementations, which cannot be changed due to ABI reasons.
 
Now to answer your question, both sizeof(A) and sizeof(c.a) will be 1, despite both objects taking up no space in either B or C.  So the region of memory occupied by them is zero, despite their sizeof being non-zero. This is existing practice and is not introduced by my proposal.

 
and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

C++ only "solved that problem" by making specific exceptions to certain operations when dealing with base class subobjects. Do you know where you would have to make similar exceptions for member subobjects? Do you know if you can just piggy back off of that language, or would you have to scour the spec for locations where a member subobject that doesn't take up space would be problematic?

I did not exhaustively read the standard looking for those places.  But given that:

1. the problem was solved for base classes
2. the problem was solved, in C, for member objects
3. the problem was solved, in at list gcc, for member objects, by declaring them as arrays of zero size.

it seems to be reasonable that there are no insurmountable difficulties.  There's simply too much prior art to assume it is impossible or even difficult.

 

What does it mean to have the address of, or a reference to, nothing?

The same thing as

  struct B {};
  struct D : B { int x; };
  D d;
  B* ptr_to_nothing = &d;  // actually points at d.x
  B& ref_to_nothing = d;

But `B` is not zero-sized. Nor is `D::B` zero sized.

The use of `B` as a base class simply does not take up room in the layout of `D`. That's a far cry from saying that `B` is zero-sized.


All right.  I am not asking for C::a or A to be zero sized.  I am asking them not to take up room in C, with the same rules applying to base classes (including, perhaps, that if C::a is not the last member in C, then it does take up room in C).

 
The other thing you're not getting is this.

This is perfectly legal:

B q;
B r
;
memcpy
(&q, &r, sizeof(B));

Because `B` is non-zero sized, that makes sense. `B` is trivially copyable, so you can copy it via memcpy.

This too is legal.

struct D { B b; };

D q
;
D r
;
memcpy
(&q.b, r.b, sizeof(B));

This however, is not legal:
c

struct D : B {...};

D q
;
D r
;
memcpy
((B*)&q, (B*)&r, sizeof(B));

While both B and D are trivially copyable, you are not allowed to trivially copy into a base-class subobject of another object. The standard explicitly forbids this in [basic.types]/2. Why?

Because it would break empty base optimization (among other things).

By standard layout rules, the presence of `B` as a base class of `D` does not distrub `D`'s layout. Therefore, a pointer to the `B` subobject must point to some storage within `D`. And that storage is probably taken up by one of the members of `D`. And since `B` is not zero-sized, copying anything into a base-class subobject can cause problems.
 

Good.  We can apply the same restriction to members annotated to take up no room in their struct's layout.

 
What does it mean to perform pointer arithmetic on a pointer to nothing?

 B* arithmetic = ptr_to_nothing + 1;

That's not an answer. What's the relationship between these two pointers? Are they pointing to the same object?


This code is not using my proposal, so whatever the answers are, the standard already provides them (it might invoke undefined behavior for all I know).

My point is, these pointers and references to nothing already exist; they just exist in a hard-to-use way. 
 

--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jun 11, 2016, 6:53:46 PM6/11/16
to std-pr...@isocpp.org
On sábado, 11 de junho de 2016 20:44:58 PDT Avi Kivity wrote:
> Let me go back and explain what I want in detail so there are no
> misunderstandings.

We understand what you want. But you fail to understand how difficult it is to
get what you want in the standard.

Nicol Bolas

unread,
Jun 11, 2016, 7:51:41 PM6/11/16
to ISO C++ Standard - Future Proposals


On Saturday, June 11, 2016 at 1:45:20 PM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 6:51 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Saturday, June 11, 2016 at 6:15:28 AM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.



That never comes into question.  The zero-sized region is always part of a larger non-zero-sized region.

Zero-sized regions seem to exist just fine in C++, when inheriting from an empty base class,

But they're not zero-sized regions of memory. sizeof for the empty object will return non-zero.

Let me go back and explain what I want in detail so there are no misunderstandings.

...

 
Now to answer your question, both sizeof(A) and sizeof(c.a) will be 1, despite both objects taking up no space in either B or C.  So the region of memory occupied by them is zero, despite their sizeof being non-zero. This is existing practice and is not introduced by my proposal.

If that's your idea, then why do you keep bringing up zero-sized arrays and zero-sized types? That is, you keep saying that C allows for empty members, but it does so with a completely different mechanism: by allowing types to have zero size.

So don't bring up empty types in C unless that's actually what you want.

and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

C++ only "solved that problem" by making specific exceptions to certain operations when dealing with base class subobjects. Do you know where you would have to make similar exceptions for member subobjects? Do you know if you can just piggy back off of that language, or would you have to scour the spec for locations where a member subobject that doesn't take up space would be problematic?

I did not exhaustively read the standard looking for those places.  But given that:

1. the problem was solved for base classes
2. the problem was solved, in C, for member objects
3. the problem was solved, in at list gcc, for member objects, by declaring them as arrays of zero size. 

it seems to be reasonable that there are no insurmountable difficulties.  There's simply too much prior art to assume it is impossible or even difficult.

As previously stated, #2 and #3 are non-sequiturs for this conversation. That only leaves #1. And you also clearly state that you haven't looked for such places. And therefore, you don't know if the solutions for base classes will work for members. You also don't know if you would need to make other adjustments that the base class solution didn't need.

So the basis for your claim is flimsy.
 
The other thing you're not getting is this.

This is perfectly legal:

B q;
B r
;
memcpy
(&q, &r, sizeof(B));

Because `B` is non-zero sized, that makes sense. `B` is trivially copyable, so you can copy it via memcpy.

This too is legal.

struct D { B b; };

D q
;
D r
;
memcpy
(&q.b, r.b, sizeof(B));

This however, is not legal:
c
struct D : B {...};

D q
;
D r
;
memcpy
((B*)&q, (B*)&r, sizeof(B));

While both B and D are trivially copyable, you are not allowed to trivially copy into a base-class subobject of another object. The standard explicitly forbids this in [basic.types]/2. Why?

Because it would break empty base optimization (among other things).

By standard layout rules, the presence of `B` as a base class of `D` does not distrub `D`'s layout. Therefore, a pointer to the `B` subobject must point to some storage within `D`. And that storage is probably taken up by one of the members of `D`. And since `B` is not zero-sized, copying anything into a base-class subobject can cause problems.

Good.  We can apply the same restriction to members annotated to take up no room in their struct's layout.

OK, the point of my spiel here was to explain to you just how little you understand the ramifications of what you ask. Until I brought that up, you had no idea that this copying thing was even an issue with your idea. Which proves that you simply do not know much about the standardization issues of permitting empty members.

And that's just one thing. How many others are there? You certainly don't know.

Yet you continue to claim that it will not be difficult. Why should we believe you about how difficult this is, when you have repeatedly displayed your ignorance on the complexities of such a feature?

Not to mention, when asked to demonstrate how "not difficult" it would be by actually implementing it, you balked and claimed that this request was somehow unfair.

Anyone can say "go do this; it should be easy." It's far easier to do that than to actually implement it or learn about the particulars of the spec so that you can get the feature's wording right.

Avi Kivity

unread,
Jun 12, 2016, 3:20:57 AM6/12/16
to std-pr...@isocpp.org
On Sun, Jun 12, 2016 at 1:53 AM, Thiago Macieira <thi...@macieira.org> wrote:
On sábado, 11 de junho de 2016 20:44:58 PDT Avi Kivity wrote:
> Let me go back and explain what I want in detail so there are no
> misunderstandings.

We understand what you want. But you fail to understand how difficult it is to
get what you want in the standard.



You are bring up objections, and I am explaining why those objections are incorrect, but I don't see any response to that. 

Avi Kivity

unread,
Jun 12, 2016, 3:37:07 AM6/12/16
to std-pr...@isocpp.org
On Sun, Jun 12, 2016 at 2:51 AM, Nicol Bolas <jmck...@gmail.com> wrote:


On Saturday, June 11, 2016 at 1:45:20 PM UTC-4, Avi Kivity wrote:


On Sat, Jun 11, 2016 at 6:51 PM, Nicol Bolas <jmck...@gmail.com> wrote:
On Saturday, June 11, 2016 at 6:15:28 AM UTC-4, Avi Kivity wrote:
On Sat, Jun 11, 2016 at 12:56 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Friday, June 10, 2016 at 3:48:02 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:38 PM, Nicol Bolas <jmck...@gmail.com> wrote:
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition.


Would not the zero-sized object occupy a zero-sized region of storage?

And what exactly is "a zero-sized region of storage"? You cannot allocate or deallocate nothing. Even `malloc` doesn't work reasonably with zero. You may or may not get back a NULL pointer, but whatever you get back, you aren't allowed to dereference it.



That never comes into question.  The zero-sized region is always part of a larger non-zero-sized region.

Zero-sized regions seem to exist just fine in C++, when inheriting from an empty base class,

But they're not zero-sized regions of memory. sizeof for the empty object will return non-zero.

Let me go back and explain what I want in detail so there are no misunderstandings.

...
 
Now to answer your question, both sizeof(A) and sizeof(c.a) will be 1, despite both objects taking up no space in either B or C.  So the region of memory occupied by them is zero, despite their sizeof being non-zero. This is existing practice and is not introduced by my proposal.

If that's your idea, then why do you keep bringing up zero-sized arrays and zero-sized types? That is, you keep saying that C allows for empty members, but it does so with a completely different mechanism: by allowing types to have zero size.


No.  I am proposing to use the same mechanism that C++ uses for empty bases (which are not zero sized), and that C uses for empty members (which in C also happen to be zero sized), and that gcc uses for array members of size zero (which are zero sized).
 

So don't bring up empty types in C unless that's actually what you want.

I do want empty types, but they are not zero sized.

struct A {};
sizeof(A) == 1.

 

and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

C++ only "solved that problem" by making specific exceptions to certain operations when dealing with base class subobjects. Do you know where you would have to make similar exceptions for member subobjects? Do you know if you can just piggy back off of that language, or would you have to scour the spec for locations where a member subobject that doesn't take up space would be problematic?

I did not exhaustively read the standard looking for those places.  But given that:

1. the problem was solved for base classes
2. the problem was solved, in C, for member objects
3. the problem was solved, in at list gcc, for member objects, by declaring them as arrays of zero size. 

it seems to be reasonable that there are no insurmountable difficulties.  There's simply too much prior art to assume it is impossible or even difficult.

As previously stated, #2 and #3 are non-sequiturs for this conversation.

They are not non sequiturs.  They are similar to my case but not exact.  If only things that were in the standard were allowed to be standardized, we'd never get anywhere.

 
That only leaves #1. And you also clearly state that you haven't looked for such places. And therefore, you don't know if the solutions for base classes will work for members. You also don't know if you would need to make other adjustments that the base class solution didn't need.

I don't know, that is why I am asking the collective wisdom of this list.  And the answers I'm getting are "it's everywhere" and "go look yourself" which are just symptoms of automatic rejection, not of anyone knowing any actual objection.
 

So the basis for your claim is flimsy.

We have seen one issue with memcpy() of empty base classes explicitly disallowed.  It could be extended to empty members that take up no space (since they are distinguished by some syntax or other).  Are there any other issues, or are we left with a vague everywhere?

 
 
The other thing you're not getting is this.

This is perfectly legal:

B q;
B r
;
memcpy
(&q, &r, sizeof(B));

Because `B` is non-zero sized, that makes sense. `B` is trivially copyable, so you can copy it via memcpy.

This too is legal.

struct D { B b; };

D q
;
D r
;
memcpy
(&q.b, r.b, sizeof(B));

This however, is not legal:
c
struct D : B {...};

D q
;
D r
;
memcpy
((B*)&q, (B*)&r, sizeof(B));

While both B and D are trivially copyable, you are not allowed to trivially copy into a base-class subobject of another object. The standard explicitly forbids this in [basic.types]/2. Why?

Because it would break empty base optimization (among other things).

By standard layout rules, the presence of `B` as a base class of `D` does not distrub `D`'s layout. Therefore, a pointer to the `B` subobject must point to some storage within `D`. And that storage is probably taken up by one of the members of `D`. And since `B` is not zero-sized, copying anything into a base-class subobject can cause problems.

Good.  We can apply the same restriction to members annotated to take up no room in their struct's layout.

OK, the point of my spiel here was to explain to you just how little you understand the ramifications of what you ask. Until I brought that up, you had no idea that this copying thing was even an issue with your idea. Which proves that you simply do not know much about the standardization issues of permitting empty members.



I'm not pretending to be an expert on standardization. I do happen to be an expert C++ programmer and I know the feature will be very useful to library writers.

 
And that's just one thing. How many others are there? You certainly don't know.

Yet you continue to claim that it will not be difficult. Why should we believe you about how difficult this is, when you have repeatedly displayed your ignorance on the complexities of such a feature?


Looks like we're an infinite loop.  Yes, getting the language-lawyering will require an expert in the standard (which I'm not). No, you can't convince me this is difficult to implement, since the implementation already exists (with some variations), or that it would be difficult for an expert to word (since new syntax is introduced, we can remove any guarantees that are given without the syntax).
 
Not to mention, when asked to demonstrate how "not difficult" it would be by actually implementing it, you balked and claimed that this request was somehow unfair.

Anyone can say "go do this; it should be easy." It's far easier to do that than to actually implement it or learn about the particulars of the spec so that you can get the feature's wording right.

What I am actually saying is that it would be easier for an expert in the standard to word it than for me, and easier for an expert in the compiler to implement it than to me. I'm not going to invest a week of me time to prove this feature; I want it, but not that much. I thought that with the focus on C++ being made easier to learn that this simplification would be welcome.  As it is now, reading a standard library implementation is very difficult.  My proposal won't make it easy, but it will make it less difficult.
 

Jonathan

unread,
Jun 12, 2016, 5:37:24 AM6/12/16
to std-pr...@isocpp.org
This seems like an interesting idea to me. 

Is anyone who could implement this interested enough to have a go?
I'm afraid I'm personally in the 'interested but unable' category.


At a guess, an implementation (or firm conclusion that implementation is impossible) might make some of the required wording changes more straightforward to understand.

Regards,

Jon
--
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.

Thiago Macieira

unread,
Jun 12, 2016, 6:24:47 PM6/12/16
to std-pr...@isocpp.org
I'm not bringing up objections. I'm giving advice: what you want is actually
more difficult to explain than what you expect.

Just try and write the necessary parts of the standard and/or modify the
compilers to do what you're asking. Especially a compiler that doesn't
implement C99.

Nicol Bolas

unread,
Jun 12, 2016, 11:29:49 PM6/12/16
to ISO C++ Standard - Future Proposals
On Sunday, June 12, 2016 at 3:37:07 AM UTC-4, Avi Kivity wrote:
On Sun, Jun 12, 2016 at 2:51 AM, Nicol Bolas <jmck...@gmail.com> wrote:
On Saturday, June 11, 2016 at 1:45:20 PM UTC-4, Avi Kivity wrote:
Now to answer your question, both sizeof(A) and sizeof(c.a) will be 1, despite both objects taking up no space in either B or C.  So the region of memory occupied by them is zero, despite their sizeof being non-zero. This is existing practice and is not introduced by my proposal.

If that's your idea, then why do you keep bringing up zero-sized arrays and zero-sized types? That is, you keep saying that C allows for empty members, but it does so with a completely different mechanism: by allowing types to have zero size.

No.  I am proposing to use the same mechanism that C++ uses for empty bases (which are not zero sized), and that C uses for empty members (which in C also happen to be zero sized), and that gcc uses for array members of size zero (which are zero sized).

Do you not understand the fact that those are 3 different mechanisms? Just because they have the same result does not mean that they achieve that result in the same way.

Mechanisms are about "how" something gets done, not "what" is being done.

and in C, with empty struct members.  Both C++ and C seem to have solved this problem.

C++ only "solved that problem" by making specific exceptions to certain operations when dealing with base class subobjects. Do you know where you would have to make similar exceptions for member subobjects? Do you know if you can just piggy back off of that language, or would you have to scour the spec for locations where a member subobject that doesn't take up space would be problematic?

I did not exhaustively read the standard looking for those places.  But given that:

1. the problem was solved for base classes
2. the problem was solved, in C, for member objects
3. the problem was solved, in at list gcc, for member objects, by declaring them as arrays of zero size. 

it seems to be reasonable that there are no insurmountable difficulties.  There's simply too much prior art to assume it is impossible or even difficult.

As previously stated, #2 and #3 are non-sequiturs for this conversation.

They are not non sequiturs.  They are similar to my case but not exact.  If only things that were in the standard were allowed to be standardized, we'd never get anywhere.

They're non-sequiturs because you clearly said, "I do want empty types, but they are not zero sized." If you don't want zero-sized types, then bringing up a feature involving zero-sized types is irrelevant since that is explicitly not what you're asking for.

It's like saying that rockets can carry people into space because you've seen airplanes fly. Even if your conclusion is right, your reasoning is flawed. Airplanes and rockets use very different mechanisms to achieve flight. Just as zero-sized types and base classes are very different mechanisms for achieving things.
 
That only leaves #1. And you also clearly state that you haven't looked for such places. And therefore, you don't know if the solutions for base classes will work for members. You also don't know if you would need to make other adjustments that the base class solution didn't need.

I don't know, that is why I am asking the collective wisdom of this list.  And the answers I'm getting are "it's everywhere" and "go look yourself" which are just symptoms of automatic rejection, not of anyone knowing any actual objection.

It's very important to understand how getting features standardized works. You seem to be under the impression that there are legions of spec editors just sitting around idle, champing at the bit to implement the next good idea someone posts on a forum. And so if you post a really good idea, one of them will do all the hard work for you.

That's not how it works. Pretty much nothing has ever been added to C++ just by doing that.

The way to get something standardized is by actually doing at least some of the work. For a feature like this, one for which careful wording will have to be crafted, you have to actually be able to demonstrate some knowledge of what the corner cases are. If there is some reasonable doubt about it being able to be implemented, you have to be able to show that it is implementable (and no, bringing up irrelevant other cases is not good enough). And so forth.

Ideas are worth nothing; it's effort that gets things done. That's not "automatic rejection"; that's reality.

FrankHB1989

unread,
Jun 14, 2016, 3:32:55 AM6/14/16
to ISO C++ Standard - Future Proposals
What you need actually is not EBO. You need a new kind of empty base which is not an object type. Such a type can with zero size, as a reference whose size is explicitly unspecified. Then there is no problem about memory layout, ABI compatibility, pointer arithmetic, etc. (However it can be more problematic for standardization.)


FrankHB1989

unread,
Jun 14, 2016, 3:41:27 AM6/14/16
to ISO C++ Standard - Future Proposals


在 2016年6月11日星期六 UTC+8上午3:38:05,Nicol Bolas写道:
On Friday, June 10, 2016 at 3:21:50 PM UTC-4, Avi Kivity wrote:
On Fri, Jun 10, 2016 at 10:07 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:
On Fri, Jun 10, 2016 at 11:13 AM, Avi Kivity <a...@scylladb.com> wrote:
On Friday, June 10, 2016 at 8:30:20 PM UTC+3, Thiago Macieira wrote:
On sexta-feira, 10 de junho de 2016 09:13:09 PDT Avi Kivity wrote:
> Note that EBO is actively dangerous.  If you inherit from a class that
> defines a virtual member function that matches the signature of one of your
> own methods, then you end up overriding it for your EBO'd type.

A class with virtuals is not empty.

You don't know that beforehand.

template <class PossiblyEmptyComparator>
struct my_container;

Should my_container inherit from PossiblyEmptyComparator, or should it contain it as a data member?

What if PossiblyEmptyComparator is a function pointer type?
Now you're no longer talking about EBO, though. You're talking about NEBP: the Non-Empty Base Pessimization. Obviously you should never name anything as a base class of yours if you don't know what's in it. Library implementors don't do that; why should you?


They do it all the time, with exactly the example I gave.  A colleague hit it recently with boost's binomial heap.

Are you suggesting you should never inherit from a template parameter?

As he clearly said, you shouldn't inherit from something you don't know what it is.
 
Because then template classes can choose from:

1. Inheriting from the base class and hitting weird problems
2. Using complex enable_if style solutions
3. Eliding the optimization altogether.

This could be so easily solved with 

4. Adding [[allow_empty_size]] attribute to the data member.

Attributes are never allowed to change the behavior of a program. That's the rule with them. Period.
Citation needed. I only remember Herb Sutter claims that attributes should never allowed to change the semantics.
Both of them are doubtful in the aspect of effect, since they are not requirements in the normative standard text. Even the published standards violate these rules. Also note [dcl.align] is [dcl.attr]. Do you think it is a defect?

 
But I guess we must preserve C++'s reputation for making things hard on its users.
 
Re your comments elsethread: C doesn't have a lot of these "heavy lifting" problems because it does allow zero-sized objects and it doesn't have a very strong type system the way C++ does. C++'s heavy lifting isn't in the machine-level implementation details; there you're right that the compilers can just "do what C does."  The heavy lifting is in the C++-specific stuff: the high-level semantics, the language. That's the hard part.

Then it's totally unreasonable for a compiler newbie like myself to try and figure them out.  But can you explain where in the high level semantics a zero sized data member enters at all?

... Everywhere? Is that a place?

C++ defines an object, first and foremost, as "a region of storage". The entire C++ object model relies on that. A zero-sized object is anathema to that definition. You would have to rewrite a lot of the standard before you can permit zero-sized objects.

Nicol Bolas

unread,
Jun 14, 2016, 1:19:56 PM6/14/16
to ISO C++ Standard - Future Proposals
On Tuesday, June 14, 2016 at 3:41:27 AM UTC-4, FrankHB1989 wrote:
在 2016年6月11日星期六 UTC+8上午3:38:05,Nicol Bolas写道:
Attributes are never allowed to change the behavior of a program. That's the rule with them. Period.
Citation needed. I only remember Herb Sutter claims that attributes should never allowed to change the semantics.

Semantics, behavior; it's effectively the same thing: could you tell the difference in the program if the attribute weren't present?

If the answer is yes, then it can't be an attribute. And since attributes-as-behavior/semantics is clearly on Herb Sutter's "over my dead body" list, using attributes to change the program's visible behavior is a non-starter.

Both of them are doubtful in the aspect of effect, since they are not requirements in the normative standard text. Even the published standards violate these rules. Also note [dcl.align] is [dcl.attr]. Do you think it is a defect?

While `alignas` is grammatically an `attribute-specifier`, it is grammatically not an `attribute`. It isn't in an `attribute-list`, so it doesn't go within a [[]] pair. So it doesn't count.

FrankHB1989

unread,
Jun 15, 2016, 11:47:28 PM6/15/16
to ISO C++ Standard - Future Proposals


在 2016年6月15日星期三 UTC+8上午1:19:56,Nicol Bolas写道:
On Tuesday, June 14, 2016 at 3:41:27 AM UTC-4, FrankHB1989 wrote:
在 2016年6月11日星期六 UTC+8上午3:38:05,Nicol Bolas写道:
Attributes are never allowed to change the behavior of a program. That's the rule with them. Period.
Citation needed. I only remember Herb Sutter claims that attributes should never allowed to change the semantics.

Semantics, behavior; it's effectively the same thing: could you tell the difference in the program if the attribute weren't present?

If the answer is yes, then it can't be an attribute. And since attributes-as-behavior/semantics is clearly on Herb Sutter's "over my dead body" list, using attributes to change the program's visible behavior is a non-starter.

For example, in a program with [[noreturn]] used correctly, its existence should not change the behavior, but it alters the semantics.

I remember Herb did not like this. Perhaps he should use "behavior" instead.
 
Both of them are doubtful in the aspect of effect, since they are not requirements in the normative standard text. Even the published standards violate these rules. Also note [dcl.align] is [dcl.attr]. Do you think it is a defect?

While `alignas` is grammatically an `attribute-specifier`, it is grammatically not an `attribute`. It isn't in an `attribute-list`, so it doesn't go within a [[]] pair. So it doesn't count.

While I agree with you on the point of grammar, the term "attribute" seems to be not only used as a syntactic category.

N4594
7.6 Attributes [dcl.attr]
7.6.1 Attribute syntax and semantics [dcl.attr.grammar]
1 Attributes specify additional information for various source constructs such as types, variables, names, blocks, or translation units.

attribute-specifier-seq:
attribute-specifier-seqopt attribute-specifier
attribute-specifier:
[ [ attribute-list ] ]
alignment-specifier
alignment-specifier:
alignas ( type-id ...opt )
alignas ( constant-expression ...opt )
attribute-list:
attributeopt
attribute-list , attributeopt
attribute ...
attribute-list , attribute ...
attribute:
attribute-token attribute-argument-clauseopt
attribute-token:
identifier
attribute-scoped-token
attribute-scoped-token:
attribute-namespace :: identifier
attribute-namespace:
identifier
attribute-argument-clause:
( balanced-token-seqopt )
balanced-token-seq:
balanced-token
balanced-token-seq balanced-token
balanced-token:
( balanced-token-seqopt )
[ balanced-token-seqopt ]
{ balanced-token-seqopt }
any token other than a parenthesis, a bracket, or a brace

Perhaps `attribute-list` is more precise to the intention.
 

Nicol Bolas

unread,
Jun 16, 2016, 10:23:29 AM6/16/16
to ISO C++ Standard - Future Proposals
On Wednesday, June 15, 2016 at 11:47:28 PM UTC-4, FrankHB1989 wrote:
在 2016年6月15日星期三 UTC+8上午1:19:56,Nicol Bolas写道:
On Tuesday, June 14, 2016 at 3:41:27 AM UTC-4, FrankHB1989 wrote:
Both of them are doubtful in the aspect of effect, since they are not requirements in the normative standard text. Even the published standards violate these rules. Also note [dcl.align] is [dcl.attr]. Do you think it is a defect?

While `alignas` is grammatically an `attribute-specifier`, it is grammatically not an `attribute`. It isn't in an `attribute-list`, so it doesn't go within a [[]] pair. So it doesn't count.

While I agree with you on the point of grammar, the term "attribute" seems to be not only used as a syntactic category.

If you look at this particular thread of conversation, it is abundantly clear exactly what I was referring to when I said "attribute".

Please stop being pointlessly pedantic and derailing the conversation into irrelevant issues.

FrankHB1989

unread,
Jun 16, 2016, 9:31:11 PM6/16/16
to ISO C++ Standard - Future Proposals


在 2016年6月16日星期四 UTC+8下午10:23:29,Nicol Bolas写道:
The derailing started from your point: "That's the rule with them. Period."

If you still believe it is valid, go for a new thread in std-discussion. Period.

Reply all
Reply to author
Forward
0 new messages