Another alternative to EBO: Unused member optimization and implicit static

83 views
Skip to first unread message

Jonathan Müller

unread,
Jul 7, 2016, 5:03:52 AM7/7/16
to std-pr...@isocpp.org
This is a suggestion for a different approach to "solve" EBO.
Responses to Allow zero size arrays and let them occupy zero bytes mentioned that the C++ object model cannot have anything that has size 0.
So I got a completely different idea based on the fact that empty types are only needed for their member functions.
  1. (optional, required for current empty types to work) Make member functions that don't access any non-static member variables or functions implicitly static.
    This shouldn't break any existing code because you can call a static member function with the same syntax as a non-member function, so any code calling a member function can also call a static function without change.
  2. "Encourage" - like with the RVO - compilers to remove objects that are only used to call static member functions. Of course once the actual object is required (i.e. if you take the address of it), this optimization cannot happen.

    With these changes you can simply write the following generic code:
    struct empty_policy
    {
     
    /*static*/ void do_sth();
    };

    template <class Policy>
    class generic_type
    {
    public:
     
    void do_sth_fancy()
     
    {
       
    ++value;
        policy
    .do_sth();
     
    }

    private:
     
    int value;
     
    Policy policy;
    };

    Because policy is only used to call a (implictly) static member function, it will be optimized out and sizeof(generic_type<empty_policy>) will be sizeof(int).
    If you instantiate it with a non-empty type, however, this is not the case.

    This approach is cumbersome, however: It is very easy to write code that depends on the address of policy,
    I thus also suggest:

  3. A new attribute: maybe_empty (or a completely different name). It informs the compiler that a member could be empty for some type and warns if you write code that requires its existence.
    With this attribute, writing generic code is much easier.

    There is still a problem though: Often you provide an accessor function to the policy, the usual signature is:
    const Policy& get_policy() const;
    But this would break the optimization because you form a reference to the empty object and thus require its existence in memory.
    You need to resolve to a std::conditional or a similar solution to return it by-value. To ease that I also suggest:

  4. A standard library addition: std::empty_reference<T> (or a completely different name).
    I
    t could be a simple alias for std::conditional_t<std::is_empty_v<T>, std::remove_cv_t<T>, T&> or a fully fledged class if need be.

With all those changes usage would look like so:

Enter code here...
struct empty_policy
{
 
/*static*/ void do_sth();
};

template <class Policy>
class generic_type
{
public:
 
void do_sth_fancy()
 
{
   
++value;
    policy
.do_sth();
 
}

  std::empty_reference<const Policy> get_policy() const
   {
       return policy;
    }
 

private:
 
int value;
  [[maybe_empty]]
Policy policy;
};


This is just a rough sketch, there might be issues (like the copy/move ctors/assignment operators for empty types and their ability to affect the optimization),

but what do you think?

Jonathan Müller

Nicol Bolas

unread,
Jul 7, 2016, 10:22:46 AM7/7/16
to ISO C++ Standard - Future Proposals
On Thursday, July 7, 2016 at 5:03:52 AM UTC-4, Jonathan Müller wrote:
This is a suggestion for a different approach to "solve" EBO.

Why do people keep talking about this in terms of EBO, like that's the only reason why you would ever want a member to take up no space?

Responses to Allow zero size arrays and let them occupy zero bytes mentioned that the C++ object model cannot have anything that has size 0.
So I got a completely different idea based on the fact that empty types are only needed for their member functions.
  1. (optional, required for current empty types to work) Make member functions that don't access any non-static member variables or functions implicitly static.
So, a member function which calls a sibling member function that doesn't access NSDMs wouldn't be allowed? That's a pretty harsh requirement. And a very unnecessary one.

And what if the type has its own member which is empty by this definition? Are you saying that nested empty members don't work:

struct Empty1 { void DoSomething() {}};
struct Empty2 {Empty 1 e; void DoSomething() {e.DoSomething();}};

Why should `Empty2` not be just as empty as `Empty1`?

Also, you are now doing something never before seen in C++: you are changing the definition of a class based on the definition of one of its member functions. Naturally that requires those member functions to be inline. Consider this:

struct Empty
{
 
void DoSomething();
};

struct Full
{
 
int foo;
 
Empty bar;
};

You are saying that `sizeof(Full)` depends on the eventual definition of `DoSomething`. This means that compilers have to do one of the following:

1. In the absence of the definition of any member function, assume that the class is not stateless until all member function definitions are found.

Which means that `sizeof(Full)` will be 8 bytes or so. And yet, if you do this afterwards:

inline void Empty::DoSomething() {}

struct Full2
{
 
int foo;
 
Empty bar;
};

`sizeof(Full2)` will be 4. That's rather surprising, yes?

2. In the absence of the definition of any members, wait until the entire translation unit is fully explorered before determining the sizes of classes.

3. Unless all members are defined within the class itself, the class can never be stateless.
  1. This shouldn't break any existing code because you can call a static member function with the same syntax as a non-member function, so any code calling a member function can also call a static function without change.
  2. "Encourage" - like with the RVO - compilers to remove objects that are only used to call static member functions. Of course once the actual object is required (i.e. if you take the address of it), this optimization cannot happen.
You're effectively saying that I can declare a class that includes an empty class as a public member. And that public member will suddenly take up space if any code anywhere takes the address of that member? There is no way that would work. You're saying that the compiler would have to compile the entire source code, fully linked and everything, before it could know if it could optimize a variable away.

Unless you're saying that taking the address of a member can provoke an ODR violation. Which is much worse.

No, this is just not workable. However it is you want to make a subobject stateless, this cannot be implicit. You need some syntax to explicitly declare that you are invoking this power, either at the site of a member's declaration or at the site of a type's definition (or both).

  1. With these changes you can simply write the following generic code:
    struct empty_policy
    {
     
    /*static*/ void do_sth();
    };

    template <class Policy>
    class generic_type
    {
    public:
     
    void do_sth_fancy()
     
    {
       
    ++value;
        policy
    .do_sth();
     
    }

    private:
     
    int value;
     
    Policy policy;
    };

    Because policy is only used to call a (implictly) static member function, it will be optimized out and sizeof(generic_type<empty_policy>) will be sizeof(int).
    If you instantiate it with a non-empty type, however, this is not the case.
You said "will be". If that's the case, then this is not like RBO, since that's not required. So is this required or is it not?

  1. This approach is cumbersome, however: It is very easy to write code that depends on the address of policy,
    I thus also suggest:

  2. A new attribute
No. Stop using attributes to solve problems. Especially this one.


This is just a rough sketch, there might be issues (like the copy/move ctors/assignment operators for empty types and their ability to affect the optimization),

but what do you think?


You're essentially trying to define away the problem by making the object not actually be a member. It's an interesting solution to the problem, but I don't think it's the best way to handle it.

The basic problem with stateless members is that they violate one of two C++ rules: 1) Objects must occupy a region of storage, 2) different instances of objects must occupy different regions of storage. In order to have stateless objects, one of these rules must be changed.

I think changing rule #1 is folly, because that rule is a very fundamental foundation of the entire C++ object model. The fallout of making such a change is not well understood. Rule #2 is not quite so inviolate, because we allow base classes to get away with it.

I think the most effective way to move forward is to make changes to rule #2 which allows member subobjects (properly designated via some syntax) to not occupy a unique region of storage. It would allow us to make such stateless subobjects to be as regular as possible; they work like regular objects except where necessary in order for them to take up no space in their containing object's layout.

Jonathan Müller

unread,
Jul 7, 2016, 11:37:52 AM7/7/16
to ISO C++ Standard - Future Proposals

On Thursday, July 7, 2016 at 4:22:46 PM UTC+2, Nicol Bolas wrote:
Why do people keep talking about this in terms of EBO, like that's the only reason why you would ever want a member to take up no space?
No, because that's the only way to have empty members.
So every solution would be a replacement for EBO.

So, a member function which calls a sibling member function that doesn't access NSDMs wouldn't be allowed? That's a pretty harsh requirement. And a very unnecessary one.
Many negations, let me grasp this:
A member function that calls a sibling member function.
The sibling member function doesn't access NSDMS, thus it is implictly static.
And thus the original member function is implictly static.
 
And what if the type has its own member which is empty by this definition? Are you saying that nested empty members don't work:

struct Empty1 { void DoSomething() {}};
struct Empty2 {Empty 1 e; void DoSomething() {e.DoSomething();}};

Why should `Empty2` not be just as empty as `Empty1`?
`Empty2` should be just as empty as `Empty1`, yes.

 
Also, you are now doing something never before seen in C++: you are changing the definition of a class based on the definition of one of its member functions. Naturally that requires those member functions to be inline. Consider this:

struct Empty
{
 
void DoSomething();
};

struct Full
{
 
int foo;
 
Empty bar;
};

You are saying that `sizeof(Full)` depends on the eventual definition of `DoSomething`. This means that compilers have to do one of the following:
No, because `Empty` doesn't have any member variables, `DoSomething` cannot possibly access any member variables and is thus implicitly static.
 
You said "will be". If that's the case, then this is not like RBO, since that's not required. So is this required or is it not?
My mistake, I meant copy elision, not RVO. It is required.
 
No. Stop using attributes to solve problems. Especially this one.
Isn't that exactly why attributes are there?
Note that the attribute has no influence on the optimization or similar, it will be performed anyway.
The attribute just warns if the optimization cannot be performed.
There are already similar GCC/Clang attributes.
 
You're essentially trying to define away the problem by making the object not actually be a member. It's an interesting solution to the problem, but I don't think it's the best way to handle it.
Yes, that's my idea.

Nicol Bolas

unread,
Jul 7, 2016, 12:15:52 PM7/7/16
to ISO C++ Standard - Future Proposals
On Thursday, July 7, 2016 at 11:37:52 AM UTC-4, Jonathan Müller wrote:

On Thursday, July 7, 2016 at 4:22:46 PM UTC+2, Nicol Bolas wrote:
Why do people keep talking about this in terms of EBO, like that's the only reason why you would ever want a member to take up no space?
No, because that's the only way to have empty members.
So every solution would be a replacement for EBO.

No, it isn't.

EBO exists to permit optimized storage of base classes. Now, people have used a type as a base class in cases that they otherwise would have just used an NSDM because EBO is available. But that's not the only reason we have EBO.

Stateless members do not replace EBO. It will replace using EBO for types that you want to be members but can't due to overhead. When the relationship between the subobject and its container is not an "is a" relationship.

EBO will still be used for CRTP purposes, since that does model the "is a" relationship.
 
So, a member function which calls a sibling member function that doesn't access NSDMs wouldn't be allowed? That's a pretty harsh requirement. And a very unnecessary one.
Many negations, let me grasp this:
A member function that calls a sibling member function.
The sibling member function doesn't access NSDMS, thus it is implictly static.
And thus the original member function is implictly static.

And what if the two functions recursively call each other?
 
 
And what if the type has its own member which is empty by this definition? Are you saying that nested empty members don't work:

struct Empty1 { void DoSomething() {}};
struct Empty2 {Empty 1 e; void DoSomething() {e.DoSomething();}};

Why should `Empty2` not be just as empty as `Empty1`?
`Empty2` should be just as empty as `Empty1`, yes.
 
Also, you are now doing something never before seen in C++: you are changing the definition of a class based on the definition of one of its member functions. Naturally that requires those member functions to be inline. Consider this:

struct Empty
{
 
void DoSomething();
};

struct Full
{
 
int foo;
 
Empty bar;
};

You are saying that `sizeof(Full)` depends on the eventual definition of `DoSomething`. This means that compilers have to do one of the following:
No, because `Empty` doesn't have any member variables, `DoSomething` cannot possibly access any member variables and is thus implicitly static.

It can do this:

static Empty *ptr = nullptr;

void
Empty::DoSomething()
{
  ptr
= this;
}

If `DoSomething` is a static function, then that code is il-formed. And yet, that code is 100% legal and well-defined today. What you're proposing would therefore break this code. While you might scoff at that for this example, it would also break this code:

void Empty::DoSomething()
{
  std
::sort(..., [this](const auto &lhs, const auto &rhs) {...});
}

It would break any code where the function uses `this` as a pointer/reference to its own instance. Empty types are passed by pointer/reference quite frequently, and member functions of empty types should reasonably be expected to be able to call such functions.

You cannot assume that member functions of empty classes don't use `this`. Therefore, the only way to make your proposal work is to have the determination of whether a type is stateless or not based on what its member functions are actually doing. Based on their definitions. Which means you need to have those definitions when it comes time to decide what the class is.

You can't get around this. You cannot declare that stateless objects just don't exist; they have to be real, live objects with actual, genuine pointers.

You said "will be". If that's the case, then this is not like RBO, since that's not required. So is this required or is it not?
My mistake, I meant copy elision, not RVO. It is required.

Only partially; it is only required for non-named cases.

Tony V E

unread,
Jul 7, 2016, 5:49:52 PM7/7/16
to Standard Proposals
On Thu, Jul 7, 2016 at 5:03 AM, Jonathan Müller <jonathanm...@gmail.com> wrote:
This is a suggestion for a different approach to "solve" EBO.
Responses to Allow zero size arrays and let them occupy zero bytes mentioned that the C++ object model cannot have anything that has size 0.
So I got a completely different idea based on the fact that empty types are only needed for their member functions.
  1. (optional, required for current empty types to work) Make member functions that don't access any non-static member variables or functions implicitly static.

This breaks existing ABIs.  Lots of code breakage. Very bad,  "...negative, negative. We had a reactor leak here now... Large leak, very dangerous..."

arthur...@mixpanel.com

unread,
Jul 7, 2016, 8:19:38 PM7/7/16
to ISO C++ Standard - Future Proposals
On Thursday, July 7, 2016 at 2:03:52 AM UTC-7, Jonathan Müller wrote:
This is a suggestion for a different approach to "solve" EBO.
Responses to Allow zero size arrays and let them occupy zero bytes mentioned that the C++ object model cannot have anything that has size 0.
So I got a completely different idea based on the fact that empty types are only needed for their member functions.
  1. (optional, required for current empty types to work) Make member functions that don't access any non-static member variables or functions implicitly static.
    This shouldn't break any existing code because you can call a static member function with the same syntax as a non-member function, so any code calling a member function can also call a static function without change.
As Nicol pointed out, this blows up on

struct A {
   
int not_visible();
   
int not_sure_if_implicitly_static_or_not() { return not_visible(); }
};

but my first thought was actually that it blows up on

struct B {
   
int oops() { return 3; }
   
int oops() const { return 4; }
};

and for that matter on

struct B {
   
int implicitly_static() { return 3; }
};
template<class T>
auto sfinae(const T& t) -> decltype(t.implicitly_static())
{
   
return 3;
}

... sfinae(B()) ...

 > "Encourage" - like with the RVO - compilers to remove objects that are only used to call static member functions.
> Of course once the actual object is required (i.e. if you take the address of it), this optimization cannot happen.

As in the first example (struct A) above, this can't possibly work if there are any not-currently-visible-in-this-TU member functions or non-member friend functions. Which also means that moving a member function from a header file to a .cpp file could break your struct type's ABI. Which is horrible.
Optimizations that require a whole-program or link-time optimizer are pretty much non-starters.

> 2. A new attribute: maybe_empty (or a completely different name). It informs the compiler
> that a member could be empty for some type and warns if you write code that requires its existence.

This second idea doesn't seem useful unless we also adopt your "make member functions static" idea, which is a non-starter, as described above.
If that first idea could be made to work, then this maybe_empty / do_not_rely_on_my_existence attribute idea seems nice; but given that the first idea is dead, so is this second idea.

–Arthur
Reply all
Reply to author
Forward
0 new messages