Is dereferencing this pointer a UB?

109 views
Skip to first unread message

Andrzej Krzemieński

unread,
Aug 10, 2017, 12:52:43 PM8/10/17
to ISO C++ Standard - Discussion
Hi Everyone,
I need help in determining if the following is a UB.

The following code compiled with UB sanitizer reports an UB:

struct B;

struct I {
 
virtual void f() {}; // <- virtual
};

struct A : I {
  A
();
};

struct B : A {
};

A
::A() { *static_cast<B*>(this); }

int main()
{
  B
{};
}

Now, If I make function f non-virtual, the sanitizer message goes away.

struct B;

struct I {
 
void f() {}; // <- non-virtual
};

struct A : I {
  A
();
};

struct B : A {
};

A
::A() { *static_cast<B*>(this); }

int main()
{
  B
{};
}

But do these two programs contain UB? If so, could you point me to the places in the Standard that would back it up?

Regards,
&rzej;


Nicol Bolas

unread,
Aug 10, 2017, 1:28:30 PM8/10/17
to ISO C++ Standard - Discussion

The same place this stuff usually gets stopped: [basic.lval]. Though with an assist from the rules of object initialization in constructors [class.base.init] and [basic.life].

We know that `this` doesn't point to an object of dynamic type `B`, because of the rules of [class.base.init]/13 and [basic.life]/1. Until `B`'s constructor finishes, its lifetime has not begun (since the object is using non-vacuous initialization. The default constructor of `A` is not trivial, and therefore neither is the default constructor of `B`).

Since its lifetime of `B` has not begun, accessing the object violates [basic.lval]/8.

Nevin Liber

unread,
Aug 10, 2017, 1:31:02 PM8/10/17
to std-dis...@isocpp.org
On Thu, Aug 10, 2017 at 11:52 AM, Andrzej Krzemieński <akrz...@gmail.com> wrote:
The following code compiled with UB sanitizer reports an UB:

struct B;

struct I {
 
virtual void f() {}; // <- virtual
};

struct A : I {
  A
();
};

struct B : A {
};

A
::A() { *static_cast<B*>(this); }

int main()
{
  B
{};
}

Now, If I make function f non-virtual, the sanitizer message goes away.

struct B;

struct I {
 
void f() {}; // <- non-virtual
};

struct A : I {
  A
();
};

struct B : A {
};

A
::A() { *static_cast<B*>(this); }

int main()
{
  B
{};
}

But do these two programs contain UB? If so, could you point me to the places in the Standard that would back it up?

I believe they are both UB because you are trying to access the B object before its lifetime has begun.  My guess is to look at [basic.life] and [class.base.init], but I'm no core expert.
--
 Nevin ":-)" Liber  <mailto:ne...@eviloverlord.com>  +1-847-691-1404

Andrzej Krzemieński

unread,
Aug 10, 2017, 2:27:19 PM8/10/17
to ISO C++ Standard - Discussion
Thanks for the prompt response, but some things seem still unclear to me Bu dereferencing a pointer I only produce an lvalue. Does it mean I am "accessing" the object?

Also, if reading the value of an object whose lifetime has not yet begun is UB, does this mean that the following has also an UB?:

struct S
{
 
int a = 0;
  S
() {
   
++a; // lifetime of `*this` not yet started, `a` is part of its value
 
}
};

Regards,
&rzej;

Ville Voutilainen

unread,
Aug 10, 2017, 2:30:21 PM8/10/17
to std-dis...@isocpp.org
On 10 August 2017 at 21:27, Andrzej Krzemieński <akrz...@gmail.com> wrote:
> Also, if reading the value of an object whose lifetime has not yet begun is
> UB, does this mean that the following has also an UB?:
>
> struct S
> {
> int a = 0;
> S() {
> ++a; // lifetime of `*this` not yet started, `a` is part of its value
> }
> };


That's not UB, a constructor body is allowed full use of initialized subobjects.

Thiago Macieira

unread,
Aug 10, 2017, 10:26:40 PM8/10/17
to std-dis...@isocpp.org
On quinta-feira, 10 de agosto de 2017 09:52:43 PDT Andrzej Krzemieński wrote:
> A::A() { *static_cast<B*>(this); }

This is UB.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center

Nicol Bolas

unread,
Aug 10, 2017, 11:13:56 PM8/10/17
to ISO C++ Standard - Discussion

My reading of the specification is that dereferencing a pointer is accessing what it points to. There's been some question as to whether merely dereferencing `nullptr` is UB, or if it only becomes UB if you actually do something with it.

However, I can say this for certain: if your first example is UB, then your second example will also be UB, and it will be UB for the same reason. And if your second example is not UB, then the first one will also not be UB, and for the same reason.

The presence or absence of a `virtual` function has no bearing on the correctness of this code. The object is most certainly not a `B` at the time of the cast. So accessing it as a `B` is UB. The only question is whether dereferencing the pointer is good enough to be considered "accessing".

Andrzej Krzemieński

unread,
Aug 11, 2017, 2:23:10 AM8/11/17
to ISO C++ Standard - Discussion

Regarding dereferencing a pointer versus accessing the pointee, I can see the following relevant sections:

[defns.access]: "access -- <execution-time action> to read or modify the value of an object"
[intro.object]/1: "The constructs in a C++ program create, destroy, refer to, access, and manipulate objects." -- as though referring to an object was distinct from accessing it.
[expr.ref] says that operator. and operator-> are used for "class member access" -- not dereferencing.

This would seem to imply that dereferencing is not accessing the pointee?

Regards,
&rzej;
 

Nevin Liber

unread,
Aug 11, 2017, 2:36:41 AM8/11/17
to std-dis...@isocpp.org
On Fri, Aug 11, 2017 at 1:23 AM, Andrzej Krzemieński <akrz...@gmail.com> wrote:

This would seem to imply that dereferencing is not accessing the pointee?

What exactly are you trying to accomplish by doing this?  I'm not seeing a practical use case.

I can certainly guess why the class with a virtual is caught by the sanitizer in that it has a vtable to look at to see that the type is wrong while a class without a virtual it doesn't.

Andrzej Krzemieński

unread,
Aug 11, 2017, 2:56:16 AM8/11/17
to ISO C++ Standard - Discussion


W dniu piątek, 11 sierpnia 2017 08:36:41 UTC+2 użytkownik Nevin ":-)" Liber napisał:
On Fri, Aug 11, 2017 at 1:23 AM, Andrzej Krzemieński <akrz...@gmail.com> wrote:

This would seem to imply that dereferencing is not accessing the pointee?

What exactly are you trying to accomplish by doing this?  I'm not seeing a practical use case.

The original problem is (of course) more convoluted. It is a part of some CRTP design. In the base class a developer is taking reference to the derived class (this happens in the constructor of the base class and stores this reference as member for future use (after the Derived object's lifetime starts:

template <typename T>
struct Base
{
  T
& _derived;
 
Base() : _derived(*static_cast<T*>(this)) {}
};

struct Derived : Base<Derived>
{
 
int i = 0;
};

int main()
{
 
Derived d;
 
// now d._derived.i may be used
}


I can certainly guess why the class with a virtual is caught by the sanitizer in that it has a vtable to look at to see that the type is wrong while a class without a virtual it doesn't.

Yes, because the sanitizer sees a vtable pointer it is able to perform more checks, so it uses the opportunity. But it looks like it might be too eagerly reporting a false positive.

Interestingly, even without the suspicious dereference, the sanitizer still reports run-time error:struct B;

struct I {
 
virtual void f() {}; // <- virtual
};

struct A : I {
  A
();
};

struct B : A {
};


A
::A() { static_cast<B*>(this); } // <- no dereference

int main()
{
  B
{};
}

Regards,
&rzej;

Florian Weimer

unread,
Aug 11, 2017, 4:15:59 AM8/11/17
to std-dis...@isocpp.org, Andrzej Krzemieński
On 08/11/2017 08:23 AM, Andrzej Krzemieński wrote:
> Regarding dereferencing a pointer versus accessing the pointee, I can see
> the following relevant sections:
>
> [defns.access]: "access -- <execution-time action> to read or modify the
> value of an object"
> [intro.object]/1: "The constructs in a C++ program create, destroy, refer
> to, access, and manipulate objects." -- as though referring to an object
> was distinct from accessing it.
> [expr.ref] says that operator. and operator-> are used for "class member
> access" -- not dereferencing.
>
> This would seem to imply that dereferencing is not accessing the pointee?

I think that existing implementations need that evaluating E1->E2
performs an access of *E1, otherwise type-based aliasing analysis would
not be sound. And this means that *E1 itself accesses the object.

On the other hand, this rule greatly reduces the usefulness of mutexes
and atomics because you cannot embed them in the object to which they
synchronize access because the usual syntax to refer to members
introduces a data race (the entire object is accessed). It is possible
to work around this using offsetof and pointer arithmetic, but no
programmer does this.

To me, this looks like a defect in the language, but addressing it is
not easy.

Florian

Andrzej Krzemieński

unread,
Aug 11, 2017, 4:58:27 AM8/11/17
to ISO C++ Standard - Discussion
Ok, I think I have found an answer in this SO quesion: https://stackoverflow.com/questions/28928590/safety-of-static-cast-to-pointer-to-derived-class-from-base-destructor

It has nothing to do with accessing/dereferencin a pointer, but only with the static_cast. It is UB to cast to derived type if there is no object of derived type (yet) to downcast to.

Regards,
&rzej;

Thiago Macieira

unread,
Aug 11, 2017, 12:30:23 PM8/11/17
to std-dis...@isocpp.org
On quinta-feira, 10 de agosto de 2017 23:56:15 PDT Andrzej Krzemieński wrote:
> The original problem is (of course) more convoluted. It is a part of some
> CRTP design. In the base class a developer is taking reference to the
> derived class (this happens in the constructor of the base class and stores
> this reference as member for future use (after the Derived object's
> lifetime starts:

Why do you need to store a reference (which is implemented as a pointer) that
can be reached only after you already know the value of that pointer?

That is to say, you can only access this->_derived if you know the value of
this, and this == this->_derived (modulo cast, which is constant).

Andrzej Krzemieński

unread,
Aug 12, 2017, 4:35:57 AM8/12/17
to ISO C++ Standard - Discussion


W dniu piątek, 11 sierpnia 2017 18:30:23 UTC+2 użytkownik Thiago Macieira napisał:
On quinta-feira, 10 de agosto de 2017 23:56:15 PDT Andrzej Krzemieński wrote:
> The original problem is (of course) more convoluted. It is a part of some
> CRTP design. In the base class a developer is taking reference to the
> derived class (this happens in the constructor of the base class and stores
> this reference as member for future use (after the Derived object's
> lifetime starts:

Why do you need to store a reference (which is implemented as a pointer) that
can be reached only after you already know the value of that pointer?

That is to say, you can only access this->_derived if you know the value of
this, and this == this->_derived (modulo cast, which is constant).

I might have made the example too trivial while trying to simplify it. I do not even claim that this "pattern" makes sense, or that it is sound. It came up at code review in my work. I was just surprised that apart from its strangeness it was also reported as UB by the sanitizer. Before the issue I would have believed that such downcast would be valid.

Regards,
&rzej;

Andrzej Krzemieński

unread,
Aug 12, 2017, 6:34:48 AM8/12/17
to ISO C++ Standard - Discussion

Ok, just for completeness, according to Jens Maurer, on SG12 list, the static_cast is valid: http://www.open-std.org/pipermail/ub/2017-August/000584.html

Regards,
&rzej;
Reply all
Reply to author
Forward
0 new messages