Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Strange code-generation

68 views
Skip to first unread message

Bonita Montero

unread,
Sep 16, 2019, 8:53:44 AM9/16/19
to
I just experimented a bit with downcasting to a derived class.
So here's the code:

struct A
{
int a, b, c;
};

struct B
{
int d, e, f;
};

struct D : public A, public B
{
int i, j, k;
};

D *f( B &b )
{
return &static_cast<D &>(b);
}

This is what you will expect ... with g++:

leaq -12(%rdi), %rax
ret

This is what makes MSVC from the above code

xor edx, edx
lea rax, QWORD PTR [rcx-12]
test rcx, rcx
cmove rax, rdx
ret 0

So MSVC keeps a nullpointer when the input was also a nullpointer.
Is there any requirement of the standard that mandates this behaviour?
I think it would be even stupid to include that into the standard.
C++ isn't a language with child-proof locks.

Paavo Helde

unread,
Sep 16, 2019, 9:59:26 AM9/16/19
to
In standard C++ one cannot legally construct a "null reference", so the
standard cannot mandate any behavior regarding them. The special
handling of null only applies to the pointer form of static_cast<>.

However, an implementation may define behavior for things which are UB
by the standard. MSVC supports things like this==nullptr (see e.g.
CWnd::GetSafeHwnd() function), so maybe it supports "null references" as
well, possibly for backward compatibility with their own old code.


Bonita Montero

unread,
Sep 16, 2019, 3:08:18 PM9/16/19
to
I think this is relevant:
"A prvalue of type “pointer to cv1 B,” where B is a class type,
can be converted to a prvalue of type “pointer to cv2 D,” where
D is a class derived (Clause 10) from B, if a valid standard
conversion from “pointer to D” to “pointer to B” exists (4.10),
cv2 is the same cv-qualification as, or greater cv-qualification
than, cv1, and B is neither a virtual base class of D nor a base
class of a virtual base class of D. The null pointer value (4.10)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
is converted to the null pointer value of the destination type
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
..."
The compiler just handles it the same way for references because
the internal representation is the same as with pointers.

Tim Rentsch

unread,
Sep 18, 2019, 11:48:02 AM9/18/19
to
> the standard cannot mandate any behavior regarding them. [...]

Sure it could. The Standard doesn't say that they can't exist,
and even if they couldn't exist the Standard could still specify
a behavior for how they must be treated. And it might even be
useful to give such a specification, for implementations that
choose to provide null references by means of an extension.

Paavo Helde

unread,
Sep 18, 2019, 2:07:07 PM9/18/19
to
On 18.09.2019 18:47, Tim Rentsch wrote:
> Paavo Helde <myfir...@osa.pri.ee> writes:
>>
>> In standard C++ one cannot legally construct a "null reference", so
>> the standard cannot mandate any behavior regarding them. [...]
>
> Sure it could. The Standard doesn't say that they can't exist,
> and even if they couldn't exist the Standard could still specify
> a behavior for how they must be treated. And it might even be
> useful to give such a specification, for implementations that
> choose to provide null references by means of an extension.

By that logic the C++ standard could also prescribe how the SQL language
or the invisible unicorn in my garage must behave, in the case someone
happens to incorporate them into their C++ implementation.

More to the point, the standard contains the following verbiage:
"A reference shall be initialized to refer to a valid object or
function. [ Note: in particular, a null reference cannot exist in a
well-defined program, because the only way to create such a reference
would be to bind it to the “object” obtained by indirection through a
null pointer, which causes undefined behavior. ]"

Without contradicting itself, the standard cannot say in one place "A
reference shall be initialized to refer to a valid object" and "behavior
is undefined" and then go on and discuss in another place what happens
if a program violates this "shall", thus defining the behavior.


Bonita Montero

unread,
Sep 19, 2019, 1:05:42 PM9/19/19
to
> D *f( B &b )
> {
>     return &static_cast<D &>(b);
> }

Look wat happnens if I do this instead with MSVC:

D *f( B &b )
{
__assume(&b != nullptr);
return &static_cast<D &>(b);
}

The resulting code isn't this:

>     xor     edx, edx
>     lea     rax, QWORD PTR [rcx-12]
>     test    rcx, rcx
>     cmove   rax, rdx
>     ret     0

But this:

lea rax, QWORD PTR [rcx-12]
ret 0

MSVC is such a cool compiler.

David Brown

unread,
Sep 19, 2019, 1:35:07 PM9/19/19
to
As you said in your first post, gcc generates that anyway - without an
"__assume". That makes gcc a cooler compiler :-)


Bonita Montero

unread,
Sep 19, 2019, 2:05:21 PM9/19/19
to
I just found the following:

gcc handles null-Pointers according to the standard:

D *f( B *b )
{
return static_cast<D *>(b);
}

This results in this:

testq %rdi, %rdi
je .L3
leaq -12(%rdi), %rax
ret
.L3:
xorl %eax, %eax
ret

So on one side gcc is so clever as to assume that references
are never null, but on the other side you have no way to say
gcc that a pointer to be downcastet won't be null.

Paavo Helde

unread,
Sep 19, 2019, 3:01:07 PM9/19/19
to
On 19.09.2019 21:05, Bonita Montero wrote:
>
> gcc handles null-Pointers according to the standard:
>
> D *f( B *b )
> {
> return static_cast<D *>(b);
> }
>
> This results in this:
>
> testq %rdi, %rdi
> je .L3
> leaq -12(%rdi), %rax
> ret
> .L3:
> xorl %eax, %eax
> ret
>
> So on one side gcc is so clever as to assume that references
> are never null, but on the other side you have no way to say
> gcc that a pointer to be downcastet won't be null.

Except that you have:

D *f( B *b ) {
if (b) {
return static_cast<D *>(b);
} else {
__builtin_unreachable();
}
}

> g++ -O2 -c test1.cpp -S

_Z1fP1B:
.LFB0:
.cfi_startproc
leaq -12(%rdi), %rax
ret
.cfi_endproc

David Brown

unread,
Sep 20, 2019, 2:21:56 AM9/20/19
to
On 19/09/2019 20:05, Bonita Montero wrote:
>>>> D *f( B &b )
>>>> {
>>>>      return &static_cast<D &>(b);
>>>> }
>>>
>>> Look wat happnens if I do this instead with MSVC:
>>>
>>> D *f( B &b )
>>> {
>>>      __assume(&b != nullptr);
>>>      return &static_cast<D &>(b);
>>> }
>>>
>>> The resulting code isn't this:
>>>
>>>>      xor     edx, edx
>>>>      lea     rax, QWORD PTR [rcx-12]
>>>>      test    rcx, rcx
>>>>      cmove   rax, rdx
>>>>      ret     0
>>>
>>> But this:
>>>
>>>      lea        rax, QWORD PTR [rcx-12]
>>>      ret        0
>>>
>>> MSVC is such a cool compiler.
>
>> As you said in your first post, gcc generates that anyway
>> - without an "__assume".  That makes gcc a cooler compiler :-)
>
> I just found the following:
>
> gcc handles null-Pointers according to the standard:

Good.

>
> D *f( B *b )
> {
>     return static_cast<D *>(b);
> }
>
> This results in this:
>
>     testq    %rdi, %rdi
>     je       .L3
>     leaq    -12(%rdi), %rax
>     ret
> .L3:
>     xorl    %eax, %eax
>     ret
>
> So on one side gcc is so clever as to assume that references
> are never null, but on the other side you have no way to say
> gcc that a pointer to be downcastet won't be null.

Didn't you read the answer I gave you to your question "is there an
equivalent to MSVC's __assume in gcc" ? Use that macro:

D *f( B *b )
{
assume(b);
return static_cast<D *>(b);
}

f(B*):
lea rax, [rdi-12]
ret

(Or write the __builtin_unreachable() directly, as Paavo did, if you
prefer.)

Thus (in this example) gcc gives you as good or better code for the
plain C++, and gives you as good tools for giving the compiler extra
information to generate more efficient results.

I'm happy to see MSVC generating good code and providing useful extras -
but it is nothing special here.

(clang handles this like gcc.)

Tim Rentsch

unread,
Nov 9, 2019, 8:53:06 AM11/9/19
to
Paavo Helde <myfir...@osa.pri.ee> writes:

> On 18.09.2019 18:47, Tim Rentsch wrote:
>
>> Paavo Helde <myfir...@osa.pri.ee> writes:
>>
>>> In standard C++ one cannot legally construct a "null reference", so
>>> the standard cannot mandate any behavior regarding them. [...]
>>
>> Sure it could. The Standard doesn't say that they can't exist,
>> and even if they couldn't exist the Standard could still specify
>> a behavior for how they must be treated. And it might even be
>> useful to give such a specification, for implementations that
>> choose to provide null references by means of an extension.
>
> By that logic the C++ standard could also prescribe how the SQL
> language or the invisible unicorn in my garage must behave, in the
> case someone happens to incorporate them into their C++
> implementation.

These analogies are pretty much meaningless, because the terms
are not in the universe of discourse for the C++ language. The
term "reference" is in the C++ universe of discourse, and every
C++ programmer understands what "null reference" means, in that
context, by analogy with "null pointer". In fact the term "null
reference" appears in the C++ standard, as you point out.

> More to the point, the standard contains the following verbiage:
> "A reference shall be initialized to refer to a valid object or
> function. [ Note: in particular, a null reference cannot exist in a
> well-defined program, because the only way to create such a reference
> would be to bind it to the ?object? obtained by indirection through a
> null pointer, which causes undefined behavior. ]"

Saying "a null reference cannot exist in a well-defined program"
is a circular statement; it has no information content, because
of what is meant by well-defined. Also, just because a construct
has undefined behavior doesn't mean the result can't exist.
Obviously it can exist, because the implementation is free to
define the meaning.

> Without contradicting itself, the standard cannot say in one place "A
> reference shall be initialized to refer to a valid object" and
> "behavior is undefined" and then go on and discuss in another place
> what happens if a program violates this "shall", thus defining the
> behavior.

There is no contradiction. What would (hypothetically) be being
defined is not declaring/initializing a null reference, but using
a null reference. Do you not understand the distinction?


(Sorry to be so long in responding, life has been unusually
chaotic of late.)

Alf P. Steinbach

unread,
Nov 9, 2019, 1:15:33 PM11/9/19
to
On 18.09.2019 20:06, Paavo Helde wrote:
> On 18.09.2019 18:47, Tim Rentsch wrote:
>> Paavo Helde <myfir...@osa.pri.ee> writes:
>>>
>>> In standard C++ one cannot legally construct a "null reference", so
>>> the standard cannot mandate any behavior regarding them.  [...]
>>
>> Sure it could.  The Standard doesn't say that they can't exist,
>> and even if they couldn't exist the Standard could still specify
>> a behavior for how they must be treated.  And it might even be
>> useful to give such a specification, for implementations that
>> choose to provide null references by means of an extension.
>
> By that logic the C++ standard could also prescribe how the SQL language
> or the invisible unicorn in my garage must behave, in the case someone
> happens to incorporate them into their C++ implementation.
>
> More to the point, the standard contains the following verbiage:
> "A reference shall be initialized to refer to a valid object or
> function. [ Note: in particular, a null reference cannot exist in a
> well-defined program, because the only way to create such a reference
> would be to bind it to the “object” obtained by indirection through a
> null pointer, which causes undefined behavior. ]"


Well, dereferencing a nullpointer is explicitly supported in a
typeid-expression.

So for a short time there is a nullreference in a valid program.

Happily notes in ISO standards are non-normative, as I recall, but the
wording in the sentence before that note is arguably defective. ;-)


> Without contradicting itself, the standard cannot say in one place "A
> reference shall be initialized to refer to a valid object" and "behavior
> is undefined" and then go on and discuss in another place what happens
> if a program violates this "shall", thus defining the behavior.

I think that holds even taking into account the typeid special case.

- Alf
0 new messages