Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Assignment between union object members of incompatible types

47 views
Skip to first unread message

Ian Abbott

unread,
Nov 30, 2020, 1:44:55 PM11/30/20
to
A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :

Consider the following:

union { int i; char c; } x = {0};
x.c = x.i;

Does the assignment x.c = x.i result in undefined behavior?

C18 6.15.16.1/3 says:

| If the value being stored in an object is read from another object
| that overlaps in any way the storage of the first object, then the
| overlap shall be exact and the two objects shall have qualified or
| unqualified versions of a compatible type; otherwise, the behavior is
| undefined.

The objects x.c and x.i overlap, but have incompatible types, so on first glance it appears to be UB.

James Kuyper

unread,
Nov 30, 2020, 2:45:39 PM11/30/20
to
On 11/30/20 1:44 PM, Ian Abbott wrote:
> A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :
>
> Consider the following:
>
> union { int i; char c; } x = {0};
> x.c = x.i;
>
> Does the assignment x.c = x.i result in undefined behavior?
>
> C18 6.15.16.1/3 says:

That's actually 6.5.16.1/3.

> | If the value being stored in an object is read from another object
> | that overlaps in any way the storage of the first object, then the
> | overlap shall be exact and the two objects shall have qualified or
> | unqualified versions of a compatible type; otherwise, the behavior is
> | undefined.
>
> The objects x.c and x.i overlap, but have incompatible types, so on first glance it appears to be UB.

That is correct.

Keith Thompson

unread,
Nov 30, 2020, 3:20:09 PM11/30/20
to
On first glance, yes, but I think this passage needs to be updated to
reflect its intent.

The RHS of an assignment is not an lvalue. If it starts out as an
lvalue, then lvalue conversion is applied. Logically the value is
retrieved *and then* copied into the destination object.

A simple case like this is not likely to cause problems (in the absence
of agressive optimization), but we can construct more problematic cases
where the object being copied is arbitrarily large and the overlap might
not be detectable at compile time.

I've written an answer:
https://stackoverflow.com/a/65080498/827263

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

Tim Rentsch

unread,
Dec 1, 2020, 5:49:17 AM12/1/20
to
Ian Abbott <ijabb...@gmail.com> writes:

> A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :
>
> Consider the following:
>
> union { int i; char c; } x = {0};
> x.c = x.i;
>
> Does the assignment x.c = x.i result in undefined behavior?
>
> C18 6.15.16.1/3 says:

As elsewhere noted, the reference is 6.5.16.1 paragraph 3.

> | If the value being stored in an object is read from another object
> | that overlaps in any way the storage of the first object, then the
> | overlap shall be exact and the two objects shall have qualified or
> | unqualified versions of a compatible type; otherwise, the behavior is
> | undefined.
>
> The objects x.c and x.i overlap, but have incompatible types, so on
> first glance it appears to be UB.

The value to be stored is read from object x.i. This value is
being stored in object x.c.

The object x.i overlaps with object x.c.

The overlap is not exact (probably). The two types involved are
not qualified or unqualified versions of a compatible type.

The assignment has undefined behavior. No doubt about it.

Tim Rentsch

unread,
Dec 1, 2020, 6:39:50 AM12/1/20
to
Let me start with where we agree. I agree that 6.5.16.1 p3 deserves
some clarification. After that however my reading and yours reach
different conclusions.

First I think the passage as written clearly conveys the meaning
intended in this case, that this assignment is undefined behavior.
The value being stored was read out of x.i, and is being stored
into x.c. The types of those two objects don't mesh, and so the
assignment is undefined behavior. This assignment is about the
clearest possible case that would violate 6.5.16.1 p3, and the
passage as written conveys that, IMO with no room for argument.

Second I think whether the RHS is an lvalue has no bearing on the
question. The cited paragraph does not mention anything about
lvalues; it speaks only of "the value being stored". Consider a
slight variation:

x.c = (printf( "hello world\n" ), x.i);

The RHS of this assignment is not an lvalue. But the /value/
being stored was read from x.i. As I read the Standard this
assignment too is undefined behavior, and IMO the Standard does
convey that intention.

Here is another variation:

x.c = 0 ? printf( "hello world\n" ) : x.i;

Once again the value being stored was read from x.i, and again
as I read the Standard this assignment too is undefined behavior,
and IMO the Standard does convey that intention.

Now let's look at an example you give in your stackoverflow answer.
Quoting a whole paragraph:

The passage says that the value "is read from another
object". That's ambiguous. Must name of the object be the
entire RHS expression, or can it be just a subexpression?
If the latter, then x.c = x.i + 1; would have undefined
behavior, which in my opinion would be absurd.

You left out an important part: "the value /being stored/".
Here the value being stored is x.i + 1. That /value/ was not
read from x.i; it was formed by an addition operation after
reading x.i. Ostensibly this assignment would not be undefined
behavior. I say "ostensibly" because actually I think this case
is not clear, and may have been intended to be undefined behavior
along with the others. And I don't see anything absurd about
having it be undefined behavior, or even surprising if it were
discovered that this case had been meant to be undefined behavior
all along.

Disclaimer: I haven't yet done any research into Defect Reports
or committee meeting notes to see if this question has been
addressed there. I suspect it has, but I just haven't looked
yet.

Francis Glassborow

unread,
Dec 5, 2020, 9:27:52 AM12/5/20
to
if
x.c = x.i + 1;

has defined behaviour then

x.c = x.i + 0;

also has defined behaviour.

So IMO, consistency requires that we tighten the requirement so that

x.c = any expression that uses x.i;

has undefined behaviour.

However I still have reservations because:

{int const i = x.i; x.c = i;}

is surely OK. However any optimising compiler will concatenate that to

x.c = x.i;

Note the braces limit the scope of i.

Francis

Tim Rentsch

unread,
Jul 10, 2021, 11:42:13 AM7/10/21
to
(see note given below)

>> Now let's look at an example you give in your stackoverflow answer.
>> Quoting a whole paragraph:
>>
>> The passage says that the value "is read from another
>> object". That's ambiguous. Must name of the object be the
>> entire RHS expression, or can it be just a subexpression?
>> If the latter, then x.c = x.i + 1; would have undefined
>> behavior, which in my opinion would be absurd.

My apologies for taking so long to respond to your comments.

> if
> x.c = x.i + 1;
>
> has defined behaviour then
>
> x.c = x.i + 0;
>
> also has defined behaviour.

Yes. It does, and it should.

> So IMO, consistency requires that we tighten the requirement so that
>
> x.c = any expression that uses x.i;
>
> has undefined behaviour.

Surely that is not the intent. It is only when the value being
stored is _read directly_ from an overlapping object, and is
not instead _formed as the result of a computation using_ an
overlapping object, that undefined behavior occurs.

> However I still have reservations because:
>
> {int const i = x.i; x.c = i;}
>
> is surely OK. However any optimising compiler will concatenate that to
>
> x.c = x.i;
>
> Note the braces limit the scope of i.

What an optimizing compiler might do has no effect on the program's
semantics, which is defined in terms of an abstract machine where
no optimizations occur. Each compiler has a responsibility to
ensure its optimizer faithfully preserves the program semantics
that the Standard requires, which in this case has defined behavior
because the value being stored is read out of 'i' and not out of
'x.i'.


(Note for the "see below" remark: I'm willing to be convinced that
the example

x.c = 0 ? printf( "hello world\n" ) : x.i;

has defined behavior, because the value being stored is the result
of a ?: operator, and thus not just a direct read out of x.i. This
argument can be seen more clearly if we consider

x.c = 0 ? 0ULL : x.i;

because the value of the RHS clearly is not the same as what is
read out of x.i, which is an int. Similarly

x.c = (unsigned long long) x.i;

has defined behavior, because the value being stored is the result
of a conversion operation, not the value read from x.i.)

0 new messages