Type Punning via a Union

880 views
Skip to first unread message

Joshua Boyce

unread,
Nov 20, 2012, 2:46:14 AM11/20/12
to std-dis...@isocpp.org
Basically, my question is why C++ doesn't explicitly support type punning via a union like C99/C11 does. It is surprising to me that support for this was not added in C++11 (or am I mistaken?).

Daniel Krügler

unread,
Nov 20, 2012, 2:49:44 AM11/20/12
to std-dis...@isocpp.org
2012/11/20 Joshua Boyce <raptor...@raptorfactor.com>

Basically, my question is why C++ doesn't explicitly support type punning via a union like C99/C11 does. It is surprising to me that support for this was not added in C++11 (or am I mistaken?).

Please be a more specific, neither C11 nor C++11 use the term "type punning" in normative ways. An example would also help a lot here.

- Daniel

Joshua Boyce

unread,
Nov 20, 2012, 3:22:51 AM11/20/12
to std-dis...@isocpp.org
My apologies.

Basically I wish to do something like this:
union Foo
{
  int i;
  float f;
};
Foo foo;
foo.f = 2.0f;
int i = foo.i;
// do something with i

It's my understanding that in C99 this is allowed. I don't have a copy of the 'official' C99 or C11 standard on-hand, but the April 12, 2011 draft[1] contains the same footnote (§6.5.2.3 footnote 95).

If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.

I can find no such clause/footnote in the C++11 standard.


 

Nicol Bolas

unread,
Nov 20, 2012, 4:04:57 AM11/20/12
to std-dis...@isocpp.org
I can't speak for the C++ committee, but here's my feeling on the matter.

C++ objects are more than blocks of bits. They can be living, breathing objects with specific lifetimes. And unions of those constructs must also have these properties.

Introducing and standardizing how type punning works in such an environment is exceedingly dangerous. It also subverts the whole type system; C++ is supposed to be "strongly typed" (relative to C at least). So standardizing something that subverts the type system without a cast (the usual method of type subversion) is... disconcerting.

It could be done for POD types perhaps. But anything more than that would become dangerous. Even standard layout types have privates that should not be made accessible without an explicit cast.

FrankHB1989

unread,
Nov 20, 2012, 3:41:37 PM11/20/12
to std-dis...@isocpp.org

It's my understanding that in C99 this is allowed. I don't have a copy of the 'official' C99 or C11 standard on-hand, but the April 12, 2011 draft[1] contains the same footnote (§6.5.2.3 footnote 95).

If the member used to read the contents of a union object is not the same as the member last used to
store a value in the object, the appropriate part of the object representation of the value is reinterpreted
as an object representation in the new type as described in 6.2.6 (a process sometimes called ‘‘type
punning’’). This might be a trap representation.

I can find no such clause/footnote in the C++11 standard.



There is a footnote concerned with it in the C++11 standard, at 5.2.10/11 [expr.reinterpret.cast]:
72) This is sometimes referred to as a type pun.
This term is also appeared in Index of the standard.
But no unions are mentioned with it.

Jens Maurer

unread,
Nov 20, 2012, 3:54:12 PM11/20/12
to std-dis...@isocpp.org
On 11/20/2012 09:22 AM, Joshua Boyce wrote:
> My apologies.
>
> Basically I wish to do something like this:
>
> union Foo
> {
> int i;
> float f;
> };
> Foo foo;
> foo.f = 2.0f;
> int i = foo.i;
> // do something with i

It seems that core issue 1116 might be related to that question.
http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#1116

When discussing this the last time around in the Core Working Group,
I think we weren't opposed to the idea that explicit union casts
like the above (where the union is visible to the compiler) should
be allowed for PODs.

Jens

Richard Smith

unread,
Nov 20, 2012, 4:09:19 PM11/20/12
to std-dis...@isocpp.org
On Tue, Nov 20, 2012 at 12:22 AM, Joshua Boyce
<raptor...@raptorfactor.com> wrote:
> On Tue, Nov 20, 2012 at 6:49 PM, Daniel Krügler <daniel....@gmail.com>
> wrote:
>>
>> 2012/11/20 Joshua Boyce <raptor...@raptorfactor.com>
>>>
>>> Basically, my question is why C++ doesn't explicitly support type punning
>>> via a union like C99/C11 does. It is surprising to me that support for this
>>> was not added in C++11 (or am I mistaken?).
>>
>> Please be a more specific, neither C11 nor C++11 use the term "type
>> punning" in normative ways. An example would also help a lot here.
>>
>> - Daniel
>
>
> My apologies.
>
> Basically I wish to do something like this:
>>
>> union Foo
>> {
>> int i;
>> float f;
>> };
>> Foo foo;
>> foo.f = 2.0f;
>> int i = foo.i;
>> // do something with i
>
>
> It's my understanding that in C99 this is allowed. I don't have a copy of
> the 'official' C99 or C11 standard on-hand, but the April 12, 2011 draft[1]
> contains the same footnote (§6.5.2.3 footnote 95).

This is, in my opinion, an underspecified semantic quagmire in C.
Consensus has not been reached between implementors and the C
committee as to exactly which cases have defined behavior and which do
not, and indeed the C standard's rules break the type-based alias
analysis optimizations which current implementations perform.

Personally, in order to support such a proposal, I would want to see:
* a precise description of the rules for such an extension (which C
does not have)
* a strong motivation for introducing it (which again is missing,
since the same effect can be produced with an array of unsigned char
and memcpy, and there will be no performance impact in a suitably
high-quality implementation)
* an analysis of the costs (including any loss of TBAA power and
complexity of implementation).

wolfei...@gmail.com

unread,
Nov 22, 2012, 5:22:18 PM11/22/12
to std-dis...@isocpp.org
Copying the data is more expensive than type punning. In my opinion, the simplest way to prevent this from being a problem is to prevent pointers or references being taken to members of such a union (perhaps you would need to declare it as a new kind of union, to avoid legacy code breakage). In C this would be untenable, but I think that in C++ it will not be such a problem, as you could easily create a union_iterator that could be an iterator of float, now that rvalue results from operator*() are permitted. This would be greatly simplified if the language had better support for proxy objects, such as the various proposals for overriding auto. This should interact just fine with static aliasing, as aliases to the union contents are never created and thus strict aliasing is preserved.

For example, consider this hypothetical sample, where I have trimmed some of the boilerplate and named the hypothetical new kind of union "union class", for no actual reason.

union class pun {
    float f;
    int i;
};
template<typename pun_iterator> class pun_float_iterator : iterator_facade<pun_float_iterator<pun_iterator>> {
    pun_iterator it;
    
    pun_float_iterator(pun_iterator x) : it(x) {}
    pun_float_iterator(const pun_float_iterator& other) : it(other.it) {}

    struct proxy {
        proxy(pun_iterator x) : it(x) {}
        proxy(const proxy& other) : it(other.it) {}
        pun_iterator it;
        operator float() { return it->f; }
        proxy& operator=(const float& f) { it->f = f; return *this; }
    };
    
    proxy operator*() { return proxy(it); }
};
std::vector<pun> x;
// fill x
std::vector<float> f(make_float_pun_iterator(x.begin()), make_float_pun_iterator(x.end()));

With compile-time reflection, one could write a generic punning iterator that would pun a given member from any such union, which would definitely be smoother, but not required, or even a functional iterator, especially with polymorphic lambdas.

As far as type requirements go, I think that POD is a little too far. What we would really be talking about is that all types must be trivially constructible, and trivially destructible, and union class is both. If all are trivially movable, then union class is trivially movable. If all are trivially copyable, then union class is trivially copyable.

The main problem is trap representations. It's difficult for me to argue that this would result in safe, portable code, if in fact punning objects may result in trap representations killing the program. I mean, there's a difference between punning a pointer, which would be explicitly quite unsafe, and punning an integer and a float. On that note, it may be undesirable to introduce a Standardised type pun for objects like float where there's no Standardised way to determine their representation, as you'd just have to fall back to implementation details to make it work anyway. For example, the fast inverse square root code which employs type punning could not be made portable just because the type pun would be portable.


Richard Smith

unread,
Nov 26, 2012, 1:27:04 AM11/26/12
to std-dis...@isocpp.org
On Thu, Nov 22, 2012 at 2:22 PM, <wolfei...@gmail.com> wrote:
Copying the data is more expensive than type punning.

If this is true for your implementation, I suggest you file a bug on it. Breaking real optimizations (anything based on type-based alias analysis) in order to work around performance issues with some particular compiler seems like a bad idea to me.

wolfei...@gmail.com

unread,
Nov 26, 2012, 5:06:03 AM11/26/12
to std-dis...@isocpp.org
The changes I proposed would not break any aliasing rules. As for copying being slower than punning, that's a simple fact- when you pun, you don't have to copy, and when you copy, well, you have to copy. It seems quite logical that this should be the case.

Nicol Bolas

unread,
Nov 26, 2012, 11:19:29 AM11/26/12
to std-dis...@isocpp.org, wolfei...@gmail.com


On Monday, November 26, 2012 2:06:04 AM UTC-8, wolfei...@gmail.com wrote:
The changes I proposed would not break any aliasing rules. As for copying being slower than punning, that's a simple fact- when you pun, you don't have to copy, and when you copy, well, you have to copy. It seems quite logical that this should be the case.

It rather depends on what you mean. by punning then. For example, as posted by Joshua:


union Foo
{
 
int i;
 
float f;
};
Foo foo;
foo
.f = 2.0f;
int i = foo.i;
// do something with i

This requires a copy, so it's no slower than just doing the copy directly into `int i`. So what is your use-case that puns without eventually copying that value out?

wolfei...@gmail.com

unread,
Nov 26, 2012, 11:21:26 AM11/26/12
to std-dis...@isocpp.org, wolfei...@gmail.com
There is nothing stopping you from referencing foo.i directly. You just can't alias it.
Reply all
Reply to author
Forward
0 new messages