A proposal for C++: bit_cast.

818 views
Skip to first unread message

Nicholas Chapman

unread,
Oct 6, 2015, 3:57:36 PM10/6/15
to ISO C++ Standard - Future Proposals
Hi all,
I'd like to propose a bit_cast function.

bit_cast<T>(x) would get the bits of x, and reinterpret them as the bits of a value of type T.

For example, you would be able to write

float x = 1.0f;
const uint32_t y = bit_cast<uint32_t>(x);

And then y would have the value 0x3f800000.

I have written some details about it on my blog here:

http://www.forwardscattering.org/post/27

I won't copy and paste the text here unless anyone requests it.

Thanks,
  Nick C.

Thiago Macieira

unread,
Oct 6, 2015, 4:53:26 PM10/6/15
to std-pr...@isocpp.org
On Tuesday 06 October 2015 12:57:36 'Nicholas Chapman' via ISO C++ Standard -
Future Proposals wrote:
> Hi all,
> I'd like to propose a bit_cast function.
>
> bit_cast<T>(x) would get the bits of x, and reinterpret them as the bits of
> a value of type T.

Makes sense, it's a simple yet powerful library addition. As you've explained
in your blog, this happens quite often, especially when dealing with the
example you have: dealing with floating points. I had to write code to deal
with that last week, as using the math.h/cmath functions generated (much)
worse code than what was needed, and my solution was memcpy (GCC is smart
enough to optimise memcpy+bswap to one assembly instruction).

The most common implementation will be the memcpy case, though compilers may
implement an intrinsic to do the actual operation if they wish.

> For example, you would be able to write
>
> float x = 1.0f;
> const uint32_t y = bit_cast<uint32_t>(x);
>
> And then y would have the value 0x3f800000.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Arthur O'Dwyer

unread,
Oct 7, 2015, 2:28:46 AM10/7/15
to ISO C++ Standard - Future Proposals
On Tuesday, October 6, 2015 at 12:57:36 PM UTC-7, Nicholas Chapman wrote:
Hi all,
I'd like to propose a bit_cast function.

bit_cast<T>(x) would get the bits of x, and reinterpret them as the bits of a value of type T.

For example, you would be able to write

float x = 1.0f;
const uint32_t y = bit_cast<uint32_t>(x);

And then y would have the value 0x3f800000.

Do you want exactly this?

template<class T, class U>
T bit_cast(U u)
{
    static_assert(sizeof(T) == sizeof(U));
    T result;
    memcpy(&result, &u, sizeof u);
    return result;
}

If so, I think the C++ language doesn't provide it because it's easy to write on your own.

The reddit discussion points out that one disadvantage of the "easy to write" version is that it's not constexpr.
But is that an argument in favor of bit_cast<>, or an argument in favor of constexpr memcpy?  (It's not a crazy idea. We already have constexpr for-loops and constexpr assignments; the problem is that we currently lack a constexpr way to talk about pointers and addresses.)

–Arthur

David Krauss

unread,
Oct 7, 2015, 4:01:53 AM10/7/15
to std-pr...@isocpp.org

On 2015–10–07, at 2:28 PM, Arthur O'Dwyer <arthur....@gmail.com> wrote:

The reddit discussion points out that one disadvantage of the "easy to write" version is that it's not constexpr.
But is that an argument in favor of bit_cast<>, or an argument in favor of constexpr memcpy?  (It's not a crazy idea. We already have constexpr for-loops and constexpr assignments; the problem is that we currently lack a constexpr way to talk about pointers and addresses.)

We already have constexpr trivially-copyable unions and constexpr pointer arithmetic (within an array). What constexpr lacks is a way to access object representations, or any implementation-specific aspects of execution at all. It executes the portable parts of the language but otherwise it’s not a target machine emulator.

If constexpr gains anything in this direction, it should be the primitive static_cast<obj*>(void_ptr) operation. From there, you can get reinterpret_cast<char*>, memcpy, and whatever is deemed appropriate.

Nicholas Chapman

unread,
Oct 7, 2015, 5:07:57 PM10/7/15
to ISO C++ Standard - Future Proposals
On Wednesday, October 7, 2015 at 7:28:46 AM UTC+1, Arthur O'Dwyer wrote:
On Tuesday, October 6, 2015 at 12:57:36 PM UTC-7, Nicholas Chapman wrote:
Hi all,
I'd like to propose a bit_cast function.

bit_cast<T>(x) would get the bits of x, and reinterpret them as the bits of a value of type T.

For example, you would be able to write

float x = 1.0f;
const uint32_t y = bit_cast<uint32_t>(x);

And then y would have the value 0x3f800000.

Do you want exactly this? 

template<class T, class U>
T bit_cast(U u)
{
    static_assert(sizeof(T) == sizeof(U));
    T result;
    memcpy(&result, &u, sizeof u);
    return result;
}

If so, I think the C++ language doesn't provide it because it's easy to write on your own.

Yes, basically I want exactly that.

Christopher Horvath

unread,
Oct 7, 2015, 5:21:46 PM10/7/15
to std-pr...@isocpp.org


template<class T, class U>
T bit_cast(U u)
{
    static_assert(sizeof(T) == sizeof(U));
    T result;
    memcpy(&result, &u, sizeof u);
    return result;
}

If so, I think the C++ language doesn't provide it because it's easy to write on your own.

Yes, basically I want exactly that.
 


I think there's a big difference between "easy to write on your own" and "easy to know that this is the right way to do the thing".  I think having bit_cast would be tremendously valuable as a certification of this being the right thing to do, or the right way to do it.
 

Nicholas Chapman

unread,
Oct 7, 2015, 5:39:17 PM10/7/15
to ISO C++ Standard - Future Proposals
Yes, exactly.  I would like the language to codify the proper and correct way to bit cast.  I want to know that the code I write will be efficient, and that it won't be declared undefined 5 years from now.
 

Matt Calabrese

unread,
Oct 7, 2015, 5:48:46 PM10/7/15
to ISO C++ Standard - Future Proposals
On Wed, Oct 7, 2015 at 2:21 PM, Christopher Horvath <black...@gmail.com> wrote:
I think there's a big difference between "easy to write on your own" and "easy to know that this is the right way to do the thing".  I think having bit_cast would be tremendously valuable as a certification of this being the right thing to do, or the right way to do it.

+1

Even for things that are otherwise trivial (or seemingly trivial), it's useful to have such a function. std::max and std::min are pretty trivial, for example, but the function is still useful... even though it's generally acknowledged that we messed up in the standard specification. It's easy to underestimate the value of simple functions.

Nicholas Chapman

unread,
Oct 7, 2015, 5:57:08 PM10/7/15
to ISO C++ Standard - Future Proposals
As someone mentioned on the reddit thread, another possibility is to change the definition of reinterpret_cast to support the kind of stuff bit_cast would do: https://www.reddit.com/r/programming/comments/3nmoo7/a_proposal_for_c_bit_cast/cvpocfe
I don't know enough about the original design and intention of the C++ casts to say if this is a good idea or not.
Does someone want to chime in on this?
Why doesn't reinterpret cast work like this in the first place?

Christopher Horvath

unread,
Oct 7, 2015, 6:03:21 PM10/7/15
to std-pr...@isocpp.org
I'm embarrassed to say that prior to this discussion, that's _exactly_ what I thought reinterpret_cast already did! 
 

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Farid Mehrabi

unread,
Oct 8, 2015, 3:35:46 AM10/8/15
to std-proposals
template <typename Target>
struct StrCaster{
     typedef 
     constexpr StrCaster(Target const& src)
     :    mData(src) {};

     template <typename Other>
     constexpr StrCaster(StrCaster<Other> const& init)
     {
            static_assert (sizeof (Other)==sizeof(Target));
            for(auto i=0;i<sizeof(Target);++i)
                mBytes[i]=init.mBytes[i];
     };
constexpr operator Target(){return mData;};
private:
     union{
           Target mData;
            char mBytes[sizeof(Target )];
     };
};

template <typename Out,typename In>
constexpr Out bit_cast(In const val)
{
       StrCaster<In> in {val};
       return StrCaster<Out>{in};
};
--
how am I supposed to end the twisted road of  your hair in such a dark night??
unless the candle of your face does shed some light upon my way!!!

Peter Koch Larsen

unread,
Oct 8, 2015, 5:38:30 AM10/8/15
to std-pr...@isocpp.org
This code is undefined behaviour, I believe (accessing elements in a
union that are not the type you put in).

/Peter

Farid Mehrabi

unread,
Oct 8, 2015, 10:03:42 AM10/8/15
to std-proposals
except for character arrays; this one used to be standard sometime ago, iff i am not mistaken. of course I am not certain about constexpr; it is relatively too new for me.

regards,
FM.

Thiago Macieira

unread,
Oct 8, 2015, 11:22:36 AM10/8/15
to std-pr...@isocpp.org
On Thursday 08 October 2015 17:33:01 Farid Mehrabi wrote:
> except for character arrays; this one used to be standard sometime ago, iff
> i am not mistaken. of course I am not certain about constexpr; it is
> relatively too new for me.

That's a different undefined behaviour's exception. You're thinking of the
strict aliasing rule.

Accessing an inactive member of a union is undefined behaviour for any type,
except the active and the inactive are aggregates, share the same initial
sequence and you're accessing data in that sequence.

Myriachan

unread,
Oct 8, 2015, 6:48:16 PM10/8/15
to ISO C++ Standard - Future Proposals
On Thursday, October 8, 2015 at 8:22:36 AM UTC-7, Thiago Macieira wrote:
Accessing an inactive member of a union is undefined behaviour for any type,
except the active and the inactive are aggregates, share the same initial
sequence and you're accessing data in that sequence.


I wish that this were instead that accessing the inactive member of a union is defined as reinterpreting the object representation as the new type.

Melissa

Thiago Macieira

unread,
Oct 8, 2015, 7:22:13 PM10/8/15
to std-pr...@isocpp.org
There was some language in some draft, somewhere, that said exactly that. It
in fact blessed the GCC/Clang/Visual Studio behaviour. It's no longer there.

Edward Catmur

unread,
Oct 9, 2015, 9:54:35 AM10/9/15
to ISO C++ Standard - Future Proposals
On Friday, 9 October 2015 00:22:13 UTC+1, Thiago Macieira wrote:
On Thursday 08 October 2015 15:48:16 Myriachan wrote:
> On Thursday, October 8, 2015 at 8:22:36 AM UTC-7, Thiago Macieira wrote:
> > Accessing an inactive member of a union is undefined behaviour for any
> > type,
> > except the active and the inactive are aggregates, share the same initial
> > sequence and you're accessing data in that sequence.
>
> I wish that this were instead that accessing the inactive member of a union
> is defined as reinterpreting the object representation as the new type.

Could you explain why? Isn't memcpy enough?
 
There was some language in some draft, somewhere, that said exactly that. It
in fact blessed the GCC/Clang/Visual Studio behaviour. It's no longer there.

Remember that the permissive behavior is defined in C (6.5.2.3/3, footnote 95). That said, even when compiling C, gcc and clang ignore it except in the most trivial of cases (that is, they apply pointer aliasing rules instead).

Thiago Macieira

unread,
Oct 9, 2015, 10:20:07 AM10/9/15
to std-pr...@isocpp.org
On Friday 09 October 2015 06:54:35 Edward Catmur wrote:
> On Friday, 9 October 2015 00:22:13 UTC+1, Thiago Macieira wrote:
> > > I wish that this were instead that accessing the inactive member of a
> > > union
> > > is defined as reinterpreting the object representation as the new type.
>
> Could you explain why? Isn't memcpy enough?

It relies on the compiler properly optimising it. It doesn't always.

Jared Grubb

unread,
Oct 9, 2015, 11:52:07 AM10/9/15
to ISO C++ Standard - Future Proposals


On Friday, October 9, 2015 at 7:20:07 AM UTC-7, Thiago Macieira wrote:
On Friday 09 October 2015 06:54:35 Edward Catmur wrote:
> On Friday, 9 October 2015 00:22:13 UTC+1, Thiago Macieira wrote:
> > > I wish that this were instead that accessing the inactive member of a
> > > union
> > > is defined as reinterpreting the object representation as the new type.
>
> Could you explain why? Isn't memcpy enough?

It relies on the compiler properly optimising it. It doesn't always.

I like this.

However, should this algorithm be SFINAE-protected with a std::is_trivially_copyable check (I think this is the correct one)? You could include a std::unsafe_bit_copy to operate without the SFINAE check.
 

David Krauss

unread,
Oct 9, 2015, 12:22:59 PM10/9/15
to std-pr...@isocpp.org

> On 2015–10–09, at 11:52 PM, Jared Grubb <jared...@gmail.com> wrote:
>
> However, should this algorithm be SFINAE-protected with a std::is_trivially_copyable check (I think this is the correct one)? You could include a std::unsafe_bit_copy to operate without the SFINAE check.


What’s the significance of trivial copyability when you’re reinterpreting the bits as some other type?

Jared Grubb

unread,
Oct 9, 2015, 1:30:30 PM10/9/15
to ISO C++ Standard - Future Proposals

I believe that both the source and destination types must be trivially-copyable in order for a memcpy operation between them to have any safe meaning.

Although I'm not 100% certain that I'm picking the right trait here. Maybe "is_pod" is the right SFINAE check here? Maybe it's a combination of a couple? My point is that you need some SFINAE-check on this algorithm to make it safe. I am having trouble thinking of an example where this would prohibit something that would work otherwise (and in that case, add an unsafe version but provide a safe one to double-check what programmers are attempting).

Edward Catmur

unread,
Oct 9, 2015, 2:24:44 PM10/9/15
to ISO C++ Standard - Future Proposals
On Friday, 9 October 2015 18:30:30 UTC+1, Jared Grubb wrote:


On Friday, October 9, 2015 at 9:22:59 AM UTC-7, David Krauss wrote:

> On 2015–10–09, at 11:52 PM, Jared Grubb <jared...@gmail.com> wrote:
>
> However, should this algorithm be SFINAE-protected with a std::is_trivially_copyable check (I think this is the correct one)? You could include a std::unsafe_bit_copy to operate without the SFINAE check.


What’s the significance of trivial copyability when you’re reinterpreting the bits as some other type?

I believe that both the source and destination types must be trivially-copyable in order for a memcpy operation between them to have any safe meaning.

Yes; see [basic.types]/2-4.
 
Although I'm not 100% certain that I'm picking the right trait here. Maybe "is_pod" is the right SFINAE check here? Maybe it's a combination of a couple? My point is that you need some SFINAE-check on this algorithm to make it safe. I am having trouble thinking of an example where this would prohibit something that would work otherwise (and in that case, add an unsafe version but provide a safe one to double-check what programmers are attempting).

For the obvious implementation to be valid, you'd need is_trivial (which implies is_trivially_copyable) on the destination type. That still doesn't guarantee that the result won't be UB; if the destination type is or contains a primitive other than an unsigned narrow character type the result could trap. There's also cv qualifiers and reference data members to consider (standayd-layout precludes the latter, but not the former).

Nicol Bolas

unread,
Oct 9, 2015, 3:47:23 PM10/9/15
to ISO C++ Standard - Future Proposals

I'm pretty sure `is_pod` is more strict than is absolutely necessary.

Trivial copyability is needed for `memcpy` to actually produce defined results. However, trivial copyability alone only allows copying between two objects of the same type. If you want to memcpy between two objects of different types, then in order to get defined behavior, the types must be layout compatible.

Of course, the OP clearly does not care about what is and is not well-defined. Binary int-to-float conversions do not produce standard-defined behavior. So if the goal is to make bit_copy<int, float> actually work, you don't care about what is technically legal C++.

I liked the idea above about having `bit_copy` and `unsafe_bit_copy`. The former requires layout compatibility between the types, and the latter doesn't care.

Then again, I'm having trouble coming up with reasons when I would need to use the safe `bit_copy`. I guess it would be a better description of what you intend, since you'd use that rather than raw `memcpy` for copying compatible types.

Myriachan

unread,
Oct 12, 2015, 4:16:10 PM10/12/15
to ISO C++ Standard - Future Proposals
On Friday, October 9, 2015 at 12:47:23 PM UTC-7, Nicol Bolas wrote:

I'm pretty sure `is_pod` is more strict than is absolutely necessary.

Trivial copyability is needed for `memcpy` to actually produce defined results. However, trivial copyability alone only allows copying between two objects of the same type. If you want to memcpy between two objects of different types, then in order to get defined behavior, the types must be layout compatible.

Of course, the OP clearly does not care about what is and is not well-defined. Binary int-to-float conversions do not produce standard-defined behavior. So if the goal is to make bit_copy<int, float> actually work, you don't care about what is technically legal C++.


What's wrong with defining bit_cast/bit_copy such that the results of such a cast as implementation-defined, with a non-normative note similar to that of [expr.reinterpret.cast]/4: "A pointer can be explicitly converted to any integral type large enough to hold it.  The mapping function is implementation-defined.  [Note: It is intended to be unsurprising to those who know the addressing structure of the underlying machine. -- end note]"?  That is, the result of bit_cast would be implementation-defined, but its result should be unsurprising to those with knowledge of the type of the machine.  bit_casting std::uint32_t(0x3F800000) to float on an x86 implementation and getting 1.0f would be such an "unsurprising result" for those knowing that the machine is an x86.

The Standard can change to define whatever the Committee feels necessary.  Saying that something is undefined behavior as a reason to dismiss a proposal is circular logic to me.

Melissa

Edward Catmur

unread,
Oct 12, 2015, 5:20:27 PM10/12/15
to std-pr...@isocpp.org

The standard can't ensure every bit_cast is defined, not even when the destination type is trivially copyable; the source value could be the object representation of a trap representation in the target type.

It's still possible to define bit_cast without reference to memcpy, though: something like "If the object representation of the source value contains the value representation of a value of the target type, returns that value; otherwise, the result is undefined." That gets us the expected behavior on x86 without restricting other platforms.

--

---
You received this message because you are subscribed to a topic in the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this topic, visit https://groups.google.com/a/isocpp.org/d/topic/std-proposals/3H6-V9_qVmQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to std-proposal...@isocpp.org.

Myriachan

unread,
Oct 12, 2015, 7:36:27 PM10/12/15
to ISO C++ Standard - Future Proposals
On Monday, October 12, 2015 at 2:20:27 PM UTC-7, Edward Catmur wrote:

The standard can't ensure every bit_cast is defined, not even when the destination type is trivially copyable; the source value could be the object representation of a trap representation in the target type.


That is true for the x86 as well, if floating-point exceptions are enabled and you make a NaN or something via bit_cast.
 

It's still possible to define bit_cast without reference to memcpy, though: something like "If the object representation of the source value contains the value representation of a value of the target type, returns that value; otherwise, the result is undefined." That gets us the expected behavior on x86 without restricting other platforms.


I really like that wording, though I think you meant "object" in one place you said "value".  How about this sort of wording?

template <class Target, class Source>
   
Target bit_cast(const Source &src);

1. Requires: Target and Source shall meet the requirements of TriviallyCopyable, and sizeof(Target) shall equal sizeof(Source).  [Note: This means that sizeof(Target) and sizeof(Source) must be defined; i.e., Target and Source cannot be "array of unknown bound of T" for some type T. — end note]
2. Returns: If the object representation of src equals the object representation of some value t of type Target, then t.  Otherwise, the behavior is undefined.  [Note: Where defined, this is equivalent to the effect of std::memcpy. — end note]
3. Complexity: At most O(N), where N has the value sizeof(Target).
4. Remarks: Target and Source need not have the same alignment requirements.

Edward Catmur

unread,
Oct 13, 2015, 10:52:23 AM10/13/15
to std-pr...@isocpp.org
Yes, that sounds great. I don't think it's actually necessary to constrain Source in any way, only Target; for example, it would be useful (for debugging/pedagogical purposes at least) to be able to write:

struct A { virtual ~A() = default; };
int main() { std::cout << bit_cast<void*>(A{}) << '\n'; }
Reply all
Reply to author
Forward
0 new messages