Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

reinterpret_cast<int&<( int* ) -- Odd behavior

9 views
Skip to first unread message

Hak...@gmail.com

unread,
Apr 1, 2009, 5:53:11 PM4/1/09
to
The fallowing code produces output I would not expect:


#include <iostream>
using namespace std;

int main()
{
int x = 5;
int* y = &x;
int& z = reinterpret_cast<int&>( y );

cout << hex;
cout << &y << " 0x" << z << '\n';
}

When compiled with g++, the output shows that &y and z are the same.
What I might expect is z to be 5. This is not totally off-putting
since it does exactly what it says, reinterprets the int* to an int&,
but I can't see how this is a good thing. g++ even emits a warning
about it: warning: casting ‘int*’ to ‘int&’ does not dereference
pointer. Well, yes, that is exactly the problem.

The real-world code that actually gave me a problem was trying to use C
++ and POSIX threads. I had a buffer type that I passed in as the
argument to my threadable function but I wanted to use it like a
reference. I quickly popped out this:

template< typename T >
void* f( void* _buf )
{
Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );
....
}

The error was not obvious for several reasons: I expected it to
somehow reference the buffer and it did not do as expected, there were
many other parts of the function that could have easily caused the
problem (segment fault), g++ did not emit a warning. Actually, on that
last one, g++ ONLY did emit a warning for int, not even unsigned int.

Obviously, since g++ did note the problem, I know there has been much
discussion on this problem and if it were considered a problem in the
language, there'd already be a proposal for it. So, a better question
is what is a good work-a-round that's not completely unintuitive and
won't require a huge comment? What is a good justification for this
behavior? Is it a bug or problem in g++ that it does not warn against
this behavior except in the case of int* to int&?


--
[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]

marcin...@gmail.com

unread,
Apr 1, 2009, 9:16:39 PM4/1/09
to
On 1 Kwi, 23:53, "Hak...@gmail.com" <Hak...@gmail.com> wrote:
> The fallowing code produces output I would not expect:
>
> #include <iostream>
> using namespace std;
>
> int main()
> {
> int x = 5;
> int* y = &x;
> int& z = reinterpret_cast<int&>( y );
>
> cout << hex;
> cout << &y << " 0x" << z << '\n';
>
> }
>
> When compiled with g++, the output shows that &y and z are the same.
> What I might expect is z to be 5.

What you expected could be achived in a straight-forward way:

int& z = *y;

> The real-world code that actually gave me a problem was trying to use C
> ++ and POSIX threads. I had a buffer type that I passed in as the
> argument to my threadable function but I wanted to use it like a
> reference. I quickly popped out this:
>
> template< typename T >
> void* f( void* _buf )
> {
> Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );
> ....
>
> }
>

If _buf is a pointer to Buf<T> object then the proper
code could look like this:

Buf<T>& buf = *static_cast<Buf<T>*>(_buf);

I think that this behavior of reinterpret_cast is supposed
to ensure that:

static_cast<T&>(obj) == reinterpret_cast<T&>(obj)

wich is rather intuitive when static_cast is applicable.

Cheers
Sfider

Martin T.

unread,
Apr 2, 2009, 3:52:21 PM4/2/09
to
Hak...@gmail.com wrote:
> ... I quickly popped out this:

>
> template< typename T >
> void* f( void* _buf )
> {
> Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );
> ....
> }
>
> The error was not obvious for several reasons: ... So, a better question

> is what is a good work-a-round that's not completely unintuitive and
> won't require a huge comment? ...
>

I may be missing the point, but why on earth use reinterpret_cast in the
first place ??

void* pBuf;
Buf<T>& = *static_cast<Buf<T>*>(pBuf);


br,
Martin

David Turner

unread,
Apr 2, 2009, 3:52:47 PM4/2/09
to
On Apr 1, 11:53 pm, "Hak...@gmail.com" <Hak...@gmail.com> wrote:
> ...

> When compiled with g++, the output shows that &y and z are the same.
> What I might expect is z to be 5. This is not totally off-putting
> since it does exactly what it says, reinterprets the int* to an int&,
> but I can't see how this is a good thing. g++ even emits a warning
> about it: warning: casting ‘int*’ to ‘int&’ does not dereference
> pointer. Well, yes, that is exactly the problem.
> ...

You said it yourself. An int* needs to be dereferenced to become an
int&. I think the issue here is the interpretation of int& as a
"pseudo-pointer". It's not. It's an int-by-reference. Without
quoting chapter and verse, I'll ask you to consider that many language
constructs (copy constructors I'm thinking of in particular) wouldn't
work correctly without something that "looks like" T but is actually a
reference to a previously existing T. That's the justification for
the distinction between int* and int&, although I fully agree that the
transparency of the reference is somewhat unsettling.

As for a better approach to your problem... what's wrong with passing
a Buf<T>*?

Finally, when something is cast to a void*, as they say, all bets are
off. How is g++ to determine statically that your reinterpret_cast is
incorrect? For myself, I would expect reinterpret_cast<T&>((void*)x)
to be undefined behaviour for all T, but that's just because I find
the notion of dereference-by-cast to be obnoxious. But who knows, it
might make sense.

In short, casing a pointer-to-void to a reference-to-something doesn't
make sense, because a reference ain't a pointer. Pass a pointer, cast
to a pointer, and then dereference if you must.

Regards
David Turner

blargg

unread,
Apr 4, 2009, 12:05:49 AM4/4/09
to
"marcin...@gmail.com wrote:
> On 1 Kwi, 23:53, "Hak...@gmail.com" <Hak...@gmail.com> wrote:
[...]

> > template< typename T >
> > void* f( void* _buf )
> > {
> > Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );
> > ....
> >
> > }
> >
> If _buf is a pointer to Buf<T> object then the proper
> code could look like this:
>
> Buf<T>& buf = *static_cast<Buf<T>*>(_buf);
>
> I think that this behavior of reinterpret_cast is supposed
> to ensure that:
>
> static_cast<T&>(obj) == reinterpret_cast<T&>(obj)

This depends on the object's operator ==, since that's what it invokes.
Did you mean to compare the identities of the objects casted to?

&static_cast<T&>(obj) == &reinterpret_cast<T&>(obj)

People seem to be getting confused with casts to a reference type.
Something like

reinterpret_cast<T&> (obj)

is nearly equivalent to

(*reinterpret_cast<T*> (&obj))

if that helps reason more clearly about it.

joshua...@gmail.com

unread,
Apr 4, 2009, 3:52:37 AM4/4/09
to
On Apr 3, 9:05 pm, blargg <blargg....@gishpuppy.com> wrote:
> People seem to be getting confused with casts to a reference type.
> Something like
>
> reinterpret_cast<T&> (obj)
>
> is nearly equivalent to
>
> (*reinterpret_cast<T*> (&obj))
>
> if that helps reason more clearly about it.

No. Just no. No at this entire thread.

That may be how it's implemented on some systems, and perhaps it is
interesting to try for fun, but to write any sort of real code, do not
do this. In addition to being not portable and otherwise broken by the
standard, do not do this because there is absolutely no reason to use
reinterpret_cast here. Use static_cast instead, (and only cast from
void* to T* where T* is the exact original type of the pointer casted
to void*).

Let me again emphasize that for pretty much all code, if you're
writing a reinterpret_cast, or a C-style cast which cannot be replaced
with a static_cast, then your code probably has undefined behavior.
reinterpret_cast has no defined behavior. It's a myth that
reinterpret_cast is a portable construct with certain defined behavior
according to the C++ standard.

(Yes, casting to char* and unsigned char* is the exception. Casting
back to any other type is not. If you don't know what this exception
is, pretend I didn't say anything.)

Also, never use C-style casts in C++ code. Sometimes the C-style cast
is equivalent to a reinterpret_cast, and thus it has all of the
undefined baggage to go with it. The C-style cast's behavior changes
if the full definitions of the types are in scope or just forward
declarations. For example, someone adding or removing a header from a
header from a header which your cpp file includes can \silently\
change the behavior of your C-style cast and break your code. No
errors. No warnings. Nothing.


Finally, if possible, introduce an incomplete type through forward
declaration to get some type safety instead of passing around void*,
though it sounds like an external interface is forcing void* onto you,
which is unfortunate, and possibly unavoidable.


The op's example was:

> template< typename T >
> void* f( void* _buf )
> {
> Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );

> // ....
> }

It should be:

/* Don't use names starting with underscores. Double underscore
names are reserved for the compiler. Underscore capital-letter names
are also reserved for the compiler. */
/* Call this with the exact right template type, not a base class
or super class, otherwise you have undefined behavior. */
template< typename T >
void* f( void* buf_ )
{
Buf<T>& buf = * static_cast<Buf<T>*>(buf_);
// ....

Hakusa

unread,
Apr 5, 2009, 5:30:56 AM4/5/09
to
On Apr 4, 3:52 am, joshuamaur...@gmail.com wrote:

> Finally, if possible, introduce an incomplete type through forward
> declaration to get some type safety instead of passing around void*,
> though it sounds like an external interface is forcing void* onto you,
> which is unfortunate, and possibly unavoidable.

That is correct. However, I plan to use C++ OOP techniques to make it
incredibly safe and intuitive. Thus, it's not really unavoidable.

> The op's example was:
>
> > template< typename T >
> > void* f( void* _buf )
> > {
> > Buf<T>& buf = reinterpret_cast<Buf<T>&>( _buf );
> > // ....
> > }
>
> It should be:
>
> /* Don't use names starting with underscores. Double underscore
> names are reserved for the compiler. Underscore capital-letter names
> are also reserved for the compiler. */

Why? I see why I should never use double underscores, but why not one?
Or, do you mean that underscore with all caps is also bad? That's
still not enough reason for me, especially since I find underscore at
the end awkward.

blargg

unread,
Apr 5, 2009, 5:30:39 AM4/5/09
to
joshuamaurice wrote:
[...]

> Also, never use C-style casts in C++ code.

And functional-style casts, since they also silently degrade into a
reinterpret_cast if necessary:

int i = (int) "123"; // oops
int i = int ("123"); // oops again

joshua...@gmail.com

unread,
Apr 6, 2009, 5:32:10 AM4/6/09
to
On Apr 5, 2:30 am, Hakusa <Hak...@gmail.com> wrote:
> On Apr 4, 3:52 am, joshuamaur...@gmail.com wrote:
> > /* Don't use names starting with underscores. Double underscore
> > names are reserved for the compiler. Underscore capital-letter names
> > are also reserved for the compiler. */
>
> Why? I see why I should never use double underscores, but why not one?
> Or, do you mean that underscore with all caps is also bad? That's
> still not enough reason for me, especially since I find underscore at
> the end awkward.

C++03 17.4.3.1.2.1
> Certain sets of names and function signatures are always reserved to the implementation:
> — Each name that contains a double underscore (_ _) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use.

I suppose it's legal to do _foo, but I think it's frowned upon as bad
style because of this restriction.

Gerhard Menzl

unread,
Apr 6, 2009, 5:35:32 AM4/6/09
to
Hakusa wrote:

> Why? I see why I should never use double underscores, but why not one?
> Or, do you mean that underscore with all caps is also bad? That's
> still not enough reason for me, especially since I find underscore at
> the end awkward.

17.4.3.1.2

"Each name that contains a double underscore (__) or begins with an


underscore followed by an uppercase letter (2.11) is reserved to the
implementation for any use."

"Each name that begins with an underscore is reserved to the
implementation for use as a name in the global namespace."

--
Gerhard Menzl

Non-spammers may respond to my email address, which is composed of my
full name, separated by a dot, followed by at, followed by "fwz",
followed by a dot, followed by "aero".

Alf P. Steinbach

unread,
Apr 6, 2009, 12:54:29 PM4/6/09
to
* joshua...@gmail.com:

> On Apr 3, 9:05 pm, blargg <blargg....@gishpuppy.com> wrote:
>> People seem to be getting confused with casts to a reference type.
>> Something like
>>
>> reinterpret_cast<T&> (obj)
>>
>> is nearly equivalent to
>>
>> (*reinterpret_cast<T*> (&obj))
>>
>> if that helps reason more clearly about it.
>
> No. Just no. No at this entire thread.
>
> That may be how it's implemented on some systems, and perhaps it is
> interesting to try for fun, but to write any sort of real code, do not
> do this.

I interpret the above as saying that the "nearly equivalent" is wrong in the
direction that any equivalence is merely how the particular implementation
does
it, if it does.

And if that interpretation is correct, then your stance on that is
incorrect,
because the standard /guarantees/ this equivalence in §5.2.10/10.

So the original statement is wrong, but in the other direction: the word
"nearly" should be "exactly". :-)


[snip]


> Let me again emphasize that for pretty much all code, if you're
> writing a reinterpret_cast, or a C-style cast which cannot be replaced
> with a static_cast, then your code probably has undefined behavior.

Agreed regarding the formal for portability.

However, to take a concrete example, in Windows programming you often have
to
reinterpret_cast (or use a C style cast to do the same), because most of the
API
routines' formal arguments are designed for C -- wrong types for C++!

It's in practice well defined behavior because it's defined by the system.
Any
compiler that did something funny, even if allowed to do so by the standard,
would just not make it in the marketplace. So it's just formally undefined.


> reinterpret_cast has no defined behavior.

Again, sorry, but that's incorrect, even regarding the purely formal.

First of all, the standard guarantees in §5.2.10/7 that round-trip
conversion of
pointers using reinterpret_cast yields the original pointer.

Then -- but here we're up against an inconsistency in the standard -- in
§9.2/17 the standard guarantees that a pointer to a POD struct, suitably
converted via reinterpret_cast, points the struct's first member. This is
presumably in support of an old C technique for emulating inheritance. It's
useful for dealing with C interface that are based on such techniques.

The reason it is an inconsistency is that §5.2.10/7 maintains that all other
reinterpret_cast pointer conversions than the roundtrip one, are
unspecified.

But, considering the potentially large amounts of code Out There(TM) that
relies
on the §9.2/17 guarantee, and also considering that a formally guaranteed
behavior can't be unspecified, it is IMHO §5.2.10/7 that is in error.


> It's a myth that
> reinterpret_cast is a portable construct with certain defined behavior
> according to the C++ standard.

I agree. :-)


> (Yes, casting to char* and unsigned char* is the exception. Casting
> back to any other type is not. If you don't know what this exception
> is, pretend I didn't say anything.)

I'm sorry, but casting to char* is, AFAIK, not formally an exception. One
might
argue that it "should" be an exception because otherwise the only way to
copy a
POD object to an array of char (or unsigned char) and back again would be
via
memcpy, whose internal magic could then not be duplicated portably in a
user-defined routine. However, this ability is very strongly implied by
§9.2/17
mentioned above. It would take a perverse implementation to ignore the
non-normative note in that paragraph that it implies no padding at the start
of
a POD struct, and do type-specific things. So, taking the stance that the
first
member of a POD struct /could/ be a char, say, and reasonably assuming that
the
implementation is not perverse in the sense outlined here, one has a
practical
guarantee for char*, and indeed for any other POD type!

Summing up that logic:

* The formal guarantee for casting to char* is a myth.

* But §9.2/17 implies an in-practice guarantee for any POD*.


> Also, never use C-style casts in C++ code.

I used to agree 100% with that.

However, trying to write introductionary material for novices it seems
wholly
wrong to introduce general casts at an early point.

Instead, where required, I've found it convenient to use the pseudo
constructor
notation T(x) for built-in non-pointer types (I'd write just "built-in"
except
it seems that the standard extends the meaning of that term to cover
pointers).

And this resolves to a C style cast.

So, current stance is, no C style casts except the pseudo constructor
notation
for built-in non pointer types.

There is however one thing that can be achieved by a C style cast that can't
be
achieved by any of the named C++ casts, namely accessing an otherwise
inaccessible base.

One might argue that that's so dirty that one shouldn't do it, but I guess
that
the capability is there in the language for a reason, that in some situation
(which I can't quite imagine) it's practically necessary.


> Sometimes the C-style cast
> is equivalent to a reinterpret_cast, and thus it has all of the
> undefined baggage to go with it. The C-style cast's behavior changes
> if the full definitions of the types are in scope or just forward
> declarations. For example, someone adding or removing a header from a
> header from a header which your cpp file includes can \silently\
> change the behavior of your C-style cast and break your code. No
> errors. No warnings. Nothing.

Good point.

I don't think I've seen it before.

*Noting*.


Cheers, & hth.,

- Alf

--
Due to hosting requirements I need visits to <url:
http://alfps.izfree.com/>.
No ads, and there is some C++ stuff! :-) Just going there is good. Linking
to it is even better! Thanks in advance!

joshua...@gmail.com

unread,
Apr 7, 2009, 5:22:07 AM4/7/09
to
On Apr 6, 9:54 am, "Alf P. Steinbach" <al...@start.no> wrote:
> * joshuamaur...@gmail.com:

> > On Apr 3, 9:05 pm, blargg <blargg....@gishpuppy.com> wrote:
> >> People seem to be getting confused with casts to a reference type.
> >> Something like
>
> >> reinterpret_cast<T&> (obj)
>
> >> is nearly equivalent to
>
> >> (*reinterpret_cast<T*> (&obj))
>
> >> if that helps reason more clearly about it.
>
> > No. Just no. No at this entire thread.
>
> > That may be how it's implemented on some systems, and perhaps it is
> > interesting to try for fun, but to write any sort of real code, do not
> > do this.
>
> I interpret the above as saying that the "nearly equivalent" is wrong in the
> direction that any equivalence is merely how the particular implementation
> does it, if it does.
>
> And if that interpretation is correct, then your stance on that is
> incorrect, because the standard /guarantees/ this equivalence in §5.2.10/10.
>
> So the original statement is wrong, but in the other direction: the word
> "nearly" should be "exactly". :-)

Note that C++03 5.2.10/10 defines its behavior in terms of 5.2.10/7,
which is at best vague. It ends with "the result of such a pointer
conversion is unspecified".

5.2.10/7
> A pointer to an object can be explicitly converted to a pointer to an object of different type.65) Except that
> converting an rvalue of type “pointer to T1” to the type “pointer to T2” (where T1 and T2 are object types
> and where the alignment requirements of T2 are no stricter than those of T1) and back to its original type
> yields the original pointer value, the result of such a pointer conversion is unspecified.

It seriously says "yields the original pointer value" and "result
[...] unspecified" in the same line of text, referring, as far as I
can tell, to the same thing. I would like some clarity on this.


> > Let me again emphasize that for pretty much all code, if you're
> > writing a reinterpret_cast, or a C-style cast which cannot be replaced
> > with a static_cast, then your code probably has undefined behavior.
>
> Agreed regarding the formal for portability.
>
> However, to take a concrete example, in Windows programming you often have
> to reinterpret_cast (or use a C style cast to do the same), because most of the
> API routines' formal arguments are designed for C -- wrong types for C++!
>
> It's in practice well defined behavior because it's defined by the system.
> Any compiler that did something funny, even if allowed to do so by the standard,
> would just not make it in the marketplace. So it's just formally undefined.

Yes. POSIX is guilty of this as well. You have to reinterpret_cast the
return of a dlsym, a void*, to a function pointer, undefined behavior
according to the C++ standard. In POSIX's defense, there isn't an
alternative in C, so I'm not calling this "bad" or "the wrong thing to
do" in C or C++. In this case yes, it's well defined by a standard,
the POSIX standard, just not the C standard or C++ standard.


> > reinterpret_cast has no defined behavior.
> Again, sorry, but that's incorrect, even regarding the purely formal.

I exaggerated. It does have some well defined behavior, but people
commonly mistake exactly how little these guarantees are.


> First of all, the standard guarantees in §5.2.10/7 that round-trip
> conversion of > pointers using reinterpret_cast yields the original pointer.

I noted above how this block from the standard is self contradictory.
Also, your interpretation disagrees with several other threads in
these forums in recent times.

For example, I recall a recent thread on here
http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/thread/be1d6fd208dae05b/636f8ef3efad284a?lnk=raot
describing how pointers can have different sizes, how sizeof(char*)==8
and sizeof(int*)==4, and this is compliant with the standard. The
example as to why this is allowed and done is some hardware is only 64
bit addressable, but they want char* to point to smaller units than 64
bit units, so a char* contains the address of the 64 bit unit, and
contains an offset into that 64 bit unit of the 8 bit "char". This
basically means that a char* casted to an int* casted back to a char*
would not be the identity function on this hardware + compiler.

Thus I am left to ponder that thread versus an apparent schizophrenic
attempt to make this well defined in 5.2.10/7 and in the same breath
say unspecified.


> Then -- but here we're up against an inconsistency in the standard -- in
> §9.2/17 the standard guarantees that a pointer to a POD struct, suitably
> converted via reinterpret_cast, points the struct's first member. This is
> presumably in support of an old C technique for emulating inheritance. It's
> useful for dealing with C interface that are based on such techniques.
>
> The reason it is an inconsistency is that §5.2.10/7 maintains that all other
> reinterpret_cast pointer conversions than the roundtrip one, are
> unspecified.

I don't see how you can get this reading. Then again, I see 5.2.10/7
as desperately needing cleanup. I believe the intent was to allow the
reinterpret_cast use for POD types as done in C, but otherwise still
subject to the strict aliasing rule. For example, I believe the intent
is to make the following well defined program which returns 5.
struct T { int x; };
struct U { int y; int z; };
int main()
{ T t;
t.x = 5;
U * u = reinterpret_cast<U*>(&t);
return u->y;
}


> But, considering the potentially large amounts of code Out There(TM) that
> relies on the §9.2/17 guarantee, and also considering that a formally guaranteed
> behavior can't be unspecified, it is IMHO §5.2.10/7 that is in error.

As I understand the issues, I disagree. I think we can support the use
of reinterpret_cast in the C-style manual inheritance, and disallow
round trip conversions between arbitrary pointer types, and I think
that was the intent. Can we? Can we do this on bizarre architectures
where sizeof(char*) != sizeof(int*), etc.? 9.2/17 seems to indicate
that the following is well defined for all types T.
template <typename T>
struct foo
{ T x;
int y;
};
template <typename T>
T* getX(foo<T>& x)
{ return reinterpret_cast<T*>(&x); }

I think that's doable. It would mean that there would be a slight
pessimization for pointers to struct to be of the larger pointer kind
if its first member has a pointer of the larger pointer kind.


> > (Yes, casting to char* and unsigned char* is the exception. Casting
> > back to any other type is not. If you don't know what this exception
> > is, pretend I didn't say anything.)
>
> I'm sorry, but casting to char* is, AFAIK, not formally an exception. One
> might argue that it "should" be an exception because otherwise the only way to
> copy a POD object to an array of char (or unsigned char) and back again would be
> via memcpy, whose internal magic could then not be duplicated portably in a
> user-defined routine. However, this ability is very strongly implied by
> §9.2/17 mentioned above. It would take a perverse implementation to ignore the
> non-normative note in that paragraph that it implies no padding at the start
> of a POD struct, and do type-specific things. So, taking the stance that the
> first member of a POD struct /could/ be a char, say, and reasonably assuming that
> the implementation is not perverse in the sense outlined here, one has a
> practical guarantee for char*, and indeed for any other POD type!
>
> Summing up that logic:
>
> * The formal guarantee for casting to char* is a myth.
>
> * But §9.2/17 implies an in-practice guarantee for any POD*.

3.8/5 strongly implies that static_casting from any pointer type to
void*, and then static_casting to char* or unsigned char* is defined
behavior.

3.9/2, as you noted, suggests being able to cast to char* or unsigned
char*, but it does not say this and uses memcpy in its example.

3.10/15, the strict aliasing rule, also strongly suggests being able
to access any object through a char* or unsigned char*.

3.9.2/4 has my strongest argument, which specifically singles out
void* as being able to point to any object, suggesting other pointers
cannot. It also states that char* and void* have the same
representation and alignment requirements, strongly suggesting char*
can also point at any object. I will also note that unsigned char* is
conspicuously absent here, which I assume is an oversight.

The standard is somewhat unclear on these issues, but as above, I
think the intent is that void*, char*, and unsigned char* are the
universal pointer types which can point at any object, and that all
other pointer types may not. Thus round-trip pointer casts not going
through void*, char*, or unsigned char* are undefined (or at best
unspecified) behavior. Finally, my point is that the issues with
reinterpret_cast are largely avoidable in practice using static_cast
and void* (though not with platform specific APIs like windows and
POSIX) (though type safety via forward declarations is better still).


--

Alf P. Steinbach

unread,
Apr 7, 2009, 1:12:08 PM4/7/09
to
* joshua...@gmail.com:

It only appears vague when the rest of the sentence is omitted, as you do.
;-)

See below.


> 5.2.10/7
>> A pointer to an object can be explicitly converted to a pointer to an
>> object of different type.65) Except that
>> converting an rvalue of type “pointer to T1” to the type “pointer to T2”
>> (where T1 and T2 are object types
>> and where the alignment requirements of T2 are no stricter than those of
>> T1) and back to its original type
>> yields the original pointer value, the result of such a pointer
>> conversion is unspecified.
>
> It seriously says "yields the original pointer value" and "result
> [...] unspecified" in the same line of text, referring, as far as I
> can tell, to the same thing. I would like some clarity on this.

OK.

The key word is the /except/, at the start of the sentence.

/Except/ for round-trip conversion of pointers with suitably aligned
referents,
the result of converting a pointer is, according to this paragraph,
unspecified.

And that's very very clear.

However, as I explained in the article you're replying to here, the standard
is
inconsistent, because in §9.2/17 it does define an additional conversion
which,
being well-defined, cannot be unspecified. And there the alignment isn't
implicit in the types but is a property of the particular objects pointed
to.
Ensured by one of those objects being located at the start of POD struct,
and
the other being that POD struct.


[snip]


>>> reinterpret_cast has no defined behavior.
>> Again, sorry, but that's incorrect, even regarding the purely formal.
>
> I exaggerated. It does have some well defined behavior, but people
> commonly mistake exactly how little these guarantees are.

Yes.


>> First of all, the standard guarantees in §5.2.10/7 that round-trip
>> conversion of pointers using reinterpret_cast yields the original
>> pointer.
>
> I noted above how this block from the standard is self contradictory.

I'm sorry, your argument for self-contradiction is incorrect, based on
ignoring
the relevant lead-in part of the sentence from which you lifted the last
words,
namely, ignoring the important formulation "except". As explained above.
However, I noted in the article you're replying to that the standard in
this
case is self-contradictory for quite a different reason, namely, that the
standard elsewhere, in §9.2/17, defines an additional conversion.


> Also, your interpretation disagrees with several other threads in
> these forums in recent times.

Don't know about that, but listen to logic and facts.

If those threads got the logic and/or facts wrong, as you indicate, then
really
it's about time that this was set straight.

But are you sure that they got it wrong, or, considering your above
incorrect
interpretation of the standard's text (by ignoring the lead-in part of a
sentence), perhaps you've misunderstood those threads?


> For example, I recall a recent thread on here
> http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/thread/be1d6fd208dae05b/636f8ef3efad284a?lnk=raot
> describing how pointers can have different sizes, how sizeof(char*)==8
> and sizeof(int*)==4, and this is compliant with the standard. The
> example as to why this is allowed and done is some hardware is only 64
> bit addressable, but they want char* to point to smaller units than 64
> bit units, so a char* contains the address of the 64 bit unit, and
> contains an offset into that 64 bit unit of the 8 bit "char". This
> basically means that a char* casted to an int* casted back to a char*
> would not be the identity function on this hardware + compiler.

Yes.

The allowed general round-trip conversion is contingent on "where the
alignment
requirements of T2 are no stricter than those of T1".

I didn't discuss that, but if we're going to be very precise it needs to be
stated. :-)


> Thus I am left to ponder that thread versus an apparent schizophrenic
> attempt to make this well defined in 5.2.10/7 and in the same breath
> say unspecified.

See above, there's no conflict.

And the standard is not schizophrenic in that regard.

The inconsistency of §5.2.10/7 is instead with the defined case of §9.2/17.


>> Then -- but here we're up against an inconsistency in the standard --
>> in
>> §9.2/17 the standard guarantees that a pointer to a POD struct, suitably
>> converted via reinterpret_cast, points the struct's first member. This is
>> presumably in support of an old C technique for emulating inheritance.
>> It's
>> useful for dealing with C interface that are based on such techniques.
>>
>> The reason it is an inconsistency is that §5.2.10/7 maintains that all
>> other
>> reinterpret_cast pointer conversions than the roundtrip one, are
>> unspecified.
>
> I don't see how you can get this reading. Then again, I see 5.2.10/7
> as desperately needing cleanup. I believe the intent was to allow the
> reinterpret_cast use for POD types as done in C, but otherwise still
> subject to the strict aliasing rule. For example, I believe the intent
> is to make the following well defined program which returns 5.
> struct T { int x; };
> struct U { int y; int z; };
> int main()
> { T t;
> t.x = 5;
> U * u = reinterpret_cast<U*>(&t);
> return u->y;
> }

It has that as a consequence and I believe basic motivation, yes, treating
an
initial part of a POD struct X where that part is layout-compatible with Y,
as a Y.

However, while the standard in §9.2./16 does ensure layout compatibility for
the
common identically declared initial part of two POD structs, for in-practice
C++
programming such C-like redundancy is Evil(TM), for even when the programmer
manages to get it right initially it can easily lead to the two definitions
diverging through maintainance of the code -- including not only changes
to
the declarations themselves but e.g. packing pragmas.

And so for the in-practice the basic example is more like

struct T { int x; };
struct U { T basePart; int z; };

not merely repeating declarations of the same elements.


>> But, considering the potentially large amounts of code Out There(TM) that
>> relies on the §9.2/17 guarantee, and also considering that a formally
>> guaranteed
>> behavior can't be unspecified, it is IMHO §5.2.10/7 that is in error.
>
> As I understand the issues, I disagree.

Well, you misunderstood the standard's text about "unspecified", by ignoring
the
"except" earlier in the sentence.


> I think we can support the use
> of reinterpret_cast in the C-style manual inheritance, and disallow
> round trip conversions between arbitrary pointer types, and I think
> that was the intent.

Nope, see above.


[snip]


>>> (Yes, casting to char* and unsigned char* is the exception. Casting
>>> back to any other type is not. If you don't know what this exception
>>> is, pretend I didn't say anything.)
>> I'm sorry, but casting to char* is, AFAIK, not formally an exception. One
>> might argue that it "should" be an exception because otherwise the only
>> way to
>> copy a POD object to an array of char (or unsigned char) and back again
>> would be
>> via memcpy, whose internal magic could then not be duplicated portably in
>> a
>> user-defined routine. However, this ability is very strongly implied by
>> §9.2/17 mentioned above. It would take a perverse implementation to
>> ignore the
>> non-normative note in that paragraph that it implies no padding at the
>> start
>> of a POD struct, and do type-specific things. So, taking the stance that
>> the
>> first member of a POD struct /could/ be a char, say, and reasonably
>> assuming that
>> the implementation is not perverse in the sense outlined here, one has a
>> practical guarantee for char*, and indeed for any other POD type!
>>
>> Summing up that logic:
>>
>> * The formal guarantee for casting to char* is a myth.
>>
>> * But §9.2/17 implies an in-practice guarantee for any POD*.

Uh huh, I forgot to qualify this with alignment considerations.

I *apologize* for that omission, but then, I'm purportedly human... ;-)

So, add "suitably aligned" (or more precise language such as "with suitably
aligned referents" or even more precise language such as the standard's, but
if
we start repeating the standard's exact language then nothing is gained).


> 3.8/5 strongly implies that static_casting from any pointer type to
> void*, and then static_casting to char* or unsigned char* is defined
> behavior.

Well, sorry, no, it talks about the storage for an object before or after
the
object's lifetime.

Again, context is important.

But I agree that there is an implication and assumption here and elsewhere
about
char*. Conversion to char* is practically well-defined for PODs. But
although
that is necessary to know to make sense of §3.8/5 it isn't specified by
§3.8/5;
it stems, AFAIK, only from the general §9.2/17 (however, given how the
standard
is, it wouldn't necessarily be a surprise if it's also present somewhere
else).


> 3.9/2, as you noted, suggests being able to cast to char* or unsigned
> char*, but it does not say this and uses memcpy in its example.

It's the same assumption, where to make sense of it you need to keep §9.2/17
in
sight.


> 3.10/15, the strict aliasing rule, also strongly suggests being able
> to access any object through a char* or unsigned char*.

I think you'll agree that it's trivial to construct a case of UB using one
of
the ways of referring to an object listed in the $3.10/15 paragraph.

So again it's §9.2/17 that you need to make sense of it. $3.10/15 doesn't
talk
about what's allowed. It talks about cases that are definitely UB by
explicitly
listing all the cases that are /not always/ UB, noting that conversion to
e.g.
char* is not necessarily always UB. "if ... other ... the behavior is
undefined"
does not mean that "if [one of these] the behavior is defined". It means
what it
says, that if you refer to an object in any other way then you're guaranteed
UB,
while if you constrain yourself to the listed ways, you may or may not have
UB.


> 3.9.2/4 has my strongest argument, which specifically singles out
> void* as being able to point to any object, suggesting other pointers
> cannot. It also states that char* and void* have the same
> representation and alignment requirements, strongly suggesting char*
> can also point at any object. I will also note that unsigned char* is
> conspicuously absent here, which I assume is an oversight.

Uhm, I'm not sure what you're arguing /for/, or against.

But apart from that possible implication I agree with the above paragraph.


> The standard is somewhat unclear on these issues, but as above, I
> think the intent is that void*, char*, and unsigned char* are the
> universal pointer types which can point at any object, and that all
> other pointer types may not.

Right.


> Thus round-trip pointer casts not going
> through void*, char*, or unsigned char* are undefined (or at best
> unspecified) behavior.

I'm sorry, that's incorrect. See above. All that the standard requires for
well-defined'ness of roundtrip conversion, as discussed and quoted above, is
suitable alignment of referents.


> Finally, my point is that the issues with
> reinterpret_cast are largely avoidable in practice using static_cast
> and void* (though not with platform specific APIs like windows and
> POSIX) (though type safety via forward declarations is better still).

I think that's a very dangerous idea. Reportedly Andrei and Herb put forth
this
idea in their C++ coding guidelines book. But until I've seen some rationale
(reportedly they forgot to include any rationale) I regard it as extremely
dangerous, with no benefits and many severe problems, for by introducing
void*
as an intermediary all local type information is lost, and one ends up
passing
void* pointers around, which exacerbates that problem. It also, but less
importantly, is in direct conflict with writing what you mean. When you use
the
double static_cast others will have to spend time on figuring out why you're
doing that, only, in the best case, discovering that it's due only to some
misguided idea that *waiving frenetically* it's less unsafe or whatever.


Cheers & hth.,

- Alf

--
Due to hosting requirements I need visits to <url:
http://alfps.izfree.com/>.
No ads, and there is some C++ stuff! :-) Just going there is good. Linking
to it is even better! Thanks in advance!

Francis Glassborow

unread,
Apr 7, 2009, 1:12:26 PM4/7/09
to
joshua...@gmail.com wrote:

>
> Yes. POSIX is guilty of this as well. You have to reinterpret_cast the
> return of a dlsym, a void*, to a function pointer, undefined behavior
> according to the C++ standard. In POSIX's defense, there isn't an
> alternative in C, so I'm not calling this "bad" or "the wrong thing to
> do" in C or C++. In this case yes, it's well defined by a standard,
> the POSIX standard, just not the C standard or C++ standard.
>
>

Yes POSIX is just taking advantage of the allowance for an
implementation (or group of implementations) to define undefined
behaviour. As POSIX effectively requires that a function pointer shall
be storable in a void* this is not a problem for a POSIX system. However
code written for POSIX is not necessarily portable to non-POSIX systems.

Bart van Ingen Schenau

unread,
Apr 7, 2009, 3:54:55 PM4/7/09
to
Gerhard Menzl wrote:

> Hakusa wrote:
>
>> Why? I see why I should never use double underscores, but why not
>> one? Or, do you mean that underscore with all caps is also bad?
>> That's still not enough reason for me, especially since I find
>> underscore at the end awkward.
>
> 17.4.3.1.2
>
> "Each name that contains a double underscore (__) or begins with an
> underscore followed by an uppercase letter (2.11) is reserved to the
> implementation for any use."
>
> "Each name that begins with an underscore is reserved to the
> implementation for use as a name in the global namespace."
>

To expand a bit on this:
It is permitted to use a name like _foo as a local variable or a member,
but that is frowned upon, because then you have to check carefully each
time you want to use a leading underscore if you are in a context where
that is permitted.

To make life easier for them, many programmers adopt the far simpler
rule of not using leading underscores at all.

Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://c-faq.com/
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/

joshua...@gmail.com

unread,
Apr 8, 2009, 5:40:39 PM4/8/09
to
On Apr 7, 10:12 am, "Alf P. Steinbach" <al...@start.no> wrote:
> * joshuamaur...@gmail.com:
> > For example, I recall a recent thread on here
> >http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/...

> > describing how pointers can have different sizes, how sizeof(char*)==8
> > and sizeof(int*)==4, and this is compliant with the standard. The
> > example as to why this is allowed and done is some hardware is only 64
> > bit addressable, but they want char* to point to smaller units than 64
> > bit units, so a char* contains the address of the 64 bit unit, and
> > contains an offset into that 64 bit unit of the 8 bit "char". This
> > basically means that a char* casted to an int* casted back to a char*
> > would not be the identity function on this hardware + compiler.
>
> Yes.
>
> The allowed general round-trip conversion is contingent on "where the
> alignment requirements of T2 are no stricter than those of T1".
>
> I didn't discuss that, but if we're going to be very precise it needs to be
> stated. :-)

I'm sorry for jumping down your throat for glossing over that
(important) detail.


> > Finally, my point is that the issues with
> > reinterpret_cast are largely avoidable in practice using static_cast
> > and void* (though not with platform specific APIs like windows and
> > POSIX) (though type safety via forward declarations is better still).
>
> I think that's a very dangerous idea. Reportedly Andrei and Herb put forth
> this idea in their C++ coding guidelines book. But until I've seen some rationale
> (reportedly they forgot to include any rationale) I regard it as extremely
> dangerous, with no benefits and many severe problems, for by introducing
> void* as an intermediary all local type information is lost, and one ends up
> passing void* pointers around, which exacerbates that problem. It also, but less
> importantly, is in direct conflict with writing what you mean. When you use
> the double static_cast others will have to spend time on figuring out why you're
> doing that, only, in the best case, discovering that it's due only to some
> misguided idea that *waiving frenetically* it's less unsafe or whatever.

I'm sorry. I needed to clarify my claim a little. I initially meant to
say that in the OP's code, there is absolutely no reason to use
reinterpret_cast. static_cast would suffice, no type information would
be lost, and no void* would be introduced. In such a situation, I
claim one should always prefer static_cast.

I fullheartedly agree that static_cast to void*, then static_cast back
is about as safe as reinterpret_cast. Thus, if I have to choose
between:

1- static_cast to void*, pass that void* around, then static_cast
back,

2- and have a typed pointer, pass that typed pointer around, then
reinterpret_cast to another class with the same leading POD part,

then yes, I will choose 2 each time. I would still strongly prefer, if
possible, hiding the reinterpret_cast in a inline helper function near
definitions of the two classes involved as opposed to reinterpret_cast-
ing on demand, to add some measure of type safety so one won't do a
reinterpret_cast to a wrong type on accident.


PS: If anyone ever figures it out, I would still like to know exactly
when I am allowed to do something like:

#include <iostream>

template <typename T>
std::ostream& printByteRepresentation(std::ostream & out, T const* t)
{ unsigned char const* c = reinterpret_cast<unsigned char const*>
(t);
unsigned char const* const end = c + sizeof(T);
for ( ; c < end; ++c)
{ int x = *c;
out << x << " ";
}
return out;
}


--

blargg

unread,
Apr 9, 2009, 6:46:34 PM4/9/09
to
Alf P. Steinbach wrote:
> * joshua...@gmail.com:

> > blargg wrote:
> >> People seem to be getting confused with casts to a reference type.
> >> Something like
> >>
> >> reinterpret_cast<T&> (obj)
> >>
> >> is nearly equivalent to
> >>
> >> (*reinterpret_cast<T*> (&obj))
> >>
> >> if that helps reason more clearly about it.
> >
> > No. Just no. No at this entire thread.
> >
> > That may be how it's implemented on some systems, and perhaps it is
> > interesting to try for fun, but to write any sort of real code, do not
> > do this.
>
> I interpret the above as saying that the "nearly equivalent" is wrong in the
> direction that any equivalence is merely how the particular implementation
> does it, if it does.

The point was that reinterpreting as a reference is just as
implementation-defined/undefined as reinterpreting the address to a
different pointer type.

> And if that interpretation is correct, then your stance on that is
> incorrect, because the standard /guarantees/ this equivalence in 5.2.10/10.
>
> So the original statement is wrong, but in the other direction: the word
> "nearly" should be "exactly". :-)

[...]

Nope, I specifically used the word "nearly" to avoid having to cover the
use of &obj. If obj's class defines its own operator &, then the
equivalence doesn't hold. I believe that a reference cast is the ONLY way
to get the address of an object of arbitrary type (boost has a function
for doing this). That is, for an object of type T which may redefine &,
this is the only way:

reinterpret_cast<T*> (&reinterpret_cast<char&> (obj))

--

joshua...@gmail.com

unread,
Apr 10, 2009, 9:53:27 AM4/10/09
to

I'm sorry. I had to sit down for a while to try and parse this.
"Except that" at the start of a large sentence through me a curve
ball. (It would have been clearer move the dependent clause after the
second clause, and perhaps to use "except for" instead of "except
that".) I see the intended reading now, that
template <typename T1, typename T2>
T1* foo(T1* x)
{ return reinterpret_cast<T1*>(reinterpret_cast<T2*>(x)); }
is an identity function iff T2's alignment requirements are no
stricter than T1. If T2's alignment requirements are stricter than T1,
then the above function foo has unspecified behavior.

Thank you.


--

Alf P. Steinbach

unread,
Apr 10, 2009, 4:59:07 PM4/10/09
to
>
<64766f99-67bb-4fe8...@q16g2000yqg.googlegroups.com>
<blargg.ei3-AAB4D...@sn-ip.vsrv-sjc.supernews.net>
<fc48c95a-d1e9-4b3f...@d38g2000prn.googlegroups.com>
<gr73ol$km8$1...@news.motzarella.org>
<blargg.ei3-09...@192.168.1.4>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
X-Original-Date: Fri, 10 Apr 2009 09:15:01 +0200
X-Submission-Address: c++-s...@netlab.cs.rpi.edu

* blargg:


> Alf P. Steinbach wrote:
>> * joshua...@gmail.com:

>>> blargg wrote:
>>>> People seem to be getting confused with casts to a reference type.
>>>> Something like
>>>>
>>>> reinterpret_cast<T&> (obj)
>>>>
>>>> is nearly equivalent to
>>>>
>>>> (*reinterpret_cast<T*> (&obj))
>>>>
>>>> if that helps reason more clearly about it.
>>> No. Just no. No at this entire thread.
>>>
>>> That may be how it's implemented on some systems, and perhaps it is
>>> interesting to try for fun, but to write any sort of real code, do
>>> not
>>> do this.
>> I interpret the above as saying that the "nearly equivalent" is
>> wrong in the
>> direction that any equivalence is merely how the particular
>> implementation
>> does it, if it does.

> The point was that reinterpreting as a reference is just as
> implementation-defined/undefined as reinterpreting the address to a
> different pointer type.

>> And if that interpretation is correct, then your stance on that is
>> incorrect, because the standard /guarantees/ this equivalence in
>> §5.2.10/10.
>>
>> So the original statement is wrong, but in the other direction: the
>> word
>> "nearly" should be "exactly". :-)

> [...]
> Nope, I specifically used the word "nearly" to avoid having to cover
> the
> use of &obj.

Oh. I didn't see that. In the context above "nearly" was interpreted
in a very different direction, which was what I responded to.

The standard, instead of adding "nearly" at the front making the
statement more vague, just adds "with the built-in & and * operators"
at the end, making the statement more precise.

The built-ins are implied, and with the built-ins it's an exact
equivalence.


> If obj's class defines its own operator &, then the
> equivalence doesn't hold. I believe that a reference cast is the
> ONLY way
> to get the address of an object of arbitrary type (boost has a
> function
> for doing this). That is, for an object of type T which may redefine
> &,
> this is the only way:
> reinterpret_cast<T*> (&reinterpret_cast<char&> (obj))

Yeah, I think so too.

But as James Bond reportedly remarked, "never say never". :-)


Cheers,

- Alf

--
Due to hosting requirements I need visits to <url: http://alfps.izfree.com/
>.
No ads, and there is some C++ stuff! :-) Just going there is good.
Linking
to it is even better! Thanks in advance!

0 new messages