Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: #include'ing .c files considered harmful?

246 views
Skip to first unread message

James Kuyper

unread,
Feb 7, 2021, 1:43:20 PM2/7/21
to
On 2/7/21 1:22 PM, Anton Shepelev wrote:
> Hello, all.
>
> My C module for pixel-perfect scaling comprises two files:
> ppscale.c and ppscale.h . I meant it to be compiled into the
> executable or into an intermediate object file, so that the
> higher-level code need only include the .h file. But the
> maintainer of a DOSBox fork has decided to include my module
> into a C++ file thus:
>
> extern "C" {
> #include "ppscale.h"
> #include "ppscale.c"
> }
>
> I don't like it, but the maintainer says it simplifies the
> configuration of the several build systems supported by the
> project in parallel. What problems, drawbacks, or dangers do
> you see to this approach?

The answer to your question involves C++ issues, so I've added
comp.lang.c++.

Note (unrelated to your question): the way I normally do things,
"ppscale.c" would already #include "ppscale.h", helping ensure that the
declarations in the header file are consistent with the definitions in
the .c file. That would render it unnecessary for the C++ file to do so
itself. Do you do the same?

If you take arbitrary C code, and process it as C++ code, there's a lot
of minor issues that could cause problems. See Annex C of the C++
standard for a list of differences that is fairly complete and fairly
comprehensive. Most of the differences are likely to prevent
compilation, or at least trigger a mandatory diagnostic.

Therefore, if he got your code to compile as C++ code with a high
warning level and no diagnostics, and in particular, if he thinks he got
it to work, then there's a decent chance that none of those problems
actually came up. However, there are silent differences - C code that
would compile without diagnostics as C++ code, but has subtly different
semantics, and if he didn't give the code a thorough test, he might have
missed those differences. It would be far safer to compile it as C code,
and then use extern "C" declarations to link to it from C++ code.
There's far fewer tricky issues to worry about (though there's still a few).

Anton Shepelev

unread,
Feb 7, 2021, 1:58:10 PM2/7/21
to
James Kuyper:

> The answer to your question involves C++ issues, so I've
> added comp.lang.c++.

Yes, C++ programmers are welcome.

> Note (unrelated to your question): the way I normally do
> things, "ppscale.c" would already #include "ppscale.h",
> helping ensure that the declarations in the header file
> are consistent with the definitions in the .c file. That
> would render it unnecessary for the C++ file to do so
> itself. Do you do the same?

No:

https://launchpad.net/ppscale

but I will, which will make including ppscale.h redundant.
I don't know why I didn't do it, probably just forgot. I
kept thinking of .c and .h files as the interface and
implentation parts of a Modula module, which they are not.
Thanks for the suggestion.

> If you take arbitrary C code, and process it as C++ code,
> there's a lot of minor issues that could cause problems.
> See Annex C of the C++ standard for a list of differences
> that is fairly complete and fairly comprehensive. Most of
> the differences are likely to prevent compilation, or at
> least trigger a mandatory diagnostic.

That is a good reason to compile the C code with a C
compliler into a separate object file or static library.

> Therefore, if he got your code to compile as C++ code with
> a high warning level and no diagnostics, and in
> particular, if he thinks he got it to work, then there's a
> decent chance that none of those problems actually came
> up. However, there are silent differences - C code that
> would compile without diagnostics as C++ code, but has
> subtly different semantics, and if he didn't give the code
> a thorough test, he might have missed those differences.
> It would be far safer to compile it as C code, and then
> use extern "C" declarations to link to it from C++ code.
> There's far fewer tricky issues to worry about (though
> there's still a few).

Thank you very much for the advice. I will forward it to the
maintainer.

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

Manfred

unread,
Feb 7, 2021, 3:04:38 PM2/7/21
to
On 2/7/21 7:57 PM, Anton Shepelev wrote:
> James Kuyper:
>
>> The answer to your question involves C++ issues, so I've
>> added comp.lang.c++.
>
> Yes, C++ programmers are welcome.
>
>> Note (unrelated to your question): the way I normally do
>> things, "ppscale.c" would already #include "ppscale.h",
>> helping ensure that the declarations in the header file
>> are consistent with the definitions in the .c file. That
>> would render it unnecessary for the C++ file to do so
>> itself. Do you do the same?
>
> No:
>
> https://launchpad.net/ppscale
>
> but I will, which will make including ppscale.h redundant.
> I don't know why I didn't do it, probably just forgot. I
> kept thinking of .c and .h files as the interface and
> implentation parts of a Modula module, which they are not.
> Thanks for the suggestion.
>
While you're at it, you may want to bracket the C declarations in the
header file with

#ifdef __cplusplus
extern "C" {
#endif // __cplusplus

...

#ifdef __cplusplus
} // extern "C"
#endif // __cplusplus

This way your C header file can be safely #include'd in the C++ files
that use it.

>> If you take arbitrary C code, and process it as C++ code,
>> there's a lot of minor issues that could cause problems.
>> See Annex C of the C++ standard for a list of differences
>> that is fairly complete and fairly comprehensive. Most of
>> the differences are likely to prevent compilation, or at
>> least trigger a mandatory diagnostic.
>
> That is a good reason to compile the C code with a C
> compliler into a separate object file or static library.
>

Definitely so.
Even if it happens to work now, the current scenario has a good chance
of causing maintenance headaches.

Brian Wood

unread,
Feb 7, 2021, 7:31:53 PM2/7/21
to
I'm using a C compression library with my code generator:

https://github.com/Ebenezer-group/onwards/blob/master/src/quicklz.h

The author has
extern "C"
around function prototypes, but not around his data types.
Not sure if I should move the extern "C" line up so that it covers
the data types also? I've not run into any problems with it the
way it is, but don't have a lot of external users:

https://webEbenezer.net/about.html

As an aside, what do you think about merging that page into
my main page? I've merged some of my other pages into my
main page recently and have thought about going further
with it.


Brian
Ebenezer Enterprises - Enjoying programming again.
https://github.com/Ebenezer-group/onwards

Richard Damon

unread,
Feb 7, 2021, 7:51:01 PM2/7/21
to
On 2/7/21 7:31 PM, Brian Wood wrote:
> I'm using a C compression library with my code generator:
>
> https://github.com/Ebenezer-group/onwards/blob/master/src/quicklz.h
>
> The author has
> extern "C"
> around function prototypes, but not around his data types.
> Not sure if I should move the extern "C" line up so that it covers
> the data types also? I've not run into any problems with it the
> way it is, but don't have a lot of external users:

The primary effect of the extern "C" is on function definitions. It
changes the calling conventions (if needed) and removes the effects of
name mangling for the parameter types. None of this applies to objects
or types.

I would need to study things more carefully to be sure that a C++
implantation couldn't make it cause a difference, but I suspect most
platform ABIs define things well enough to not allow it either.

Manfred

unread,
Feb 8, 2021, 7:40:07 AM2/8/21
to
On 2/8/2021 1:50 AM, Richard Damon wrote:
> On 2/7/21 7:31 PM, Brian Wood wrote:
>> I'm using a C compression library with my code generator:
>>
>> https://github.com/Ebenezer-group/onwards/blob/master/src/quicklz.h
>>
>> The author has
>> extern "C"
>> around function prototypes, but not around his data types.
>> Not sure if I should move the extern "C" line up so that it covers
>> the data types also? I've not run into any problems with it the
>> way it is, but don't have a lot of external users:
>
> The primary effect of the extern "C" is on function definitions.

s/definitions/declarations/

It
> changes the calling conventions (if needed) and removes the effects of
> name mangling for the parameter types. None of this applies to objects
> or types.
>
> I would need to study things more carefully to be sure that a C++
> implantation couldn't make it cause a difference, but I suspect most
> platform ABIs define things well enough to not allow it either.
>

One example that comes to mind is function pointers as struct members.
Since the entire header, including functions and types, constitutes the
interface to the module, I'd say that it's probably a good idea to
handle both as part of one entity and 'extern "C"' all of it.

An additional issue is that, unless all structs have their members
carefully ordered by decreasing size, some provision to enforce their
proper member alignment would be in order.

Brian Wood

unread,
Feb 11, 2021, 5:34:29 PM2/11/21
to
On Monday, February 8, 2021 at 6:40:07 AM UTC-6, Manfred wrote:
> On 2/8/2021 1:50 AM, Richard Damon wrote:
> > On 2/7/21 7:31 PM, Brian Wood wrote:
> >> I'm using a C compression library with my code generator:
> >>
> >> https://github.com/Ebenezer-group/onwards/blob/master/src/quicklz.h
> >>
> >> The author has
> >> extern "C"
> >> around function prototypes, but not around his data types.
> >> Not sure if I should move the extern "C" line up so that it covers
> >> the data types also? I've not run into any problems with it the
> >> way it is, but don't have a lot of external users:
> >
> > The primary effect of the extern "C" is on function definitions.
> s/definitions/declarations/
> It
> > changes the calling conventions (if needed) and removes the effects of
> > name mangling for the parameter types. None of this applies to objects
> > or types.
> >
> > I would need to study things more carefully to be sure that a C++
> > implantation couldn't make it cause a difference, but I suspect most
> > platform ABIs define things well enough to not allow it either.
> >
> One example that comes to mind is function pointers as struct members.
> Since the entire header, including functions and types, constitutes the
> interface to the module, I'd say that it's probably a good idea to
> handle both as part of one entity and 'extern "C"' all of it.

I decided to go that route and enlarged what's covered by the
extern "C". Also changed this:

#if QLZ_COMPRESSION_LEVEL == 1 || QLZ_COMPRESSION_LEVEL == 2
struct qlz_state_decompress
{
#if QLZ_STREAMING_BUFFER > 0
unsigned char stream_buffer[QLZ_STREAMING_BUFFER];
#endif
qlz_hash_decompress hash[QLZ_HASH_VALUES];
unsigned char hash_counter[QLZ_HASH_VALUES];
size_t stream_counter;
} ;
#elif QLZ_COMPRESSION_LEVEL == 3
struct qlz_state_decompress
{
#if QLZ_STREAMING_BUFFER > 0
unsigned char stream_buffer[QLZ_STREAMING_BUFFER];
#endif
#if QLZ_COMPRESSION_LEVEL <= 2
qlz_hash_decompress hash[QLZ_HASH_VALUES];
#endif
size_t stream_counter;
} ;
#endif

to this:

struct qlz_state_decompress
{
#if QLZ_STREAMING_BUFFER > 0
unsigned char stream_buffer[QLZ_STREAMING_BUFFER];
#endif
#if QLZ_COMPRESSION_LEVEL == 1 || QLZ_COMPRESSION_LEVEL == 2
qlz_hash_decompress hash[QLZ_HASH_VALUES];
unsigned char hash_counter[QLZ_HASH_VALUES];
#endif
size_t stream_counter;
};

Some light testing has not revealed a problem with the
change, but mention it here for further scrutiny.

>
> An additional issue is that, unless all structs have their members
> carefully ordered by decreasing size, some provision to enforce their
> proper member alignment would be in order.

Thanks for the reminder about the ordering. I've not changed
anything related to that yet.
Doesn't the compiler insure the proper alignment of members?



Brian
Ebenezer Enterprises
https://webEbenezer.net

Chris M. Thomasson

unread,
Feb 11, 2021, 5:49:32 PM2/11/21
to
All of those macros seem like they could be a nightmare to debug.
Perhaps you can use a function pointer to specific code. Say setting the
function pointer to qlz_compression_level_1 or something. A pure virtual
base class where you can implement each level separately?

Brian Wood

unread,
Feb 11, 2021, 9:09:50 PM2/11/21
to
This code is from another developer and, to the best of my
knowledge he doesn't provide a C++ version. I don't understand
his code very well so am limiting myself to making minor changes
and then running it by anyone interested.


Brian

David Brown

unread,
Feb 12, 2021, 3:25:27 AM2/12/21
to
extern "C" language linkage does two things. It keeps the external
linkable names simple - no name mangling (for functions), no namespaces.
And it uses the C ABI for calling conventions, instead of the C++ ABI.
(I don't know of any compilers where this makes a difference - thus
function types and function pointer types are not going to be affected
on any platform where the ABIs match.)

But be careful that even within an extern "C" block, class member
declarations and member function type declarations are "C++".

<https://en.cppreference.com/w/cpp/language/language_linkage>


The common uses of extern "C" are either as a wrapper for a whole header
file (or at least the part covering declarations of C functions), or
specifically for one function at a time.
Both are horrible and unreadable. This is C++ - try making a template
or inherited structures, perhaps with a /single/ conditional compilation
part at the end to give an alias to the struct you want.

If you can't (or don't want to) avoid conditional compilation, consider
making duplications of the struct definition for the different choices,
so that you don't have conditionals in the middle of the definition.

A better choice, if you don't mind using gcc extensions, would be to
have all the members but with zero length arrays. Then it will all be
simpler, clearer, and easier to maintain.

>
> Some light testing has not revealed a problem with the
> change, but mention it here for further scrutiny.
>
>>
>> An additional issue is that, unless all structs have their members
>> carefully ordered by decreasing size, some provision to enforce their
>> proper member alignment would be in order.
>
> Thanks for the reminder about the ordering. I've not changed
> anything related to that yet.
> Doesn't the compiler insure the proper alignment of members?
>

Compilers don't "insure" anything - that's what insurance companies are
for. Compilers /ensure/ correct alignment. And they do so with a
complete disregard to the ordering of fields in the struct - though the
ordering may affect the padding needed to ensure the correct alignments.
The alignment of the struct is the maximum of the alignments of the
members (or higher, if you use "alignas").

David Brown

unread,
Feb 12, 2021, 3:38:43 AM2/12/21
to
On 11/02/2021 23:49, Chris M. Thomasson wrote:

> All of those macros seem like they could be a nightmare to debug.
> Perhaps you can use a function pointer to specific code. Say setting the
> function pointer to qlz_compression_level_1 or something. A pure virtual
> base class where you can implement each level separately?
>

Adding function pointers rarely (IME) makes code easier to debug. It
might, conceivably, make it easier to read. Virtual base classes are
even worse for debugging, and code efficiency, and entirely unnecessary
for a struct of plain data.

struct common_bits {
size_t stream_counter;
}

#if QLZ_STREAMING_BUFFER > 0
struct streaming_bits {
unsigned char stream_buffer[QLZ_STREAMING_BUFFER];
}
#else
struct streaming_bits {
}
#endif

struct qlz_state_decompress : common_bits, streaming_bits {};


No virtual inheritance, function pointers, overhead, hidden pointers,
etc. It's just simple data.

I think with the way his macros are defined, he is not going to get rid
of the conditional compilation entirely, but it should be structured in
a way that has the least impact on readability. (That is, as always, a
somewhat subjective quality.) With gcc zero length arrays, the code is
a neater.

Manfred

unread,
Feb 12, 2021, 11:15:27 AM2/12/21
to
On 2/12/2021 9:25 AM, David Brown wrote:
> Compilers don't "insure" anything - that's what insurance companies are
> for. Compilers/ensure/ correct alignment. And they do so with a
> complete disregard to the ordering of fields in the struct - though the
> ordering may affect the padding needed to ensure the correct alignments.
> The alignment of the struct is the maximum of the alignments of the
> members (or higher, if you use "alignas").

You are right that the standard does not relate alignment or padding
with members order, and in fact gcc does not do that, thanks for
pointing this out.

However, from the MSVC docs:

https://docs.microsoft.com/en-us/cpp/build/reference/zp-struct-member-alignment?view=msvc-160#remarks

"The compiler stores members after the first one on a boundary that's
the smaller of either the size of the member type, or an N-byte boundary."

https://docs.microsoft.com/en-us/cpp/preprocessor/pack?view=msvc-160#parameters

"The alignment of a member is on a boundary that's either a multiple of
n, or a multiple of the size of the member, whichever is smaller."

So, this compiler /does/ padding between members depending on their order.

Chris M. Thomasson

unread,
Feb 12, 2021, 4:01:21 PM2/12/21
to
Humm.... Good points.

David Brown

unread,
Feb 13, 2021, 8:57:34 AM2/13/21
to
As I said, the /padding/ between members in a struct does depend on
their order. Their /alignment/ does not. That applies to MSVC, gcc,
and any other compiler.

If you have:

struct S {
char a;
int b;
char c;
}

where "int" is size 4 and alignment 4, then S will have an alignment of
4, there will be 3 bytes of padding between "a" and "b", and three bytes
of padding after "c" - giving a total size of 12.

If you arrange it as:

struct S {
char a;
char c;
int b;
}

then there will be 2 bytes of padding after "c", and a total size of 8
(with the same alignment of 4).


If you arrange it as:

struct S {
int b;
char a;
char c;
}

then there will be 2 bytes of padding after "c" at the end of the
structure, and a total size of 8 (with the same alignment of 4).

Padding depends on the order, alignment does not.


Now, compilers are free to /increase/ alignments if they want, and
padding as needed to support that. It's not uncommon on 64-bit systems
to use 8-bit alignment on structures, and data on stacks or in
statically allocated memory can be given extra alignment - this can aid
cache friendly memory layouts. And of course compilers can offer
options or extensions to give different alignment (and thereby padding)
arrangements, even if that breaks the platform's ABI.



mick...@potatofield.co.uk

unread,
Feb 13, 2021, 11:39:55 AM2/13/21
to
On Sat, 13 Feb 2021 14:57:19 +0100
David Brown <david...@hesbynett.no> wrote:
>Padding depends on the order, alignment does not.
>
>
>Now, compilers are free to /increase/ alignments if they want, and
>padding as needed to support that. It's not uncommon on 64-bit systems
>to use 8-bit alignment on structures, and data on stacks or in
>statically allocated memory can be given extra alignment - this can aid
>cache friendly memory layouts. And of course compilers can offer
>options or extensions to give different alignment (and thereby padding)
>arrangements, even if that breaks the platform's ABI.

#pragma pack is absolutely essential if you're mapping a structure directly
to a memory block or serialising the block into a structure such as network
packet header/data or in a device driver so you let the compiler know exactly
how you want padding if any.

David Brown

unread,
Feb 13, 2021, 12:20:09 PM2/13/21
to
No, it is not essential. It can be convenient, but it is far from the
only way to achieve the padding you want or to handle externally defined
structures (network packets, file formats, hardware registers, etc.).
I've seen it misused much more often than I've seen it used appropriately.

When you need to control padding, I usually find it cleaner and more
reliable (as well as more portable) to put the padding in explicitly,
combined with static assertions to confirm that everything is the right
size. Non-aligned data can read or written using memcpy(), and good
compilers will optimise the results at least as efficiently as accessing
packed structures.

Some older or poorer compilers can't do a decent job of optimising
memcpy(), and there some kind of compiler-specific "pack" solution can
be helpful.


Brian Wood

unread,
Feb 13, 2021, 9:42:52 PM2/13/21
to
What I want to be be sure of is that the second form is a benign
refactoring.

>This is C++ - try making a template
> or inherited structures, perhaps with a /single/ conditional compilation
> part at the end to give an alias to the struct you want.
>

This is a C library that I'm using.


Brian
Ebenezer Enterprises - Enjoying programming again.
https://webEbenezer.net

mick...@potatofield.co.uk

unread,
Feb 15, 2021, 4:28:03 AM2/15/21
to
On Sat, 13 Feb 2021 18:19:55 +0100
David Brown <david...@hesbynett.no> wrote:
>On 13/02/2021 17:39, mick...@potatofield.co.uk wrote:
>> #pragma pack is absolutely essential if you're mapping a structure directly
>> to a memory block or serialising the block into a structure such as network
>> packet header/data or in a device driver so you let the compiler know
>exactly
>> how you want padding if any.
>>
>
>No, it is not essential. It can be convenient, but it is far from the
>only way to achieve the padding you want or to handle externally defined

If you're creating the memory layout then put in whatever padding you want.

>size. Non-aligned data can read or written using memcpy(), and good

Yuck, very messy! But everyone has a prefered style I suppose.

David Brown

unread,
Feb 15, 2021, 4:47:53 AM2/15/21
to
memcpy can work simply and easily, it's portable, and with a good
compiler it is usually optimally efficient with known fixed sizes. And
it is always correct code, unlike some things people do with packed
structs (like taking the address of non-aligned fields - something that
may not work as expected). Any messiness can easily be wrapped in a C++
class or template - that's why you use C++.

mick...@potatofield.co.uk

unread,
Feb 15, 2021, 5:03:31 AM2/15/21
to
On Mon, 15 Feb 2021 10:47:39 +0100
David Brown <david...@hesbynett.no> wrote:
>On 15/02/2021 10:27, mick...@potatofield.co.uk wrote:
>> On Sat, 13 Feb 2021 18:19:55 +0100
>> David Brown <david...@hesbynett.no> wrote:
>>> On 13/02/2021 17:39, mick...@potatofield.co.uk wrote:
>>>> #pragma pack is absolutely essential if you're mapping a structure directly
>
>>>> to a memory block or serialising the block into a structure such as
>network
>>>> packet header/data or in a device driver so you let the compiler know
>>> exactly
>>>> how you want padding if any.
>>>>
>>>
>>> No, it is not essential. It can be convenient, but it is far from the
>>> only way to achieve the padding you want or to handle externally defined
>>
>> If you're creating the memory layout then put in whatever padding you want.
>>
>>> size. Non-aligned data can read or written using memcpy(), and good
>>
>> Yuck, very messy! But everyone has a prefered style I suppose.
>>
>
>memcpy can work simply and easily, it's portable, and with a good
>compiler it is usually optimally efficient with known fixed sizes. And

No memcpy is going to be more efficient than a 1 line pointer cast which is
all you need to do with mapping a structure onto a block of memory. Plus for
any significant number of fields - eg for a TCP header with 9 fields - you
going to have an equivalent number of memcpys in the code which frankly
looks fugly and is much harder to visually parse.

eg:
tcp_hdr *thdr = (tcp_hdr *)packet;

vs

memcpy((char *)thdr.src,packet,2);
memcpy((char *)thdr.dst,packet+2,2);
memcpy((char *)thdr.seqnum,packet+4,4)
etc etc

Blech. No thanks.

David Brown

unread,
Feb 15, 2021, 6:35:26 AM2/15/21
to
On 15/02/2021 11:03, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 10:47:39 +0100
> David Brown <david...@hesbynett.no> wrote:
>> On 15/02/2021 10:27, mick...@potatofield.co.uk wrote:
>>> On Sat, 13 Feb 2021 18:19:55 +0100
>>> David Brown <david...@hesbynett.no> wrote:
>>>> On 13/02/2021 17:39, mick...@potatofield.co.uk wrote:
>>>>> #pragma pack is absolutely essential if you're mapping a structure directly
>>
>>>>> to a memory block or serialising the block into a structure such as
>> network
>>>>> packet header/data or in a device driver so you let the compiler know
>>>> exactly
>>>>> how you want padding if any.
>>>>>
>>>>
>>>> No, it is not essential. It can be convenient, but it is far from the
>>>> only way to achieve the padding you want or to handle externally defined
>>>
>>> If you're creating the memory layout then put in whatever padding you want.
>>>
>>>> size. Non-aligned data can read or written using memcpy(), and good
>>>
>>> Yuck, very messy! But everyone has a prefered style I suppose.
>>>
>>
>> memcpy can work simply and easily, it's portable, and with a good
>> compiler it is usually optimally efficient with known fixed sizes. And
>
> No memcpy is going to be more efficient than a 1 line pointer cast which is
> all you need to do with mapping a structure onto a block of memory.

Yes, it is - because the compiler knows what memcpy does, and can
optimise appropriately. memcpy does not have to be implemented as an
external library function call!


#pragma pack(1)
typedef struct S {
int8_t a;
int32_t b;
int64_t c;
} S;

int getsize(void) { return sizeof(S); }

int32_t getb1(const S* p) {
return p->b;
}

int32_t getb2(const S* p) {
int32_t x;
memcpy(&x, &p->b, sizeof x);
return x;
}

int32_t getb3(const S* p) {
const uint8_t * q = (const uint8_t *) &p->b;
int32_t x;
uint8_t b[sizeof x];
for (size_t i = 0; i < sizeof x; i++) {
b[i] = *q++;
}
memcpy(&x, b, sizeof x);
return x;
}

gcc turns all of these into a single "mov" instruction. Modern
compilers (and many old ones) can do a lot of nice things in their
optimisation. Standard library functions are specified in the standard,
and the compiler can take advantage of that.

> Plus for
> any significant number of fields - eg for a TCP header with 9 fields - you
> going to have an equivalent number of memcpys in the code which frankly
> looks fugly and is much harder to visually parse.
>
> eg:
> tcp_hdr *thdr = (tcp_hdr *)packet;
>
> vs
>
> memcpy((char *)thdr.src,packet,2);
> memcpy((char *)thdr.dst,packet+2,2);
> memcpy((char *)thdr.seqnum,packet+4,4)
> etc etc
>
> Blech. No thanks.
>

You already have to handle accessing these with the right endianness.
You can't just read the fields and use the values. (Well, you can if
you have a compiler with extensions that cover endian specifications and
those are used in the struct definition - but that is far from standard.)

James Kuyper

unread,
Feb 15, 2021, 8:43:47 AM2/15/21
to
On 2/15/21 5:03 AM, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 10:47:39 +0100
> David Brown <david...@hesbynett.no> wrote:
...
>> memcpy can work simply and easily, it's portable, and with a good
>> compiler it is usually optimally efficient with known fixed sizes. And
>
> No memcpy is going to be more efficient than a 1 line pointer cast which is
> all you need to do with mapping a structure onto a block of memory.

He said "optimally efficient", not "more efficient". What he's
suggesting is that a compiler could easily optimize a call to memcpy()
into exactly the same machine code that would be generated for the code
you describe.

> memcpy((char *)thdr.src,packet,2);
> memcpy((char *)thdr.dst,packet+2,2);
> memcpy((char *)thdr.seqnum,packet+4,4)

Why the (char*) casts? memcpy() takes void*, which can be implicitly
converted to from any object pointer type, which is the only reason why
the char* cast works. A direct conversion to void* is no less safe than
an indirect one using char* as an intermediate step.

mick...@potatofield.co.uk

unread,
Feb 15, 2021, 10:31:35 AM2/15/21
to
On Mon, 15 Feb 2021 12:35:10 +0100
David Brown <david...@hesbynett.no> wrote:
>On 15/02/2021 11:03, mick...@potatofield.co.uk wrote:
>> No memcpy is going to be more efficient than a 1 line pointer cast which is
>> all you need to do with mapping a structure onto a block of memory.
>
>Yes, it is - because the compiler knows what memcpy does, and can
>optimise appropriately. memcpy does not have to be implemented as an
>external library function call!

So what? 9 memcpys will be a min of 9 mov's at best. 1 cast is 1 mov
though potentially zero depending on how smart the compiler is.

>> Blech. No thanks.
>>
>
>You already have to handle accessing these with the right endianness.

So what? You'd have to call ntoh*() or similar after the fact regardless of
what method you used. I suppose you could write your own endian aware memcpy
for numeric values but why bother plus its unlikely to be very efficient.

mick...@potatofield.co.uk

unread,
Feb 15, 2021, 10:33:39 AM2/15/21
to
On Mon, 15 Feb 2021 08:43:31 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 2/15/21 5:03 AM, mick...@potatofield.co.uk wrote:
>> On Mon, 15 Feb 2021 10:47:39 +0100
>> David Brown <david...@hesbynett.no> wrote:
>....
>>> memcpy can work simply and easily, it's portable, and with a good
>>> compiler it is usually optimally efficient with known fixed sizes. And
>>
>> No memcpy is going to be more efficient than a 1 line pointer cast which is
>> all you need to do with mapping a structure onto a block of memory.
>
>He said "optimally efficient", not "more efficient". What he's
>suggesting is that a compiler could easily optimize a call to memcpy()
>into exactly the same machine code that would be generated for the code
>you describe.

No it wouldn't. Maybe for 1 but not for 9.

>> memcpy((char *)thdr.src,packet,2);
>> memcpy((char *)thdr.dst,packet+2,2);
>> memcpy((char *)thdr.seqnum,packet+4,4)
>
>Why the (char*) casts? memcpy() takes void*, which can be implicitly

Habit. Plus I forgot &.

David Brown

unread,
Feb 15, 2021, 11:56:13 AM2/15/21
to
On 15/02/2021 16:31, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 12:35:10 +0100
> David Brown <david...@hesbynett.no> wrote:
>> On 15/02/2021 11:03, mick...@potatofield.co.uk wrote:
>>> No memcpy is going to be more efficient than a 1 line pointer cast which is
>>> all you need to do with mapping a structure onto a block of memory.
>>
>> Yes, it is - because the compiler knows what memcpy does, and can
>> optimise appropriately. memcpy does not have to be implemented as an
>> external library function call!
>
> So what? 9 memcpys will be a min of 9 mov's at best. 1 cast is 1 mov
> though potentially zero depending on how smart the compiler is.

Casts frequently don't need any instructions - pointer casts on most
systems are free at run time. /Accessing/ the data takes instructions.
The point is that with a good enough compiler (and sensible flags),
memcpy is going to give you the same code.

The key difference is that casting pointer types then using them to
access data is often lying to the compiler - for all but a handful of
exceptions, it is behaviour undefined by the standard. This means you
can easily get something that works fine in your simple tests, but fails
in more complex situations when code is inlined, link-time optimised, or
otherwise used in more advanced code. Memcpy, on the other hand, is
well specified and safe.

>
>>> Blech. No thanks.
>>>
>>
>> You already have to handle accessing these with the right endianness.
>
> So what? You'd have to call ntoh*() or similar after the fact regardless of
> what method you used. I suppose you could write your own endian aware memcpy
> for numeric values but why bother plus its unlikely to be very efficient.
>

The point is that you have to have code for accessing the fields, you
can't just use them directly. And when you have a an accessor function
anyway, you might as well write it correctly, safely and portably - it
will be just as efficient.

I am not at all suggesting that memcpy is always the best way to write
code - merely that it is not an expensive way to do it either in terms
of run-time costs or in source code clarity, and it is often safer and
more portable. People have been writing code to access network-defined
or file format defined structures since C has been in existence, and
#pragma pack is neither necessary nor sufficient for the task.

David Brown

unread,
Feb 15, 2021, 11:58:09 AM2/15/21
to
On 15/02/2021 16:33, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 08:43:31 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 2/15/21 5:03 AM, mick...@potatofield.co.uk wrote:
>>> On Mon, 15 Feb 2021 10:47:39 +0100
>>> David Brown <david...@hesbynett.no> wrote:
>> ....
>>>> memcpy can work simply and easily, it's portable, and with a good
>>>> compiler it is usually optimally efficient with known fixed sizes. And
>>>
>>> No memcpy is going to be more efficient than a 1 line pointer cast which is
>>> all you need to do with mapping a structure onto a block of memory.
>>
>> He said "optimally efficient", not "more efficient". What he's
>> suggesting is that a compiler could easily optimize a call to memcpy()
>> into exactly the same machine code that would be generated for the code
>> you describe.
>
> No it wouldn't. Maybe for 1 but not for 9.

I think you are misunderstanding something here.

If you read three items view a pointer, you will (in general) have a
minimum of three read instructions. If you do it with three optimised
memcpy() calls, you also have three read instructions. The code is the
same.

mick...@potatofield.co.uk

unread,
Feb 15, 2021, 12:16:43 PM2/15/21
to
On Mon, 15 Feb 2021 17:55:58 +0100
David Brown <david...@hesbynett.no> wrote:
>On 15/02/2021 16:31, mick...@potatofield.co.uk wrote:
>> On Mon, 15 Feb 2021 12:35:10 +0100
>> So what? 9 memcpys will be a min of 9 mov's at best. 1 cast is 1 mov
>> though potentially zero depending on how smart the compiler is.
>
>Casts frequently don't need any instructions - pointer casts on most
>systems are free at run time. /Accessing/ the data takes instructions.
> The point is that with a good enough compiler (and sensible flags),
>memcpy is going to give you the same code.
>
>The key difference is that casting pointer types then using them to
>access data is often lying to the compiler - for all but a handful of

Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing
network programming for a couple of decades and this method is used all over
the place. No one does 50 memcpys if there's a memory structure with 50
fields in it just for the sake of ivory tower correctness, you'd have to
be insane. A structure only has to be correct once in the header, memcpys have
to be correct everywhere you use them.

If you don't believe me have a look in any of the /usr/include/linux network
header files and then go through this and check out the casting to structs:

https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c

>exceptions, it is behaviour undefined by the standard. This means you
>can easily get something that works fine in your simple tests, but fails
>in more complex situations when code is inlined, link-time optimised, or

Rubbish. Maybe in Windows but that doesn't concern me.

>> So what? You'd have to call ntoh*() or similar after the fact regardless of
>> what method you used. I suppose you could write your own endian aware memcpy
>> for numeric values but why bother plus its unlikely to be very efficient.
>>
>
>The point is that you have to have code for accessing the fields, you
>can't just use them directly. And when you have a an accessor function

Wtf are you taking about? You just access them as structure fields. There
may be a small cost in deferencing but there's a large gain in code
readability and correctness.

>more portable. People have been writing code to access network-defined
>or file format defined structures since C has been in existence, and
>#pragma pack is neither necessary nor sufficient for the task.

Whether its pragma pack or attribute packed, its used a lot in Linux.

$ pwd
/usr/include/linux
$ grep __attribute__ *.h | grep packed | wc -l
239

But what do they know?

James Kuyper

unread,
Feb 15, 2021, 3:54:51 PM2/15/21
to
On 2/15/21 12:16 PM, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 17:55:58 +0100
> David Brown <david...@hesbynett.no> wrote:
...
>> The key difference is that casting pointer types then using them to
>> access data is often lying to the compiler - for all but a handful of
>
> Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing
> network programming for a couple of decades and this method is used all over
> the place.

The C++ rules are stricter than the C rules, but it's also a problem in
C. Type punning is standard C, but there are restrictions on when it can
safely be used. Those restrictions are defined in terms of the
"effective type" of a piece of memory. For objects with a declared type,
the effective type is the same as the declared type. For memory with no
declared type (which basically means dynamically allocated memory), the
effective type is set by the last store into that memory using a
non-character type T. If you used an lvalue of type T to store the
value, then the memory has an effective type of T. If you use methods
such as memcpy() or memmove(), to copy an entire object over into such
memory, or if you copied it over as an array of character type, that
memory acquires the same effective type as the object it was copied from.

The relevant rule violated by many kinds of type punning is the
anti-aliasing rule:

"An object shall have its stored value accessed only by an lvalue
expression that has one of
the following types: 88)
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of
the object,
— a type that is the signed or unsigned type corresponding to the
effective type of the object,
— a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union), or
— a character type." (C standard, 6.5p7).

Since that "shall" occurs outside of a constraints section, type punning
that violates the above rule has undefined behavior. Here's an example
that shows what can go wrong as a result of violating that rule. Given:

U func(T *pt, U *pu){
*pt = 0;
return *pu;
}

then *pt acquires the effective type of T. If U is not one of the types
permitted by the anti-aliasing rule, a compiler is not obligated to
consider the possibility that pt and pu might point to overlapping
blocks of memory. It could, therefore, delay the write to *pt until
after it has read the value of *pu. In such a simple piece of code, it's
unlikely to do so, but in more complicated code there's a very good
chance of such optimizations occurring.

Unions provide a way to avoid this problem (see 6.5.2.3p3, and pay
attention to footnote 95), but that way only works if the object is
question is actually of the union type, and only if the declaration of
that union is in scope at the point where the problem could otherwise occur.

...
>> exceptions, it is behaviour undefined by the standard. This means you
>> can easily get something that works fine in your simple tests, but fails
>> in more complex situations when code is inlined, link-time optimised, or
>
> Rubbish. Maybe in Windows but that doesn't concern me.

It's not just Windows - compilers that take advantage of the
anti-aliasing rules to optimize code generation are quite common.

Paavo Helde

unread,
Feb 15, 2021, 5:00:48 PM2/15/21
to
15.02.2021 19:16 mick...@potatofield.co.uk kirjutas:
> On Mon, 15 Feb 2021 17:55:58 +0100
> David Brown <david...@hesbynett.no> wrote:

>> exceptions, it is behaviour undefined by the standard. This means you
>> can easily get something that works fine in your simple tests, but fails
>> in more complex situations when code is inlined, link-time optimised, or
>
> Rubbish. Maybe in Windows but that doesn't concern me.

FYI, the biggest "culprit" in this area has been gcc in recent years. It
is keen to optimize away things which are formally UB, like infinite
loops. For some pointer conversions it helpfully warns you that it is
planning to break your code ("dereferencing type-punned pointer will
break strict-aliasing rules"). For some other kind of UB one might not
get so lucky.

MSVC, on the other hand, is generally much more careful to keep alive
tons of crap code produced by hordes of cowboy programmers during last
decades, only because such code accidentally happened to work at some
time in the past.

And yes, this is C++, not C, the rules are different.

mick...@potatofield.co.uk

unread,
Feb 16, 2021, 4:48:00 AM2/16/21
to
On Mon, 15 Feb 2021 15:54:36 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 2/15/21 12:16 PM, mick...@potatofield.co.uk wrote:
>> Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing
>> network programming for a couple of decades and this method is used all over
>
>> the place.
>
>The C++ rules are stricter than the C rules, but it's also a problem in
>C. Type punning is standard C, but there are restrictions on when it can
>safely be used. Those restrictions are defined in terms of the
>"effective type" of a piece of memory. For objects with a declared type,
>the effective type is the same as the declared type. For memory with no
>declared type (which basically means dynamically allocated memory), the
>effective type is set by the last store into that memory using a
>non-character type T. If you used an lvalue of type T to store the
>value, then the memory has an effective type of T. If you use methods
>such as memcpy() or memmove(), to copy an entire object over into such
>memory, or if you copied it over as an array of character type, that
>memory acquires the same effective type as the object it was copied from.

Memory is memory, it doesn't have a type. How the compiler sees it is another
matter of course but unless a C/C++ compiler wants to break a huge amount of
code its going to have to treat memory in this instance as void.

>The relevant rule violated by many kinds of type punning is the
>anti-aliasing rule:

Fine, but like it or not type punning has been de facto standard C for a very
long time and any C compiler (and C++ in many cases) breaks it at its peril.

>Unions provide a way to avoid this problem (see 6.5.2.3p3, and pay
>attention to footnote 95), but that way only works if the object is
>question is actually of the union type, and only if the declaration of
>that union is in scope at the point where the problem could otherwise occur.

Unions are another matter entirely mainly because endianess issues tend to
occur with them regardless of memory alignment.

>It's not just Windows - compilers that take advantage of the
>anti-aliasing rules to optimize code generation are quite common.

IME most compilers when pushed to do heavy optimisation start making subtle
mistakes here and there. Any heavily optimised code should always be tested
much more thoroughly than non optimised before its released.

mick...@potatofield.co.uk

unread,
Feb 16, 2021, 4:49:36 AM2/16/21
to
On Tue, 16 Feb 2021 00:00:33 +0200
Paavo Helde <myfir...@osa.pri.ee> wrote:
>15.02.2021 19:16 mick...@potatofield.co.uk kirjutas:
>> On Mon, 15 Feb 2021 17:55:58 +0100
>> David Brown <david...@hesbynett.no> wrote:
>
>>> exceptions, it is behaviour undefined by the standard. This means you
>>> can easily get something that works fine in your simple tests, but fails
>>> in more complex situations when code is inlined, link-time optimised, or
>>
>> Rubbish. Maybe in Windows but that doesn't concern me.
>
>FYI, the biggest "culprit" in this area has been gcc in recent years. It

gcc optimisation has always been a bit flaky when you go beyond -O2 anyway.
Their optimisation system seems to be a permanent work in progress IMO.

David Brown

unread,
Feb 16, 2021, 5:20:37 AM2/16/21
to
On 15/02/2021 18:16, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 17:55:58 +0100
> David Brown <david...@hesbynett.no> wrote:
>> On 15/02/2021 16:31, mick...@potatofield.co.uk wrote:
>>> On Mon, 15 Feb 2021 12:35:10 +0100
>>> So what? 9 memcpys will be a min of 9 mov's at best. 1 cast is 1 mov
>>> though potentially zero depending on how smart the compiler is.
>>
>> Casts frequently don't need any instructions - pointer casts on most
>> systems are free at run time. /Accessing/ the data takes instructions.
>> The point is that with a good enough compiler (and sensible flags),
>> memcpy is going to give you the same code.
>>
>> The key difference is that casting pointer types then using them to
>> access data is often lying to the compiler - for all but a handful of
>
> Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing
> network programming for a couple of decades and this method is used all over
> the place. No one does 50 memcpys if there's a memory structure with 50
> fields in it just for the sake of ivory tower correctness, you'd have to
> be insane. A structure only has to be correct once in the header, memcpys have
> to be correct everywhere you use them.

Casting pointers is allowed in C and C++ - but there are very tight
limits on what you are actually able to do with them in a portable way
(by that I mean there are limits on which accesses have defined
behaviour in the standards). But you don't need pointer casts to access
structs or fields in structs - you only need them if you are messing
about taking a pointer to one type of object and using it as though it
were a pointer to a different type of object. And it is that kind of
usage that is risky - there are lots of situations where people /think/
it is valid code, and it works when they test it, but it comes without
guarantees and might fail in other circumstances.

I get the impression here that there is a bit of mismatch between what I
have been trying to say, and what you think I have been saying. I am
not sure how - whether I was unclear or you misunderstood. But to go
back to the beginning, you claimed that "packed" structs were required
to handle pre-defined structures such as for network packets, and I
pointed out that this is not correct - you can, for example, use memcpy
to access the data in a portable and efficient manner. Do you agree on
that point?


>
> If you don't believe me have a look in any of the /usr/include/linux network
> header files and then go through this and check out the casting to structs:
>
> https://github.com/torvalds/linux/blob/master/net/ipv4/tcp.c
>

Can you give a more specific reference? I'd rather not read through
four thousand lines.

>> exceptions, it is behaviour undefined by the standard. This means you
>> can easily get something that works fine in your simple tests, but fails
>> in more complex situations when code is inlined, link-time optimised, or
>
> Rubbish. Maybe in Windows but that doesn't concern me.

Who has been talking about Windows? I have been talking about C and C++.

If you mess about with pointers and objects in a way that breaks the
"strict aliasing rules", simple test code /usually/ works as you might
expect (as the "obvious" implementation is typically already optimal).
But in more complex situations the compiler might be able to generate
more efficient code by using the knowledge that you cannot use a
pointer-to-int to change a float object, for example.

<https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gcc/Optimize-Options.html#index-fstrict-aliasing>


>
>>> So what? You'd have to call ntoh*() or similar after the fact regardless of
>>> what method you used. I suppose you could write your own endian aware memcpy
>>> for numeric values but why bother plus its unlikely to be very efficient.
>>>
>>
>> The point is that you have to have code for accessing the fields, you
>> can't just use them directly. And when you have a an accessor function
>
> Wtf are you taking about? You just access them as structure fields. There
> may be a small cost in deferencing but there's a large gain in code
> readability and correctness.
>

You need to put you r ntoh* functions somewhere!

>> more portable. People have been writing code to access network-defined
>> or file format defined structures since C has been in existence, and
>> #pragma pack is neither necessary nor sufficient for the task.
>
> Whether its pragma pack or attribute packed, its used a lot in Linux.
>
> $ pwd
> /usr/include/linux
> $ grep __attribute__ *.h | grep packed | wc -l
> 239
>
> But what do they know?
>

What do they know about writing highly portable and standard C? Not
everything, that's for sure. You can see that in many ways. For
starters, they don't /have/ to write code that is fully portable - they
can assume a number of basic features (32-bit int, 8-bit char,
little-endian or big-endian ordering, and lots of other common features
of any POSIX system). They don't have to write code that relies only on
standard C - they use gcc extensions freely. They (Torvalds in
particular) regularly get into arguments with the gcc development team
when new gcc versions come out and Torvalds says it "breaks" Linux, then
the gcc team point out that the C code was incorrect. Sometimes the
agreed solution is to fix the Linux code, sometimes it is to add flags
to gcc for finer control.

(None of this is criticism, by the way - using these assumptions lets
them write simpler or more efficient code. Most people, including me,
write non-portable code all the time.)

Oh, and I have several times said that "packed" can be /convenient/.
But it is never /necessary/. There is a difference.

And of course a sample of where someone else uses a particular feature
does not show the code is correct, and certainly does not show that the
feature is necessary. Using the Linux kernel as sample code is
particularly inappropriate, as it is a very unique piece of work with
very unique requirements and history.


David Brown

unread,
Feb 16, 2021, 5:48:07 AM2/16/21
to
On 16/02/2021 10:47, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 15:54:36 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 2/15/21 12:16 PM, mick...@potatofield.co.uk wrote:
>>> Sorry? Its standard C. Perhaps its frowned on in C++ but I've been doing
>>> network programming for a couple of decades and this method is used all over
>>
>>> the place.
>>
>> The C++ rules are stricter than the C rules, but it's also a problem in
>> C. Type punning is standard C, but there are restrictions on when it can
>> safely be used. Those restrictions are defined in terms of the
>> "effective type" of a piece of memory. For objects with a declared type,
>> the effective type is the same as the declared type. For memory with no
>> declared type (which basically means dynamically allocated memory), the
>> effective type is set by the last store into that memory using a
>> non-character type T. If you used an lvalue of type T to store the
>> value, then the memory has an effective type of T. If you use methods
>> such as memcpy() or memmove(), to copy an entire object over into such
>> memory, or if you copied it over as an array of character type, that
>> memory acquires the same effective type as the object it was copied from.
>
> Memory is memory, it doesn't have a type. How the compiler sees it is another
> matter of course but unless a C/C++ compiler wants to break a huge amount of
> code its going to have to treat memory in this instance as void.
>

I'm sorry, but you are wrong - C and C++ view memory in terms of
objects, which have specific types, and compilers will at times take
advantage of that. It is relatively rare that this makes a difference
to the code, but it happens sometimes. And yes, this results in
mistakes in people's C and C++ code giving buggy results. But it is not
that the compiler "breaks" their code - their code was broken when they
wrote it.

>> The relevant rule violated by many kinds of type punning is the
>> anti-aliasing rule:
>
> Fine, but like it or not type punning has been de facto standard C for a very
> long time and any C compiler (and C++ in many cases) breaks it at its peril.
>

Type punning is possible in a variety of ways. But the standards do
/not/ allow it just by doing pointer casts. Accessing memory by an
incompatible type breaks strong typing.

You are not the first person to misunderstand this - it is unfortunately
common amongst C and C++ programmers. (You can well argue that this is
a mistake in the way the languages are defined, and I think you'd find
support for that - but that's they way they are. Most programming
languages have similar rules - they just don't make it as easy to write
code that breaks the rules as C does.)

>> Unions provide a way to avoid this problem (see 6.5.2.3p3, and pay
>> attention to footnote 95), but that way only works if the object is
>> question is actually of the union type, and only if the declaration of
>> that union is in scope at the point where the problem could otherwise occur.
>
> Unions are another matter entirely mainly because endianess issues tend to
> occur with them regardless of memory alignment.
>

Endianness is inherent in pre-defined structures, and is orthogonal to
alignment questions and independent of unions.

>> It's not just Windows - compilers that take advantage of the
>> anti-aliasing rules to optimize code generation are quite common.
>
> IME most compilers when pushed to do heavy optimisation start making subtle
> mistakes here and there.

That is not my experience, with quality compilers (though bugs do occur
in compilers). But it /is/ my experience that heavy optimisation can
reveal bugs in the C or C++ source.

Optimisations from type-based alias analysis are not mistakes in the
compiler.

> Any heavily optimised code should always be tested
> much more thoroughly than non optimised before its released.
>

That much is true.

My experience is that code that "works when optimisation is disabled but
fails when optimised" is almost invariably bugs in the code, not in the
compiler.

David Brown

unread,
Feb 16, 2021, 6:08:45 AM2/16/21
to
On 15/02/2021 23:00, Paavo Helde wrote:
> 15.02.2021 19:16 mick...@potatofield.co.uk kirjutas:
>> On Mon, 15 Feb 2021 17:55:58 +0100
>> David Brown <david...@hesbynett.no> wrote:
>
>>> exceptions, it is behaviour undefined by the standard.  This means you
>>> can easily get something that works fine in your simple tests, but fails
>>> in more complex situations when code is inlined, link-time optimised, or
>>
>> Rubbish. Maybe in Windows but that doesn't concern me.
>
> FYI, the biggest "culprit" in this area has been gcc in recent years. It
> is keen to optimize away things which are formally UB, like infinite
> loops.

Certainly it would often be nice to get more warnings about this kind of
thing - but getting good warnings with few false positives is not easy.
gcc has been getting steadily better at its warnings over the years.

I can understand why people /want/ their compiler to read their minds
and guess what they meant to write, even though the actual code is in
error. I have a harder time understanding when they /expect/ it to do so.

It is particularly difficult for me to understand in this particular
case of type punning and type-based alias analysis. It's fair enough
that this is an advanced topic and lots of programmers don't really know
about it. But when you explain to people that the C and C++ standards
have rules about how objects can be accessed, and the compiler assumes
you follow those rules, some people get completely irrational - I have
seen people call compiler writers "evil" and "obsessed with benchmarks
at the expense of users".

C and C++ are defined the way they are defined. A C or C++ compiler
implements that language. As a programmer, you are expected to write
code that follows the rules of the language. The standard (plus
additional rules defined by the compiler) form an agreement between the
programmer and the compiler. If the programmer does not hold up his/her
side of the deal by writing correct code, the compiler can't be expected
to produce correct output from incorrect input.

Having said all that, it is of course important that a compiler does its
best to help the developer find and fix his/her errors, such as by
giving warning messages. It is not in anybody's interest for the
compiler to cover up the mistakes by pretending incorrect code means
something different.


Some people don't like certain aspects of the C and C++ standards - they
want a language with additional semantics defined. In particular, some
people want to be able to access data using any pointer types, and don't
want to use supported methods (memcpy, char access, placement new,
unions, volatile, compiler extensions). gcc helpfully gives you the
option "-fno-strict-aliasing" which does precisely that. So if you want
to program in a language that is mostly like C or C++ but has this
additional feature, that's the way to do it (for gcc and clang, anyway).

(The other common case like this is that many people believe that
because their signed integers are stored as two's complement, overflow
behaviour is defined as wrapping. This is, of course, nonsense. But to
help people who want this, gcc has a "-fwrapv" flag.)

> For some pointer conversions it helpfully warns you that it is
> planning to break your code ("dereferencing type-punned pointer will
> break strict-aliasing rules"). For some other kind of UB one might not
> get so lucky.
>
> MSVC, on the other hand, is generally much more careful to keep alive
> tons of crap code produced by hordes of cowboy programmers during last
> decades, only because such code accidentally happened to work at some
> time in the past.

MSVC has experimented with this kind of optimisation in their compiler.
But their problem is that the biggest source of crap code from cowboy
programmers is MS - the standard windows.h header relies on it.

(Contrast this with wrapping overflow for signed integers. MSVC
generally gives you wrapping behaviour, simply because it doesn't do as
good a job at optimising this kind of thing as gcc. Many people believe
that MSVC guarantees wrapping behaviour, and rely on it - but it does
not, and sometimes code that assumes wrapping will fail on MSVC. There
is, AFAIK, no "-fwrapv" flag for MSVC.)

>
> And yes, this is C++, not C, the rules are different.
>

The details are different, but many have the same effect here.

mick...@potatofield.co.uk

unread,
Feb 16, 2021, 9:25:22 AM2/16/21
to
On Tue, 16 Feb 2021 11:20:20 +0100
David Brown <david...@hesbynett.no> wrote:
>I get the impression here that there is a bit of mismatch between what I
>have been trying to say, and what you think I have been saying. I am
>not sure how - whether I was unclear or you misunderstood. But to go
>back to the beginning, you claimed that "packed" structs were required
>to handle pre-defined structures such as for network packets, and I
>pointed out that this is not correct - you can, for example, use memcpy
>to access the data in a portable and efficient manner. Do you agree on
>that point?

Ok, when I said essential I meant for efficient coding. Obviously you can
always use other methods and for [reasons] you prefer memcpy. It seems to boil
down to personal choice and there's little point arguing over that.


Chris Vine

unread,
Feb 16, 2021, 4:36:01 PM2/16/21
to
On Tue, 16 Feb 2021 11:47:52 +0100
David Brown <david...@hesbynett.no> wrote:
[snip]
> You are not the first person to misunderstand this - it is unfortunately
> common amongst C and C++ programmers. (You can well argue that this is
> a mistake in the way the languages are defined, and I think you'd find
> support for that - but that's they way they are. Most programming
> languages have similar rules - they just don't make it as easy to write
> code that breaks the rules as C does.)

I don't think there can be many competent C programmers who have not at
least heard of the strict aliasing rule by now, given that it has
existed since the first C88 standard was promulgated. Possibly there
is also the reverse problem - some C programmers don't properly
understand that it is fine to cast from a struct to its first member,
or back the other way again, and dereference the cast at will. This is
commonplace for example in network programming, and is basically how
POSIX's networking API is built up: POSIX does not rely on undefined
behaviour as far as that is concerned.

But although wilful ignorance is no excuse, I do wonder about whether
there has been a proper analysis of the speed-up gains of strict
aliasing, given that is does appear to be a problem for some second
rate programmers. Furthermore I think the way that C++ has doubled down
on this by requiring the use of std::launder for any case where a
pointer cast is not "pointer interconvertible" is a mistake. Too many
obscure technical rules launched at programmers because a compiler
vendor has asserted that it might make 1% of code 0.5% faster seems to
me to be the wrong balance.

David Brown

unread,
Feb 16, 2021, 5:25:56 PM2/16/21
to
That is a valid argument. However, optimisations and efficient code is
made from the sum of many small optimisations (either ones that are
often applicable but only make a small difference, or ones that make a
larger difference but are only rarely applicable). When you start
saying "we'll make this change in the language because people get it
wrong", where do you stop? Should you also make signed overflow
defined, because some people think it is? Should you add checks for
pointers being non-zero before dereferencing them, because some people
get it wrong and many of the checks can be optimised away?

Casting pointers is /dangerous/. It is lying to the compiler - it is
saying that an object has one type, but you want to pretend it is a
different type. Many other programming languages don't allow anything
equivalent to such conversions. However, it can be useful on occasion
in low-level code, which can usually be left to a few programmers who
understand the issues. The same applies in C++ - std::launder is likely
to find use in implementing memory pools and specialist allocators, not
in normal application code. It is also part of the move towards
defining a pointer provenance model for C and C++, to improve alias
tracking (for knowing when apparently different pointers may alias, and
for being sure that pointers of similar types do not alias).

Chris Vine

unread,
Feb 16, 2021, 6:24:26 PM2/16/21
to
On Tue, 16 Feb 2021 23:25:43 +0100
Yes, but by the same argument should you make the language incrementally
more difficult to use with every new memory feature: honestly, what
percentage of C++ programmers have even heard of std::launder? But if
you want to construct an array in malloc'ed or 'new char'ed memory and
access it again from your original pointer you better have had. I have
to confess I have started to steer away from C++ for new projects, not
because I fail to understand it, but perhaps because I understand it
too well. I think it is getting its trade-offs wrong. (I also think
that the standard is not curated adequately as witness the contents of
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html
which corrected C++17 for C++20 by some last-minute messing about with
ISO procedures: between 2017 and 2020 best practices used by the best
programmers for years were technically defective - even
std::uninitialized_copy had, but its own rules, undefined behaviour.)

I think times have moved on.

> Casting pointers is /dangerous/. It is lying to the compiler - it is
> saying that an object has one type, but you want to pretend it is a
> different type. Many other programming languages don't allow anything
> equivalent to such conversions. However, it can be useful on occasion
> in low-level code, which can usually be left to a few programmers who
> understand the issues. The same applies in C++ - std::launder is likely
> to find use in implementing memory pools and specialist allocators, not
> in normal application code. It is also part of the move towards
> defining a pointer provenance model for C and C++, to improve alias
> tracking (for knowing when apparently different pointers may alias, and
> for being sure that pointers of similar types do not alias).

Casting is not necessarily lying to the computer. It is only lying to
the computer if an object of the relevant type does not in fact reside
at the memory in question. (Even if it does, that doesn't mean you can
get away without using std::launder. And that de-optimizer is not
required just for writing allocators.)

James Kuyper

unread,
Feb 16, 2021, 10:54:41 PM2/16/21
to
On 2/16/21 4:47 AM, mick...@potatofield.co.uk wrote:
> On Mon, 15 Feb 2021 15:54:36 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
...
>> The C++ rules are stricter than the C rules, but it's also a problem in
>> C. Type punning is standard C, but there are restrictions on when it can
>> safely be used. Those restrictions are defined in terms of the
>> "effective type" of a piece of memory. For objects with a declared type,
>> the effective type is the same as the declared type. For memory with no
>> declared type (which basically means dynamically allocated memory), the
>> effective type is set by the last store into that memory using a
>> non-character type T. If you used an lvalue of type T to store the
>> value, then the memory has an effective type of T. If you use methods
>> such as memcpy() or memmove(), to copy an entire object over into such
>> memory, or if you copied it over as an array of character type, that
>> memory acquires the same effective type as the object it was copied from.
>
> Memory is memory, it doesn't have a type.

The C standard says otherwise, and so does the C++ standard. That type
is kept track of by the compiler, not by the hardware, but it does have
a type. That type is very important - how and when the effective type of
a piece of memory changes determines which optimizations a compiler can
perform. If a compiler performs an optimization that is permitted by
those rules, and your code breaks those rules, there's a significant
chance (but no guarantees) that you won't be happy with the resulting
behavior.

> ... How the compiler sees it is another
> matter of course but unless a C/C++ compiler wants to break a huge amount of
> code its going to have to treat memory in this instance as void.

That would make things very easy - dereferencing a pointer to void has
undefined behavior. If they did that, they wouldn't have to worry about
type-punning at all; for practical purposes, that means that they
wouldn't support type punning. I suspect you meant something else by
that statement.

>> The relevant rule violated by many kinds of type punning is the
>> anti-aliasing rule:
>
> Fine, but like it or not type punning has been de facto standard C for a very
> long time and any C compiler (and C++ in many cases) breaks it at its peril.

Most C compilers do in fact enable optimizations that are permitted only
because of the anti-aliasing rules, and can therefore cause your program
to fail if you write code that violates those rules. It's not guaranteed
to, but it can. They perform those optimizations because developers like
to see their code execute faster. That those optimizations make it
somewhat trickier to write correct code is a price that many developers
are willing to pay. If you want to ignore such issues, don't write your
code in C (or C++). Use some other language that protects you from
having to worry about them.

>> Unions provide a way to avoid this problem (see 6.5.2.3p3, and pay
>> attention to footnote 95), but that way only works if the object is
>> question is actually of the union type, and only if the declaration of
>> that union is in scope at the point where the problem could otherwise occur.
>
> Unions are another matter entirely mainly because endianess issues tend to
> occur with them regardless of memory alignment.

Unions are very much the same matter - almost every use of a union to
perform type punning would have undefined behavior according to the
anti-aliasing rules, if it weren't for the fact that the C standard says
otherwise. If read one member of a union is sequenced after a write to a
different member, with no other writes that are either sequenced
betweent them or unsequenced relative to them, the the write will be
completed before the read, and the read will result in the bit pattern
created by that write being interpreted as if it were an object of the
type used to read it (this can be problematic if that bit pattern
doesn't represent a valid value of that type, but is otherwise
well-defined behavior). Those guarantees do not normally apply when the
type written and the type read do not satisfy the anti-aliasing rules.
Note: these are C rules, the C++ rules are stricter on this issue.

>> It's not just Windows - compilers that take advantage of the
>> anti-aliasing rules to optimize code generation are quite common.
>
> IME most compilers when pushed to do heavy optimisation start making subtle
> mistakes here and there. Any heavily optimised code should always be tested
> much more thoroughly than non optimised before its released.

I suspect that if you were not already aware of this issue, then it's
possible that the "mistakes" you've seen might have been perfectly
legitimate optimizations, permitted as a result of violations of this
rule, or possibly of other similar rules that you're also unaware of.

I'm not saying that compilers are always right. I'm just saying that if
you aren't aware of how to write your code to avoid violating the
anti-aliasing rules, there's probably many other rules that you also
don't know how to avoid violating.

James Kuyper

unread,
Feb 16, 2021, 11:07:03 PM2/16/21
to
On 2/16/21 5:47 AM, David Brown wrote:
...
> Type punning is possible in a variety of ways. But the standards do
> /not/ allow it just by doing pointer casts. Accessing memory by an
> incompatible type breaks strong typing.

We are talking about the C standard, for which "compatible type" has a
very specific meaning (see section 6.2.7 of the C standard). "a type
compatible with the effective type of the object" is only the first of
the six different cases where type punning is permitted - the other five
cases all involve types that are incompatible with the effective type.

C++ uses the concepts of "layout-compatible types" and
"reference-compatible types", but the only types it refers to as simply
"compatible" are pointer types (20.3.11p6).

Keith Thompson

unread,
Feb 16, 2021, 11:20:55 PM2/16/21
to
James Kuyper <james...@alumni.caltech.edu> writes:
> On 2/16/21 4:47 AM, mick...@potatofield.co.uk wrote:
[...]
>> ... How the compiler sees it is another
>> matter of course but unless a C/C++ compiler wants to break a huge amount of
>> code its going to have to treat memory in this instance as void.
>
> That would make things very easy - dereferencing a pointer to void has
> undefined behavior. If they did that, they wouldn't have to worry about
> type-punning at all; for practical purposes, that means that they
> wouldn't support type punning. I suspect you meant something else by
> that statement.

Dereferencing a pointer to void is a constraint violation, and I'd be
surprised to see a compiler that didn't treat it as a fatal error.
There's no undefined behavior if there's no behavior. Though I suppose
the (run-time) behavior would be undefined if a compiler permitted it
for some reason.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

James Kuyper

unread,
Feb 16, 2021, 11:24:31 PM2/16/21
to
On 2/16/21 11:20 PM, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> On 2/16/21 4:47 AM, mick...@potatofield.co.uk wrote:
> [...]
>>> ... How the compiler sees it is another
>>> matter of course but unless a C/C++ compiler wants to break a huge amount of
>>> code its going to have to treat memory in this instance as void.
>>
>> That would make things very easy - dereferencing a pointer to void has
>> undefined behavior. If they did that, they wouldn't have to worry about
>> type-punning at all; for practical purposes, that means that they
>> wouldn't support type punning. I suspect you meant something else by
>> that statement.
>
> Dereferencing a pointer to void is a constraint violation,

You're right. My mistake.

Öö Tiib

unread,
Feb 17, 2021, 2:45:30 AM2/17/21
to
Programmers are expected to be users of tool. Tool has to be specified
with usability in mind. Tool that is harder and harder to use is losing its
usages. The C++ committee understood it two decades ago. For
little example they realized after C++98 that programmers use
container.size() == 0 more frequently than container.empty() for
detecting if container is empty and so required size() of all
container types to be O(1), despite it made some containers slightly
less efficient.

By C++17 that attitude did die off and unsure if it ever comes back.
The checks are often done on level of hardware architecture to support
other, more checked languages and so are technically free unless failing.
C++ does not promise any of those common free checks and instead
allows optimizing clumsily written explicit checks away ... so those are
symptoms of sabotaged tool.

All those std::launders, std::byte full of UBs that unsigned char did not
have, broken constexpr. Pure trash.

mick...@potatofield.co.uk

unread,
Feb 17, 2021, 4:03:39 AM2/17/21
to
On Tue, 16 Feb 2021 22:54:27 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 2/16/21 4:47 AM, mick...@potatofield.co.uk wrote:
>> On Mon, 15 Feb 2021 15:54:36 -0500
>> James Kuyper <james...@alumni.caltech.edu> wrote:
>> Memory is memory, it doesn't have a type.
>
>The C standard says otherwise, and so does the C++ standard. That type
>is kept track of by the compiler, not by the hardware, but it does have

As I said below.

>That would make things very easy - dereferencing a pointer to void has

Good luck with that.

>> Unions are another matter entirely mainly because endianess issues tend to
>> occur with them regardless of memory alignment.
>
>Unions are very much the same matter - almost every use of a union to

Yes I know, do try and understand idiom.

>I suspect that if you were not already aware of this issue, then it's
>possible that the "mistakes" you've seen might have been perfectly
>legitimate optimizations, permitted as a result of violations of this
>rule, or possibly of other similar rules that you're also unaware of.
>
>I'm not saying that compilers are always right. I'm just saying that if
>you aren't aware of how to write your code to avoid violating the
>anti-aliasing rules, there's probably many other rules that you also
>don't know how to avoid violating.

I'm getting tired of this patronising crap from pedants. I've NEVER seen
type punning fail so long as pragmas were used correctly regardless of
optimisation but if any of you can provide a simple example where it does
then feel free to post it instead of a lot of hand waving vagueries and
appeals to "the standard".

David Brown

unread,
Feb 17, 2021, 5:57:50 AM2/17/21
to
A language needs to figure out where it stands on this kind of thing at
an early stage, and when stabilising. Then it should stick to the
decisions it makes unless there is a very good reason to do otherwise.
Once you have a base of existing code in use, it's hard to make changes
at this level. If you remove semantics to allow more optimisation, you
break code that used to be correct according to the language
specification when it was written. If you add semantics and reduce
optimisation, code that was fine before now runs less efficiently.
Neither is good.

(An example of where changes have been justified are the C++17
additional rules for the sequencing of sub-expressions. The new rules
are needed to make some clear and common stream expressions valid, while
being very unlikely to lead to any extra run-time cost for existing code.)

> honestly, what
> percentage of C++ programmers have even heard of std::launder? But if
> you want to construct an array in malloc'ed or 'new char'ed memory and
> access it again from your original pointer you better have had.

That would count as making your own allocator. You are defining your
block of memory as one type, and then accessing it as a different type -
it is not unreasonable that you need extra code to tell the compiler
about it. (I'd prefer it if the language mechanics made it a
compile-time error to fail to get this right.)

Other uses for std::launder are if you are using placement new to change
an object that is defined with "const". When you define something as
"const", you are telling the compiler that it cannot change value
(merely declaring it as "const" says that you won't change its value via
that declaration). The compiler can optimise on the assumption that the
const object won't change value. And then you use placement new and
change it. So you need std::launder to inform the compiler of the change.

These are not things that come up often, in normal code. C++ is full of
obscure and difficult features that are only needed and known by a
relatively small number of people. This is, I think, a necessarily evil
for big languages. (Look at Python - how many people can tell you how
"__slots__" should be used to make code more efficient? How many can
explain metaclasses?)

> I have
> to confess I have started to steer away from C++ for new projects, not
> because I fail to understand it, but perhaps because I understand it
> too well. I think it is getting its trade-offs wrong.

That's possible. It's always difficult to distinguish "wrong" from "not
what I want", but if enough existing C++ users think it is "not what I
want", then it becomes "wrong".

> (I also think
> that the standard is not curated adequately as witness the contents of
> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html
> which corrected C++17 for C++20 by some last-minute messing about with
> ISO procedures: between 2017 and 2020 best practices used by the best
> programmers for years were technically defective - even
> std::uninitialized_copy had, but its own rules, undefined behaviour.)
>
> I think times have moved on.
>
>> Casting pointers is /dangerous/. It is lying to the compiler - it is
>> saying that an object has one type, but you want to pretend it is a
>> different type. Many other programming languages don't allow anything
>> equivalent to such conversions. However, it can be useful on occasion
>> in low-level code, which can usually be left to a few programmers who
>> understand the issues. The same applies in C++ - std::launder is likely
>> to find use in implementing memory pools and specialist allocators, not
>> in normal application code. It is also part of the move towards
>> defining a pointer provenance model for C and C++, to improve alias
>> tracking (for knowing when apparently different pointers may alias, and
>> for being sure that pointers of similar types do not alias).
>
> Casting is not necessarily lying to the computer. It is only lying to
> the computer if an object of the relevant type does not in fact reside
> at the memory in question. (Even if it does, that doesn't mean you can
> get away without using std::launder. And that de-optimizer is not
> required just for writing allocators.)
>

Fair enough - casting itself is not lying to the compiler. But many
uses of the result of the cast are lies.

Chris Vine

unread,
Feb 17, 2021, 8:50:56 AM2/17/21
to
On Wed, 17 Feb 2021 11:57:36 +0100
David Brown <david...@hesbynett.no> wrote:
> On 17/02/2021 00:24, Chris Vine wrote:
[snip]
> > honestly, what
> > percentage of C++ programmers have even heard of std::launder? But if
> > you want to construct an array in malloc'ed or 'new char'ed memory and
> > access it again from your original pointer you better have had.
>
> That would count as making your own allocator. You are defining your
> block of memory as one type, and then accessing it as a different type -
> it is not unreasonable that you need extra code to tell the compiler
> about it. (I'd prefer it if the language mechanics made it a
> compile-time error to fail to get this right.)

I disagree. Constructing arrays dynamically in uninitialized memory is
a very common requirement. Making your own buffer is not making your
own allocator and I think it is completely wrong to describe it as such.
Converting sound code written prior to 2017 which does this very common
thing into undefined behaviour without std::launder is in my opinion (I
accept not yours) unacceptable.

James Kuyper

unread,
Feb 17, 2021, 9:34:30 AM2/17/21
to
On 2/17/21 4:03 AM, mick...@potatofield.co.uk wrote:
> On Tue, 16 Feb 2021 22:54:27 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 2/16/21 4:47 AM, mick...@potatofield.co.uk wrote:
>>> On Mon, 15 Feb 2021 15:54:36 -0500
>>> James Kuyper <james...@alumni.caltech.edu> wrote:

<Restore snipped context>
>>> ... How the compiler sees it is another
>>> matter of course but unless a C/C++ compiler wants to break a huge amount of
>>> code its going to have to treat memory in this instance as void.
</restore snipped context>
...>> That would make things very easy - dereferencing a pointer to void has
>
> Good luck with that.

"treat memory in this instance as void" would not avoid breaking a huge
amount of code - it would break all code that qualifies as "in this
instance", for precisely the reason you are expressing by saying "good
luck with that". Whatever it is that you meant by that phrase needs to
be expressed differently.

What precisely is "this instance"? I'm having trouble interpreting your
comments in any fashion except as insisting that type-punning should
always be handled as reinterpreting the bit-pattern, the same way that
unions do. If you are instead saying only that there should be some
additions to the list of exceptions to the anti-aliasing rule, that
would be much more workable.

mick...@potatofield.co.uk

unread,
Feb 17, 2021, 9:43:05 AM2/17/21
to
On Wed, 17 Feb 2021 09:34:13 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 2/17/21 4:03 AM, mick...@potatofield.co.uk wrote:
>What precisely is "this instance"? I'm having trouble interpreting your
>comments in any fashion except as insisting that type-punning should
>always be handled as reinterpreting the bit-pattern, the same way that

What have bit patterns got to do with it? I'll I'm saying is if I cast
from one pointer type to another the resulting pointer has the same
memory address and I'm trying to think in what circumstances it
wouldn't. I've asked for an example and no one has provided one.

Bo Persson

unread,
Feb 17, 2021, 9:53:24 AM2/17/21
to
You have the case of multiple base classes, where casting to anyone but
the first base will give you a different address.

And casting a char pointer to an int pointer, while preserving the
address, might give you an invalid (unaligned) pointer.

mick...@potatofield.co.uk

unread,
Feb 17, 2021, 10:25:47 AM2/17/21
to
On Wed, 17 Feb 2021 15:53:08 +0100
Bo Persson <b...@bo-persson.se> wrote:
>On 2021-02-17 at 15:42, mick...@potatofield.co.uk wrote:
>> On Wed, 17 Feb 2021 09:34:13 -0500
>> James Kuyper <james...@alumni.caltech.edu> wrote:
>>> On 2/17/21 4:03 AM, mick...@potatofield.co.uk wrote:
>>> What precisely is "this instance"? I'm having trouble interpreting your
>>> comments in any fashion except as insisting that type-punning should
>>> always be handled as reinterpreting the bit-pattern, the same way that
>>
>> What have bit patterns got to do with it? I'll I'm saying is if I cast
>> from one pointer type to another the resulting pointer has the same
>> memory address and I'm trying to think in what circumstances it
>> wouldn't. I've asked for an example and no one has provided one.
>>
>
>You have the case of multiple base classes, where casting to anyone but
>the first base will give you a different address.

Ok, I didn't know that, but then taking the raw address in that situation is
probably never something you're going to do in practice.

>And casting a char pointer to an int pointer, while preserving the
>address, might give you an invalid (unaligned) pointer.

AFAIK that doesn't matter on x86, can't remember ever seeing a bus error on
Linux. But obviously it does on other architectures.

David Brown

unread,
Feb 17, 2021, 10:45:30 AM2/17/21
to
As I explained, memcpy (used appropriately and with a decent compiler,
such as reasonably current gcc) has /zero/ run-time cost. The same
often (but not always) applies to accessing data via unsigned char
pointers and building up the composite data that way. Those are valid
techniques for accessing "weird" layouts - data with unusual alignments,
data with unusual sizes (like a 24-bit integer), data with non-native
endianness. Done well in wrapping functions, classes, templates or
macros, it does not even have to be difficult to use or ugly in the
source code. And done well, the results are optimal efficiency on good
tools, while being guaranteed correct on other tools regardless of
optimisation flags or other settings.

That of course does not mean that this is /always/ the best way to write
the code. It does not mean that it will give you optimal results with
all compilers. But in my book, /correctness/ trumps efficiency every time.

If you know you are using a single fixed compiler and fixed target, and
you know the effect of a particular extension, and you know the
additional semantic guarantees the tool gives you, then of course it can
be better to use these extensions. I do that myself in a lot of my code
- if I am dealing with endian issues, I will sometimes use gcc's
"scaler_storage_order" attribute rather than "hton*" or manual byte swaps.

But do you /need/ to use "pragma pack" to get efficient code from
structs with odd alignment? No, you most certainly do not. Do you need
to use pointer casts and use them in ways not defined in the C
standards? No, you most certainly do not - not with decent tools.

However, sometimes you /don't/ have a good compiler and you are faced
with a choice of writing technically correct code that is slow, or
technically incorrect code that you can see works in this particular
case. That's part of life as a programmer. My recommendation for that
kind of situation is pre-processor checks for the compiler you know and
have tested for the non-portable but efficient code, either with a
fall-back to a portable but potentially slow alternative, or simply
giving compilation failure when the code is used on a different tool.
That attitude improves code re-use and reduces the risk of accidents.



Scott Lurndal

unread,
Feb 17, 2021, 10:58:30 AM2/17/21
to
X86 will fault unaligned accesss if the AF flag is set in the cpu flags.

Linux generally doesn't set that flag.

Scott Lurndal

unread,
Feb 17, 2021, 11:00:28 AM2/17/21
to
David Brown <david...@hesbynett.no> writes:
>On 16/02/2021 15:25, mick...@potatofield.co.uk wrote:
>> On Tue, 16 Feb 2021 11:20:20 +0100
>> David Brown <david...@hesbynett.no> wrote:
>>> I get the impression here that there is a bit of mismatch between what I
>>> have been trying to say, and what you think I have been saying. I am
>>> not sure how - whether I was unclear or you misunderstood. But to go
>>> back to the beginning, you claimed that "packed" structs were required
>>> to handle pre-defined structures such as for network packets, and I
>>> pointed out that this is not correct - you can, for example, use memcpy
>>> to access the data in a portable and efficient manner. Do you agree on
>>> that point?
>>
>> Ok, when I said essential I meant for efficient coding. Obviously you can
>> always use other methods and for [reasons] you prefer memcpy. It seems to boil
>> down to personal choice and there's little point arguing over that.
>>
>
>As I explained, memcpy (used appropriately and with a decent compiler,
>such as reasonably current gcc) has /zero/ run-time cost.

Except for any cache line(s) evicted as a result of the memcpy, which
may indeed have an overall performance cost.


Manfred

unread,
Feb 17, 2021, 11:21:09 AM2/17/21
to
On 2/16/2021 10:35 PM, Chris Vine wrote:
> Furthermore I think the way that C++ has doubled down
> on this by requiring the use of std::launder for any case where a
> pointer cast is not "pointer interconvertible" is a mistake. Too many
> obscure technical rules launched at programmers because a compiler
> vendor has asserted that it might make 1% of code 0.5% faster seems to
> me to be the wrong balance.

Valid point.
std::launder is IMO a good example of bad design.

James Kuyper

unread,
Feb 17, 2021, 11:22:49 AM2/17/21
to
No one is suggesting that the pointer would point to a different memory
address (though that is in fact a potential problem in some cases - see
below). They're saying that you cannot safely use the pointer that
results from the conversion to access the memory it points at.
Implementations are allowed to perform optimizations that ignore the
possibility that two pointers to different types might point at the same
location in memory, depending upon what relationships those two types
have to each other.

With regards to the "same memory address": conversion of a pointer to
one type into a pointer to a different type which has alignment
requirements violated by the original pointer has undefined behavior,
which in particular allows for the possibility that the resulting
pointer does not point at the same memory location.
As an example of the reason why this rule exists, there have been
implementations targeting machines with large word sizes which have
pointers to word-aligned types that have fewer bits (and even, in some
cases, fewer bytes) than pointers to types with smaller alignment
requirements. Conversion of a pointer that doesn't point at the
beginning of a word to a pointer type that can only represent positions
at the beginning of a word CANNOT result in a pointer to the same location.

<pedantic>Even when pointer conversions have defined behavior, in
general that definition only says that if the resulting pointer value
gets converted back to the original pointer type, it will compare equal
to the original.
There are subtle differences between the C and C++ standards about these
issues, but neither standard specifies where the resulting pointer
points except in some special cases. Personally, I think they should
specify that it points at the same location, but they don't. The C
exceptions are easier to describe, so I'll list them here:
1. converting a pointer to an array into a pointer of the element type
of the array results in a pointer pointing at the first element of the
array.
2. Converting to a pointer to a struct type into a pointer to the
pointer to the type of the first member of the struct results in a
pointer to that member.
3. Conversion of a pointer to a pointer to character type results in a
pointer to the first byte of the object.
Each of these conversions are reversible.
</pedantic>

David Brown

unread,
Feb 17, 2021, 11:30:44 AM2/17/21
to
#include <inttypes.h> // for int32_t related stuff
#include <string.h>

#pragma pack(1)
typedef struct S {
int8_t a;
int32_t b;
int64_t c;
} S;

int getsize(void) { return sizeof(S); }

int32_t getb1(const S* p) {
return p->b;
}

int32_t getb2(const S* p) {
int32_t x;
memcpy(&x, &p->b, sizeof x);
return x;
}

int32_t getb3(const S* p) {
const uint8_t * q = (const uint8_t *) &p->b;
int32_t x;
uint8_t b[sizeof x];
for (size_t i = 0; i < sizeof x; i++) {
b[i] = *q++;
}
memcpy(&x, b, sizeof x);
return x;
}

int32_t getb4(const S* p) {
const uint8_t * q = (const uint8_t *) (&p->b + 1) - 1;
int32_t x = 0;
#pragma GCC unroll sizeof x
for (size_t i = 0; i < sizeof x; i++) {
x = (x << 8) | *q--;
}
return x;
}


gcc (via godbolt.org) on x86-64 with -O2 gives:

getb1:
movl 1(%rdi), %eax
ret
getb2:
movl 1(%rdi), %eax
ret
getb3:
movl 1(%rdi), %eax
ret
getb4:
movl 1(%rdi), %eax
ret


Exactly which cache lines are "evicted" by the use of memcpy here?


mick...@potatofield.co.uk

unread,
Feb 17, 2021, 12:12:19 PM2/17/21
to
On Wed, 17 Feb 2021 11:22:36 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>requirements. Conversion of a pointer that doesn't point at the
>beginning of a word to a pointer type that can only represent positions
>at the beginning of a word CANNOT result in a pointer to the same location.

Really?

gondor$ cat t.c
#include <stdio.h>
#include <stdint.h>

int main()
{
char s[5];
uint32_t *i = (uint32_t *)s;
*i = 123;
printf("addr = %p, val = %u\n",i,*i);
return 0;
}
gondor$ cc t.c
gondor$ a.out
addr = 0x7ffe017f3873, val = 123

That int looks nonaligned to me.


Chris Vine

unread,
Feb 17, 2021, 12:28:35 PM2/17/21
to
Wherever a reinterpret_cast wouldn't have alignment issues, memcpy
will be optimized out entirely - it is a compiler intrinsic/built-in.
(In fact, in C++17 it has to be an intrinsic because memcpy cannot be
implemented as a function using standard C++17 without undefined
behaviour, but that is a bug in the standard rather than a feature.)

Manfred

unread,
Feb 17, 2021, 12:35:47 PM2/17/21
to
On 2/17/2021 11:57 AM, David Brown wrote:
> On 17/02/2021 00:24, Chris Vine wrote:
>> On Tue, 16 Feb 2021 23:25:43 +0100
>> David Brown <david...@hesbynett.no> wrote:
>>> On 16/02/2021 22:35, Chris Vine wrote:
>>>>
>>>> But although wilful ignorance is no excuse, I do wonder about whether
>>>> there has been a proper analysis of the speed-up gains of strict
>>>> aliasing, given that is does appear to be a problem for some second
>>>> rate programmers. Furthermore I think the way that C++ has doubled down
>>>> on this by requiring the use of std::launder for any case where a
>>>> pointer cast is not "pointer interconvertible" is a mistake. Too many
>>>> obscure technical rules launched at programmers because a compiler
>>>> vendor has asserted that it might make 1% of code 0.5% faster seems to
>>>> me to be the wrong balance.
>>>
>>> That is a valid argument. However, optimisations and efficient code is
>>> made from the sum of many small optimisations (either ones that are
>>> often applicable but only make a small difference, or ones that make a
>>> larger difference but are only rarely applicable). When you start
>>> saying "we'll make this change in the language because people get it
>>> wrong", where do you stop?
That's not the point, it is in fact the other way around. It is the C++
committee who decided to make a change in the language, because they
decided that people were getting something wrong. Despite this something
being a very basic language feature brought up as an example even by
Bjarne himself in his book.

Should you also make signed overflow
>>> defined, because some people think it is? Should you add checks for
>>> pointers being non-zero before dereferencing them, because some people
>>> get it wrong and many of the checks can be optimised away?

Again, this is the attitude taken by the C++ committee, and it is wrong.
It is them who changed the language to second the opinion of a part of
the audience.

>>
>> Yes, but by the same argument should you make the language incrementally
>> more difficult to use with every new memory feature:
>
> A language needs to figure out where it stands on this kind of thing at
> an early stage, and when stabilising. Then it should stick to the
> decisions it makes unless there is a very good reason to do otherwise.
> Once you have a base of existing code in use, it's hard to make changes
> at this level. If you remove semantics to allow more optimisation, you
> break code that used to be correct according to the language
> specification when it was written. If you add semantics and reduce
> optimisation, code that was fine before now runs less efficiently.
> Neither is good.

This is why I believe the std::launder change is an example of bad design:
In the beginning there was type punning, unions, and the "effective
type" rules of C, i.e. the rules of pointer casts.
Then C++ addressed this by adding a number of cast operators designed
for the very purpose of making the semantics of these rules more explicit.

Baseline is that pointer conversion is strictly coupled with dynamic
memory allocation - and I think you agree that dynamic memory is a core
feature of C++. malloc /is/ useful in C++ too;
And C++ did "figure out where it stands on this kind of thing at an
early stage".
The fact is that the C++ committee decided, only 30+ years after this
early stage, to turn the whole thing upside down. Not smart, IMO.


>>
>>> Casting pointers is /dangerous/. It is lying to the compiler - it is
>>> saying that an object has one type, but you want to pretend it is a
>>> different type.
/Careless/ casting is dangerous. Bjarne knew about it and he addressed
the problem.
It's not lying unless you misuse it, and in C++ you need to exercise
some gymnastics to achieve that.
On the other hand, if you want to do anything with a piece of memory
returned by malloc you /need/ to cast the pointer you get (the same is
substantially true for new char[]).

Many other programming languages don't allow anything
>>> equivalent to such conversions.
And these programming languages are far less powerful than C++ (and C),
to the point that the most popular of them would just not work without C
or C++, e.g Java and C#.

However, it can be useful on occasion
>>> in low-level code, which can usually be left to a few programmers who
>>> understand the issues. The same applies in C++ - std::launder is likely
>>> to find use in implementing memory pools and specialist allocators, not
>>> in normal application code.

I'm not that convinced about this ivory tower argument.
I do get annoyed by broken code when written by some incompetent
keypusher when I see it, and I dislike when some major software vendor
advertises their new programming language being "easy to use" as its
primary selling point, but I don't think that making C++ more convoluted
or fragmented is a solution to that.

James Kuyper

unread,
Feb 17, 2021, 1:26:38 PM2/17/21
to
On 2/17/21 12:12 PM, mick...@potatofield.co.uk wrote:
> On Wed, 17 Feb 2021 11:22:36 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
...
>> requirements. Conversion of a pointer that doesn't point at the
>> beginning of a word to a pointer type that can only represent positions
>> at the beginning of a word CANNOT result in a pointer to the same location.
>
> Really?
>
> gondor$ cat t.c
> #include <stdio.h>
> #include <stdint.h>
>
> int main()
> {
> char s[5];

> uint32_t *i = (uint32_t *)s;

That's not guaranteed to be misaligned. It would make a better example
to use

_Alignas(_Alignof(uint32_t)) char s[5];
uint32_t *i = (uint32_t*)(s+1);

That would guarantee misalignment on any implementation where
_Alignof(uint32_t) > 1.

> *i = 123;
> printf("addr = %p, val = %u\n",i,*i);

The values corresponding to a format specifier of %p are supposed to
have a type of void*, otherwise the behavior is undefined. On
implementations where all pointers have the same representation, which
probably includes every implementation you've ever used, that's
generally not a problem: the implementation defines the behavior that
the standard leaves undefined, in precisely the way you presumably
thought it was required to be defined.

However, on implementations where the problem I described can occur,
sizeof(void*) will be larger than sizeof(uint32_t*), and the %p
specifier will therefore generally make up the difference by
misinterpreting something else as containing the bytes it's looking for
that aren't part of i.

> return 0;
> }
> gondor$ cc t.c
> gondor$ a.out
> addr = 0x7ffe017f3873, val = 123
>
> That int looks nonaligned to me.

For the reasons given above, on an implementation which can have the
problem I described, the value printed out is meaningless. However, even
without that problem, the meaning of the string printed out with "%p" is
implementation defined. The way in which you can determine whether or
not the pointer is correctly aligned can differ from one platform to
another. That might seem a purely pedantic issue to worry about, but you
can avoid it completely by using print("%p : %p\n", (void*)s, (void*)i).
If you're using an implementation where the problem I described can
occur, and if you made the other changes I suggested above, those two
pointers must differ.

Are you claiming that you just compiled this code using such an
implementation of C? If so, which implementation is it, and what is the
target platform? In particular, what are the values of
_Alignof(uint32_t), sizeof(uint32_t*), and sizeof(char*)?

Manfred

unread,
Feb 17, 2021, 1:43:17 PM2/17/21
to
I think this still does not solve the issue at the language level.
The standard does not mandate the behaviour that is shown by your
example, so even if compilers do compensate for inefficiency of the
code, this does not make the language good.

Looking at your first and second example above there is no reason for
which one should prefer getb2 over getb1, in fact getb2, as written, is
less efficient than getb1 because it introduces some unneeded extra
storage and an extra function call - albeit at the level of the abstract
machine, with no benefit in readability or robustness against bugs.
If the language somehow requires to use getb2 instead of getb1, I see
this as an inefficiency in the language.
The fact that the compiler puts a remedy to this by applying some
operation that is hidden to the language specification does not make the
language itself any more efficient.

Keith Thompson

unread,
Feb 17, 2021, 2:28:11 PM2/17/21
to
mick...@potatofield.co.uk writes:
> On Wed, 17 Feb 2021 11:22:36 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
>>requirements. Conversion of a pointer that doesn't point at the
>>beginning of a word to a pointer type that can only represent positions
>>at the beginning of a word CANNOT result in a pointer to the same location.
>
> Really?

Yes, really.

> gondor$ cat t.c
> #include <stdio.h>
> #include <stdint.h>
>
> int main()
> {
> char s[5];
> uint32_t *i = (uint32_t *)s;
> *i = 123;
> printf("addr = %p, val = %u\n",i,*i);
> return 0;
> }
> gondor$ cc t.c
> gondor$ a.out
> addr = 0x7ffe017f3873, val = 123
>
> That int looks nonaligned to me.

James was talking about "a pointer type that can only represent
positions at the beginning of a word", something that doesn't exist on
the implementation you're using.

Imagine an implementation on which machine-level addresses point to
32-bit words, and byte pointers (CHAR_BIT==8) are constructed in
software by adding bits describing the byte offset within the word. On
such an implementation, a uint32_t* pointer value cannot refer to
anything other than an entire 32-bit word. Your program would behave
differently on such an implementation.

I've worked on such implementations (except that the word size was 64
bits).

On the implementation you're using, a uint32_t* pointer value *can*
point to an odd address, and such a pointer can be dereferenced
successfully.

David Brown

unread,
Feb 17, 2021, 3:19:57 PM2/17/21
to
std::launder is not a new feature as a change to the language - it is a
way to write code that is correct according to the way the C++ memory
model has worked since it was defined for C++03. The mistake is not
adding std::launder in C++17 - the mistake was not including it in
C++03, or not finding a memory model that did not need such a feature.

So it is /not/ the language that has changed - this is a feature that
has been needed (but only in rare situations) since C++03, that has
finally been added.


Why is std::launder needed? Let's take an example (assuming I have
understood the details correctly) :

#include <new>

struct X {
int a = 1;
virtual int foo() { return a + 2; }
};

void foof(X& x);

int foobar1() {
X x;
int f = x.foo();
foof(x);
int g = x.foo();
return f + g;
}

int foobar2() {
X x;
int f = x.foo();
foof(x);
X& y = *std::launder(&x);
int g = y.foo();
return f + g;
}

When compiling foobar1(), the compiler knows that the function "foof"
cannot replace "x" with a new object and still access it through "x" -
that would be breaking the C++ memory model. That means it can be sure
that any "const" fields, including the vtable, are unchanged by "foof".
This lets it make very significantly more efficient code by
devirtualizing then inlining the code. It can compile foobar1() as
though it were:

int foobar1() {
X x;
foof(x);
return x.a + 5;
}

If foof() used placement new to put a new object in x (such as a type
that inherits from X but is the same size, with a new implementation of
foo), then the programmer would want that new foo() to be called. The
best way to do this would be for "foof" to return the result of the
placement new - which is a pointer to the same memory, but known to
point to a different object. However, that's not always convenient. So
std::launder tells the compiler that the object may have changed. The
compiler thus cannot do the same kind of optimisation. gcc implements
it roughly as though the programmer had written:

int foobar2() {
X x;
foof(x);
auto p = &x.foo;
if (p == &X.foo) {
return x.a + 5;
} else {
return x.p() + 3;
}
}


Devirtualization, fully or partially, is a /big/ optimisation for C++
classes with virtual functions. It only works well because the C++
memory model (from C++03, not C++17) limits what you can do.
std::launder give you a way to be more flexible for unusual cases.

>
>>>
>>> Yes, but by the same argument should you make the language incrementally
>>> more difficult to use with every new memory feature:
>>
>> A language needs to figure out where it stands on this kind of thing at
>> an early stage, and when stabilising.  Then it should stick to the
>> decisions it makes unless there is a very good reason to do otherwise.
>> Once you have a base of existing code in use, it's hard to make changes
>> at this level.  If you remove semantics to allow more optimisation, you
>> break code that used to be correct according to the language
>> specification when it was written.  If you add semantics and reduce
>> optimisation, code that was fine before now runs less efficiently.
>> Neither is good.
>
> This is why I believe the std::launder change is an example of bad design:
> In the beginning there was type punning, unions, and the "effective
> type" rules of C, i.e. the rules of pointer casts.
> Then C++ addressed this by adding a number of cast operators designed
> for the very purpose of making the semantics of these rules more explicit.

No, it did not. The cast operators make casts clearer and make it more
obvious what that cast does or does not do. But they don't make the
type aliasing rules clearer or more explicit, and they don't change them
significantly compared to C. In particular, if you have a "float*" that
points to a float, there is no cast operator that will turn that into an
"int*" that can be used (in a fully defined manner) to read the float
object's memory as though it were an int.

>
> Baseline is that pointer conversion is strictly coupled with dynamic
> memory allocation - and I think you agree that dynamic memory is a core
> feature of C++. malloc /is/ useful in C++ too;

malloc can be used in C++ in the same way as C. And like C, the memory
returned by C++ has (AFAIUI) no type until you use the memory. So there
is no problem.

> And C++ did "figure out where it stands on this kind of thing at an
> early stage".
> The fact is that the C++ committee decided, only 30+ years after this
> early stage, to turn the whole thing upside down. Not smart, IMO.
>

They have clarified it, and added flexibility that was missing. The
language hasn't changed here since C++03.

What /has/ changed, is that compilers have gained optimisations that
take advantage of decisions made 18 years ago (or more), and which
perhaps people have misunderstood in the meantime.

The question of whether compilers should continue to do what some people
thought they were supposed to do, or whether they optimise more based on
what the /standards/ say they should do, is a difficult one. But that
is the question to be asking here - not whether the committee should
have added std::launder or not.

(You can, of course, ask whether std::launder was the best way to
implement the additional flexibility.)

>
>>>
>>>> Casting pointers is /dangerous/.  It is lying to the compiler - it is
>>>> saying that an object has one type, but you want to pretend it is a
>>>> different type.
> /Careless/ casting is dangerous. Bjarne knew about it and he addressed
> the problem.
> It's not lying unless you misuse it, and in C++ you need to exercise
> some gymnastics to achieve that.
> On the other hand, if you want to do anything with a piece of memory
> returned by malloc you /need/ to cast the pointer you get (the same is
> substantially true for new char[]).
>

You can cast the return value from malloc() and use it - that is an
entirely reasonable (indeed, essential) use of casting pointer types.
You can't do the same with memory returned by "new char[]".

You can't do it in C either. I'd like a std::launder equivalent in C to
be able to handle this.

(When I say "you can't do this", I mean the standards don't define a
particular behaviour for it. Particular compilers might, perhaps using
extensions, or they might simply give you the code you expect even
though it is not guaranteed by design of the language or specification
of the compiler.)

>   Many other programming languages don't allow anything
>>>> equivalent to such conversions.
> And these programming languages are far less powerful than C++ (and C),
> to the point that the most popular of them would just not work without C
> or C++, e.g Java and C#.
>
>   However, it can be useful on occasion
>>>> in low-level code, which can usually be left to a few programmers who
>>>> understand the issues.  The same applies in C++ - std::launder is
>>>> likely
>>>> to find use in implementing memory pools and specialist allocators, not
>>>> in normal application code.
>
> I'm not that convinced about this ivory tower argument.
> I do get annoyed by broken code when written by some incompetent
> keypusher when I see it, and I dislike when some major software vendor
> advertises their new programming language being "easy to use" as its
> primary selling point, but I don't think that making C++ more convoluted
> or fragmented is a solution to that.

I'm not going to argue that the C++ memory model here is the best
choice, or that the ideal balance has been found between a compiler's
opportunities for optimisation and the programmer's expectation that
code works the way it looks like it works. I'm always happier if
mistakes in the code lead to compile-time failures, or at least compiler
warnings - having to remember subtle things like std::launder in certain
types of low-level code is not great.

But complaints and blame should be appropriate. std::launder was not
added so that previously safe code would now be unsafe without it - it
was added so that previously unsafe code could now be written safely.

David Brown

unread,
Feb 17, 2021, 3:36:12 PM2/17/21
to
On 17/02/2021 19:43, Manfred wrote:
> On 2/17/2021 5:30 PM, David Brown wrote:
>> On 17/02/2021 17:00, Scott Lurndal wrote:
>>> David Brown <david...@hesbynett.no> writes:
>>>>
>>>> As I explained, memcpy (used appropriately and with a decent compiler,
>>>> such as reasonably current gcc) has /zero/ run-time cost.
>>>
>>> Except for any cache line(s) evicted as a result of the memcpy, which
>>> may indeed have an overall performance cost.
>>>
>>

>
> I think this still does not solve the issue at the language level.

What issue? The "cache lines" Scott referred to don't exist at the
"language level".

> The standard does not mandate the behaviour that is shown by your
> example, so even if compilers do compensate for inefficiency of the
> code, this does not make the language good.

That is correct. The standard only mandates that they will work with
the same effect (if the non-standard pragma were removed).

>
> Looking at your first and second example above there is no reason for
> which one should prefer getb2 over getb1, in fact getb2, as written, is
> less efficient than getb1 because it introduces some unneeded extra
> storage and an extra function call - albeit at the level of the abstract
> machine, with no benefit in readability or robustness against bugs.

In these examples, getb1() is the simplest and clearest. Consider
instead that we had:

int32_t getb1(const void* p) {
return *(const int32_t *)p;
}

int32_t getb2(const void* p) {
int32_t x;
memcpy(&x, p, sizeof x);
return x;
}

(and so on for the others).

The results from gcc are the same - a single "mov" instruction. Here
getb2() /does/ have an advantage over getb1() in that it is valid and
fully defined behaviour even if the parameter did not point to a
properly aligned int - perhaps it points into a buffer of unsigned char
of incoming raw data. In that case, it /is/ more robust - it will work
even if changes to the code and build process (such as using LTO) give
the compiler enough information to know that a particular call to
getb1() is undefined behaviour and can lead to unexpected failures. I
don't like code that appears to work in a simple test case, but has
subtle flaws that mean it might not work in all cases.

And no, at least for this compiler, getb2() is not less efficient than
getb1(). I really don't care about a couple of microseconds of the
compiler's time - the efficiency I care about is at run-time.

(There's no argument about the readability.)


> If the language somehow requires to use getb2 instead of getb1, I see
> this as an inefficiency in the language.
> The fact that the compiler puts a remedy to this by applying some
> operation that is hidden to the language specification does not make the
> language itself any more efficient.

This particular branch of the thread was in response to Scott's strange
claim that using memcpy() in the code caused cache lines to be evicted.

David Brown

unread,
Feb 17, 2021, 3:42:12 PM2/17/21
to
That is not quite accurate. In gcc, memcpy with known small sizes
(usually regardless of alignment) is handled using a built-in version
that is going to be as good as it gets - memcpy(&a, &b, sizeof a) is
going to be roughly like "a = b" (but skipping any assignment operator
stuff). That won't necessarily apply to all other compilers, though gcc
is not alone in handling this.

I don't think memcpy can be implemented (or duplicated) in pure C or C++
of any standard, with all aspects of the way it copies effective types,
but I am not sure on that. However, that doesn't mean it has to be an
"intrinsic" - it just means it has to be implemented using extensions in
the compiler, or treated specially in some other way by the tool.

Bo Persson

unread,
Feb 17, 2021, 4:05:52 PM2/17/21
to
And therefore C++20 packages this into std::bit_cast, for your convenience.

https://en.cppreference.com/w/cpp/numeric/bit_cast

constexpr double f64v = 19880124.0;
constexpr auto u64v = std::bit_cast<std::uint64_t>(f64v);


Keith Thompson

unread,
Feb 17, 2021, 4:07:58 PM2/17/21
to
James Kuyper <james...@alumni.caltech.edu> writes:
[...]
> The values corresponding to a format specifier of %p are supposed to
> have a type of void*, otherwise the behavior is undefined. On
> implementations where all pointers have the same representation, which
> probably includes every implementation you've ever used, that's
> generally not a problem: the implementation defines the behavior that
> the standard leaves undefined, in precisely the way you presumably
> thought it was required to be defined.

Do most implementations actually *define* (i.e., document) the behavior
of passing a non-void* pointer with a %p format specifier? I haven't
checked, but it's plausible that most of them implement it in the
obvious way but don't bother to mention it. Since the behavior is
undefined, not implementation-defined, implementations are not required
to document it.

Chris Vine

unread,
Feb 17, 2021, 4:17:13 PM2/17/21
to
On Wed, 17 Feb 2021 21:19:41 +0100
David Brown <david...@hesbynett.no> wrote:
[snip]
> But complaints and blame should be appropriate. std::launder was not
> added so that previously safe code would now be unsafe without it - it
> was added so that previously unsafe code could now be written safely.

You probably have already deduced this, but this is to say I disagree.
And I don't think that C++03 was the source of this but if it was, I am
absolutely certain that the authors of C++03 didn't think they were
declaring past (and future) accepted practice to be invalid.

In the beginning (C88/89) was the strict-aliasing rule. The competent
C and subsequently C++ programmer promised that (subject to certain
specified exceptions) she would not dereference a pointer to an object
which did not in fact exist at the memory location in question. You
want to construct an object in uninitialized memory and cast your
pointer to the type of that object? Fine, you complied with the
strict-aliasing rule. Examples appeared in all the texts and websites,
including Stroustrup's C++PL 4th edition. No one thought C++03 had the
effect you mention.

But if C++03 was the source of the problem, then the mistake in the
standard should have been corrected. You don't correct it by inventing
additional rules which declare past practice (and C practice) to
comprise undefined behaviour. std::launder declares itself in a part
of the standard described as "Pointer optimization barrier". This is a
conceptual error, casting (!) the compiler's job (deciding whether code
can be optimized or not) onto the programmer. Write a better compiler
algorithm or don't do it at all.

Keith Thompson

unread,
Feb 17, 2021, 4:55:57 PM2/17/21
to
David Brown <david...@hesbynett.no> writes:
[...]
> I don't think memcpy can be implemented (or duplicated) in pure C or C++
> of any standard, with all aspects of the way it copies effective types,
> but I am not sure on that. However, that doesn't mean it has to be an
> "intrinsic" - it just means it has to be implemented using extensions in
> the compiler, or treated specially in some other way by the tool.

A pure C implementation of memcpy cannot, as I understand it, be
portable to all implementations. But a pure C implementation could work
correctly with a particular implementation, especially if the compiler
doesn't try to optimize based on the effective type rules.

Violations of the effective type rules result in undefined behavior. A
compiler *could* treat such violations in a way that's consistent with a
naive memcpy.

Chris Vine

unread,
Feb 17, 2021, 5:06:26 PM2/17/21
to
On Wed, 17 Feb 2021 13:55:41 -0800
Keith Thompson <Keith.S.T...@gmail.com> wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
> > I don't think memcpy can be implemented (or duplicated) in pure C or C++
> > of any standard, with all aspects of the way it copies effective types,
> > but I am not sure on that. However, that doesn't mean it has to be an
> > "intrinsic" - it just means it has to be implemented using extensions in
> > the compiler, or treated specially in some other way by the tool.
>
> A pure C implementation of memcpy cannot, as I understand it, be
> portable to all implementations. But a pure C implementation could work
> correctly with a particular implementation, especially if the compiler
> doesn't try to optimize based on the effective type rules.
>
> Violations of the effective type rules result in undefined behavior. A
> compiler *could* treat such violations in a way that's consistent with a
> naive memcpy.

The effective type of the result of applying memcpy is the type of the
destination. The effective type of a cast is the type of the source.
That is why memcpy works. Where it can be, memcpy is a compiler
illusion.

Keith Thompson

unread,
Feb 17, 2021, 5:28:58 PM2/17/21
to
Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
[...]
> In the beginning (C88/89) was the strict-aliasing rule.
[...]

What is C88? The first C standard was ANSI C89, which became ISO C90.
Of course work started before 1989, but there is no C88 standard.

Keith Thompson

unread,
Feb 17, 2021, 5:29:54 PM2/17/21
to
Does that contradict what I wrote?

Chris Vine

unread,
Feb 17, 2021, 5:43:32 PM2/17/21
to
On Wed, 17 Feb 2021 14:28:41 -0800
Keith Thompson <Keith.S.T...@gmail.com> wrote:
> Chris Vine <chris@cvine--nospam--.freeserve.co.uk> writes:
> [...]
> > In the beginning (C88/89) was the strict-aliasing rule.
> [...]
>
> What is C88? The first C standard was ANSI C89, which became ISO C90.
> Of course work started before 1989, but there is no C88 standard.

Indeed. A memory bus error.

Chris M. Thomasson

unread,
Feb 17, 2021, 6:18:42 PM2/17/21
to
Shit happens!

James Kuyper

unread,
Feb 17, 2021, 10:50:08 PM2/17/21
to
On 2/17/21 4:07 PM, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
> [...]
>> The values corresponding to a format specifier of %p are supposed to
>> have a type of void*, otherwise the behavior is undefined. On
>> implementations where all pointers have the same representation, which
>> probably includes every implementation you've ever used, that's
>> generally not a problem: the implementation defines the behavior that
>> the standard leaves undefined, in precisely the way you presumably
>> thought it was required to be defined.
>
> Do most implementations actually *define* (i.e., document) the behavior
> of passing a non-void* pointer with a %p format specifier?

I meant "define" only in the sense that they actually do something
useful, not that they've necessarily publicized that fact. In practice,
on systems where all pointer types have the same representation, they'd
have to go out of their way to make such code break.

James Kuyper

unread,
Feb 17, 2021, 10:57:33 PM2/17/21
to
On 2/17/21 1:43 PM, Manfred wrote:
> On 2/17/2021 5:30 PM, David Brown wrote:
>> On 17/02/2021 17:00, Scott Lurndal wrote:
...
The point is, the standard doesn't mandate that behavior for ANY of the
functions he defined. The standard doesn't even address the issue,
beyond giving implementations the freedom to generate any code that has
the same required observable behavior.

> Looking at your first and second example above there is no reason for
> which one should prefer getb2 over getb1, in fact getb2, as written, is
> less efficient than getb1 because it introduces some unneeded extra
> storage and an extra function call - albeit at the level of the abstract
> machine, with no benefit in readability or robustness against bugs.
> If the language somehow requires to use getb2 instead of getb1, I see
> this as an inefficiency in the language.
> The fact that the compiler puts a remedy to this by applying some
> operation that is hidden to the language specification does not make the
> language itself any more efficient.

The language itself is efficient because it's been quite deliberately
and carefully designed to allow implementations that are as efficient as
this one is. It's also inefficient, in that it doesn't prohibit
implementations that would convert all four functions into the same code
you'd naively expect to see generated for getb3().

Chris M. Thomasson

unread,
Feb 18, 2021, 12:03:08 AM2/18/21
to
On 2/7/2021 10:43 AM, James Kuyper wrote:
> On 2/7/21 1:22 PM, Anton Shepelev wrote:
>> Hello, all.
>>
>> My C module for pixel-perfect scaling comprises two files:
>> ppscale.c and ppscale.h . I meant it to be compiled into the
>> executable or into an intermediate object file, so that the
>> higher-level code need only include the .h file. But the
>> maintainer of a DOSBox fork has decided to include my module
>> into a C++ file thus:
>>
>> extern "C" {
>> #include "ppscale.h"
>> #include "ppscale.c"
>> }
[...]

Well, I have personally included a .c file. However, I have had to work
with a team that did that. I just never have had the need to do it.

Chris M. Thomasson

unread,
Feb 18, 2021, 12:04:00 AM2/18/21
to
On 2/17/2021 9:02 PM, Chris M. Thomasson wrote:
> On 2/7/2021 10:43 AM, James Kuyper wrote:
>> On 2/7/21 1:22 PM, Anton Shepelev wrote:
>>> Hello, all.
>>>
>>> My C module for pixel-perfect scaling comprises two files:
>>> ppscale.c and ppscale.h . I meant it to be compiled into the
>>> executable or into an intermediate object file, so that the
>>> higher-level code need only include the .h file. But the
>>> maintainer of a DOSBox fork has decided to include my module
>>> into a C++ file thus:
>>>
>>>     extern "C" {
>>>     #include "ppscale.h"
>>>     #include "ppscale.c"
>>>     }
> [...]
>
ARGHGHGH!

> Well, I have personally included a .c file. However, I have had to work
^^^^^^^^^^^^
NEVER

Well, I have personally NEVER included a .c file. However, I have had to

David Brown

unread,
Feb 18, 2021, 3:06:08 AM2/18/21
to
On 17/02/2021 22:17, Chris Vine wrote:
> On Wed, 17 Feb 2021 21:19:41 +0100
> David Brown <david...@hesbynett.no> wrote:
> [snip]
>> But complaints and blame should be appropriate. std::launder was not
>> added so that previously safe code would now be unsafe without it - it
>> was added so that previously unsafe code could now be written safely.
>
> You probably have already deduced this, but this is to say I disagree.
> And I don't think that C++03 was the source of this but if it was, I am
> absolutely certain that the authors of C++03 didn't think they were
> declaring past (and future) accepted practice to be invalid.

To my understanding (and I freely admit I didn't follow the changes in
C++ over time to the same extent as I did C), one of the changes in
C++03 was to give a clear (well, as clear as anything in these standards
documents...) specification of the memory model. Until then, a lot more
had been left up to chance of implementation.

At this time, compilers were getting noticeably smarter and doing more
optimisation - and processors were getting more complex (speculative
execution, out of order, multiprocessing, and so on). The old "it's all
obvious, right?" attitude to memory and object storage was not good enough.

It was not about trying to make old code invalid, it was about trying to
say what guarantees the compiler had to give you no matter what
optimisations it used or what processor you ran the code on. And like
many aspects of the C and C++ standards, it was based on trying to
understand what compilers did at the time, and what a substantial number
of programmers wrote - trying to get a consistent set of rules from
existing practice. Of course it is inevitable that some existing
practices and compilers would have to be changed.

(Again, I am /not/ claiming they got everything right or ideal here.)

>
> In the beginning (C88/89) was the strict-aliasing rule. The competent
> C and subsequently C++ programmer promised that (subject to certain
> specified exceptions) she would not dereference a pointer to an object
> which did not in fact exist at the memory location in question. You
> want to construct an object in uninitialized memory and cast your
> pointer to the type of that object? Fine, you complied with the
> strict-aliasing rule.

The "-fno-strict-aliasing" flag in gcc is a great idea, and I would be
happy to see a standardised pragma for the feature in all (C and C++)
compilers. But it is not standard. So in the beginning, there was the
"strict aliasing rule" ("effective type rules" is perhaps more accurate,
or "type-based aliasing rules"). There was no good standard way to get
around them - memcpy was slow (compilers were not as smart at that
time), there was no "-fno-strict-aliasing" flag to give you guarantees
of new semantics, and even union-based type punning was at best
"implementation defined". People wrote code that had no
standards-defined behaviour but worked in practice because compilers
were limited.

The code was /wrong/ - but practicalities and real life usually trump
pedantry and nit-picking, and code that /works/ is generally all you need.


Later C and C++ standards have gradually given more complete
descriptions of how things are supposed to work in these languages.
They have added things that weren't there in the older versions. C++03
did not make changes so that "X x(1); new (&x) X(2); x.foo();" is
suddenly wrong. It made changes to say when it is /right/ - older
versions didn't give any information and you relied on luck and weak
compilers. And C++17 didn't change the meanings here either - it just
gave you a new feature to help you write such code if you want it.


> Examples appeared in all the texts and websites,
> including Stroustrup's C++PL 4th edition. No one thought C++03 had the
> effect you mention.

Lots of C and C++ books - even written by experts - have mistakes. They
also have best practices of the day, which do not necessarily match best
practices of now. And they are certainly limited in their prediction of
the future.

Remember, the limitation that is resolved by adding std::launder stems
from C++03 (because before that, no one knew what was going on as it
wasn't specified at all), but the need for a fix was not discovered
until much later. That's why it is in C++17, not C++03. C++ and C are
complex systems - defect reports are created all the time.

>
> But if C++03 was the source of the problem, then the mistake in the
> standard should have been corrected. You don't correct it by inventing
> additional rules which declare past practice (and C practice) to
> comprise undefined behaviour. std::launder declares itself in a part
> of the standard described as "Pointer optimization barrier". This is a
> conceptual error, casting (!) the compiler's job (deciding whether code
> can be optimized or not) onto the programmer. Write a better compiler
> algorithm or don't do it at all.
>

I appreciate your point, and I am fully in favour of having the compiler
figure out the details rather than forcing the programmer to do so. I
don't know if the standard could have been "fixed" to require this here,
but certainly that would have been best.

However, it is clear that the use-cases of std::launder are rare - you
only need it in a few circumstances. And the consequences of letting
the compiler assume that const and reference parts of an object remain
unchanged are huge - devirtualisation in particular is a /massive/ gain
on code with lots of virtual functions. Do you think it is worth
throwing that out because some function taking a reference or pointer to
an object might happen to use placement new on it? How many times have
you used classes with virtual functions in your code over the years?
How many times have you used placement new on these objects?

If people needed to add std::launder a dozen times per file, I could
understand the complaints. If the new standards had actually changed
existing specified behaviour, I could understand. If they had changed
the meaning or correctness of existing code, I'd understand. (And that
/has/ happened with C++ changes.) But here it is just adding a new
feature that will rarely be needed.

David Brown

unread,
Feb 18, 2021, 4:09:51 AM2/18/21
to
On 17/02/2021 22:05, Bo Persson wrote:

>
> And therefore C++20 packages this into std::bit_cast, for your convenience.
>
> https://en.cppreference.com/w/cpp/numeric/bit_cast
>
> constexpr double f64v = 19880124.0;
> constexpr auto u64v = std::bit_cast<std::uint64_t>(f64v);
>
>

Yes, std::bit_cast will be useful in some cases.

I don't know if the practice will be much neater than memcpy for cases
such as reading data from a buffer - you'd still need somewhat ugly
casts for accessing the data (such as to reference a 4-byte subsection
of a large unsigned char array). Of course that kind of stuff can be
written once in a template or function and re-used.

It would have the advantage over memcpy of being efficient even on
weaker compilers, as it will not (should not!) lead to a function call.
But are there compilers that can't optimise simple memcpy and also
support C++20?

I think the clearest use-case for bit_cast will be in situations where
you would often use union-based type punning in C:

uint32_t float_bits_C(float f) {
union { float f; uint32_t u; } u;
u.f = f;
return u.u;
}

uint32_t float_bits_Cpp(float f) {
return std::bit_cast<uint32_t>(f);
}

or if you want to combine them :-) :

uint32_t float_bits(float f) {
union { float f; uint32_t u; } u;
u.f = f;
#ifdef __cplusplus
u.u = std::bit_cast<uint32_t>(u.f);
#endif
return u.u;
}


David Brown

unread,
Feb 18, 2021, 4:14:50 AM2/18/21
to
On 17/02/2021 22:55, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> I don't think memcpy can be implemented (or duplicated) in pure C or C++
>> of any standard, with all aspects of the way it copies effective types,
>> but I am not sure on that. However, that doesn't mean it has to be an
>> "intrinsic" - it just means it has to be implemented using extensions in
>> the compiler, or treated specially in some other way by the tool.
>
> A pure C implementation of memcpy cannot, as I understand it, be
> portable to all implementations. But a pure C implementation could work
> correctly with a particular implementation, especially if the compiler
> doesn't try to optimize based on the effective type rules.

Yes. (Thanks for adding that clarification.)

>
> Violations of the effective type rules result in undefined behavior. A
> compiler *could* treat such violations in a way that's consistent with a
> naive memcpy.
>

Sure.

Chris Vine

unread,
Feb 18, 2021, 5:44:58 AM2/18/21
to
On Thu, 18 Feb 2021 09:05:52 +0100
David Brown <david...@hesbynett.no> wrote:
[snip]
> I appreciate your point, and I am fully in favour of having the compiler
> figure out the details rather than forcing the programmer to do so. I
> don't know if the standard could have been "fixed" to require this here,
> but certainly that would have been best.
>
> However, it is clear that the use-cases of std::launder are rare - you
> only need it in a few circumstances. And the consequences of letting
> the compiler assume that const and reference parts of an object remain
> unchanged are huge - devirtualisation in particular is a /massive/ gain
> on code with lots of virtual functions. Do you think it is worth
> throwing that out because some function taking a reference or pointer to
> an object might happen to use placement new on it? How many times have
> you used classes with virtual functions in your code over the years?
> How many times have you used placement new on these objects?
>
> If people needed to add std::launder a dozen times per file, I could
> understand the complaints. If the new standards had actually changed
> existing specified behaviour, I could understand. If they had changed
> the meaning or correctness of existing code, I'd understand. (And that
> /has/ happened with C++ changes.) But here it is just adding a new
> feature that will rarely be needed.

Well the requirement to use std::launder applies to any case where,
even though the strict aliasing rules have been met, the pointers
concerned are not 'pointer-interconvertible', which is an entirely new
concept introduced in C++17 and as far as I can see has little to do
with the C++03 memory model.

I think we have taken this about as far as it is possible to take it. I
remain sceptical that imposing std::launder on programmers in C++17
as a supplement to strict aliasing has a significant effect on the
compiler's ability to optimise code, and I don't buy the story that "Oh
we discovered in 2017 that code complying with the strict aliasing rule
still has undefined behaviour because of our memory model so we have
provided std::launder to be helpful". If helpfulness was required the
C++17 concept of pointer interconvertibility could have been developed
differently.

Chris M. Thomasson

unread,
Feb 18, 2021, 6:15:22 AM2/18/21
to
Damn!

>
> I think we have taken this about as far as it is possible to take it. I
> remain sceptical that imposing std::launder on programmers in C++17
> as a supplement to strict aliasing has a significant effect on the
> compiler's ability to optimise code, and I don't buy the story that "Oh
> we discovered in 2017 that code complying with the strict aliasing rule
> still has undefined behaviour because of our memory model so we have
> provided std::launder to be helpful". If helpfulness was required the
> C++17 concept of pointer interconvertibility could have been developed
> differently.
>

std::launder, well, shit...

David Brown

unread,
Feb 18, 2021, 6:27:38 AM2/18/21
to
Fair enough. I am unconvinced that C++17 adds new requirements for code
to be correct - but it might well be that the new clarifications give
compilers a license to optimise in a way that could be unexpected. For
practical purposes for programmers and compiler users, that amounts to
much the same thing.

> I
> remain sceptical that imposing std::launder on programmers in C++17
> as a supplement to strict aliasing has a significant effect on the
> compiler's ability to optimise code, and I don't buy the story that "Oh
> we discovered in 2017 that code complying with the strict aliasing rule
> still has undefined behaviour because of our memory model so we have
> provided std::launder to be helpful". If helpfulness was required the
> C++17 concept of pointer interconvertibility could have been developed
> differently.
>

Again - I am not going to claim that the memory model, alias rules,
std::launder, or anything else is the best or clearest solution.

mick...@potatofield.co.uk

unread,
Feb 19, 2021, 5:23:18 AM2/19/21
to
On Wed, 17 Feb 2021 13:26:24 -0500
James Kuyper <james...@alumni.caltech.edu> wrote:
>On 2/17/21 12:12 PM, mick...@potatofield.co.uk wrote:
>> That int looks nonaligned to me.
>
>For the reasons given above, on an implementation which can have the

So one minute you're saying it CANNOT happen, next minute its "oh, except
on architectures where it can."

Whatever mate.

>Are you claiming that you just compiled this code using such an
>implementation of C? If so, which implementation is it, and what is the
>target platform? In particular, what are the values of
>_Alignof(uint32_t), sizeof(uint32_t*), and sizeof(char*)?


All irrelevant. You said - and I quote - it CANNOT happen. Well it can. End of.

mick...@potatofield.co.uk

unread,
Feb 19, 2021, 5:24:24 AM2/19/21
to
On Wed, 17 Feb 2021 11:27:56 -0800
Keith Thompson <Keith.S.T...@gmail.com> wrote:
>mick...@potatofield.co.uk writes:
>> That int looks nonaligned to me.
>
>James was talking about "a pointer type that can only represent
>positions at the beginning of a word", something that doesn't exist on
>the implementation you're using.

He said it cannot happen. He didn't say it cannot on certain architectures,
he said it cannot, full stop. Well it can.

>On the implementation you're using, a uint32_t* pointer value *can*
>point to an odd address, and such a pointer can be dereferenced
>successfully.

You don't say.

James Kuyper

unread,
Feb 19, 2021, 8:58:04 AM2/19/21
to
On 2/19/21 5:23 AM, mick...@potatofield.co.uk wrote:
> On Wed, 17 Feb 2021 13:26:24 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
>> On 2/17/21 12:12 PM, mick...@potatofield.co.uk wrote:
>>> That int looks nonaligned to me.
>>
>> For the reasons given above, on an implementation which can have the
>
> So one minute you're saying it CANNOT happen, next minute its "oh, except
> on architectures where it can."

What I said was:

"Conversion of a pointer that doesn't point at the beginning of a word
to a pointer type that can only represent positions at the beginning of
a word CANNOT result in a pointer to the same location.".

That is absolutely true, without exceptions: if the pointer type is only
capable of representing positions at the beginning of a word, it cannot
represent the same position as a pointer that doesn't point to the
beginning of the word. Therefore, conversion of such a pointer to such a
type cannot result in a pointer that points at the same location.

On implementations where every pointer type can represent any position
in memory, the statement isn't false - the conditions under which it
would apply simply cannot occur.

And I was very clear at every stage of this discussion that I was
talking about behavior which can vary from one implementation to
another. Relevant quotes:

"If a compiler performs an optimization ..."
"... has undefined behavior ..."
"Most C compilers do ..."
"It's not guaranteed to, ..."
"Implementations are allowed ..."
"... there have been implementations ..."
"... on an implementation where ..."
"... the implementation defines ..."
"... on implementations where ..."
"... on an implementation which ..."

I've tried my best to make it crystal clear that I was talking about
behavior which can vary between one implementation and another. If
there's anything I could have written differently to prevent you from
getting confused about that aspect of what I was saying, please let me
know how you think I should have re-worded it.

James Kuyper

unread,
Feb 19, 2021, 9:03:34 AM2/19/21
to
On 2/19/21 5:24 AM, mick...@potatofield.co.uk wrote:
> On Wed, 17 Feb 2021 11:27:56 -0800
> Keith Thompson <Keith.S.T...@gmail.com> wrote:
>> mick...@potatofield.co.uk writes:
>>> That int looks nonaligned to me.
>>
>> James was talking about "a pointer type that can only represent
>> positions at the beginning of a word", something that doesn't exist on
>> the implementation you're using.
>
> He said it cannot happen. He didn't say it cannot on certain architectures,
> he said it cannot, full stop. Well it can.

Yes, I was quite clear about it being implementation-dependent. I said
"a pointer type which can only represent positions at the beginning of a
word". I made it quite clear that whether or not there are any such
types is something that depends upon the implementation.

mick...@potatofield.co.uk

unread,
Feb 19, 2021, 9:15:14 AM2/19/21
to
In other words its not part of the standard which is the position you were
arguing from.

james...@alumni.caltech.edu

unread,
Feb 19, 2021, 11:09:20 PM2/19/21
to
On Friday, February 19, 2021 at 9:15:14 AM UTC-5, mick...@potatofield.co.uk wrote:
> On Fri, 19 Feb 2021 09:03:20 -0500
> James Kuyper <james...@alumni.caltech.edu> wrote:
...
> >Yes, I was quite clear about it being implementation-dependent. I said
> >"a pointer type which can only represent positions at the beginning of a
> >word". I made it quite clear that whether or not there are any such
> >types is something that depends upon the implementation.
> In other words its not part of the standard which is the position you were
> arguing from.

I was arguing, from the standard, that the behavior is undefined, which is precisely
what allows the behavior to be different on different implementations. I see that I
never got around to citing the precise text from the standard, because I was
concentrating on the effective type issue. Well, here it is:

“A pointer to an object type may be converted to a pointer to a different object type.
If the resulting pointer is not correctly aligned 68) for the referenced type, the
behavior is undefined.” (C standard 6.3.2.3p7).

Notice that it is the conversion itself that has undefined behavior. You don’t even
have to dereference the pointer to run into problems, simply trying to create such
a pointer may cause problems.

Tim Rentsch

unread,
Feb 20, 2021, 9:27:44 AM2/20/21
to
Keith Thompson <Keith.S.T...@gmail.com> writes:

> David Brown <david...@hesbynett.no> writes:
> [...]
>
>> I don't think memcpy can be implemented (or duplicated) in pure C or C++
>> of any standard, with all aspects of the way it copies effective types,
>> but I am not sure on that. However, that doesn't mean it has to be an
>> "intrinsic" - it just means it has to be implemented using extensions in
>> the compiler, or treated specially in some other way by the tool.
>
> A pure C implementation of memcpy cannot, as I understand it, be
> portable to all implementations. [...]

Can you expand on this statement? AFAICS a rather simple writing
of a memcpy() function would be portable to all conforming C
implementations, as for example

void *
memcpy( void *vd, const void *vs, size_t n ){
unsigned char *p = vd;
const unsigned char *q = vs;
while( n > 0 ) n--, *p++ = *q++;
return vd;
}

In what way does this definition of memcpy() not satisfy the
specifications given in the ISO C standard? Or does it pass
that test?

Anton Shepelev

unread,
Feb 21, 2021, 6:55:11 AM2/21/21
to
Chris M. Thomasson:

> > Well, I have personally included a .c file.
> ^^^^^^^^^^^^
> NEVER

A freudian slip. Now we now what coding practices Mr.
Thomasson indulges in, when in private :-)

--
() ascii ribbon campaign -- against html e-mail
/\ http://preview.tinyurl.com/qcy6mjc [archived]

Chris M. Thomasson

unread,
Feb 21, 2021, 5:16:02 PM2/21/21
to
On 2/21/2021 3:54 AM, Anton Shepelev wrote:
> Chris M. Thomasson:
>
>>> Well, I have personally included a .c file.
>> ^^^^^^^^^^^^
>> NEVER
>
> A freudian slip. Now we now what coding practices Mr.
> Thomasson indulges in, when in private :-)
>

lol! :^D

I just never found the need to include .c file. However I have had to
work on code that did do that. Shit happens. :^)

Juha Nieminen

unread,
Feb 22, 2021, 5:06:44 AM2/22/21
to
Sometimes including a source file (not a header file) may be the most
practical way of dealing with generated code. The most common case of this
probably being an array of values in the form of C/C++ source code generated
by a program, perhaps even your own program. If you want this array to have
internal linkage (ie. you want it to be "static") and you want to be able
to write code in that compilation unit that uses the array, then this is
basically the only option. And I don't think there's anything wrong with it.

(The only other option is to put the array in its own compilation unit and
have it have external linkage. This pollutes the global namespace and
hinders possible compiler optimizations in the code that uses the array.)

David Brown

unread,
Feb 22, 2021, 5:59:46 AM2/22/21
to
I have done exactly that with generated code. But I usually give it a
specific extension such as ".inc" to distinguish it from ordinary
human-written C or C++ files.

This is not the /only/ option, but it is often the most convenient one.
(At least some arrays that could only be generated with external
programs can now be done using compile-time calculations in C++. I'd
probably use that for CRC tables and the like in new code. But that
won't work for everything.)

Kaz Kylheku

unread,
Feb 22, 2021, 10:22:08 AM2/22/21
to
On 2021-02-22, Juha Nieminen <nos...@thanks.invalid> wrote:
> In comp.lang.c++ Chris M. Thomasson <chris.m.t...@gmail.com> wrote:
>> On 2/21/2021 3:54 AM, Anton Shepelev wrote:
>>> Chris M. Thomasson:
>>>
>>>>> Well, I have personally included a .c file.
>>>> ^^^^^^^^^^^^
>>>> NEVER
>>>
>>> A freudian slip. Now we now what coding practices Mr.
>>> Thomasson indulges in, when in private :-)
>>>
>>
>> lol! :^D
>>
>> I just never found the need to include .c file. However I have had to
>> work on code that did do that. Shit happens. :^)
>
> Sometimes including a source file (not a header file) may be the most
> practical way of dealing with generated code. The most common case of this
> probably being an array of values in the form of C/C++ source code generated
> by a program, perhaps even your own program.

Both .h and .c files are source files.

A source file that is #include -d should have a .h suffix, even
if it contains definitions that would prevent it from being
included in more than one place.

Using .c for include files causes confusion; it is a deeply entrenched
convention that .c means "I am the root file of a translation unit".

If that is a lie, you are flouting the deeply rooted convention,
and should be whipped naked with a wet noodle.

> internal linkage (ie. you want it to be "static") and you want to be able
> to write code in that compilation unit that uses the array, then this is
> basically the only option. And I don't think there's anything wrong with it.

You have the excellent option of naming it with a .h suffix.

--
TXR Programming Language: http://nongnu.org/txr
Cygna: Cygwin Native Application Library: http://kylheku.com/cygnal

anti...@math.uni.wroc.pl

unread,
Feb 22, 2021, 1:41:20 PM2/22/21
to
In comp.lang.c Kaz Kylheku <563-36...@kylheku.com> wrote:
>
> Both .h and .c files are source files.
>
> A source file that is #include -d should have a .h suffix, even
> if it contains definitions that would prevent it from being
> included in more than one place.
>
> Using .c for include files causes confusion; it is a deeply entrenched
> convention that .c means "I am the root file of a translation unit".

What about file that includes itself? It is both "root" and
included.

>
> If that is a lie, you are flouting the deeply rooted convention,
> and should be whipped naked with a wet noodle.

My convention is that .h specifies interface. I prefer something
like '.def' or '.inc' for file that is included in specific places
(specific context) and would not work otherwise.

--
Waldek Hebisch

Joe Pfeiffer

unread,
Feb 22, 2021, 1:46:14 PM2/22/21
to
anti...@math.uni.wroc.pl writes:

> In comp.lang.c Kaz Kylheku <563-36...@kylheku.com> wrote:
>>
>> Both .h and .c files are source files.
>>
>> A source file that is #include -d should have a .h suffix, even
>> if it contains definitions that would prevent it from being
>> included in more than one place.
>>
>> Using .c for include files causes confusion; it is a deeply entrenched
>> convention that .c means "I am the root file of a translation unit".
>
> What about file that includes itself? It is both "root" and
> included.

I find it hard to imagine a clearer indication of badly, badly broken
code than a file that includes itself.

Kaz Kylheku

unread,
Feb 22, 2021, 1:50:10 PM2/22/21
to
["Followup-To:" header set to comp.lang.c.]
On 2021-02-22, anti...@math.uni.wroc.pl <anti...@math.uni.wroc.pl> wrote:
> In comp.lang.c Kaz Kylheku <563-36...@kylheku.com> wrote:
>>
>> Both .h and .c files are source files.
>>
>> A source file that is #include -d should have a .h suffix, even
>> if it contains definitions that would prevent it from being
>> included in more than one place.
>>
>> Using .c for include files causes confusion; it is a deeply entrenched
>> convention that .c means "I am the root file of a translation unit".
>
> What about file that includes itself? It is both "root" and
> included.

Obviously, the conventions have to be relaxed if we are to
make submissions to IOCCC.

In production code, if you need such a thing, you can split
it into a non-self-referential .c part, and a self-referential .h
part.

You can always find content for that .c part that doesn't have
to be in the .h part. If nothing else, then a copyright header.

Kaz Kylheku

unread,
Feb 22, 2021, 1:50:55 PM2/22/21
to
> In comp.lang.c Kaz Kylheku <563-36...@kylheku.com> wrote:
>>
>> Both .h and .c files are source files.
>>
>> A source file that is #include -d should have a .h suffix, even
>> if it contains definitions that would prevent it from being
>> included in more than one place.
>>
>> Using .c for include files causes confusion; it is a deeply entrenched
>> convention that .c means "I am the root file of a translation unit".
>
> What about file that includes itself? It is both "root" and
> included.

Obviously, the conventions have to be relaxed if we are to
make submissions to IOCCC.

In production code, if you need such a thing, you can split
it into a non-self-referential .c part, and a self-referential .h
part.

You can always find content for that .c part that doesn't have
to be in the .h part. If nothing else, then a copyright header.

It is loading more messages.
0 new messages