Strange optimization

Bonita Montero

unread,

Jun 13, 2023, 2:31:19 PM6/13/23

to

What is the if( ... ) below good for ?

template<bit_cursor_iterator BitIt>
constexpr uint64_t bit_cursor<BitIt>::dRead( uint64_t const &data ) noexcept
{
#if defined(__GNUC__) || defined(__llvm__)
if( (size_t)&data % 8 )
__builtin_unreachable();
#endif
uint64_t value;
memcpy( &value, &data, sizeof(uint64_t) );
return value;
}

Bo Persson

unread,

Jun 13, 2023, 2:36:45 PM6/13/23

to

It tells the compiler that the data is properly aligned. Possibly saves
a potential call to library memcpy to sort out unaligned bytes.

Alf P. Steinbach

unread,

Jun 13, 2023, 11:12:00 PM6/13/23

to

But the function is nonsense, copying an `uint64_t` to `uint64_t`.

And with that in mind it doesn't matter that proper alignment is already
guaranteed via the types, i.e. that the alignment hint is nonsense
embedded in nonsense.

At a guess, the whole thing is probably wrapped in some nonsense at the
level where it's called, to make it Churchill-esque: nonsense containing
some nonsense wrapped in nonsense.

---

More constructively, the compiler-specific `__builtin_unreachable();`
can possibly be replaced with portable `for(;;){}`, which is formally UB
and thus tells the compiler to assume it will never be reached.

And the nonsensical `memcpy` can be replaced with safer assignment, `=`.

And the assignment can be replaced with initialization.

---

But then, the whole nonsensical shebang can be replaced with nothing,
and I guess that's worth keeping in mind in general.

- Alf

Bonita Montero

unread,

Jun 14, 2023, 12:05:24 AM6/14/23

to

Am 14.06.2023 um 05:11 schrieb Alf P. Steinbach:

> But the function is nonsense, copying an `uint64_t` to `uint64_t`.

No, it isn't because the uint64_t is an aliased content.

Alf P. Steinbach

unread,

Jun 14, 2023, 1:52:45 AM6/14/23

to

It's copying an `uint64_t` that is known to be correctly aliased, to an
`uint64_t`; that's nonsense.

It's using `memcpy` instead of `=` for the copying; that's some nested
nonsense.

Maybe I should state this more clearly: it's retarded code.

- Alf

Alf P. Steinbach

unread,

Jun 14, 2023, 1:54:35 AM6/14/23

to

On 2023-06-14 7:52 AM, Alf P. Steinbach wrote:
> On 2023-06-14 6:05 AM, Bonita Montero wrote:
>> Am 14.06.2023 um 05:11 schrieb Alf P. Steinbach:
>>
>>> But the function is nonsense, copying an `uint64_t` to `uint64_t`.
>>
>> No, it isn't because the uint64_t is an aliased content.
>
> It's copying an `uint64_t` that is known to be correctly aliased, to an
> `uint64_t`; that's nonsense.

aligned, aligned. sry.

Bonita Montero

unread,

Jun 14, 2023, 1:56:43 AM6/14/23

to

Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:

> It's copying an `uint64_t` that is known to be correctly aliased, to an
> `uint64_t`; that's nonsense.

The reference intitially supplied by the caller is casted from a char
-array. memcpy() is the only legal way in C++ to alias that content.

Bonita Montero

unread,

Jun 14, 2023, 3:22:29 AM6/14/23

to

memcpy() is a partitially intrsinic function. I.e. if you want to copy
a static small amount of memory the compiler uses some moves; or even
copies the content directly to a register.
My bit_cursor is for having bit_field acess in a char-sized array. The
compiler can't guess that the address supplied to memcpy() is aligned,
so I give it a hint with the above code. Accoding to godbolt this works
with current clang++, but unfortunately not with g++ when you compile
code for RISC-V (unaligned acesses RISC-V are optional with RISC-V).

David Brown

unread,

Jun 14, 2023, 4:32:45 AM6/14/23

to

Well, you can also use std::byte as an alternative to memcpy. But
memcpy works for all standards.

David Brown

unread,

Jun 14, 2023, 7:06:40 AM6/14/23

to

On 14/06/2023 09:22, Bonita Montero wrote:
> Am 13.06.2023 um 20:36 schrieb Bo Persson:
>
>> On 2023-06-13 at 20:31, Bonita Montero wrote:
>
>>> What is the if( ... ) below good for ?
>>>
>>> template<bit_cursor_iterator BitIt>
>>> constexpr uint64_t bit_cursor<BitIt>::dRead( uint64_t const &data )
>>> noexcept
>>> {
>>> #if defined(__GNUC__) || defined(__llvm__)
>>>      if( (size_t)&data % 8 )
>>>          __builtin_unreachable();
>>> #endif
>>>      uint64_t value;
>>>      memcpy( &value, &data, sizeof(uint64_t) );
>>>      return value;
>>> }
>
>> It tells the compiler that the data is properly aligned. Possibly
>> saves a potential call to library memcpy to sort out unaligned bytes.
>
> memcpy() is a partitially intrsinic function. I.e. if you want to copy
> a static small amount of memory the compiler uses some moves; or even
> copies the content directly to a register.

Note that this applies only to specific implementations of memcpy().
The function itself is a standard library function, and can be
implemented with whatever magic the compiler provides, but there is
absolutely no requirement for a given implementation to handle it as
inline or with any special optimisations.

Most decent compilers will, however, optimise memcpy() as inline when
given a small constant size for the number of bytes, and even omit it
entirely if it doesn't need to do anything in the generated code. (It
can still be useful for accessing data in ways that might be contrary to
the "strict aliasing" rules or alignment requirements, even when no code
is generated - as in this case.)

> My bit_cursor is for having bit_field acess in a char-sized array. The
> compiler can't guess that the address supplied to memcpy() is aligned,
> so I give it a hint with the above code. Accoding to godbolt this works
> with current clang++, but unfortunately not with g++ when you compile
> code for RISC-V (unaligned acesses RISC-V are optional with RISC-V).
>

I can understand the attempt, but it doesn't actually work. Look at the
generated code with godbolt.org.

First, it's worth noting that the use of "constexpr" here is wrong. Use
of uninitialised variables, memcpy and pointer casts are all naughty in
constexpr functions (later C++ standards are a little laxer than earlier
ones).

There is no need to test for "__llvm__" here when you test for
"__GNUC__", as clang/llvm also defines "__GNUC__". Your cast should be
to "uintptr_t", not "size_t". And if you want to be /really/ portable,
or at least consistent, you should use sizeof(uint64_t) instead of 8.
(All of gcc's targets have 8-bit char, so it is also not unreasonable to
use 8 everywhere instead of sizeof(uint64_t).)

Regarding the optimisation hint, you have a problem.
"__builtin_unreachable()" can /sometimes/ be used as an optimisation
hint, but it is not really the purpose of the builtin, and sometimes it
does not work as you expect. In particular, different passes of the
optimiser can make things work in different ways. So if you look at
32-bit ARM or 64-bit RISC-V, your code gives optimal results for -O1.
However, at -O2 the compiler is eliminating the unreachable part earlier
and losing that information, leading to slow unaligned load code.

The correct way to handle this kind of thing is __builtin_assume_unaligned :

uint64_t dRead2(const uint64_t &data) {
#if defined(__GNUC__) || defined(__llvm__)
uint64_t value;
const uint64_t * p = (const uint64_t * )
__builtin_assume_aligned (&data, sizeof(uint64_t);
memcpy( &value, p, sizeof(uint64_t) );
return value;
#else

uint64_t value;
memcpy( &value, &data, sizeof(uint64_t) );
return value;

#endif
}

<https://godbolt.org/z/P3oj1fzs9>

Still, it would be nice if your code was also optimised the way you
want, so I filed a missed-optimisation issue for gcc.

<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110249>

Bonita Montero

unread,

Jun 14, 2023, 9:31:57 AM6/14/23

to

Am 14.06.2023 um 13:06 schrieb David Brown:

> Note that this applies only to specific implementations of memcpy(). ...

memcpy() is the only way to reliably alias something in C++ and
therefore it's safe to assume that it is partitially intrinsic.

> I can understand the attempt, but it doesn't actually work.
> Look at the generated code with godbolt.org.

I checked it for clang++ and RISC_V and the trick worked; without
it does eight loads to eight registers followes by eight stores
to an aligned entitity on the stack and a single load from the
stack. But this currently doesn't work with g++ and RISC-V.

> There is no need to test for "__llvm__" here when you test for
> "__GNUC__", as clang/llvm also defines "__GNUC__". Your cast
> should be to "uintptr_t", not "size_t".

I use size_t because that's a safe assumption with flat memory.

> Regarding the optimisation hint, you have a problem.
> "__builtin_unreachable()" can /sometimes/ be used as

> an optimisation hint, ...

Check godbolt and RISC-V clang++.

David Brown

unread,

Jun 14, 2023, 10:36:30 AM6/14/23

to

On 14/06/2023 15:31, Bonita Montero wrote:
> Am 14.06.2023 um 13:06 schrieb David Brown:
>
>> Note that this applies only to specific implementations of memcpy(). ...
>
> memcpy() is the only way to reliably alias something in C++ and
> therefore it's safe to assume that it is partitially intrinsic.

That's not what "intrinsic" means. An "intrinsic" is basically a
wrapper function around a cpu-specific feature or instruction. So if
you want to use advanced SIMD instructions, you are likely to include a
header with macros (they are often macros, for use in C as much as C++)
defining these for a particular CPU family.

It is not long since memcpy() on MSVC was always implemented as a call
to an external library function in a DLL, which was a huge overhead for
such a simple feature.

And no, "memcpy" is /not/ the only way to access aliased data or access
data through pointers of non-compatible types. You can use any
character pointer, with pointers to unsigned char being particularly
useful. (This is precisely how "memcpy" is able to work.) std::byte is
a more modern C++ replacement for using unsigned char in such contexts.

I agree that memcpy is often the right choice here, and I agree that
most decent compilers will optimise it very well. But /some/ compilers
do a bad job of handling it.

>
>> I can understand the attempt, but it doesn't actually work.
>> Look at the generated code with godbolt.org.
>
> I checked it for clang++ and RISC_V and the trick worked; without
> it does eight loads to eight registers followes by eight stores
> to an aligned entitity on the stack and a single load from the
> stack. But this currently doesn't work with g++ and RISC-V.
>

You did not check gcc.

You might have looked at the results for "-O1" optimisation, but you did
not look at the results for "-O2" optimisation.

I spend quite a bit of effort on the post I made to you, because I
thought it was an interesting case. I would appreciate it if you
actually read the post.

>> There is no need to test for "__llvm__" here when you test for
>> "__GNUC__", as clang/llvm also defines "__GNUC__". Your cast
>> should be to "uintptr_t", not "size_t".
>
> I use size_t because that's a safe assumption with flat memory.
>

It is still not the appropriate type. "uintptr_t" is the type for such
use-cases (even though it will almost certainly be implemented with the
same underlying integer type as size_t). The flatness of the memory is
irrelevant.

>> Regarding the optimisation hint, you have a problem.
>> "__builtin_unreachable()" can /sometimes/ be used as
>> an optimisation hint, ...
>
> Check godbolt and RISC-V clang++.
>

You did not look at gcc.

Writing code with a specific conditional compilation that explicitly
targets gcc, and then not testing properly for gcc, is just silly.

Did you try the godbolt link I sent?

Bonita Montero

unread,

Jun 14, 2023, 10:46:30 AM6/14/23

to

Am 14.06.2023 um 16:36 schrieb David Brown:

> On 14/06/2023 15:31, Bonita Montero wrote:

>> memcpy() is the only way to reliably alias something in C++ and
>> therefore it's safe to assume that it is partitially intrinsic.

> That's not what "intrinsic" means. <rest unread>

For me it's partitially intrinsic.

> It is not long since memcpy() on MSVC was always implemented as a call
> to an external library function in a DLL, which was a huge overhead for
> such a simple feature.

Since it's the only way to have safe aliasing in C++ you can silently
rely on that.

> And no, "memcpy" is /not/ the only way to access aliased data or access
> data through pointers of non-compatible types. You can use any

> character pointer, ...

In C you can alias anything as a char-array and vise versa,
but not in C++.

> You did not check gcc.

I checked it and this trick doesn't work with g++ / RISC-V
_with_ -O2 but I hope this will change, so I check for __GNUC__.

> It is still not the appropriate type. ...

For me it is because I use systems with flat memory. And in this case
I could even use a char since I only check for the three lower bits to
be zero.

Scott Lurndal

unread,

Jun 14, 2023, 11:14:23 AM6/14/23

to

David Brown <david...@hesbynett.no> writes:
>On 14/06/2023 15:31, Bonita Montero wrote:
>> Am 14.06.2023 um 13:06 schrieb David Brown:
>>
>>> Note that this applies only to specific implementations of memcpy(). ...
>>
>> memcpy() is the only way to reliably alias something in C++ and
>> therefore it's safe to assume that it is partitially intrinsic.
>
>That's not what "intrinsic" means. An "intrinsic" is basically a
>wrapper function around a cpu-specific feature or instruction.

FWIW, the HP-3000 MPE operating system interfaces (system calls) were called
"intrinsics".

Bonita Montero

unread,

Jun 14, 2023, 11:26:01 AM6/14/23

to

Am 14.06.2023 um 17:14 schrieb Scott Lurndal:

> FWIW, the HP-3000 MPE operating system interfaces (system calls) were called
> "intrinsics".

An intrinsic is a function that directly maps to a processor instruc-
tion. That's holds true for memcpy() with small static sizes because
memcpy() is too important to be mapped to a library call for that.
Because if you're pedantic memcpy() is the only safe way to have
aliasing in C++.
In C you can alias anything as a char array and a char array as
anything so it would be safe to alias anything as anything with
double-casting (I guess that's correctly supported by the compilers).

Scott Lurndal

unread,

Jun 14, 2023, 11:43:27 AM6/14/23

to

Bonita Montero <Bonita....@gmail.com> writes:
>Am 14.06.2023 um 17:14 schrieb Scott Lurndal:
>
>> FWIW, the HP-3000 MPE operating system interfaces (system calls) were called
>> "intrinsics".
>
>An intrinsic is a function that directly maps to a processor instruc-
>tion.

You don't get to unilateraly determine anything, much less what an
"intrinsic" is.

Alf P. Steinbach

unread,

Jun 14, 2023, 3:47:37 PM6/14/23

to

When that function is separately compiled in a different translation
unit, how is the compiler to know when compiling calling code that the
function uses `memcpy` internally,

and when compiling the function, how is the compiler to know that it's
generally called with a `*reinterpret_cast<uint64_t*>( p_bytes )` as
argument?

Answer: it doesn't know, in either case.

So generally (let's disregard global optimization with link time
compilation) `memcpy` versus `=` can't affect the outcome.

So for the separately compiled function it does not matter technically,
except possibly for performance, whether it uses clear, concise, safe
and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.

That means that regarding this matter the common interpretation of the
standard is not technical but instead specifies a formal UB that can't
happen unless one informs a really perverse compiler that it's there.
Even g++'s documentation of `-fstrict-aliasing` says "A character type
may alias any other type.". So g++ is /not/ that perverse a compiler,
and I believe you won't find any sufficiently perverse compiler.

So in practice one can and should not accommodate the common standard
interpretation; it's nonsense, built on the assumption of magic effects.

Let the committee come up with their COBOL-inspired C++26 formal
solution. Let others waste their time trying to write then formally
correct COBOL++. I.e. just ignore all that; say no to nonsense.

- Alf

Bonita Montero

unread,

Jun 14, 2023, 3:53:21 PM6/14/23

to

Am 14.06.2023 um 21:46 schrieb Alf P. Steinbach:

> When that function is separately compiled in a different translation
> unit, how is the compiler to know when compiling calling code that the
> function uses `memcpy` internally,

It doesn't matter if my function is inlined or not.

> So generally (let's disregard global optimization with link
> time compilation) `memcpy` versus `=` can't affect the outcome.

Thats not safe if you do aliasing.

Rest unread.

David Brown

unread,

Jun 14, 2023, 5:29:02 PM6/14/23

to

On 14/06/2023 16:46, Bonita Montero wrote:
> Am 14.06.2023 um 16:36 schrieb David Brown:
>
>> On 14/06/2023 15:31, Bonita Montero wrote:
>
>>> memcpy() is the only way to reliably alias something in C++ and
>>> therefore it's safe to assume that it is partitially intrinsic.
>
>> That's not what "intrinsic" means. <rest unread>
>
> For me it's partitially intrinsic.

Please don't use words like that in totally different ways from everyone
else.

>
>> It is not long since memcpy() on MSVC was always implemented as a call
>> to an external library function in a DLL, which was a huge overhead
>> for such a simple feature.
>
> Since it's the only way to have safe aliasing in C++ you can silently
> rely on that.
>

You said that before - you are still wrong.

>> And no, "memcpy" is /not/ the only way to access aliased data or
>> access data through pointers of non-compatible types. You can use any
>> character pointer, ...
>
> In C you can alias anything as a char-array and vise versa,
> but not in C++.

<https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>

<https://en.cppreference.com/w/cpp/types/byte>

Look it up in the standards if you prefer.

>
>> You did not check gcc.
>
> I checked it and this trick doesn't work with g++ / RISC-V
> _with_ -O2 but I hope this will change, so I check for __GNUC__.
>

Finally you have managed to check it, and found - as I told you at the
start - that it does not work for -O2. (Not for RISC-V, and not for any
other target that makes a distinction between aligned and unaligned
accesses.)

While I too hope this will change (that's why I registered it as a bug),
the sane thing to do is write the code properly using the correct gcc
extension for the purpose - __builtin_assume_aligned - as I suggested.

>> It is still not the appropriate type. ...
>
> For me it is because I use systems with flat memory. And in this case
> I could even use a char since I only check for the three lower bits to
> be zero.
>

No, it is still the wrong type regardless of the memory flatness.

It is really very simple. "size_t" is a type used for the size of
objects. "uintptr_t" is a type used for converting pointers to unsigned
integers. Use the correct type. You have /nothing/ to gain by using
the wrong type here other than a reputation for being stubborn in your
insistence on writing pointlessly odd code.

David Brown

unread,

Jun 14, 2023, 5:33:22 PM6/14/23

to

It is not uncommon for system calls to be handled by embedded assembly
instructions for software interrupts. You could also argue that the
system calls are native to the target but not to the C (or C++)
language, just like assembly instructions, and can justifiably be called
"intrinsics".

"memcpy", on the other hand, is no more "intrinsic" than any other
standard library function.

James Kuyper

unread,

Jun 14, 2023, 6:27:42 PM6/14/23

to

On 6/14/23 15:46, Alf P. Steinbach wrote:
> On 2023-06-14 7:56 AM, Bonita Montero wrote:
>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
>>
>>> It's copying an `uint64_t` that is known to be correctly aliased, to
>>> an `uint64_t`; that's nonsense.
>>
>> The reference intitially supplied by the caller is casted from a char
>> -array. memcpy() is the only legal way in C++ to alias that content.

...

> So for the separately compiled function it does not matter technically,
> except possibly for performance, whether it uses clear, concise, safe
> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
>
> That means that regarding this matter the common interpretation of the
> standard is not technical but instead specifies a formal UB that can't
> happen unless one informs a really perverse compiler that it's there.

All you need is a platform where misaligned pointers do not merely cause
the code to be inefficient, but to actually malfunction. On such a
platform, if p_bytes is not correctly aligned to store a uint64_t, then
the code will malfunction in the reinterpret_cast<>.

"When a prvalue v of object pointer type is converted to the object
pointer type “pointer to cv T”, the result is
static_cast<cv T*>(static_cast<cv void*>(v))." (7.6.1.9p7)

"A prvalue of type “pointer to cv1 void” can be converted to a prvalue
of type “pointer to cv2 T”, where T is an object type and cv2 is the
same cv-qualification as, or greater cv-qualification than, cv1. If the
original pointer value represents the address A of a byte in memory and
A does not satisfy the alignment requirement of T, then the resulting
pointer value is unspecified." (7.6.1.8p13).

In particular, because the pointer value is unspecified, it's not
guaranteed to be dereferenceable. A common possibility is that the
result of the conversion will be a pointer to the nearest preceding or
following correctly aligned location.

If p_bytes is correctly aligned, simple assignment will work just as
well as memcpy().

> Even g++'s documentation of `-fstrict-aliasing` says "A character type

> may alias any other type.". ...

True, but that's not what this reinterpret_cast does; it aliases a
character type with uint64_t, and that a problem if pbytes is not
correctly aligned to hold a uint64_t. The anti-aliasing rules are not
symmetric.

Scott Lurndal

unread,

Jun 14, 2023, 6:48:58 PM6/14/23

to

David Brown <david...@hesbynett.no> writes:
>On 14/06/2023 17:14, Scott Lurndal wrote:
>> David Brown <david...@hesbynett.no> writes:
>>> On 14/06/2023 15:31, Bonita Montero wrote:
>>>> Am 14.06.2023 um 13:06 schrieb David Brown:
>>>>
>>>>> Note that this applies only to specific implementations of memcpy(). ...
>>>>
>>>> memcpy() is the only way to reliably alias something in C++ and
>>>> therefore it's safe to assume that it is partitially intrinsic.
>>>
>>> That's not what "intrinsic" means. An "intrinsic" is basically a
>>> wrapper function around a cpu-specific feature or instruction.
>>
>> FWIW, the HP-3000 MPE operating system interfaces (system calls) were called
>> "intrinsics".
>
>It is not uncommon for system calls to be handled by embedded assembly
>instructions for software interrupts.

In this case, it was more generic than that, since:

1) The HP-3000 didn't have an assembler
2) Intrinsic calls were just function calls into a different segment
(a stack architecture similar to the Burroughs systems).

Chris M. Thomasson

unread,

Jun 14, 2023, 7:02:02 PM6/14/23

to

On 6/14/2023 3:27 PM, James Kuyper wrote:
> On 6/14/23 15:46, Alf P. Steinbach wrote:
>> On 2023-06-14 7:56 AM, Bonita Montero wrote:
>>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
>>>
>>>> It's copying an `uint64_t` that is known to be correctly aliased, to
>>>> an `uint64_t`; that's nonsense.
>>>
>>> The reference intitially supplied by the caller is casted from a char
>>> -array. memcpy() is the only legal way in C++ to alias that content.
> ...
>> So for the separately compiled function it does not matter technically,
>> except possibly for performance, whether it uses clear, concise, safe
>> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
>>
>> That means that regarding this matter the common interpretation of the
>> standard is not technical but instead specifies a formal UB that can't
>> happen unless one informs a really perverse compiler that it's there.
>
> All you need is a platform where misaligned pointers do not merely cause
> the code to be inefficient, but to actually malfunction. On such a
> platform, if p_bytes is not correctly aligned to store a uint64_t, then
> the code will malfunction in the reinterpret_cast<>.

[...]

What about some code that crosses a L2 cache line boundary and causes
the damn processor to assert a bus lock... Argh!

David Brown

unread,

Jun 15, 2023, 4:02:30 AM6/15/23

to

On 15/06/2023 00:27, James Kuyper wrote:
> On 6/14/23 15:46, Alf P. Steinbach wrote:
>> On 2023-06-14 7:56 AM, Bonita Montero wrote:
>>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
>>>
>>>> It's copying an `uint64_t` that is known to be correctly aliased, to
>>>> an `uint64_t`; that's nonsense.
>>>
>>> The reference intitially supplied by the caller is casted from a char
>>> -array. memcpy() is the only legal way in C++ to alias that content.
> ...
>> So for the separately compiled function it does not matter technically,
>> except possibly for performance, whether it uses clear, concise, safe
>> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
>>
>> That means that regarding this matter the common interpretation of the
>> standard is not technical but instead specifies a formal UB that can't
>> happen unless one informs a really perverse compiler that it's there.
>
> All you need is a platform where misaligned pointers do not merely cause
> the code to be inefficient, but to actually malfunction. On such a
> platform, if p_bytes is not correctly aligned to store a uint64_t, then
> the code will malfunction in the reinterpret_cast<>.
>

And in case anyone has doubts, such platforms do exist. I have used
embedded microcontrollers in which an unaligned access might mean access
to the address rounded down (i.e., a 16-bit store to 0x2001 actually
stores 16 bits at 0x2000). The stored data may or may not be
byte-swapped - the behaviour is undefined, and I don't think it was
consistent between different generations of the processor.

There are also big processors which will fault on unaligned accesses.
Even if there are OS services in place to simulate the access, the
process is so massively slower than normal accesses that it could be
considered a software malfunction for performance critical code.

David Brown

unread,

Jun 15, 2023, 4:06:29 AM6/15/23

to

On 14/06/2023 21:46, Alf P. Steinbach wrote:
> On 2023-06-14 7:56 AM, Bonita Montero wrote:
>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
>>
>>> It's copying an `uint64_t` that is known to be correctly aliased, to
>>> an `uint64_t`; that's nonsense.
>>
>> The reference intitially supplied by the caller is casted from a char
>> -array. memcpy() is the only legal way in C++ to alias that content.
>
> When that function is separately compiled in a different translation
> unit, how is the compiler to know when compiling calling code that the
> function uses `memcpy` internally,
>
> and when compiling the function, how is the compiler to know that it's
> generally called with a `*reinterpret_cast<uint64_t*>( p_bytes )` as
> argument?
>
> Answer: it doesn't know, in either case.
>
> So generally (let's disregard global optimization with link time
> compilation) `memcpy` versus `=` can't affect the outcome.
>
> So for the separately compiled function it does not matter technically,
> except possibly for performance, whether it uses clear, concise, safe
> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
>

Code that relies on limited optimisation or separate compilation for
correct behaviour, is an extremely bad idea - it is fragile and a hidden
bug waiting to explode in the future. Shortcuts now will cost dearly
later on. Take pride in your work, and code responsibly - do what you
can to make your code /correct/, rather than relying on weak tools!

Chris M. Thomasson

unread,

Jun 15, 2023, 4:50:44 AM6/15/23

to

CMPXCHG on an address that points to data that straddles a l2 cache line
should do it... I cannot remember for sure if the LOCK prefix _has_ to
be present here... Cannot remember right that detail right now, damn!
Fwiw, XCHG should assert bus lock as well wrt the "bad" location, and
LOCK is automatically implied in XCHG to begin with.

Chris M. Thomasson

unread,

Jun 15, 2023, 4:55:32 AM6/15/23

to

I also cannot remember if it only asserts the bus lock when there is
"contention" on a LOCK'ed atomic RMW using an address that goes to data
that straddles a l2 cache line on Intel. Its been a while since I have
worked with raw x86 asm. I am sure some of my work is up on the way back
machine. Let me check...

I found some of my old asm work!

This is MASM:

http://web.archive.org/web/20060214112539/http://appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_masm_asm.html

This should be GAS:

http://web.archive.org/web/20060214112345/http://appcore.home.comcast.net/appcore/src/cpu/i686/ac_i686_gcc_asm.html

;^)

Bonita Montero

unread,

Jun 15, 2023, 6:06:20 AM6/15/23

to

Am 14.06.2023 um 23:28 schrieb David Brown:

>> In C you can alias anything as a char-array and vise versa,
>> but not in C++.

> <https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>
> <https://en.cppreference.com/w/cpp/types/byte>

There never will be an upcoming CPU where CHAR_BIT is not eight.
Even Posix requires that.

> Finally you have managed to check it, ...

I've checked it long before the posting of you before.

> No, it is still the wrong type regardless of the memory flatness.

You're paranoid.

David Brown

unread,

Jun 15, 2023, 9:05:54 AM6/15/23

to

On 15/06/2023 12:06, Bonita Montero wrote:
> Am 14.06.2023 um 23:28 schrieb David Brown:
>
>>> In C you can alias anything as a char-array and vise versa,
>>> but not in C++.
>
>> <https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>
>> <https://en.cppreference.com/w/cpp/types/byte>
>
> There never will be an upcoming CPU where CHAR_BIT is not eight.
> Even Posix requires that.

To the nearest percent, 0% of all cpus shipped are used in POSIX systems.

CPUs are made all the time that don't have 8-bit char. Just because you
have a limited view, does not mean C, C++ or all other programmers do so.

Of course, none of that matters in the slightest here - nothing about
std::byte, type aliasing, or accessing via char types relies on char
being 8-bit.

>
>
> > Finally you have managed to check it, ...
>
> I've checked it long before the posting of you before.
>

Either you are lying in an attempt to look less incompetent, or you are
incompetent, or you wrote a poorly considered "optimisation" that
doesn't work at all in a major use-case and were so proud of your
half-arsed solution that you hoped no one would notice.

>> No, it is still the wrong type regardless of the memory flatness.
>
> You're paranoid.

No, understanding the point of basic language types is not paranoia.

james...@alumni.caltech.edu

unread,

Jun 15, 2023, 9:28:41 AM6/15/23

to

On Wednesday, June 14, 2023 at 11:26:01 AM UTC-4, Bonita Montero wrote:
...

> In C you can alias anything as a char array and a char array as
> anything so it would be safe to alias anything as anything with
> double-casting (I guess that's correctly supported by the compilers).

C's anti-alasing rules are asymmetric. It distinguishes between the effective type of a object (which is the same as it'declared type, if it has one) and the typeof the league used to access it. Aliasing an object with an effect8ve type of uint64_t using an lvalue of character type is allowed by the following clause:

"An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
...
- a character type" (6.5p7)

Accessing an object with an effective type that is a character type (or an array thereof) using an lvalue with a type of uint64_t is not allowed by any of the cases listed in that paragraph unless they are members of the same union, (or if uint64_t is a character type, which is pretty unlikely, but permitted).

Bonita Montero

unread,

Jun 15, 2023, 9:36:04 AM6/15/23

to

Am 15.06.2023 um 15:05 schrieb David Brown:
> On 15/06/2023 12:06, Bonita Montero wrote:
>> Am 14.06.2023 um 23:28 schrieb David Brown:
>>
>>>> In C you can alias anything as a char-array and vise versa,
>>>> but not in C++.
>>
>>> <https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>
>>> <https://en.cppreference.com/w/cpp/types/byte>
>>
>> There never will be an upcoming CPU where CHAR_BIT is not eight.
>> Even Posix requires that.
>
> To the nearest percent, 0% of all cpus shipped are used in POSIX systems.
>
> CPUs are made all the time that don't have 8-bit char. Just because you
> have a limited view, does not mean C, C++ or all other programmers do so.

CPUs with CHAR_BIT != 8 are rare and there won't be any further
in the future.

> Of course, none of that matters in the slightest here - nothing about
> std::byte, type aliasing, or accessing via char types relies on char
> being 8-bit.
>
>>
>>
>> > Finally you have managed to check it, ...
>>
>> I've checked it long before the posting of you before.
>>
>

> Either you are lying in an attempt to look less incompetent, ...

I checked it, you didn't.

> incompetent, or you wrote a poorly considered "optimisation" that
> doesn't work at all in a major use-case and were so proud of your
> half-arsed solution that you hoped no one would notice.
>
>>> No, it is still the wrong type regardless of the memory flatness.
>>
>> You're paranoid.
>
> No, understanding the point of basic language types is not paranoia.

I would never run into problems with that.
Your opinion is compulsive.

Bonita Montero

unread,

Jun 15, 2023, 9:43:04 AM6/15/23

to

In C you can alias anything as a char-array and a a char-array as
anything. And you can alias signed, defaulted (char) or unsigned
entities as their counterparts. That's all.
Since aliasing with a union is very common all compiler support
that, although there's no guarantee from the standard for that.

james...@alumni.caltech.edu

unread,

Jun 15, 2023, 10:38:50 AM6/15/23

to

Citation, please?
6,5p7 is a complete and exhaustive list of the situations where an object can be accessed with defined behaviorusing an lvaue with a type that is different from the effective type of that object. It starts with a "shall", so violations have undefined behavior. Please identify which item on that list covers the case where the lvalue is uint64_t and the effective type is an array of char.

> Since aliasing with a union is very common all compiler support
> that, although there's no guarantee from the standard for that.

There's a footnote in the C standard which says "If the member used to read the contents of a union object is not the same as the member last used to store a value in the object the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called type punning)."

Footnotes are non-normative. They are not supposed to contain the sole specification of some aspect if the language. They're only supposed to explain something that could be derived from the normative text of standard. I don'tbelieve that is the case for this footnote. I've discussed this issue with a couple of people, one of them a member of the committee, who disagreed. Neither of them was able to present an argument laying out that derivation, so I believe that you are technically correct.

However, what that footnote describes is the intent of the committee, and the expectation that almost all users of C have had since union's were first introduced, and the way essentially all real world implementors have implemented them. Therefore, I would recommend treating that footnote as if it were normative, until such time as the standard is corrected to say the same thing in normative text.

David Brown

unread,

Jun 15, 2023, 2:35:47 PM6/15/23

to

On 15/06/2023 15:35, Bonita Montero wrote:
> Am 15.06.2023 um 15:05 schrieb David Brown:
>> On 15/06/2023 12:06, Bonita Montero wrote:
>>> Am 14.06.2023 um 23:28 schrieb David Brown:
>>>
>>>>> In C you can alias anything as a char-array and vise versa,
>>>>> but not in C++.
>>>
>>>> <https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>
>>>> <https://en.cppreference.com/w/cpp/types/byte>
>>>
>>> There never will be an upcoming CPU where CHAR_BIT is not eight.
>>> Even Posix requires that.
>>
>> To the nearest percent, 0% of all cpus shipped are used in POSIX systems.
>>
>> CPUs are made all the time that don't have 8-bit char. Just because
>> you have a limited view, does not mean C, C++ or all other programmers
>> do so.
>
> CPUs with CHAR_BIT != 8 are rare and there won't be any further
> in the future.

You do understand that simply repeating something does not make it true?
Processors with char greater than 8 bits are niche, but certainly not
rare in numbers of devices delivered. I suppose it's fair to assume
that /you/ will never be programming any.

>
>> Of course, none of that matters in the slightest here - nothing about
>> std::byte, type aliasing, or accessing via char types relies on char
>> being 8-bit.
>>
>>>
>>>
>>> > Finally you have managed to check it, ...
>>>
>>> I've checked it long before the posting of you before.
>>>
>>
>> Either you are lying in an attempt to look less incompetent, ...
>
> I checked it, you didn't.
>
>> incompetent, or you wrote a poorly considered "optimisation" that
>> doesn't work at all in a major use-case and were so proud of your
>> half-arsed solution that you hoped no one would notice.
>>
>>>> No, it is still the wrong type regardless of the memory flatness.
>>>
>>> You're paranoid.
>>
>> No, understanding the point of basic language types is not paranoia.
>
> I would never run into problems with that.
> Your opinion is compulsive.
>

"Compulsive" is not nearly as inaccurate as "paranoid" - I do prefer to
try to be accurate in my coding. If there is a type that fits a
particular usage, I'll use that rather than one that just happens to work.

Bonita Montero

unread,

Jun 15, 2023, 3:08:09 PM6/15/23

to

Am 15.06.2023 um 20:35 schrieb David Brown:

> Processors with char greater than 8 bits are niche, but certainly not
> rare in numbers of devices delivered. I suppose it's fair to assume
> that /you/ will never be programming any.

Almost any programmer is not programming for such CPUs. And you think
everything must be portable to such CPUs if you complain about that
for sources for which you don't understand its purpose. There's for
sure no C++-compiler which supports C++20 for systems that have CHAR_BIT
different than eight.

> "Compulsive" is not nearly as inaccurate as "paranoid" - I do prefer

> to try to be accurate in my coding. ...

... if necessary. If it is not necessary it's just compulsiveness.

Alf P. Steinbach

unread,

Jun 16, 2023, 2:48:24 AM6/16/23

to

On 2023-06-15 12:27 AM, James Kuyper wrote:
> On 6/14/23 15:46, Alf P. Steinbach wrote:
>> On 2023-06-14 7:56 AM, Bonita Montero wrote:
>>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
>>>
>>>> It's copying an `uint64_t` that is known to be correctly aliased, to
>>>> an `uint64_t`; that's nonsense.
>>>
>>> The reference intitially supplied by the caller is casted from a char
>>> -array. memcpy() is the only legal way in C++ to alias that content.
> ...
>> So for the separately compiled function it does not matter technically,
>> except possibly for performance, whether it uses clear, concise, safe
>> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
>>
>> That means that regarding this matter the common interpretation of the
>> standard is not technical but instead specifies a formal UB that can't
>> happen unless one informs a really perverse compiler that it's there.
>
> All you need is a platform where misaligned pointers do not merely cause
> the code to be inefficient, but to actually malfunction. On such a
> platform, if p_bytes is not correctly aligned to store a uint64_t, then
> the code will malfunction in the reinterpret_cast<>.

Yes, but irrelevant for the case discussed, because the values are
guaranteed correctly aligned.

[snip]

> If p_bytes is correctly aligned, simple assignment will work just as
> well as memcpy().

Yes.

>> Even g++'s documentation of `-fstrict-aliasing` says "A character type
>> may alias any other type.". ...
>
> True, but that's not what this reinterpret_cast does;

It is what this `reinterpret_cast` does.

> it aliases a
> character type with uint64_t, and that a problem if pbytes is not
> correctly aligned to hold a uint64_t.

It is correctly aligned.

> The anti-aliasing rules are not symmetric.

That's a tangential issue, and best discussed in a new thread.

- Alf

Alf P. Steinbach

unread,

Jun 16, 2023, 2:59:59 AM6/16/23

to

Repeatedly copying lots of data instead of reinterpreting,

is inefficient and awkward and a source of bugs in itself.

`memcpy` is not the safest tool around. It's rather the opposite,
something to avoid /if possible/. `memcpy` is the "weak tool" here.

The problem is not the standard, which simply lacks wording for what was
obviously intended, like the wording in the g++ documentation, and which
at least in C++03 was a bit inconsistent re this issue (e.g. the
separate point about address of first item in a POD struct was not part
of the allegedly exhaustive strict aliasing list), but the problem is,
clearly, the C++ standardization committee and its interpretation.

Once you realize that it should not be hard to code responsibly.

- Alf

David Brown

unread,

Jun 16, 2023, 4:56:05 AM6/16/23

to

With a half-decent compiler, "copying" like this disappears in the
optimisation. But there's no disagreement that it is awkward and could
be a source of bugs (Bonita's half-way attempt at optimisation shows that).

Reinterpreting data as a different type from its real type is, in
general, undefined behaviour. Reinterpreting, such as by pointer casts
or reinterpret_cast<>, does not allow you to break the language's type
aliasing rules.

In C++20, there is an alternative to some related uses of memcpy() -
std::bit_cast<>. This is safer, because it hides messy details like the
size of the copy and fails to compile for inappropriate object types,
but in practice it is just a nice wrapper for a memcpy() with a little
"magic" to make it work as constexpr.

(If Bonita had been targeting C++20, presumably the code would have used
std::asumme_aligned instead of a gcc/clang extension.)

>
> `memcpy` is not the safest tool around. It's rather the opposite,
> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>

I don't know how the function here is supposed to be used. We know that
the pointer (it is syntactically a reference, but effectively a pointer)
is properly aligned for a uint64_t, but we don't know if it actually
points to a uint64_t. Perhaps it points to a double, or some other
64-bit type.

You are absolutely right that memcpy() is not a "safe" tool - used like
this, it is a way to bypass the normal safe typing of the language. But
it is a way to get around the rules, rather than break the rules. So if
you need to access the representation of one type as though it were a
different type, then memcpy (or equivalent use of std::byte or char
pointers) is the correct way to do it. So memcpy is not a /weak/ tool
here - it is the /best/ tool here. But you only use it if you need it.

The alternative you are suggesting - separately compiled functions - is
far less safe. It is a hidden bomb, waiting to cause trouble when later
compilers or flags combine code differently, or when someone moves the
function to a different part of the code. memcpy() is part of the
language, and is fully defined and documented behaviour, while separate
compilation is going outside the language and defined behaviour. It is
the most fragile solution, and /never/ one to be recommended.

> The problem is not the standard, which simply lacks wording for what was
> obviously intended, like the wording in the g++ documentation, and which
> at least in C++03 was a bit inconsistent re this issue (e.g. the
> separate point about address of first item in a POD struct was not part
> of the allegedly exhaustive strict aliasing list), but the problem is,
> clearly, the C++ standardization committee and its interpretation.
>

I don't think anyone would accuse the C++ standards of being too clear,
but the behaviour of memcpy is well documented and well defined. Bonita
is thoroughly confused about how and when access by character pointers
is defined behaviour in C and C++, but knows that memcpy works correctly
here. It is usually a better choice to use memcpy() than roll-your-own
character pointer access code anyway.

> Once you realize that it should not be hard to code responsibly.
>

Yes - you do so by writing code that has defined behaviour that does
what you want. You don't do it by relying on luck of compilation details.

Alf P. Steinbach

unread,

Jun 16, 2023, 6:34:44 AM6/16/23

to

On 2023-06-16 10:55 AM, David Brown wrote:
> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>
>> `memcpy` is not the safest tool around. It's rather the opposite,
>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>
>
> I don't know how the function here is supposed to be used. We know that
> the pointer (it is syntactically a reference, but effectively a pointer)
> is properly aligned for a uint64_t, but we don't know if it actually
> points to a uint64_t. Perhaps it points to a double, or some other
> 64-bit type.

In the case you sketch where the bits do not represent a valid
`uint64_t`, the `memcpy` does not make the behavior well-defined: that's
a (dangerous) misconception.

I'm not sure if you're addressing only the formal here.

However, if you intended this to also apply to the in-practice, then the
burden of proof is on you that some computer exists where `uint64_t` has
bits that do not participate in the value representation, so that they
/can/ be invalid: as far as I know there is no such computer.

- Alf

Bonita Montero

unread,

Jun 16, 2023, 7:26:45 AM6/16/23

to

Am 16.06.2023 um 08:59 schrieb Alf P. Steinbach:

> `memcpy` is not the safest tool around. It's rather the opposite,
> something to avoid /if possible/. `memcpy` is the "weak tool" here.

I'm aliasing, so memcpy() is the only tool here.

David Brown

unread,

Jun 16, 2023, 8:56:18 AM6/16/23

to

On 16/06/2023 12:34, Alf P. Steinbach wrote:
> On 2023-06-16 10:55 AM, David Brown wrote:
>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>
>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>
>>
>> I don't know how the function here is supposed to be used. We know
>> that the pointer (it is syntactically a reference, but effectively a
>> pointer) is properly aligned for a uint64_t, but we don't know if it
>> actually points to a uint64_t. Perhaps it points to a double, or some
>> other 64-bit type.
>
> In the case you sketch where the bits do not represent a valid
> `uint64_t`, the `memcpy` does not make the behavior well-defined: that's
> a (dangerous) misconception.

A "uint64_t" has a guaranteed fully-defined format and no padding bits.
All possible bit patterns for the type have well-defined behaviour.
(Hypothetically, that would not be true for "unsigned long long", which
could contain padding bits and have could have trap representations.)

In case I am missing something, please tell me where you see any
possible dangerous or not fully defined behaviour even for the more
general case :

#include <string.h>
#include <stdint.h>

uint64_t read64bits(const void * p) {
uint64_t x;
memcpy((void*) &x, p, sizeof(uint64_t));
return x;
}

We can assume that "p" points to data of some type of at least 64 bits
in size. I would like to hear of any potential issues in C or C++
(hence the cross-language code).

>
> I'm not sure if you're addressing only the formal here.

No. It is a general principle. Some people /do/ believe that "separate
compilation" creates magical barriers that limit a compiler's ability to
see the relationships between code sections, and therefore its ability
to "optimise using assumptions about defined and undefined behaviours",
and that this means some kinds of undefined behaviours become defined by
moving code to a different file or disabling optimisation. They are
wrong to believe this. They may manage to write code that works when
they test it, but it will be fragile - the code still has exactly the
same undefined behaviours, and these may manifest as bugs in the future.

>
> However, if you intended this to also apply to the in-practice, then the
> burden of proof is on you that some computer exists where `uint64_t` has
> bits that do not participate in the value representation, so that they
> /can/ be invalid: as far as I know there is no such computer.
>

No, not at all. It is only /you/ that has suggested, by claiming the
use of memcpy is not fully defined, that uint64_t may hypothetically
contain padding bits. (See earlier in my reply.) I know it can't, so
that is not the issue.

My argument is that the following code is /wrong/, even if the functions
are compiled in separate sources :

uint64_t read64(const uint64_t * p) {
return *p;
}

uint64_t reinterpret(double x) {
return read64((const uint64_t *) &x);
}

If these are placed in separate files and compiled separately, with
today's compilers, with no link-time or whole-program optimisation, then
the code will work as the programmer expected and get a bit
representation of the double (which we assume is 64-bit). But working
in a test does not make it /correct/ code, and certainly not /good/ code.

Ben Bacarisse

unread,

Jun 16, 2023, 10:13:04 AM6/16/23

to

David Brown <david...@hesbynett.no> writes:

> On 16/06/2023 12:34, Alf P. Steinbach wrote:
>> On 2023-06-16 10:55 AM, David Brown wrote:
>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>
>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>
>>>
>>> I don't know how the function here is supposed to be used. We know that
>>> the pointer (it is syntactically a reference, but effectively a pointer)
>>> is properly aligned for a uint64_t, but we don't know if it actually
>>> points to a uint64_t. Perhaps it points to a double, or some other
>>> 64-bit type.
>> In the case you sketch where the bits do not represent a valid
>> `uint64_t`, the `memcpy` does not make the behavior well-defined: that's
>> a (dangerous) misconception.
>
> A "uint64_t" has a guaranteed fully-defined format and no padding bits. All
> possible bit patterns for the type have well-defined
> behaviour.

"Fully-defined format" says, to my mind, more that you wanted to say.
In particular the significance of the bits is not defined.

> (Hypothetically, that would not be true for "unsigned long
> long", which could contain padding bits and have could have trap
> representations.)
>
> In case I am missing something, please tell me where you see any possible
> dangerous or not fully defined behaviour even for the more general case :
>
> #include <string.h>
> #include <stdint.h>
>
> uint64_t read64bits(const void * p) {
> uint64_t x;
> memcpy((void*) &x, p, sizeof(uint64_t));
> return x;
> }

What's not "fully defined behaviour" is the return value's relationship
to the bytes pointed to by p. I think you are using "fully defined
behaviour" to mean "not undefined behaviour", but since the former is
not a technical term in the language standards, a reader might take more
from it than you intended.

Yes, this is something of a nit-pick, I know.

--
Ben.

Alf P. Steinbach

unread,

Jun 16, 2023, 11:26:00 AM6/16/23

to

On 2023-06-16 2:55 PM, David Brown wrote:
> On 16/06/2023 12:34, Alf P. Steinbach wrote:
>> On 2023-06-16 10:55 AM, David Brown wrote:
>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>
>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>
>>> I don't know how the function here is supposed to be used. We know
>>> that the pointer (it is syntactically a reference, but effectively a
>>> pointer) is properly aligned for a uint64_t, but we don't know if it
>>> actually points to a uint64_t. Perhaps it points to a double, or
>>> some other 64-bit type.
>>
>> In the case you sketch where the bits do not represent a valid
>> `uint64_t`, the `memcpy` does not make the behavior well-defined:
>> that's a (dangerous) misconception.
>
> A "uint64_t" has a guaranteed fully-defined format and no padding bits.

Since you believe that, your comment about "Perhaps points to a double"
is meaningless nonsense.

You're arguing against yourself, with only three lines between your two
comments that are shooting dum-dum bullets at each other.

Make up your mind, please.

> All possible bit patterns for the type have well-defined behaviour.
> (Hypothetically, that would not be true for "unsigned long long", which
> could contain padding bits and have could have trap representations.)

We could discuss this assertion, e.g. I could helpfully mention that in
C++ "these requirements do not hold for other types [than character
types]", but better that you waste time attempting to PROVE IT.

Chapter and verse, please.

Not that it matters for what I've written, but it matters for the silly
argument that you offered, quoted above, and that you now argue against,
plus, there is the thing about being Wrong on the internet, not to
mention Doubly Wrong: just on principle one should not let that pass.

> In case I am missing something, please tell me where you see any
> possible dangerous or not fully defined behaviour even for the more
> general case :
>
> #include <string.h>
> #include <stdint.h>
>
> uint64_t read64bits(const void * p) {
>     uint64_t x;
>     memcpy((void*) &x, p, sizeof(uint64_t));
>     return x;
> }

Now you're arguing against yourself again.

That means that whatever I respond, I can expect a random direction answer.

Anyway:

* The C style cast there is both unnecessary and dangerous, because it
can cast away const, and is difficult to grep, so it's ungood code.
* You're wrong about formally no padding bits for C++, so in principle
that function can produce an invalid `uint64` with trap representation;
that's UB -- except that that's in principle, not in practice.
* If the pointer `p` is invalid, or is a nullpointer, or doesn't point
to at least sizeof(uint64_t) contiguous bytes of readable memory, then
that's UB, and it's UB both in principle and in practice.

If you had been a certain other old timer in this group, then it would
also be relevant that there's possible UB due to stack overflow.

Which by his (lack of) logic leads to the conclusion that all C++
programs have UB.

> We can assume that "p" points to data of some type of at least 64 bits
> in size. I would like to hear of any potential issues in C or C++
> (hence the cross-language code).

Oh, cross language, sorry.

For the case of C /I believe/ that there's a formal guarantee of no
padding bits, so then only the last point above matters wrt. UB.

>> I'm not sure if you're addressing only the formal here.
>
> No. It is a general principle. Some people /do/ believe that "separate
> compilation" creates magical barriers that limit a compiler's ability to
> see the relationships between code sections, and therefore its ability
> to "optimise using assumptions about defined and undefined behaviours",
> and that this means some kinds of undefined behaviours become defined by
> moving code to a different file or disabling optimisation. They are
> wrong to believe this. They may manage to write code that works when
> they test it, but it will be fragile - the code still has exactly the
> same undefined behaviours, and these may manifest as bugs in the future.
>
>>
>> However, if you intended this to also apply to the in-practice, then
>> the burden of proof is on you that some computer exists where
>> `uint64_t` has bits that do not participate in the value
>> representation, so that they /can/ be invalid: as far as I know there
>> is no such computer.
>>
>
> No, not at all. It is only /you/ that has suggested, by claiming the
> use of memcpy is not fully defined, that uint64_t may hypothetically
> contain padding bits. (See earlier in my reply.) I know it can't, so
> that is not the issue.

Now you're claiming that your argument was my argument. Jeez.

[snip]

- Alf

David Brown

unread,

Jun 16, 2023, 12:03:49 PM6/16/23

to

On 16/06/2023 16:12, Ben Bacarisse wrote:
> David Brown <david...@hesbynett.no> writes:
>
>> On 16/06/2023 12:34, Alf P. Steinbach wrote:
>>> On 2023-06-16 10:55 AM, David Brown wrote:
>>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>>
>>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>>
>>>>
>>>> I don't know how the function here is supposed to be used. We know that
>>>> the pointer (it is syntactically a reference, but effectively a pointer)
>>>> is properly aligned for a uint64_t, but we don't know if it actually
>>>> points to a uint64_t. Perhaps it points to a double, or some other
>>>> 64-bit type.
>>> In the case you sketch where the bits do not represent a valid
>>> `uint64_t`, the `memcpy` does not make the behavior well-defined: that's
>>> a (dangerous) misconception.
>>
>> A "uint64_t" has a guaranteed fully-defined format and no padding bits. All
>> possible bit patterns for the type have well-defined
>> behaviour.
>
> "Fully-defined format" says, to my mind, more that you wanted to say.
> In particular the significance of the bits is not defined.

The order of the bits is implementation-defined, and the bits are
required to represent different powers of two from 0 to 63. There can
be no padding bits.

I was using "fully defined" to mean "defined by the standard or
implementation defined" - i.e., something documented that you could rely
upon. (I should have made that clear.)

It is certainly allowed for an implementation to have different
endianness for different types - so even if "double" uses the IEEE
formats, storing a value as a double and reading it as a uint64_t may
have different results on different platforms. I suspect such
inconsistently ordered implementations would be quite rare, however -
usually the bit ordering is either clearly little-endian or clearly
big-endian.

>
>> (Hypothetically, that would not be true for "unsigned long
>> long", which could contain padding bits and have could have trap
>> representations.)
>>
>> In case I am missing something, please tell me where you see any possible
>> dangerous or not fully defined behaviour even for the more general case :
>>
>> #include <string.h>
>> #include <stdint.h>
>>
>> uint64_t read64bits(const void * p) {
>> uint64_t x;
>> memcpy((void*) &x, p, sizeof(uint64_t));
>> return x;
>> }
>
> What's not "fully defined behaviour" is the return value's relationship
> to the bytes pointed to by p.
> I think you are using "fully defined
> behaviour" to mean "not undefined behaviour", but since the former is
> not a technical term in the language standards, a reader might take more
> from it than you intended.
>
> Yes, this is something of a nit-pick, I know.
>

I am interested in nit-picks, and glad to hear of any you find.

If I had written "standards-defined and/or implementation defined
behaviour" instead of "fully defined behaviour", would that be sufficient?

David Brown

unread,

Jun 16, 2023, 12:33:14 PM6/16/23

to

On 16/06/2023 17:25, Alf P. Steinbach wrote:
> On 2023-06-16 2:55 PM, David Brown wrote:
>> On 16/06/2023 12:34, Alf P. Steinbach wrote:
>>> On 2023-06-16 10:55 AM, David Brown wrote:
>>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>>
>>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>>
>>>> I don't know how the function here is supposed to be used. We know
>>>> that the pointer (it is syntactically a reference, but effectively a
>>>> pointer) is properly aligned for a uint64_t, but we don't know if it
>>>> actually points to a uint64_t. Perhaps it points to a double, or
>>>> some other 64-bit type.
>>>
>>> In the case you sketch where the bits do not represent a valid
>>> `uint64_t`, the `memcpy` does not make the behavior well-defined:
>>> that's a (dangerous) misconception.
>>
>> A "uint64_t" has a guaranteed fully-defined format and no padding bits.
>
> Since you believe that,

Ref. C standards 7.20.1.1p2, 6.2.6.1p2, 6.2.6.2p1.

I believe the C++ standards inherit this from C, and the C standards are
easier to reference (IMHO) because the numbering stays consistent
between versions.

> your comment about "Perhaps points to a double"
> is meaningless nonsense.

I'm sorry, I have no idea what you mean by that.

>
> You're arguing against yourself, with only three lines between your two
> comments that are shooting dum-dum bullets at each other.
>
> Make up your mind, please.

Again, I have no idea what you mean.

Perhaps you don't remember what you wrote yourself?

>
>
>> All possible bit patterns for the type have well-defined behaviour.
>> (Hypothetically, that would not be true for "unsigned long long",
>> which could contain padding bits and have could have trap
>> representations.)
>
> We could discuss this assertion, e.g. I could helpfully mention that in
> C++ "these requirements do not hold for other types [than character
> types]", but better that you waste time attempting to PROVE IT.
>
> Chapter and verse, please.
>

The C++ standards change their numbering regularly. These numbers are
from the C++20 draft N4860 - I don't believe the contents change much
between versions.

6.8.1p3 .. p5 describes the representation of unsigned types as
collections of bits representing powers of 2. For unsigned integer
types in general, there may be padding bits, but there are no padding
bits in char, signed char, unsigned char and char8_t. (C defines the
bit order for unsigned char, but I don't think C++ does as far as I can
see.)

17.4.1 specifies <cstdint>, and in 17.4.1p2 this refers back to the C
standards section 7.20 where the size-specific integer types are defined
to have no padding bits.

So again - please tell me why you think memcpy'ing data into a uint64_t
may have dangerous or poorly defined behaviour.

The ball is in your court.

> Not that it matters for what I've written, but it matters for the silly
> argument that you offered, quoted above, and that you now argue against,
> plus, there is the thing about being Wrong on the internet, not to
> mention Doubly Wrong: just on principle one should not let that pass.
>
>
>> In case I am missing something, please tell me where you see any
>> possible dangerous or not fully defined behaviour even for the more
>> general case :
>>
>> #include <string.h>
>> #include <stdint.h>
>>
>> uint64_t read64bits(const void * p) {
>>      uint64_t x;
>>      memcpy((void*) &x, p, sizeof(uint64_t));
>>      return x;
>> }
>
> Now you're arguing against yourself again.
>
> That means that whatever I respond, I can expect a random direction answer.
>
> Anyway:
>
> * The C style cast there is both unnecessary and dangerous, because it
> can cast away const, and is difficult to grep, so it's ungood code.

I am asking what you think is /dangerous/ or not fully defined here -
not whether there might be an unnecessary cast or whether the style is
"perfect" according to your questionable judgement.

> * You're wrong about formally no padding bits for C++, so in principle
> that function can produce an invalid `uint64` with trap representation;
> that's UB -- except that that's in principle, not in practice.

See above for the chapter and verse you requested.

> * If the pointer `p` is invalid, or is a nullpointer, or doesn't point
> to at least sizeof(uint64_t) contiguous bytes of readable memory, then
> that's UB, and it's UB both in principle and in practice.
>

See below for the explicit assumption that p points to sufficient
readable data.

> If you had been a certain other old timer in this group, then it would
> also be relevant that there's possible UB due to stack overflow.
>
> Which by his (lack of) logic leads to the conclusion that all C++
> programs have UB.
>

I hope we can agree to assume that kind of thing is outside the scope of
the language!

>
>> We can assume that "p" points to data of some type of at least 64 bits
>> in size. I would like to hear of any potential issues in C or C++
>> (hence the cross-language code).
>
> Oh, cross language, sorry.

No problem. As long as we avoid crass language :-)

>
> For the case of C /I believe/ that there's a formal guarantee of no
> padding bits, so then only the last point above matters wrt. UB.
>

Ah ha - now we are getting somewhere!

I suspect the key point is that the C++ standard does not explicitly
define uint64_t to have no padding bits, but it refers to the C standard
which /does/ have such guarantees. C++ inherits them from C.

>
>>> I'm not sure if you're addressing only the formal here.
>>
>> No. It is a general principle. Some people /do/ believe that
>> "separate compilation" creates magical barriers that limit a
>> compiler's ability to see the relationships between code sections, and
>> therefore its ability to "optimise using assumptions about defined and
>> undefined behaviours", and that this means some kinds of undefined
>> behaviours become defined by moving code to a different file or
>> disabling optimisation. They are wrong to believe this. They may
>> manage to write code that works when they test it, but it will be
>> fragile - the code still has exactly the same undefined behaviours,
>> and these may manifest as bugs in the future.
>>
>>>
>>> However, if you intended this to also apply to the in-practice, then
>>> the burden of proof is on you that some computer exists where
>>> `uint64_t` has bits that do not participate in the value
>>> representation, so that they /can/ be invalid: as far as I know there
>>> is no such computer.
>>>
>>
>> No, not at all. It is only /you/ that has suggested, by claiming the
>> use of memcpy is not fully defined, that uint64_t may hypothetically
>> contain padding bits. (See earlier in my reply.) I know it can't, so
>> that is not the issue.
>
> Now you're claiming that your argument was my argument. Jeez.
>

Please re-read your reply to Bonita. You were advocating using
separately compiled code so that you can use assignment instead of
memcpy - claiming, bizarrely, that assignment was "safe" and memcpy
"unsafe" despite insisting on separately compiled functions for your
"solution" to work "in practice".

And then you claimed that the memcpy version was not well defined.

Alf P. Steinbach

unread,

Jun 16, 2023, 1:27:30 PM6/16/23

to

On 2023-06-16 6:32 PM, David Brown wrote:
[snip]

>>>
>>> A "uint64_t" has a guaranteed fully-defined format and no padding bits.

[snip]

> The C++ standards change their numbering regularly. These numbers are
> from the C++20 draft N4860 - I don't believe the contents change much
> between versions.
>
> 6.8.1p3 .. p5 describes the representation of unsigned types as
> collections of bits representing powers of 2. For unsigned integer
> types in general, there may be padding bits, but there are no padding
> bits in char, signed char, unsigned char and char8_t.

[snip]

> The ball is in your court.

In the above you contradict yourself:

* First you claim that there are no padding bits for `uint64_t`.
* Then you paraphrase the standard that "there may be padding bits".

---

I elected now to not quote or respond to the rest which likewise
involved some self-contradictions.

- Alf

Keith Thompson

unread,

Jun 16, 2023, 2:07:04 PM6/16/23

to

"Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
> On 2023-06-16 10:55 AM, David Brown wrote:
>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>
>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>
>> I don't know how the function here is supposed to be used. We know
>> that the pointer (it is syntactically a reference, but effectively a
>> pointer) is properly aligned for a uint64_t, but we don't know if it
>> actually points to a uint64_t. Perhaps it points to a double, or
>> some other 64-bit type.
>
> In the case you sketch where the bits do not represent a valid
> `uint64_t`, the `memcpy` does not make the behavior well-defined:
> that's a (dangerous) misconception.

I think David was suggesting, not the bits are not a valid
representation for an object of type uint64_t, but that for example the
pointer point to an object whose type is, say, double. (I haven't
followed the discussion closely enough to have an opinion on whether
that matters.)

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Will write code for food.
void Void(void) { Void(); } /* The recursive call of the void */

james...@alumni.caltech.edu

unread,

Jun 16, 2023, 5:16:50 PM6/16/23

to

On Friday, June 16, 2023 at 2:48:24 AM UTC-4, Alf P. Steinbach wrote:
> On 2023-06-15 12:27 AM, James Kuyper wrote:
> > On 6/14/23 15:46, Alf P. Steinbach wrote:

...

> > All you need is a platform where misaligned pointers do not merely cause
> > the code to be inefficient, but to actually malfunction. On such a
> > platform, if p_bytes is not correctly aligned to store a uint64_t, then
> > the code will malfunction in the reinterpret_cast<>.
> Yes, but irrelevant for the case discussed, because the values are
> guaranteed correctly aligned.

You said nothing about what p_bytes was when you asked

"how is the compiler to know that it's generally called with a
`*reinterpret_cast<uint64_t*>( p_bytes )` as argument?"

In particular, you said nothing about p_bytes that would guarantee that it was
correctly aligned. The memcpy() would only be reasonable if it were possible
for "data' to be a misaligned pointer; otherwise simple assignment would be
simpler and equally safe.

> [snip]
> > If p_bytes is correctly aligned, simple assignment will work just as
> > well as memcpy().
> Yes.
> >> Even g++'s documentation of `-fstrict-aliasing` says "A character type
> >> may alias any other type.". ...
> >
> > True, but that's not what this reinterpret_cast does;
> It is what this `reinterpret_cast` does.

This reinterpret_cast converts p_bytes, which presumably points to the
first element of an array of character type, into a pointer to uint64_t*. In
other words, an array of char is being aliased as a uint64_t. Are you
claiming that uint64_t is a character type?

Alf P. Steinbach

unread,

Jun 16, 2023, 6:19:34 PM6/16/23

to

On 2023-06-16 11:16 PM, james...@alumni.caltech.edu wrote:
> On Friday, June 16, 2023 at 2:48:24 AM UTC-4, Alf P. Steinbach wrote:
>> On 2023-06-15 12:27 AM, James Kuyper wrote:
>>> On 6/14/23 15:46, Alf P. Steinbach wrote:
> ...
>>> All you need is a platform where misaligned pointers do not merely cause
>>> the code to be inefficient, but to actually malfunction. On such a
>>> platform, if p_bytes is not correctly aligned to store a uint64_t, then
>>> the code will malfunction in the reinterpret_cast<>.
>> Yes, but irrelevant for the case discussed, because the values are
>> guaranteed correctly aligned.
>
> You said nothing about what p_bytes was when you asked
>
> "how is the compiler to know that it's generally called with a
> `*reinterpret_cast<uint64_t*>( p_bytes )` as argument?"
>
> In particular, you said nothing about p_bytes that would guarantee that it was
> correctly aligned.

That was known from the optimization attempt in the original posting's
code (I'm not the OP):

#if defined(__GNUC__) || defined(__llvm__)
if( (size_t)&data % 8 )
__builtin_unreachable();
#endif

> The memcpy() would only be reasonable if it were possible
> for "data' to be a misaligned pointer; otherwise simple assignment would be
> simpler and equally safe.

On that we agree. :)

>> [snip]
>>> If p_bytes is correctly aligned, simple assignment will work just as
>>> well as memcpy().
>> Yes.
>>>> Even g++'s documentation of `-fstrict-aliasing` says "A character type
>>>> may alias any other type.". ...
>>>
>>> True, but that's not what this reinterpret_cast does;
>> It is what this `reinterpret_cast` does.
>
> This reinterpret_cast converts p_bytes, which presumably points to the
> first element of an array of character type, into a pointer to uint64_t*. In
> other words, an array of char is being aliased as a uint64_t. Are you
> claiming that uint64_t is a character type?

It may be that my English isn't good enough to understand a one-way
nature of the GCC docs' wording.

Perhaps.

The way I think, aliasing is of interest to the compiler because where
it can assume that a T value is never changed via a U* pointer, and that
the value that the U* points to is never changed by a change to the T
value, it can optimize, e.g. not reload that value from memory when it's
already in a register. If T = float and U* = int*, then it can make this
assumption. Similarly if T = int and U* = float*, it can assume this.

But if T = char-type, such as the first char in an array, and U* is a
double*, then it can not reasonably make this assumption, and as I read
the docs quote g++ doesn't make this assumption for T = char-type.

And similarly, if T = double and U* is a char-type*, then it can not
reasonably make this assumption, and as I read the docs quote g++
doesn't make this assumption for U* = char-type*.

That is, as I read it, based more on reasoning about the purpose than
expertise in English (I'm Norwegian), the wording works both ways.
char-type on either side is OK. But invoking that freedom twice with
different types, to end up with e.g. int* and double* pointers that are
the same address, so that an int assignment can change a double or vice
versa, is ungood.

- Alf

David Brown

unread,

Jun 17, 2023, 1:57:28 PM6/17/23

to

If we knew that the original pointer pointed to a uint64_t, then we
could /all/ agree that a plain assignment would be simpler, clearer, and
as efficient as possible.

But as far as I know, no such guarantee exists. My guess, from how
functions like this are sometimes used, is that the original data is in
an array of unsigned char - perhaps a buffer for a received network
packet or a file that has been read.

And if the data does not start as a uint64_t (or compatible type), then
reading it through a uint64_t glvalue is /not/ safe, even if alignment
is guaranteed.

>
>
>>> [snip]
>>>> If p_bytes is correctly aligned, simple assignment will work just as
>>>> well as memcpy().
>>> Yes.
>>>>> Even g++'s documentation of `-fstrict-aliasing` says "A character type
>>>>> may alias any other type.". ...
>>>>
>>>> True, but that's not what this reinterpret_cast does;
>>> It is what this `reinterpret_cast` does.
>>
>> This reinterpret_cast converts p_bytes, which presumably points to the
>> first element of an array of character type, into a pointer to
>> uint64_t*. In
>> other words, an array of char is being aliased as a uint64_t. Are you
>> claiming that uint64_t is a character type?
>
> It may be that my English isn't good enough to understand a one-way
> nature of the GCC docs' wording.
>

The "cppreference" site is often clearer than the standards language:

<https://en.cppreference.com/w/cpp/language/reinterpret_cast#Type_aliasing>

The key point is that a reinterpret_cast (or equivalent via a C-style
cast) does not let you access incompatible types. It does not, in gcc
parlance, side-step the strict aliasing rules.

Like a C cast, reinterpret_cast is a way to tell the compiler that you
think it is safe to change the type of an object (usually a pointer).
But it does not /make/ it safe. And if the compiler can see that the
actual object type is not compatible with the way you are accessing it,
then you have a conflict - you are lying to the compiler, and no good
will come of it. A reinterpret_cast will not change that situation.
(And separate compilation will only hide the problem.)

> Perhaps.
>
> The way I think, aliasing is of interest to the compiler because where
> it can assume that a T value is never changed via a U* pointer, and that
> the value that the U* points to is never changed by a change to the T
> value, it can optimize, e.g. not reload that value from memory when it's
> already in a register. If T = float and U* = int*, then it can make this
> assumption. Similarly if T = int and U* = float*, it can assume this.
>

Yes.

> But if T = char-type, such as the first char in an array, and U* is a
> double*, then it can not reasonably make this assumption, and as I read
> the docs quote g++ doesn't make this assumption for T = char-type.
>

No. You can use a char-type pointer to access a non-char object, but
you cannot use a non-char pointer to access a char (array) object.

I think gcc's documentation could certainly be clearer here, and I also
think that the documentation for "-fstrict-aliasing" flag could be moved
from an optimisation flag to the "code generation options" page. IMHO,
using "-fno-strict-aliasing" is a significant change to the semantics of
the language, making previously undefined behaviour into defined
behaviour (much like "-fwrapv" does for signed integer overflow).

> And similarly, if T = double and U* is a char-type*, then it can not
> reasonably make this assumption, and as I read the docs quote g++
> doesn't make this assumption for U* = char-type*.
>

Correct.

However, gcc/g++ is not saying anything more or less than the standards
say here. The behaviour - both the defined behaviour, and the undefined
behaviour - comes straight from the standard. (The only exception is
for type-punning unions. C90 did not explicitly say they were allowed,
but the gcc documentation says they are allowed even in C90 mode. C99
onwards allows them, while C++ never has.)

Bonita Montero

unread,

Jun 17, 2023, 2:08:39 PM6/17/23

to

Am 16.06.2023 um 23:16 schrieb james...@alumni.caltech.edu:

> ... The memcpy() would only be reasonable if it were possible

> for "data' to be a misaligned pointer; otherwise simple assignment
> would be simpler and equally safe.

I'm aliasing a char-array.

Alf P. Steinbach

unread,

Jun 17, 2023, 2:37:42 PM6/17/23

to

Assuming no padding bits, which for clarity is not checked (AFAIK there
is no extant compiler that introduces padding bits):

#include <new>
#include <assert.h>
#include <stdio.h>

auto main() -> int
{
alignas( int ) char chars[sizeof( int )] = {};
int* const p = std::launder( new( (void*) chars ) int );
assert( *p == 0 );
chars[0] = 42;
printf( "This system is %s-endian.\n", (*p == 42)? "little" :
"big" );
}

One possible way to reconcile that working example with your assertion
"cannot" is that in your view the `*p` expressions do not access the
character array but the `double` object that resides there.

If that is the case then the assertion is meaningless nonsense,
consistent with your sequence of self-contradictions in this thread.

[snip]

- Alf

David Brown

unread,

Jun 17, 2023, 6:30:22 PM6/17/23

to

It is an int, in your example, not a double, but otherwise that is
/exactly/ what happens. *p is accessing the int object, not a char array.

Then "chars[0] = 42;" is using a character pointer to access the memory
used for storing an int.

<https://en.cppreference.com/w/cpp/utility/launder>

>
> If that is the case then the assertion is meaningless nonsense,
> consistent with your sequence of self-contradictions in this thread.
>

I haven't contradicted myself - I have only contradicted you.

David Brown

unread,

Jun 17, 2023, 6:36:37 PM6/17/23

to

On 16/06/2023 19:27, Alf P. Steinbach wrote:
> On 2023-06-16 6:32 PM, David Brown wrote:
> [snip]
>>>>
>>>> A "uint64_t" has a guaranteed fully-defined format and no padding bits.
> [snip]
>> The C++ standards change their numbering regularly. These numbers are
>> from the C++20 draft N4860 - I don't believe the contents change much
>> between versions.
>>
>> 6.8.1p3 .. p5 describes the representation of unsigned types as
>> collections of bits representing powers of 2. For unsigned integer
>> types in general, there may be padding bits, but there are no padding
>> bits in char, signed char, unsigned char and char8_t.
>
> [snip]
>
>> The ball is in your court.
>
> In the above you contradict yourself:

Your English is better than most native speakers - but you have to /try/
to read and understand posts. It is your comprehension that is at
fault, and I suspect also your assumptions about C and C++. (To be
fair, many of your assumptions are true in all realistic implementations
in the modern world.)

>
> * First you claim that there are no padding bits for `uint64_t`.

Correct. That's what the C standard says, and that's what the C++
standard says by delegation to the C standard.

> * Then you paraphrase the standard that "there may be padding bits".

You asked for references - I gave them. Look them up. You will see
that in general, unsigned integer types (other than unsigned char) may
have padding bits. Only specific types such as the size-specific types
in <stdint.h> are guaranteed not to have padding bits - /if/ they exist.
An implementation might not support these non-padded types at all. It
might support them as extended integer types, and have padding bits in
the standard integer types. It might have lots of unsigned types - some
with padding, some without.

But uint64_t and the other size-specific unsigned types are guaranteed
not to have padding.

Alf P. Steinbach

unread,

Jun 17, 2023, 7:11:50 PM6/17/23

to

Well I give up, as usual when I encounter strong religious beliefs with
argumentation consisting of self-contradictions, denials and advice.

- Alf

james...@alumni.caltech.edu

unread,

Jun 18, 2023, 1:10:50 AM6/18/23

to

How is that supposed to help? On platforms where misaligned access
doesn't just cause inefficiency, but actually is a serious problem, the
unspecified result of the reinterpret_cast<> of a mis-aligned pointer
can be, and often is, a correctly aligned pointer which, therefore,
necessarily points at a different location in memory. It is quite
frequently one of the two nearest correctly aligned pointers. The result
of such a conversion would pass that test, and as a result it would
memcpy() from the wrong location.

>>> [snip]
>>>> If p_bytes is correctly aligned, simple assignment will work just as
>>>> well as memcpy().
>>> Yes.
>>>>> Even g++'s documentation of `-fstrict-aliasing` says "A character type
>>>>> may alias any other type.". ...
>>>>
>>>> True, but that's not what this reinterpret_cast does;
>>> It is what this `reinterpret_cast` does.
>>
>> This reinterpret_cast converts p_bytes, which presumably points to the
>> first element of an array of character type, into a pointer to
>> uint64_t*. In
>> other words, an array of char is being aliased as a uint64_t. Are you
>> claiming that uint64_t is a character type?
>
> It may be that my English isn't good enough to understand a one-way
> nature of the GCC docs' wording.

I was responding to a statement about both the C and the C++ standards.
The GCC docs merely reword the relevant clauses of the standards.
Insofar as gcc claims standard conformance, the wording of those
standards is more authoritative than GCC's docs.

The C standard's wording is quite explicitly asymmetrical. It
distinguishes between the effective type of an object, and of the lvalue
that is used to access it. The effective type of the object, in this
case, is an array of character type. The lvalue is uint64_t. There is a
special case that allows the lvalue to have a character type regardless
of the effective type; there is no special case that allows the
effective type to be an array of character when the lvalue's type isn't
also an array of character.

The relevant section of the C++ standard is 7.2.1p11 (in n4860.pdf). It
uses the term "dynamic type" instead of "effective type", and "glvalue"
rather than "lvalue", and is assymetrical in the same way: it has an
exception that allows the glvalue to be "char, unsigned char, or
std::byte" regardless of the dynamic type. It has no exception that
allows the dynamic type be an array of one of those types unless the
glvalue is also such an array.

...

> The way I think, aliasing is of interest to the compiler because where
> it can assume that a T value is never changed via a U* pointer, and
> that the value that the U* points to is never changed by a change to
> the T value, it can optimize, e.g. not reload that value from memory
> when it's already in a register. If T = float and U* = int*, then it
> can make this assumption. Similarly if T = int and U* = float*, it can
> assume this.

That is one relevant issue, but it ignores the alignment issue, which is
also a barrier to aliasing, and an inherently asymmetric one.

David Brown

unread,

Jun 18, 2023, 5:00:55 AM6/18/23

to

On 16/06/2023 20:06, Keith Thompson wrote:
> "Alf P. Steinbach" <alf.p.s...@gmail.com> writes:
>> On 2023-06-16 10:55 AM, David Brown wrote:
>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>
>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>
>>> I don't know how the function here is supposed to be used. We know
>>> that the pointer (it is syntactically a reference, but effectively a
>>> pointer) is properly aligned for a uint64_t, but we don't know if it
>>> actually points to a uint64_t. Perhaps it points to a double, or
>>> some other 64-bit type.
>>
>> In the case you sketch where the bits do not represent a valid
>> `uint64_t`, the `memcpy` does not make the behavior well-defined:
>> that's a (dangerous) misconception.
>
> I think David was suggesting, not the bits are not a valid
> representation for an object of type uint64_t, but that for example the
> pointer point to an object whose type is, say, double. (I haven't
> followed the discussion closely enough to have an opinion on whether
> that matters.)
>

All I am saying, is that if you have a double in memory somewhere, and
you have a pointer-to-uint64_t that happens to contain the address of
the double, you can't use that pointer to access the double without
invoking undefined behaviour. Alignment is not the main problem here
(it is not a problem at all if the double has a valid alignment for a
uint64_t). It is the effective type (C), dynamic type (C++) or strict
aliasing rules (gcc) that are the problem.

Moving parts of the code to separate compilation, as Alf suggested, does
not help. reinterpret_cast<> does not help. std::launder, another of
Alf's suggestions, does not help either if you want to read the double
as an uint64_t. (You can use std::launder to create a new uint64_t in
the space the double occupied, but not to let you read the existing
data. I believe that std::launder was so rarely used appropriately, and
so often used incorrectly, that there are plans to remove it from C++.)

memcpy(), on the other hand, /will/ let you access the memory safely and
with well-defined behaviour. This is why the OP's original code will
work correctly. It was merely the optimisation in that code that was
not as good as it could easily have been.

David Brown

unread,

Jun 18, 2023, 5:01:12 AM6/18/23

to

I strongly resent the accusation of "religious" beliefs here. I put
considerable effort into trying to understand the details of the C and
C++ languages, both the theoretical aspects from the standards, and the
practical aspects of how the languages are implemented in practice. If
my understanding is incorrect, I like to be told, and learn from that.
I have done so many, many times over the years in this group (and c.l.c.).

I still have no idea what you are talking about when you say I am
contradicting myself. None. Not a clue. If you told me - I have asked
you to do so - then we might make progress. But no, you would rather
repeat your claims without proof or explanation - /that/ is
religious-style posting.

Vir Campestris

unread,

Jun 18, 2023, 9:14:12 AM6/18/23

to

On 13/06/2023 19:36, Bo Persson wrote:
> On 2023-06-13 at 20:31, Bonita Montero wrote:
>> What is the if( ... ) below good for ?
>>
>> template<bit_cursor_iterator BitIt>
>> constexpr uint64_t bit_cursor<BitIt>::dRead( uint64_t const &data )
>> noexcept
>> {

>> #if defined(__GNUC__) || defined(__llvm__)
>> if( (size_t)&data % 8 )
>> __builtin_unreachable();
>> #endif

>>      uint64_t value;
>>      memcpy( &value, &data, sizeof(uint64_t) );
>>      return value;
>> }
>
> It tells the compiler that the data is properly aligned. Possibly saves
> a potential call to library memcpy to sort out unaligned bytes.
>
Coming in late to this, but having read the thread through -
It tells the compiler the data is aligned on an 8 word boundary.

I say word, because while almost all computers have byte addressing it
is not 100% (the first one I learned assembler on had 36-bit addressing
- and I didn't use C on it).

And even when it is 8-bit addressing this could be an excessive check,
for example on a machine with a 32 bit native word size.

Andy

Keith Thompson

unread,

Jun 18, 2023, 3:59:57 PM6/18/23

to

Vir Campestris <vir.cam...@invalid.invalid> writes:
> On 13/06/2023 19:36, Bo Persson wrote:
>> On 2023-06-13 at 20:31, Bonita Montero wrote:
>>> What is the if( ... ) below good for ?
>>>
>>> template<bit_cursor_iterator BitIt>
>>> constexpr uint64_t bit_cursor<BitIt>::dRead( uint64_t const &data )
>>> noexcept
>>> {
>>> #if defined(__GNUC__) || defined(__llvm__)
>>>      if( (size_t)&data % 8 )
>>>          __builtin_unreachable();
>>> #endif
>>>      uint64_t value;
>>>      memcpy( &value, &data, sizeof(uint64_t) );
>>>      return value;
>>> }
>> It tells the compiler that the data is properly aligned. Possibly
>> saves a potential call to library memcpy to sort out unaligned
>> bytes.
>>
> Coming in late to this, but having read the thread through -
> It tells the compiler the data is aligned on an 8 word boundary.
>
> I say word, because while almost all computers have byte addressing it
> is not 100% (the first one I learned assembler on had 36-bit
> addressing - and I didn't use C on it).

C++, as far as I know, doesn't define what a "word" is, and
doesn't use that term. It does define a "byte" to be CHAR_BIT
bits, where CHAR_BIT >= 8. An implementation for a system with
36-bit addressing might either define CHAR_BIT==36 or CHAR_BIT==9
with byte level accesses implemented in software. (And such an
implementation would not define uint64_t.)

sizeof (uint64_t) is most likely 8, but it could be 4, 2, or 1.

> And even when it is 8-bit addressing this could be an excessive check,
> for example on a machine with a 32 bit native word size.

Alf P. Steinbach

unread,

Jun 19, 2023, 10:43:29 AM6/19/23

to

Regarding the cast to `size_t` you'd have to ask the original coder,
which is not necessarily even the original poster, which isn't me.

However, presumably you're asking how that quote is meant to help you
understand that there is an assumption of correct alignment.

That's because with incorrect alignment execution (with GCC and clang)
reaches `__builtin_unreachable`, which is UB by definition. This allows
the compiler to optimize /as if/ correct alignment is guaranteed. Thus
the quoted code explicitly expresses that assumption.

> On platforms where misaligned access
> doesn't just cause inefficiency, but actually is a serious problem, the
> unspecified result of the reinterpret_cast<> of a mis-aligned pointer
> can be, and often is, a correctly aligned pointer which, therefore,
> necessarily points at a different location in memory. It is quite
> frequently one of the two nearest correctly aligned pointers. The result
> of such a conversion would pass that test, and as a result it would
> memcpy() from the wrong location.

Yes that's trivially true: breaking an assumption can have dire
consequences.

However, /mentioning/ that indicates some relevance to something.

And AFAICS there is no relevance to anything.

[snip]

- Alf

James Kuyper

unread,

Jun 19, 2023, 2:25:33 PM6/19/23

to

That is not the point I was discussing, which you might reasonably have
concluded from the fact that I didn't even mention size_t or [u]intptr_t.

I agree that it is the wrong type, but other people have already
addressed that issue more than adequately. I don't normally respond to
the original poster, for reasons that should be abundantly clear by
looking at how she responded to discussions of that issue. Keep in mind
that she's apparently not the person who wrote the code, she just ran
into it and is trying to figure it out.

> However, presumably you're asking how that quote is meant to help you
> understand that there is an assumption of correct alignment.

Nope. I'm not how I'm supposed to figure out that that assumption was
made. That's trivial. I'm asking why you think the assumption is
guaranteed to be true. See above, where you say "the values are
guaranteed correctly aligned".

> That's because with incorrect alignment execution (with GCC and clang)
> reaches `__builtin_unreachable`, which is UB by definition. This allows
> the compiler to optimize /as if/ correct alignment is guaranteed. Thus
> the quoted code explicitly expresses that assumption.

So, this is designed to make the compiler generate code that is
optimized for correctly aligned pointers, which is valuable on a
platform where mis-aligned access is merely slower than correctly
aligned access. It doesn't do anything useful on a platform where
mis-aligned pointers simply cannot be safely dereferenced. Nor does it
do any good on platforms where uint64_t* is incapable of representing an
address that is mis-aligned, so the reinterpret_cast<> is guaranteed to
produce a pointer that points at a different location in memory.

The fact that you don't work on such platforms doesn't change the fact
that such platforms exist, and it doesn't negate the C and C++
committees' decisions to write the two standards in such a way as to
allow fully conforming implementations on such platforms.

...

> Yes that's trivially true: breaking an assumption can have dire
> consequences.

You said that this code has "UB that can't happen unless one informs a
really perverse compiler that it's there." You've just admitted that if
you compile for a platform where that assumption is not valid, the
consequences are dire. That's precisely what the UB specified by the
standard is intended to cover. That the code can also be used on
platforms where it isn't a problem doesn't make it any less appropriate
for the standard to say that has UB.

Bonita Montero

unread,

Jun 19, 2023, 2:29:35 PM6/19/23

to

Am 19.06.2023 um 20:25 schrieb James Kuyper:

> I agree that it is the wrong type, but other people have already
> addressed that issue more than adequately. I don't normally respond to
> the original poster, for reasons that should be abundantly clear by
> looking at how she responded to discussions of that issue. Keep in mind
> that she's apparently not the person who wrote the code, she just ran
> into it and is trying to figure it out.

I've written the code and I don't think that I'll ever run
into problems with that; I'm simply not that compulsive.

Alf P. Steinbach

unread,

Jun 19, 2023, 6:47:55 PM6/19/23

to

Not sure I'm buying that evaluation; it assumes too much about
compiler's code generation and knowledge of calling context.

> Nor does it
> do any good on platforms where uint64_t* is incapable of representing an
> address that is mis-aligned, so the reinterpret_cast<> is guaranteed to
> produce a pointer that points at a different location in memory.
>
> The fact that you don't work on such platforms doesn't change the fact
> that such platforms exist, and it doesn't negate the C and C++
> committees' decisions to write the two standards in such a way as to
> allow fully conforming implementations on such platforms.
>
> ...
>> Yes that's trivially true: breaking an assumption can have dire
>> consequences.
>
> You said that this code has "UB that can't happen unless one informs a
> really perverse compiler that it's there." You've just admitted that if
> you compile for a platform where that assumption is not valid, the
> consequences are dire.

No, the assumption that was made in the code was not about the platform.

The assumption made was that it's guaranteed that the function is called
with a correctly aligned argument.

The platform has nothing to do with it.

> That's precisely what the UB specified by the
> standard is intended to cover. That the code can also be used on
> platforms where it isn't a problem doesn't make it any less appropriate
> for the standard to say that has UB.

These errors of reasoning follow from the one about platform-specificity.

- Alf

Keith Thompson

unread,

Jun 19, 2023, 8:37:58 PM6/19/23

to

"Alf P. Steinbach" <alf.p.s...@gmail.com> writes:

I haven't followed this thread closely enough to be sure that nobody has
raised this point:

Nothing in the language standard guarantees that converting a pointer
value to an integer type and checking the low-order bits of the result
tells you anything about the alignment of the memory the pointer points
to. The assuption is likely to be valid on any platform supported by
gcc and/or llvm (clang), but I have no idea whether either of those
compilers would be clever enough to recognize what the code is trying to
assert and perform the desired optimization.

I've worked on systems where a pointer value with a representation
ending in binary 000 might be byte-aligned, or a pointer ending in
non-zero bits might be word-aligned. (Cray vector machines, where the
hardware can only address 64-bit words and a byte offset was stored in
the high-order 4 bits and managed in software.)

>> That's precisely what the UB specified by the
>> standard is intended to cover. That the code can also be used on
>> platforms where it isn't a problem doesn't make it any less appropriate
>> for the standard to say that has UB.
>
> These errors of reasoning follow from the one about platform-specificity.

Alf P. Steinbach

unread,

Jun 20, 2023, 2:32:58 AM6/20/23

to

On 2023-06-20 2:37 AM, Keith Thompson wrote:
[snip]

> Nothing in the language standard guarantees that converting a pointer
> value to an integer type and checking the low-order bits of the result
> tells you anything about the alignment of the memory the pointer points
> to. The assuption is likely to be valid on any platform supported by
> gcc and/or llvm (clang), but I have no idea whether either of those
> compilers would be clever enough to recognize what the code is trying to
> assert and perform the desired optimization.

From a portability POV the example given in the original posting is
even worse:

* Your point that checking the low order bits is not guaranteed.
Apparently the standard library has no simple way to query the
alignment of an arbitrary address, only the alignment of a type. In
principle one could check the alignment by generating a corresponding
well aligned address via `std::align`, and comparing, but that may be
too convoluted to help the compiler understand an assertion about the
alignment. However, at the point where one is dealing with a byte array,
if one has ensured that the array itself is properly aligned it's
trivial to check/ensure that any part of it has some given alignment.

* The `% 8` in the code.
Should be `% alignof( Type )`. E.g. byte size /can/ be 8 octets.

* The cast to `size_t`.
Should be a cast to `uintptr_t`. `size_t` may be too small.

And I've remarked elsewhere that portable `for(;;){}` UB would probably
communicate the "can't get here" just as well as the non-portable GCC
intrinsic.

Plus, that

* The parameter type `uint64_t const &` /already/ expresses the
alignment assumption.

It's only with machine code inlining that the compiler might at all
consider that `void foo( int& )` might receive a non-aligned parameter.

Proper alignment is a tacit assumption of all C++ code.

> I've worked on systems where a pointer value with a representation
> ending in binary 000 might be byte-aligned, or a pointer ending in
> non-zero bits might be word-aligned. (Cray vector machines, where the
> hardware can only address 64-bit words and a byte offset was stored in
> the high-order 4 bits and managed in software.)

Gah. :-(

But that raises the question of which platforms modern C++ really cover
/in practice/. I.e. which platforms have C++20 compilers.

It's counter-productive with modern C++ rules that ensure portability of
some construct where that portability is only an issue for systems that
don't have modern C++ compilers, so maybe the standard should be fixed.

[snip]

- Alf

David Brown

unread,

Jun 20, 2023, 3:15:24 AM6/20/23

to

On 20/06/2023 08:32, Alf P. Steinbach wrote:
> On 2023-06-20 2:37 AM, Keith Thompson wrote:
> [snip]
>> Nothing in the language standard guarantees that converting a pointer
>> value to an integer type and checking the low-order bits of the result
>> tells you anything about the alignment of the memory the pointer points
>> to. The assuption is likely to be valid on any platform supported by
>> gcc and/or llvm (clang), but I have no idea whether either of those
>> compilers would be clever enough to recognize what the code is trying to
>> assert and perform the desired optimization.

That assumption /is/ true for all gcc targets. (I am less familiar with
clang.) And gcc will /sometimes/ use the explicit undefined behaviour
to optimise the access as aligned. Sometimes, however, that information
appears to get lost before later optimisation passes. (On RISC-V, gcc
generates fast aligned code at -O1, but slow unaligned code at -O2.)
gcc has a better solution to this - __builtin_assume_aligned() - which
could also work if gcc ever targeted systems like the Cray you
mentioned. (The OP, being Bonita, decided that the original poorer code
was better than listening to anyone else.)

>
> From a portability POV the example given in the original posting is
> even worse:
>
> * Your point that checking the low order bits is not guaranteed.
> Apparently the standard library has no simple way to query the
> alignment of an arbitrary address, only the alignment of a type.

C++20 has std::assume_aligned, which is basically a less flexible
standardisation of the gcc builtin. It does not let you query the
alignment, but it /does/ let you tell the compiler about it, which is
what you need for optimising code like the original post.

> In
> principle one could check the alignment by generating a corresponding
> well aligned address via `std::align`, and comparing, but that may be
> too convoluted to help the compiler understand an assertion about the
> alignment. However, at the point where one is dealing with a byte array,
> if one has ensured that the array itself is properly aligned it's
> trivial to check/ensure that any part of it has some given alignment.
>
> * The `% 8` in the code.
> Should be `% alignof( Type )`. E.g. byte size /can/ be 8 octets.
>

Yes.

> * The cast to `size_t`.
> Should be a cast to `uintptr_t`. `size_t` may be too small.
>

There's also the possibility that the target does not have an integer
type that is suitable for conversion of all data pointer types. If that
is the case, no "uintptr_t" type will exist for the platform - but
"size_t" will always exist. So if you have such a target, a cast to
"uintptr_t" will give a compile-time error, which is vastly preferable
to the silently dangerous compile you'd get with "size_t".

And of course, "uintptr_t" is better than "size_t" simply because you
would be using the correctly specified type for the purpose.

> And I've remarked elsewhere that portable `for(;;){}` UB would probably
> communicate the "can't get here" just as well as the non-portable GCC
> intrinsic.
>

In theory, perhaps. In practice, no. In some kinds of coding, infinite
loops like that can be exactly what you want, and compilers generally do
not assume that it is an error or undefined behaviour. (I am not at all
convinced that an infinite loop is undefined behaviour.)

If you are happy with C++23, then std::unreachable() would of course be
a better choice than a gcc extension.

James Kuyper

unread,

Jun 20, 2023, 12:23:02 PM6/20/23

to

On 6/19/23 18:47, Alf P. Steinbach wrote:
> On 2023-06-19 8:25 PM, James Kuyper wrote:
>> On 6/19/23 10:43, Alf P. Steinbach wrote:

...

>>> That's because with incorrect alignment execution (with GCC and clang)
>>> reaches `__builtin_unreachable`, which is UB by definition. This allows
>>> the compiler to optimize /as if/ correct alignment is guaranteed. Thus
>>> the quoted code explicitly expresses that assumption.
>>
>> So, this is designed to make the compiler generate code that is
>> optimized for correctly aligned pointers, which is valuable on a
>> platform where mis-aligned access is merely slower than correctly
>> aligned access. It doesn't do anything useful on a platform where
>> mis-aligned pointers simply cannot be safely dereferenced.
>
> Not sure I'm buying that evaluation; it assumes too much about
> compiler's code generation and knowledge of calling context.

I'm merely trying to guess at plausible reasons why someone might
consider it desireable to instruct the compiler to generate code that
can't deal safely with misaligned pointers. Obviously, those reasons
will be platform-specific.

...
>>> Yes that's trivially true: breaking an assumption can have dire
>>> consequences.
>>
>> You said that this code has "UB that can't happen unless one informs a
>> really perverse compiler that it's there." You've just admitted that if
>> you compile for a platform where that assumption is not valid, the
>> consequences are dire.
>
>
> No, the assumption that was made in the code was not about the platform.
>
> The assumption made was that it's guaranteed that the function is
> called with a correctly aligned argument.

Nothing is more normal than writing code that has pre-conditions that
must be met for the code to work properly. However, I find it quite
bizarre that you describe such a practice by claiming that it is
guaranteed that the pre-conditions will be met, and that, as a result,
the corresponding UB can't happen. And you reiterate those claims after
admitting that "breaking an assumption can have dire consequences".
You're twisting the meaning of basic words in the English language such
as "guarantee", "assume", and "can't happen" in ways I hadn't considered
plausible. I'll keep that in mind for future conversations with you.

,,,

> The platform has nothing to do with it.

The platform's characteristics determine whether or not alignment even
matters, and in particular whether misaligned access is even an issue,
and if it is, whether it is merely inefficient, or positively dangerous.

V

unread,

Jul 8, 2023, 2:36:52 PM7/8/23

to

Hello, David.

Nice to see You around, man.

Seems line noone wants to post anymore to comp.programming.

Even Amine not.

You too moved out of there, like it is visible ?

On Thursday, June 15, 2023 at 11:02:30 AM UTC+3, David Brown wrote:
> On 15/06/2023 00:27, James Kuyper wrote:
> > On 6/14/23 15:46, Alf P. Steinbach wrote:
> >> On 2023-06-14 7:56 AM, Bonita Montero wrote:
> >>> Am 14.06.2023 um 07:52 schrieb Alf P. Steinbach:
> >>>
> >>>> It's copying an `uint64_t` that is known to be correctly aliased, to
> >>>> an `uint64_t`; that's nonsense.
> >>>
> >>> The reference intitially supplied by the caller is casted from a char
> >>> -array. memcpy() is the only legal way in C++ to alias that content.
> > ...
> >> So for the separately compiled function it does not matter technically,
> >> except possibly for performance, whether it uses clear, concise, safe
> >> and guaranteed max efficient `=`, or verbose and unsafe `memcpy`.
> >>
> >> That means that regarding this matter the common interpretation of the
> >> standard is not technical but instead specifies a formal UB that can't

> >> happen unless one informs a really perverse compiler that it's there.
> >

> > All you need is a platform where misaligned pointers do not merely cause
> > the code to be inefficient, but to actually malfunction. On such a
> > platform, if p_bytes is not correctly aligned to store a uint64_t, then
> > the code will malfunction in the reinterpret_cast<>.
> >

> And in case anyone has doubts, such platforms do exist. I have used
> embedded microcontrollers in which an unaligned access might mean access
> to the address rounded down (i.e., a 16-bit store to 0x2001 actually
> stores 16 bits at 0x2000). The stored data may or may not be
> byte-swapped - the behaviour is undefined, and I don't think it was
> consistent between different generations of the processor.
>
> There are also big processors which will fault on unaligned accesses.
> Even if there are OS services in place to simulate the access, the
> process is so massively slower than normal accesses that it could be
> considered a software malfunction for performance critical code.