On 16/06/2023 17:25, Alf P. Steinbach wrote:
> On 2023-06-16 2:55 PM, David Brown wrote:
>> On 16/06/2023 12:34, Alf P. Steinbach wrote:
>>> On 2023-06-16 10:55 AM, David Brown wrote:
>>>> On 16/06/2023 08:59, Alf P. Steinbach wrote:
>>>>>
>>>>> `memcpy` is not the safest tool around. It's rather the opposite,
>>>>> something to avoid /if possible/. `memcpy` is the "weak tool" here.
>>>>>
>>>> I don't know how the function here is supposed to be used. We know
>>>> that the pointer (it is syntactically a reference, but effectively a
>>>> pointer) is properly aligned for a uint64_t, but we don't know if it
>>>> actually points to a uint64_t. Perhaps it points to a double, or
>>>> some other 64-bit type.
>>>
>>> In the case you sketch where the bits do not represent a valid
>>> `uint64_t`, the `memcpy` does not make the behavior well-defined:
>>> that's a (dangerous) misconception.
>>
>> A "uint64_t" has a guaranteed fully-defined format and no padding bits.
>
> Since you believe that,
Ref. C standards 7.20.1.1p2, 6.2.6.1p2, 6.2.6.2p1.
I believe the C++ standards inherit this from C, and the C standards are
easier to reference (IMHO) because the numbering stays consistent
between versions.
> your comment about "Perhaps points to a double"
> is meaningless nonsense.
I'm sorry, I have no idea what you mean by that.
>
> You're arguing against yourself, with only three lines between your two
> comments that are shooting dum-dum bullets at each other.
>
> Make up your mind, please.
Again, I have no idea what you mean.
Perhaps you don't remember what you wrote yourself?
>
>
>> All possible bit patterns for the type have well-defined behaviour.
>> (Hypothetically, that would not be true for "unsigned long long",
>> which could contain padding bits and have could have trap
>> representations.)
>
> We could discuss this assertion, e.g. I could helpfully mention that in
> C++ "these requirements do not hold for other types [than character
> types]", but better that you waste time attempting to PROVE IT.
>
> Chapter and verse, please.
>
The C++ standards change their numbering regularly. These numbers are
from the C++20 draft N4860 - I don't believe the contents change much
between versions.
6.8.1p3 .. p5 describes the representation of unsigned types as
collections of bits representing powers of 2. For unsigned integer
types in general, there may be padding bits, but there are no padding
bits in char, signed char, unsigned char and char8_t. (C defines the
bit order for unsigned char, but I don't think C++ does as far as I can
see.)
17.4.1 specifies <cstdint>, and in 17.4.1p2 this refers back to the C
standards section 7.20 where the size-specific integer types are defined
to have no padding bits.
So again - please tell me why you think memcpy'ing data into a uint64_t
may have dangerous or poorly defined behaviour.
The ball is in your court.
> Not that it matters for what I've written, but it matters for the silly
> argument that you offered, quoted above, and that you now argue against,
> plus, there is the thing about being Wrong on the internet, not to
> mention Doubly Wrong: just on principle one should not let that pass.
>
>
>> In case I am missing something, please tell me where you see any
>> possible dangerous or not fully defined behaviour even for the more
>> general case :
>>
>> #include <string.h>
>> #include <stdint.h>
>>
>> uint64_t read64bits(const void * p) {
>> uint64_t x;
>> memcpy((void*) &x, p, sizeof(uint64_t));
>> return x;
>> }
>
> Now you're arguing against yourself again.
>
> That means that whatever I respond, I can expect a random direction answer.
>
> Anyway:
>
> * The C style cast there is both unnecessary and dangerous, because it
> can cast away const, and is difficult to grep, so it's ungood code.
I am asking what you think is /dangerous/ or not fully defined here -
not whether there might be an unnecessary cast or whether the style is
"perfect" according to your questionable judgement.
> * You're wrong about formally no padding bits for C++, so in principle
> that function can produce an invalid `uint64` with trap representation;
> that's UB -- except that that's in principle, not in practice.
See above for the chapter and verse you requested.
> * If the pointer `p` is invalid, or is a nullpointer, or doesn't point
> to at least sizeof(uint64_t) contiguous bytes of readable memory, then
> that's UB, and it's UB both in principle and in practice.
>
See below for the explicit assumption that p points to sufficient
readable data.
> If you had been a certain other old timer in this group, then it would
> also be relevant that there's possible UB due to stack overflow.
>
> Which by his (lack of) logic leads to the conclusion that all C++
> programs have UB.
>
I hope we can agree to assume that kind of thing is outside the scope of
the language!
>
>> We can assume that "p" points to data of some type of at least 64 bits
>> in size. I would like to hear of any potential issues in C or C++
>> (hence the cross-language code).
>
> Oh, cross language, sorry.
No problem. As long as we avoid crass language :-)
>
> For the case of C /I believe/ that there's a formal guarantee of no
> padding bits, so then only the last point above matters wrt. UB.
>
Ah ha - now we are getting somewhere!
I suspect the key point is that the C++ standard does not explicitly
define uint64_t to have no padding bits, but it refers to the C standard
which /does/ have such guarantees. C++ inherits them from C.
>
>>> I'm not sure if you're addressing only the formal here.
>>
>> No. It is a general principle. Some people /do/ believe that
>> "separate compilation" creates magical barriers that limit a
>> compiler's ability to see the relationships between code sections, and
>> therefore its ability to "optimise using assumptions about defined and
>> undefined behaviours", and that this means some kinds of undefined
>> behaviours become defined by moving code to a different file or
>> disabling optimisation. They are wrong to believe this. They may
>> manage to write code that works when they test it, but it will be
>> fragile - the code still has exactly the same undefined behaviours,
>> and these may manifest as bugs in the future.
>>
>>>
>>> However, if you intended this to also apply to the in-practice, then
>>> the burden of proof is on you that some computer exists where
>>> `uint64_t` has bits that do not participate in the value
>>> representation, so that they /can/ be invalid: as far as I know there
>>> is no such computer.
>>>
>>
>> No, not at all. It is only /you/ that has suggested, by claiming the
>> use of memcpy is not fully defined, that uint64_t may hypothetically
>> contain padding bits. (See earlier in my reply.) I know it can't, so
>> that is not the issue.
>
> Now you're claiming that your argument was my argument. Jeez.
>
Please re-read your reply to Bonita. You were advocating using
separately compiled code so that you can use assignment instead of
memcpy - claiming, bizarrely, that assignment was "safe" and memcpy
"unsafe" despite insisting on separately compiled functions for your
"solution" to work "in practice".
And then you claimed that the memcpy version was not well defined.