On 9/11/20 2:57 PM, Chris Vine wrote:
> On Fri, 11 Sep 2020 11:44:54 -0400
> James Kuyper <
james...@alumni.caltech.edu> wrote:
>> On 9/11/20 6:17 AM, Chris Vine wrote:
...
>> Do you mean "second example"? Tracing back, only the second example
>> involve pointer arithmentic (i++).
>> That is just as much a problem with C as with C++. For pointer values,
>> they both define i++ as equivalent to i=i+1, except that 'i' is
>> evaluated only once, and they both define addition of integer values to
>> pointer values in terms of a containing array; if there's no such array,
>> there's no applicable definition of what the addition means.
>
> Yes I did mean the second example. The point about C11 is that
> �6.5.6/8 deals with pointer arithmetic concerning arrays and the words
> "otherwise, the behavior is undefined" seem to have been read in the
> context of arrays only and not applying to raw malloc'ed memory. That
> reading is impossible with C++20's [expr.add]/4. ...
I don't see that as being the case. It says:
"When an expression J that has integral type is added to or subtracted
from an expression P of pointer type, the result has the type of P.
(4.1) — If P evaluates to a null pointer value and J evaluates to 0, the
result is a null pointer value.
(4.2) — Otherwise, if P points to an array element i of an array object
x with n elements (9.3.3.4), 76 the expressions P + J and J + P (where J
has the value j) point to the (possibly-hypothetical) array element i +
j of x if 0 ≤ i + j ≤ n and the expression P - J points to the
(possibly-hypothetical) array element i − j of x if 0 ≤ i − j ≤ n.
(4.3) — Otherwise, the behavior is undefined." (7.6.6p4).
Just like the corresponding words in the C standard, paragraph 4.2 is
predicated on P pointing at the i'th member of an array. If there is no
such array, then 4.3 applies, and the behavior is undefined.
If C++2020 handles this issue better than C does, the different wording
that makes that true must lie in some other part of the document.
> ... The issue with C is
> also that the "effective type" of raw memory is normally the type of
> the first object constructed in it (�6.5/6 of C11), so in a sense
> malloc'ed memory becomes an array in C by being treated as an array.
The fact that this isn't normally the case is precisely what's
problematic in C. When an lvalue of a given type is used to write to
dynamically allocated memory, that write gives that memory the effective
type of that lvalue. If you only write one element of an array at a
time, each element of that array acquires an effective type which is the
array's element type, but nothing acquires an effective type that is the
array type.
There are ways around this: you can write an entire struct object to the
memory using a single assignment expression, in which case that object
acquires that struct type as its effective type, and in particular, if
one of the members is an array, that member's memory acquires that array
type.
Also, when you copy an object into dynamically allocated memory using
memcpy(), memmove(), or otherwise copied as an array of character type,
there's a special rule that says that the memory acquires the effective
type of that object. As a special case of this, an array object can be
copied into the memory, giving it an array type.
Lots of C code treats dynamically allocated memory as if it were an
array, without either of those two special cases applying. Therefore, a
strict reading of the rules of pointer arithmetic gives access to any
element of such an array other than the first element undefined behavior.
> (Somewhat akin to the new C++ implicit-lifetime types.)
6.7.2p12 has code that contains a comment indicating that
X *p = (X*)std::malloc(sizeof(struct X))
implicitly creates an object of type X in the allocated memory.
20.10.12p5 appears to be the clause that actually supports that comment.
However, it's not clear to me from 20.10.12p5 that
X *p = (X*)std::malloc(n*sizeof(struct X));
would implicitly create an n-element array. I certainly wouldn't object
to that being the case, but it doesn't clearly say so.
> At any rate I doubt you will find any C programmer, or member of the C
> standard committee, who thinks that:
>
> int* i = (int*) malloc(2 * sizeof(int));
> i++;
>
> is undefined behaviour because of �6.5.6/8, as in C it is the only way
> of getting to the second element in advance of constructing the first
> one.
I think it is undefined behavior. I don't think it was intended to be
undefined behavior, so this represents a defect in the standard, but I
don't see any way to justify saying that there's an array that can be
used to give meaning to such pointer arithmetic.
> If N4860 is the definitive text for C++20, then I think this construct
> now has defined behaviour in C++20 because an array would implicitly be
> taken to arise for the case of an array of trivial types, given that
> array types are now implicit-lifetime types. I will need to consider
> revised [intro.object]/10 further on this.
Could you explain what that section says that makes you feel that way?
>>> ... You are right that an object that is not
>>> an array element whose address is taken by the unary & operator is
>>> considered to belong to an array with one element of type T for the
>>> purposes of pointer arithmetic on the value returned by that operator:
>>> in such a case you can add 1 to the address of the object.
>>>
>>> That doesn't apply here,
>>
>> In n3797.pdf, the relevant wording is "For the purposes of these
>> operators, a pointer to a nonarray object behaves the same as a pointer
>> to the first element of an array of length one with the type of the
>> object as its element type." (5.7p4), which seems perfectly applicable.
>> That wording does not include the part from your explanation requiring
>> that it be the result of the unary & operator; all that's required is
>> that it be "a pointer to a non-array object". How is the wording in
>> C++2020 different, to render that clause inapplicable?
>
> My meaning was that it was not applicable to my example, which did not
> involve construction of an object.
Now that I have a copy of n4860.pdf, I see that there's a much more
serious obstacle to having that clause apply: there is no such clause in
C++2020. The location in n4860.pdf that corresponds to 5.7 in n3797.pdf
is 7.6.6, but it contains no such wording, nor does similar wording
appear anywhere else. Given "int i;", is there no longer any meaning
defined for 1 + &i, or have I missed something that defines it? I've
just started reading this version of the standard - I wouldn't be
surprised if I missed something.