Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

memcpy of zero bytes

4,040 views
Skip to first unread message

martin...@yahoo.co.uk

unread,
Nov 12, 2013, 6:40:33 AM11/12/13
to
Consider the following code:

#include <string.h>
int main(void)
{
char source[4] = { 0, 1, 2, 3 };
char dest[4];
memcpy( dest, source+4, 0 );
return 0;
}

Does the memcpy invoke undefined behaviour or not?

Obviously *in practise* it is going to work (*), but I am
trying to decide whether the C standard blesses the code
or not.

*: Until an optimizer decides it doesn't need to bother of
course.

James Kuyper

unread,
Nov 12, 2013, 7:33:43 AM11/12/13
to
On 11/12/2013 06:40 AM, martin...@yahoo.co.uk wrote:
> Consider the following code:
>
> #include <string.h>
> int main(void)
> {
> char source[4] = { 0, 1, 2, 3 };
> char dest[4];
> memcpy( dest, source+4, 0 );
> return 0;
> }
>
> Does the memcpy invoke undefined behaviour or not?

"Where an argument declared as size_t n specifies the length of the
array for a function, n can have the value zero on a call to that
function. Unless explicitly stated otherwise in the description of a
particular function in this subclause, pointer arguments on such a call
shall still have valid values, as described in 7.1.4. On such a call,
... a function that copies characters copies zero characters." 7.24.1p2
--
James Kuyper

martin...@yahoo.co.uk

unread,
Nov 12, 2013, 8:04:51 AM11/12/13
to
Yes, I'd already got that far. source+4 is a valid pointer (but not
dereferencable), so it looks OK. However a colleague points out that
7.21.2.1 says "from the object pointed to by s2", and source+4 doesn't
point at an object - on that basis, it's undefined.

James Kuyper

unread,
Nov 12, 2013, 8:32:35 AM11/12/13
to
The phrase that immediately precedes that one says "The memcpy function
copies n characters ...". If no characters are copied, then it only
matters whether the pointer is valid, as described in 7.1.4 - it doesn't
matter whether or not there's an object from which characters could be
copied, if characters were to be copied - because they are not to be copied.

The clause "such as a value outside the domain of the function" from
7.1.4p1 combined with 7.21.2.1p1's specification that s2 points to an
object, might be interpreted as implying that source+4 is outside of
memcpy()s domain, and therefore invalid, but I don't think that's a
valid interpretation. The description of memcpy() contains no direct
specification of the valid domain for its arguments. A specification
that could be inferred from the description of memcpy() is that
(char*)s2 + i, where i has the type size_t, must be dereferencable for
all i such that i<n; but for n==0, there is no such value of i, and
therefore no basis for rejecting source+4 as being outside of memcpy()'s
domain.
--
James Kuyper

Tim Rentsch

unread,
Nov 12, 2013, 9:14:51 AM11/12/13
to
IMO the behavior here is meant to be defined, not undefined.
Certainly the Standard could be written so as to express this
more clearly, but of the two choices defined behavior seems
less strained. In support here are two points:

1. As discussed in a recent comp.lang.c thread, the Standard
isn't always careful with how it uses the term "object". It's
easy to believe the "non-object" case of one past the end of
an array was simply overlooked when writing 7.21.2.1, or maybe
unimportant to identify specifically because of 7.21.1 (all
references are to N1256).

2. The discussion in 7.21.1 p2 refers to 7.1.4 for what arguments
constitute valid values. 7.1.4 p1 says

If a function argument is described as being an array, the
pointer actually passed to the function shall have a value
such that all address computations and accesses to objects
(that would be valid if the pointer did point to the first
element of such an array) are in fact valid.

As the parameters for the section 7.21 functions are described as
being arrays, and since the particular argument does have a value
with the property that "all address computations and accesses to
objects [needed to carry out the stated semantics] are in fact
valid", this provision suggests that the argument in question is
okay. Since 7.1.4 is more specific than 7.21.2.1, it seems
appropriate to take the conditions in 7.1.4 as controlling the
definedness in this case.

Keith Thompson

unread,
Nov 12, 2013, 10:52:15 AM11/12/13
to
James Kuyper <james...@verizon.net> writes:
[...]
> The clause "such as a value outside the domain of the function" from
> 7.1.4p1 combined with 7.21.2.1p1's specification that s2 points to an
> object, might be interpreted as implying that source+4 is outside of
> memcpy()s domain, and therefore invalid, but I don't think that's a
> valid interpretation.
[...]

It's 7.21.2.1p2 (not p1) in C99 and N1256. It's 7.24.2.1p2 in C11
and N1570. Same wording in both.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Manlio Perillo

unread,
Nov 13, 2013, 5:34:32 AM11/13/13
to
Il Tue, 12 Nov 2013 03:40:33 -0800, martinfrompi ha scritto:

> Consider the following code:
>
> #include <string.h>
> int main(void)
> {
> char source[4] = { 0, 1, 2, 3 };
> char dest[4];
> memcpy( dest, source+4, 0 ); return 0;
> }
>
> Does the memcpy invoke undefined behaviour or not?
>

It is not memcpy but source + n that may invoke undefined behaviour.
But in this case source + 4 is well defined, but you can not apply the
`*` operator to it.

> [...]


Regards Manlio

Hallvard Breien Furuseth

unread,
Nov 17, 2013, 6:47:08 AM11/17/13
to
James Kuyper writes:
> On 11/12/2013 08:04 AM, martin...@yahoo.co.uk wrote:
>> On Tuesday, November 12, 2013 12:33:43 PM UTC, James Kuyper wrote:
>>> On 11/12/2013 06:40 AM, martin...@yahoo.co.uk wrote:
>>>> Consider the following code:
>>>>
>>>> #include <string.h>
>>>> int main(void)
>>>> {
>>>> char source[4] = { 0, 1, 2, 3 };
>>>> char dest[4];
>>>> memcpy( dest, source+4, 0 );
>>>> return 0;
>>>> }
>>>>
>>>> Does the memcpy invoke undefined behaviour or not?
>>>
>>> (....)
>> 7.21.2.1 says "from the object pointed to by s2", and source+4 doesn't
>> point at an object - on that basis, it's undefined.
>
> The phrase that immediately precedes that one says "The memcpy function
> copies n characters ...". If no characters are copied, then it only
> matters whether the pointer is valid, as described in 7.1.4 - it doesn't
> matter whether or not there's an object from which characters could be
> copied, if characters were to be copied - because they are not to be copied.
>
> The clause "such as a value outside the domain of the function" from
> 7.1.4p1 combined with 7.21.2.1p1's specification that s2 points to an
> object, might be interpreted as implying that source+4 is outside of
> memcpy()s domain, and therefore invalid, but (...snip...)

That's backwards. memcpy refers to "the object pointed to by s2", so
by default that means an object is there. There's no need to combine
that phrase with anything to conclude this. We need something which
overrides the default meaning. 7.1.4p1's wording doesn't, since it
doesn't cover this case - though it may intend to.

If your memcpy reads nothing from s2 when copying 0 bytes, that could be
considered an implementation detail unless something says it must read
nothing, as in "the argument must be treated as volatile const void*".
Otherwise memcpy() could read the first byte (or more likely the word
containing that byte) before checking the length, to bring it into
cache quickly. It may want to treat the beginning and end specially
anyway if it tweaks the addresses and shifts input data around so it
can do aligned memory accesses.

I hope I'm just being overly pedantic and this case is safe, but I'd
like to see a clarification in the Standard.

--
Hallvard

James Kuyper

unread,
Nov 17, 2013, 2:52:48 PM11/17/13
to
> that phrase with anything to conclude this. ...

True, but you do need to combine it with something else to conclude that
the absence of an object is a problem. There's no constraint saying that
it must point at an object. It's certainly not a syntax error. There's
no explicit statement that the behavior is undefined. If the count were
non-zero, and the pointer did not point at an object, the behavior would
be implicitly undefined "... by the
omission of any explicit definition of behavior", since the only
explicit description that would otherwise be applicable becomes
meaningless if there's no object in the pointed-at location. However,
when the count is zero, the defined behavior is very explicitly a no-op,
which makes just as much sense whether there is or is not an object in
the location pointed at.

> ... We need something which
> overrides the default meaning. 7.1.4p1's wording doesn't, since it
> doesn't cover this case - though it may intend to.
> If your memcpy reads nothing from s2 when copying 0 bytes, that could be
> considered an implementation detail unless something says it must read
> nothing, as in "the argument must be treated as volatile const void*".
> Otherwise memcpy() could read the first byte (or more likely the word
> containing that byte) before checking the length, ...

Under the as-if rule, the compiler is always allowed to insert a
spurious read from memory anywhere it likes, so long as it doesn't have
any consequences that matter, such as violating protected memory.

However, such a read would be fully spurious in this context - the only
thing that gives memcpy() permission to read any memory is the
description that gives it permission to copy from that memory; and
7.1.4p1 explicitly specifies that when the count is zero, memcpy() has
no permission to copy; therefore, it has no permission to read. That
seems an obvious implication from the main description of memcpy(), but
without 7.1.4p1, it could have been concluded from the description that
a count of 0 was an invalid argument value.

If you think that memcpy() does have permission to read from that
memory, even in contexts where such a read would be problematic, and
could therefore not be justified by the as-if rule, please explain why.
--
James Kuyper

Tim Rentsch

unread,
Nov 19, 2013, 8:17:43 PM11/19/13
to
> something which overrides the default meaning. [snip elaboration]

Clearly 7.21.1 p2 is meant to preempt each function's defined
semantics in the case that n == 0. Therefore the description in
7.21.2.1 p2 (not p1) is irrelevant. 7.21.1 p2 does require that
any pointer arguments "still have valid values, as described in
7.1.4". But the value s+4 does satisfy the description of 7.1.4,
as I explained in my other posting. Since this condition is
satisfied, and since 7.21.1 p2 takes precedence when n == 0, the
behavior is defined in this case, given by the last sentence of
that paragraph. 7.21.2.1 p2 doesn't enter into it.

(This is section 7.21 in n1256, corresponding to 7.24 in n1570.)
0 new messages