memcpy with NULL pointer

Steve Keller

unread,

Oct 29, 2021, 5:09:04 AM10/29/21

to

I wonder whether calling the mem* and str* functions with a NULL
pointer has defined bahavior if the count parameter is also 0, like in
this call

#include <string.h>

...
memcpy(NULL, NULL, 0);

The same for memmove(NULL, NULL, 0), memcmp(NULL, NULL, 0),
strncpy(dst, NULL, 0), strncat(dst, NULL, 0), strncmp(NULL, NULL, 0).

It seems the standard doesn't say anything about this.

Since no memory access via the NULL pointer is done I'd assume this
should not result in undefined behavior.

Steve

Tim Rentsch

unread,

Oct 29, 2021, 6:41:06 AM10/29/21

to

Steve Keller <keller...@gmx.de> writes:

> I wonder whether calling the mem* and str* functions with a NULL
> pointer has defined bahavior if the count parameter is also 0,
> like in this call
>
> #include <string.h>
>
> ...
> memcpy(NULL, NULL, 0);
>
> The same for memmove(NULL, NULL, 0), memcmp(NULL, NULL, 0),
> strncpy(dst, NULL, 0), strncat(dst, NULL, 0), strncmp(NULL, NULL, 0).

They are all undefined behavior.

> It seems the standard doesn't say anything about this.

It does. Paragraph 2 of 7.24.1, "String function conventions", says
this:

Where an argument declared as size_t n specifies the length
of the array for a function, n can have the value zero on a
call to that function. Unless explicitly stated otherwise in
the description of a particular function in this subclause,
pointer arguments on such a call shall still have valid
values, as described in 7.1.4. [...]

A null pointer is not among the set of valid values. Refer to
section 7.1.4, paragraph 1, for details. (I did a quick check of
the six functions you mentioned and did not see any indication
that they are exceptions to the above rule. Of course, I cannot
promise that such quick checks are 100% reliable, so please feel
free to double check me on that.)

> Since no memory access via the NULL pointer is done I'd assume
> this should not result in undefined behavior.

That's a plausible assumption but not one that the C standard
supports.

Steve Keller

unread,

Oct 29, 2021, 5:06:22 PM10/29/21

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:

> > The same for memmove(NULL, NULL, 0), memcmp(NULL, NULL, 0),
> > strncpy(dst, NULL, 0), strncat(dst, NULL, 0), strncmp(NULL, NULL, 0).
>
> They are all undefined behavior.

:-(

> > It seems the standard doesn't say anything about this.
>
> It does. Paragraph 2 of 7.24.1, "String function conventions", says
> this:
>
> Where an argument declared as size_t n specifies the length
> of the array for a function, n can have the value zero on a
> call to that function. Unless explicitly stated otherwise in
> the description of a particular function in this subclause,
> pointer arguments on such a call shall still have valid
> values, as described in 7.1.4. [...]

Hm, I've overseen that paragraph and the reference to 7.1.4 in "String
handling" (7.21 in my C9X draft, 7.24 in the final standard?).

> A null pointer is not among the set of valid values. Refer to
> section 7.1.4, paragraph 1, for details. (I did a quick check of
> the six functions you mentioned and did not see any indication
> that they are exceptions to the above rule. Of course, I cannot
> promise that such quick checks are 100% reliable, so please feel
> free to double check me on that.)
>
> > Since no memory access via the NULL pointer is done I'd assume
> > this should not result in undefined behavior.

I've read the description of all of these functions in "String
handling <string.h>" and found no exception, so you're right. As I
have overseen the above I came to my false assumption.

> That's a plausible assumption but not one that the C standard
> supports.

That's a pity. I really wish the standard wouldn't leave these and
thing like malloc(0) undefined or implementation-defined. This would
avoid quite some checks on corner cases, like in

void func(int n) {
char *ptr, tmp[n];
if (n > 0 && !(ptr = malloc(n))) {
// handle error
}
...
int cmp_res = (n == 0) ? 0 : memcmp(ptr, ..., n);
if (n > 0)
memcpy(tmp, ptr, n);
...
}

BTW, has a VLA like char tmp[n]; with value zero for n defined
behavior?

Steve

Keith Thompson

unread,

Oct 29, 2021, 5:23:30 PM10/29/21

to

Steve Keller <keller...@gmx.de> writes:
[...]

> BTW, has a VLA like char tmp[n]; with value zero for n defined
> behavior?

No, the behavior is undefined if n is not positive.

N2176 (C17 draft) 6.7.6.2p5:

If the size is an expression that is not an integer constant expression:
if it occurs in a declaration at function prototype scope, it is treated
as if it were replaced by * ; otherwise, each time it is evaluated it
shall have a value greater than zero.

That "shall" is outside a constraint, so violating it has undefined behavior.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Scott Lurndal

unread,

Oct 29, 2021, 5:48:17 PM10/29/21

to

Steve Keller <keller...@gmx.de> writes:
>Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>> > The same for memmove(NULL, NULL, 0), memcmp(NULL, NULL, 0),
>> > strncpy(dst, NULL, 0), strncat(dst, NULL, 0), strncmp(NULL, NULL, 0).
>>
>> They are all undefined behavior.
>
>:-(

We actually looked into adding EFAULT as an error to the mem* and str*
functions back in the 90's. Ultimately the overhead to validate
the pointers was too great (absent dedicated hardware support such
as CHERI[*]) and the proposal was discarded.

[*] https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/

Tim Rentsch

unread,

Oct 29, 2021, 9:44:13 PM10/29/21

to

Steve Keller <keller...@gmx.de> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>>> The same for memmove(NULL, NULL, 0), memcmp(NULL, NULL, 0),
>>> strncpy(dst, NULL, 0), strncat(dst, NULL, 0), strncmp(NULL, NULL, 0).
>>
>> They are all undefined behavior.
>
> :-(
>
>>> It seems the standard doesn't say anything about this.
>>
>> It does. Paragraph 2 of 7.24.1, "String function conventions", says
>> this:
>>
>> Where an argument declared as size_t n specifies the length
>> of the array for a function, n can have the value zero on a
>> call to that function. Unless explicitly stated otherwise in
>> the description of a particular function in this subclause,
>> pointer arguments on such a call shall still have valid
>> values, as described in 7.1.4. [...]
>
> Hm, I've overseen that paragraph and the reference to 7.1.4 in
> "String handling" (7.21 in my C9X draft, 7.24 in the final
> standard?).

Do you mean overlooked rather than overseen?

>> A null pointer is not among the set of valid values. Refer to
>> section 7.1.4, paragraph 1, for details. (I did a quick check of
>> the six functions you mentioned and did not see any indication
>> that they are exceptions to the above rule. Of course, I cannot
>> promise that such quick checks are 100% reliable, so please feel
>> free to double check me on that.)
>>
>>> Since no memory access via the NULL pointer is done I'd assume
>>> this should not result in undefined behavior.
>
> I've read the description of all of these functions in "String
> handling <string.h>" and found no exception, so you're right. As I
> have overseen the above I came to my false assumption.
>
>> That's a plausible assumption but not one that the C standard
>> supports.
>
> That's a pity. I really wish the standard wouldn't leave these
> and thing like malloc(0) undefined or implementation-defined.

> [...]

Rather than wishing the C standard would be different than it is,
avoid the problem by using wrapper functions (disclaimer: not
compiled):

static inline void *
safer_memcpy( void *d, const void *s, size_t n ){
return n ? memcpy( d, s, n ) : d;
}

static inline void *
safer_memcmp( const void *s1, const void *s2, size_t n ){
return n ? memcpy( s1, s2, n ) : 0;
}

...

and move on to the next trouble spot.

Guillaume

unread,

Oct 30, 2021, 12:25:35 PM10/30/21

to

Yes of course. It's straighforward and this way, you get the checks if
you want/need them instread of imposing them on everyone regardless of
the use case.

Those std functions should be considered "low-level" functions.
I for one frankly don't care if any of the above examples, like
"memcmp(NULL, NULL, 0)" are UB. If you are bound to call those with such
parameters, there's something wrong with your coding style IMHO. And
now, in particular cases for which it would make sense (because it could
make your code more elegant or something), then you can use wrappers as
suggested above.

Don't get me wrong - I rather like functions that do exhaustive
parameter validation in general. But I can also recognize tradeoffs to
this rule when performance could be an issue, for instance for this kind
of "low-level" functions.

If there was something to complain about, possibly, that would be some
inconsistency in how C std functions check their parameters, rather than
a lack of checks. For instance, it's OK to call free() with a NULL
pointer, but using a NULL pointer in some other C std functions
expecting a pointer is UB. That is kind of WTF to me. But we're usually
fine if we just RTFM.

Peter 'Shaggy' Haywood

unread,

Nov 3, 2021, 5:08:34 AM11/3/21

to

Groovy hepcat Tim Rentsch was jivin' in comp.lang.c on Fri, 29 Oct 2021
09:40 pm. It's a cool scene! Dig it.

> Steve Keller <keller...@gmx.de> writes:
>
>> I wonder whether calling the mem* and str* functions with a NULL
>> pointer has defined bahavior if the count parameter is also 0,
>> like in this call

[Snip.]

>> Since no memory access via the NULL pointer is done I'd assume
>> this should not result in undefined behavior.
>
> That's a plausible assumption but not one that the C standard
> supports.

But, of course, undefined behaviour could be perfectly benign. Just
don't count on it in every case.

--

----- Dig the NEW and IMPROVED news sig!! -----

-------------- Shaggy was here! ---------------
Ain't I'm a dawg!!

James Kuyper

unread,

Nov 3, 2021, 10:52:22 AM11/3/21

to

On 11/2/21 8:24 PM, Peter 'Shaggy' Haywood wrote:
> Groovy hepcat Tim Rentsch was jivin' in comp.lang.c on Fri, 29 Oct 2021
> 09:40 pm. It's a cool scene! Dig it.
>
>> Steve Keller <keller...@gmx.de> writes:
>>
>>> I wonder whether calling the mem* and str* functions with a NULL
>>> pointer has defined bahavior if the count parameter is also 0,
>>> like in this call
>
> [Snip.]
>
>>> Since no memory access via the NULL pointer is done I'd assume
>>> this should not result in undefined behavior.
>>
>> That's a plausible assumption but not one that the C standard
>> supports.
>
> But, of course, undefined behaviour could be perfectly benign. Just
> don't count on it in every case.

"undefined behavior" means only that the C standard imposes no
requirements - if some other document does impose requirements, it's
fine to rely upon those requirements being met - so long as you know
that the document has authority over all of the systems to which you
might need to port your code. However, if you know of no such document,
or at least none that has authority over a particular system of interest
to you, you should not count on it, at all.

Manfred

unread,

Nov 3, 2021, 12:46:50 PM11/3/21

to

Especially considering that it's easy enough to write you own memcpy
that is robust against null pointers, although not heavily optimized.
On the other hand, aggressive optimization is the most likely motivation
for the requirement of valid pointers in all cases.

Tim Rentsch

unread,

Nov 6, 2021, 5:57:14 PM11/6/21

to

Peter 'Shaggy' Haywood <phay...@alphalink.com.au> writes:

> Groovy hepcat Tim Rentsch was jivin' in comp.lang.c on Fri, 29 Oct 2021
> 09:40 pm. It's a cool scene! Dig it.
>
>> Steve Keller <keller...@gmx.de> writes:
>>
>>> I wonder whether calling the mem* and str* functions with a NULL
>>> pointer has defined bahavior if the count parameter is also 0,
>>> like in this call
>
> [Snip.]
>
>>> Since no memory access via the NULL pointer is done I'd assume
>>> this should not result in undefined behavior.
>>
>> That's a plausible assumption but not one that the C standard
>> supports.
>
> But, of course, undefined behaviour could be perfectly benign. Just
> don't count on it in every case.

Bad advice in this particular case.

Steve Keller

unread,

Nov 23, 2021, 3:59:07 AM11/23/21

to

sc...@slp53.sl.home (Scott Lurndal) writes:

> We actually looked into adding EFAULT as an error to the mem* and str*
> functions back in the 90's. Ultimately the overhead to validate
> the pointers was too great (absent dedicated hardware support such
> as CHERI[*]) and the proposal was discarded.

Doing a check and returning EFAULT also feels wrong for me. I think
it's best when these cases simply are no special corner cases. No
expensive check, no error, no UB or implementation defined behavior.
Just do the equivalent of

memcpy(void *dst, const void *src, size_t n) {
char *d = dst;
const char *s = src;
while (n-- > 0)
*d++ = *s++;
}

For n == 0, the value of src and dst doesn't matter. There's no
special definition of the behavior, it's just that 0 items beginning
at dst or src are accessed and therefore the pointers are used to
access anything at all.

And I don't see any optimizations that are only possible if these
special cases are left undefined.

BTW, I have also always hated in math lectures when definitions using
e.g. subsets defined the term in question in a way that excluded the
empty set and/or the full set. This often required handling theses
excluded cases specially in many theorems and proofs. And simplifying
the definition to not exclude some cases would also simplify all
theorems based on it, and everything would just look cleaner.

Steve

Steve Keller

unread,

Nov 23, 2021, 4:26:26 AM11/23/21

to

Guillaume <mes...@bottle.org> writes:

> > Rather than wishing the C standard would be different than it is,
> > avoid the problem by using wrapper functions (disclaimer: not
> > compiled):
>
> Yes of course. It's straighforward and this way, you get the checks if
> you want/need them instread of imposing them on everyone regardless of
> the use case.

I wouldn't propose additional checks. Just define the behavior
naturally, i.e. if the size arguement is 0, it means 0 elements
starting at the pointed adresses are accessed, so the pointers don't
matter at all.

> Those std functions should be considered "low-level" functions.
> I for one frankly don't care if any of the above examples, like
> "memcmp(NULL, NULL, 0)" are UB. If you are bound to call those with
> such parameters, there's something wrong with your coding style
> IMHO.

Not in my opinion. There are often cases where the size of a dynamic
array to be allocated and then worked on, is calculated from other
values and 0 is a possible outcome[1]. This is, of cource some
special case, but often the algorithms work the same on these special
cases and no special handling would be required if the functions from
the standard lib like malloc() and memcpy() wouldn't behave specially.

[1] Even if sometimes these are not very useful use-cases of the
software, I prefer no to add arbitrary and unneeded limits to the
range of valid arguments, making the code more complex and less
general.

> And now, in particular cases for which it would make sense
> (because it could make your code more elegant or something), then you
> can use wrappers as suggested above.

Yes, and I will do this. But for the reader of the code it means, he
sees calls to safe_memcpy() et al. and has to first lookup where and
how they are defined instead of just seeing memcpy() that everyone
knows.

I know that's not that big a deal and there are more complex things in
the software field to be solved, but still I'd like more simple, clean,
and elegant low-level libs to work with.

Steve

Dolores Filandro

unread,

Dec 20, 2021, 4:32:12 PM12/20/21

to

> I wouldn't propose additional checks. Just define the behavior
> naturally, i.e. if the size arguement is 0, it means 0 elements
> starting at the pointed adresses are accessed, so the pointers don't
> matter at all.

For string functions such as memmove and strncat,
there may be work which needs to be done before any copying takes place,
which would need valid pointers to objects even if no copying takes place.

An implementer could program around that
but the need for string functions to be able to take NULL arguments isn't strong.

Öö Tiib

unread,

Dec 20, 2021, 7:56:11 PM12/20/21

to

So the functions are expected to be designed inefficient for situation where
zero bytes are needed to be moved or copied ... as these do some additional
work? That indicates need for wrapper even on cases when all the pointers
are valid just the size argument is 0 to get rid of that additional work
overhead in standard library.

Richard Damon

unread,

Dec 20, 2021, 8:39:31 PM12/20/21

to

They are ALLOWED to be inefficient for the zero byte case, and it is
allowed for the system to be hostile to treating NULL pointers casually.

For instance, it might have special registers for using pointers as
addresses and manipulating them, but use the 'normal' registers for just
checking for a NULL pointer, and loading a NULL pointer into the special
registers traps.

If the code begins with some alignment checks, this might trap on doing
this on a NULL pointer.

This leads to either ALL uses need to pay the penalty of checking for
zero byte transfers, or we need to restrict using NULL pointers as
parameters.

Since if the application wants the skip on 0 behavior, it can easily
write a wrapper function (and even make it inline to negate the
additional cost) while it is impossible to remove the cost for the check
if it was added in the library, says it makes sense to write the
requirements to allow the most efficient library code.

This is common thinking in the C library design.

Öö Tiib

unread,

Dec 20, 2021, 9:31:35 PM12/20/21

to

On Tuesday, 21 December 2021 at 03:39:31 UTC+2, Richard Damon wrote:
> On 12/20/21 7:56 PM, Öö Tiib wrote:
> > On Monday, 20 December 2021 at 23:32:12 UTC+2, Dolores Filandro wrote:
> >>> I wouldn't propose additional checks. Just define the behavior
> >>> naturally, i.e. if the size arguement is 0, it means 0 elements
> >>> starting at the pointed adresses are accessed, so the pointers don't
> >>> matter at all.
> >>
> >> For string functions such as memmove and strncat,
> >> there may be work which needs to be done before any copying takes place,
> >> which would need valid pointers to objects even if no copying takes place.
> >>
> >> An implementer could program around that
> >> but the need for string functions to be able to take NULL arguments isn't strong.
> >
> > So the functions are expected to be designed inefficient for situation where
> > zero bytes are needed to be moved or copied ... as these do some additional
> > work? That indicates need for wrapper even on cases when all the pointers
> > are valid just the size argument is 0 to get rid of that additional work
> > overhead in standard library.
>
> They are ALLOWED to be inefficient for the zero byte case, and it is
> allowed for the system to be hostile to treating NULL pointers casually.

Yes standard allows whatever amount of inefficiency as it is not its concern
to deal with issues that market takes care of anyway. Programmers will
avoid inefficient implementations.

> For instance, it might have special registers for using pointers as
> addresses and manipulating them, but use the 'normal' registers for just
> checking for a NULL pointer, and loading a NULL pointer into the special
> registers traps.

My question was about valid non-NULL pointers and additional work done
with those despite the size to copy or move is 0. No competent user
passes NULL pointers there anyway so NULL is red herring.

> If the code begins with some alignment checks, this might trap on doing
> this on a NULL pointer.

Alignment checks are good example of additional work that is done
but later discarded as being not worth doing at all.

> This leads to either ALL uses need to pay the penalty of checking for
> zero byte transfers, or we need to restrict using NULL pointers as
> parameters.

All users pay that penalty if doing too lot of zero byte transfers, period.
NULL is red herring.

> Since if the application wants the skip on 0 behavior, it can easily
> write a wrapper function (and even make it inline to negate the
> additional cost) while it is impossible to remove the cost for the check
> if it was added in the library, says it makes sense to write the
> requirements to allow the most efficient library code.
>
> This is common thinking in the C library design.

Modern programmers tend to expect compilers to optimize out calls
that do nothing when capable to predict that ... not to add additional
unexpected works. Programmers will avoid inefficient implementations
and so the inefficiency is expected only on platforms compiling for
what is controlled by monopoly ... like iOS devices and such.