*rubeyes*: realloc(ptr, 0) is UB?

Kaz Kylheku

unread,

Jan 16, 2024, 7:32:23 PM1/16/24

to

I'm looking at the C99 and N3096 (April 2023) definitions of realloc
side by side.

N3096 says

"Otherwise, if ptr does not match a pointer earlier returned by a memory
management function, or if the space has been deallocated by a call to
the free or realloc function, or if the size is zero, the behavior is
undefined."

Yikes! In C99 there is nothing about the size being zero:

"Otherwise, if ptr does not match a pointer earlier returned by the
calloc, malloc, or realloc function, or if the space has been
deallocated by a call to the free or realloc function, the behavior is
undefined."

Nothing about "or if the size is zero".

Yikes; when did this criminal stupidity get perpetrated?

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Lawrence D'Oliveiro

unread,

Jan 16, 2024, 8:57:18 PM1/16/24

to

On Wed, 17 Jan 2024 00:32:10 -0000 (UTC), Kaz Kylheku wrote:

> Nothing about "or if the size is zero".

From <https://manpages.debian.org/3/realloc.en.html>:

If size is equal to zero, and ptr is not NULL, then the call is
equivalent to free(ptr) (but see "Nonportable behavior" for
portability issues).

And from the referenced section:

The behavior of these functions when the requested size is zero is
glibc specific; other implementations may return NULL without setting
errno, and portable POSIX programs should tolerate such behavior.

Kaz Kylheku

unread,

Jan 16, 2024, 9:32:54 PM1/16/24

to

On 2024-01-17, Lawrence D'Oliveiro <l...@nz.invalid> wrote:
> On Wed, 17 Jan 2024 00:32:10 -0000 (UTC), Kaz Kylheku wrote:
>
>> Nothing about "or if the size is zero".
>
> From <https://manpages.debian.org/3/realloc.en.html>:
>
> If size is equal to zero, and ptr is not NULL, then the call is
> equivalent to free(ptr) (but see "Nonportable behavior" for
> portability issues).

That is just garbage documentation though. No case in realloc
is equivalent to just free(ptr) and nothing else.

> And from the referenced section:
>
> The behavior of these functions when the requested size is zero is
> glibc specific; other implementations may return NULL without setting
> errno, and portable POSIX programs should tolerate such behavior.

None of those choices is simply "undefined behavior". This is just the
old concept that implementations may have malloc(0) returning null, or
else allocate a unique pointer that can be liberated via free.

Previously, realloc(ptr, 0) behaved like free(ptr) followed by
returning the result of malloc(0).

What we see in N3096 is that this resize to zero is undefined behavior,
while malloc(0) remains well-defined.

Thus applications which previously relied on that case of realloc
now have to do this:

void *classic_realloc(void *ptr, size_t size)
{
// Handle zero size case as was required in C99,
// thereby avoiding undefined behavior.

if (size == 0) {
free(ptr);
return malloc(0);
}

// Pass other cases to realloc.
return realloc(ptr, size);

Lawrence D'Oliveiro

unread,

Jan 16, 2024, 10:07:06 PM1/16/24

to

On Wed, 17 Jan 2024 02:32:41 -0000 (UTC), Kaz Kylheku wrote:

> That is just garbage documentation though.

You’re talking about what the C spec says, I’m talking about POSIX.

Kaz Kylheku

unread,

Jan 16, 2024, 10:08:35 PM1/16/24

to

On 2024-01-17, Lawrence D'Oliveiro <l...@nz.invalid> wrote:

Oh, I'm sorry, am I in the wrong newsgroup?

Lawrence D'Oliveiro

unread,

Jan 16, 2024, 10:34:41 PM1/16/24

to

On Wed, 17 Jan 2024 03:08:22 -0000 (UTC), Kaz Kylheku wrote:

> Oh, I'm sorry, am I in the wrong newsgroup?

No need to apologize, just leave quietly.

David Brown

unread,

Jan 17, 2024, 5:37:15 AM1/17/24

to

On 17/01/2024 01:32, Kaz Kylheku wrote:
> I'm looking at the C99 and N3096 (April 2023) definitions of realloc
> side by side.
>
> N3096 says
>
> "Otherwise, if ptr does not match a pointer earlier returned by a memory
> management function, or if the space has been deallocated by a call to
> the free or realloc function, or if the size is zero, the behavior is
> undefined."
>
> Yikes! In C99 there is nothing about the size being zero:
>
> "Otherwise, if ptr does not match a pointer earlier returned by the
> calloc, malloc, or realloc function, or if the space has been
> deallocated by a call to the free or realloc function, the behavior is
> undefined."
>
> Nothing about "or if the size is zero".
>
> Yikes; when did this criminal stupidity get perpetrated?
>

Nothing stops a particular library implementation from giving this all a
defined behaviour. It was already "implementation defined" - if you
want to be able to use malloc or realloc with size 0, then you need to
know exactly how your library says it will behave or your code will risk
serious problems.

While I don't know why this change was made, and I agree it sounds a
strange change to make, I cannot see how any code would be affected by
it. Either your code is non-portable and relies on specific behaviour
(documented by your particular library, or additional standards such as
POSIX), or it has portability issues that could lead to failures if it
is used with a library that doesn't match your guessed behaviour - the C
standards making this UB does not change anything.

All this means, in my eyes, is that developers are being encouraged to
take responsibility for size zero allocations in their own code - make
an active choice about how they will deal with it (if it can occur in
their code). After all, there is no single appropriate choice of
behaviour for malloc or realloc of zero size - the best way to handle it
will depend on the user program.

Scott Lurndal

unread,

Jan 17, 2024, 11:27:50 AM1/17/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:
>I'm looking at the C99 and N3096 (April 2023) definitions of realloc
>side by side.
>
>N3096 says
>
>"Otherwise, if ptr does not match a pointer earlier returned by a memory
>management function, or if the space has been deallocated by a call to
>the free or realloc function, or if the size is zero, the behavior is
>undefined."
>
>Yikes! In C99 there is nothing about the size being zero:
>
>"Otherwise, if ptr does not match a pointer earlier returned by the
>calloc, malloc, or realloc function, or if the space has been
>deallocated by a call to the free or realloc function, the behavior is
>undefined."
>
>Nothing about "or if the size is zero".
>
>Yikes; when did this criminal stupidity get perpetrated?

I'm not sure what stupidity you are referring to, but IIRC, there was
some recent standardization activity relating to realloc
when size == 0 because there were differences in the behavior
between different implementations. Making the behavior
undefined was the only rational choice.

Kaz Kylheku

unread,

Jan 17, 2024, 12:48:00 PM1/17/24

to

On 2024-01-17, David Brown <david...@hesbynett.no> wrote:
> On 17/01/2024 01:32, Kaz Kylheku wrote:
>> Yikes; when did this criminal stupidity get perpetrated?
>>
>
> Nothing stops a particular library implementation from giving this all a
> defined behaviour.

Nothings stops getchar() from being locally defined as an extension;
let's make it undefined behavior!

An entire language can be defined by an implementation, with no
standard.

People use plenty of these and tech marches on.

> It was already "implementation defined" - if you
> want to be able to use malloc or realloc with size 0, then you need to
> know exactly how your library says it will behave or your code will risk
> serious problems.

No, not serious. Very minor problems in a rare combination of
circumstances.

It's possible that realloc(ptr, 0) will return null, and this null
indicates that the reallocation failed due to OOM, meaning that ptr is
still valid.

This situation is indistinguishable from ptr having been freed,
and the null just being the result of a working zero size allocation.

(Implementations can easily resolve the ambiguity internally and
guarantee that realloc(ptr, 0) will free the object, and not
leak ptr.)

> While I don't know why this change was made, and I agree it sounds a
> strange change to make, I cannot see how any code would be affected by
> it. Either your code is non-portable and relies on specific behaviour
> (documented by your particular library, or additional standards such as
> POSIX), or it has portability issues that could lead to failures if it
> is used with a library that doesn't match your guessed behaviour - the C
> standards making this UB does not change anything.

It isn't nonportable. Dynamic array management code can resize an object
down to zero (as an alternative to freeing it). There is no problem with
this (other than the possible ambiguity under OOM conditions). In the
abstract semantics, the old object is freed, and you get a new one as if
from malloc(0).

Code can easily be written not to care whether malloc(0) returns null
or non-null.

> All this means, in my eyes, is that developers are being encouraged to
> take responsibility for size zero allocations in their own code - make

Working code that relies on zero size allocations already (even if
that isn't the best alternative in that code) won't fix itself to become
defined again.

> an active choice about how they will deal with it (if it can occur in
> their code). After all, there is no single appropriate choice of
> behaviour for malloc or realloc of zero size - the best way to handle it
> will depend on the user program.

A malloc of zero size is still defined; only realloc to zero size is
no longer defined. If you pretend that the remark about size zero isn't
there in the description of realloc, the rest of the description of
realloc still says that the old object will be freed, and the new one
will come as if from a malloc request for that size.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazi...@mstdn.ca

NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Kaz Kylheku

unread,

Jan 17, 2024, 12:57:59 PM1/17/24

to

On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:

> Kaz Kylheku <433-92...@kylheku.com> writes:
>>Yikes; when did this criminal stupidity get perpetrated?
>
> I'm not sure what stupidity you are referring to, but IIRC, there was
> some recent standardization activity relating to realloc
> when size == 0 because there were differences in the behavior
> between different implementations. Making the behavior
> undefined was the only rational choice.

No, the rational choice is letting those implementations be
nonconforming, until they fix their shit.

You cannot claw back decades-old defined behaviors in a major
language standard because some players are getting it wrong
in some way.

I now have to do things like this in all the code I work on:

void *sane_realloc(void *ptr, size_t size)
{

if (size == 0) {
free(ptr);

return malloc(0); // don't care if that didn't work
}
return realloc(ptr, size);
}

or, if I don't suspect realloc-to-zero is being used:

void *sane_realloc(void *ptr, size_t size)
{
assert (size != 0);
return realloc(ptr, size);
}

and this is code that will never even run on those rogue implementations
whose botchery lead to the poor decision of the language being damaged.

Any implementation of realloc whatsoever could just do this internally:

void *realloc(void *ptr, size_t size)
{
if (size == 0) {
__free(ptr);
return __malloc(0);
}
// call our internal __realloc implementation that
// we suspect doesn't handle zero.
return __realloc(ptr, size);

BGB

unread,

Jan 17, 2024, 1:39:31 PM1/17/24

to

On 1/17/2024 11:47 AM, Kaz Kylheku wrote:
> On 2024-01-17, David Brown <david...@hesbynett.no> wrote:
>> On 17/01/2024 01:32, Kaz Kylheku wrote:
>>> Yikes; when did this criminal stupidity get perpetrated?
>>>
>>
>> Nothing stops a particular library implementation from giving this all a
>> defined behaviour.
>
> Nothings stops getchar() from being locally defined as an extension;
> let's make it undefined behavior!
>
> An entire language can be defined by an implementation, with no
> standard.
>
> People use plenty of these and tech marches on.
>

And then Clang goes and decides to make any use of these functions cause
the whole control-flow path to be pruned from existence or something...
Since, after all, it is UB...

( Partly satire, but wouldn't exactly be surprised... )

>> It was already "implementation defined" - if you
>> want to be able to use malloc or realloc with size 0, then you need to
>> know exactly how your library says it will behave or your code will risk
>> serious problems.
>
> No, not serious. Very minor problems in a rare combination of
> circumstances.
>
> It's possible that realloc(ptr, 0) will return null, and this null
> indicates that the reallocation failed due to OOM, meaning that ptr is
> still valid.
>
> This situation is indistinguishable from ptr having been freed,
> and the null just being the result of a working zero size allocation.
>
> (Implementations can easily resolve the ambiguity internally and
> guarantee that realloc(ptr, 0) will free the object, and not
> leak ptr.)
>

Yeah, find something sensible to do and do it.
Knowing modern compilers, declaring anything as UB throws a big wrench
into things.

>> While I don't know why this change was made, and I agree it sounds a
>> strange change to make, I cannot see how any code would be affected by
>> it. Either your code is non-portable and relies on specific behaviour
>> (documented by your particular library, or additional standards such as
>> POSIX), or it has portability issues that could lead to failures if it
>> is used with a library that doesn't match your guessed behaviour - the C
>> standards making this UB does not change anything.
>
> It isn't nonportable. Dynamic array management code can resize an object
> down to zero (as an alternative to freeing it). There is no problem with
> this (other than the possible ambiguity under OOM conditions). In the
> abstract semantics, the old object is freed, and you get a new one as if
> from malloc(0).
>
> Code can easily be written not to care whether malloc(0) returns null
> or non-null.
>

Yes.

>> All this means, in my eyes, is that developers are being encouraged to
>> take responsibility for size zero allocations in their own code - make
>
> Working code that relies on zero size allocations already (even if
> that isn't the best alternative in that code) won't fix itself to become
> defined again.
>
>> an active choice about how they will deal with it (if it can occur in
>> their code). After all, there is no single appropriate choice of
>> behaviour for malloc or realloc of zero size - the best way to handle it
>> will depend on the user program.
>
> A malloc of zero size is still defined; only realloc to zero size is
> no longer defined. If you pretend that the remark about size zero isn't
> there in the description of realloc, the rest of the description of
> realloc still says that the old object will be freed, and the new one
> will come as if from a malloc request for that size.
>

Agreed.

My personal preference is that behaviors be kept "sensible" when possible.

At least in my own compiler, I generally try for sensible behaviors.
But, within the limit that the compiler may also be buggy...

Finding and fixing bugs is a priority, but as I see it, things like
optimization concerns (or esoteric edge cases) should not be
justification for deviating from otherwise sensible behavior (and
instead limited to cases that would not otherwise be visible to the
program in question).

Granted, I would also prefer it if compilers were not allowed to make
optimizations which may change the visible behavior of a program (at
least within "sane" limits) but, alas... ( Say, we put the burden of
proof on the compiler that an optimization does not change the visible
output or behavior of the program. ).

Scott Lurndal

unread,

Jan 17, 2024, 5:13:39 PM1/17/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:
>On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:
>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>Yikes; when did this criminal stupidity get perpetrated?
>>
>> I'm not sure what stupidity you are referring to, but IIRC, there was
>> some recent standardization activity relating to realloc
>> when size == 0 because there were differences in the behavior
>> between different implementations. Making the behavior
>> undefined was the only rational choice.
>
>No, the rational choice is letting those implementations be
>nonconforming, until they fix their shit.
>
>You cannot claw back decades-old defined behaviors in a major

Nobody is clawing anything back. No compiler is going to
change its behavior because undefined behavior is now
called undefined behavior.

Everything that's been done for decades will still work just
fine.

Realloc with a length zero - never - made any sense anyway.

>language standard because some players are getting it wrong
>in some way.
>
>I now have to do things like this in all the code I work on:
>
> void *sane_realloc(void *ptr, size_t size)
> {
> if (size == 0) {
> free(ptr);
> return malloc(0); // don't care if that didn't work
> }
> return realloc(ptr, size);
> }

You can also just never call sane_realloc with a size of zero.

Kaz Kylheku

unread,

Jan 17, 2024, 5:31:27 PM1/17/24

to

On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:
> Kaz Kylheku <433-92...@kylheku.com> writes:
>>On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:
>>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>>Yikes; when did this criminal stupidity get perpetrated?
>>>
>>> I'm not sure what stupidity you are referring to, but IIRC, there was
>>> some recent standardization activity relating to realloc
>>> when size == 0 because there were differences in the behavior
>>> between different implementations. Making the behavior
>>> undefined was the only rational choice.
>>
>>No, the rational choice is letting those implementations be
>>nonconforming, until they fix their shit.
>>
>>You cannot claw back decades-old defined behaviors in a major
>
> Nobody is clawing anything back. No compiler is going to
> change its behavior because undefined behavior is now
> called undefined behavior.

What? The behavior was defined in the absence of the new piece of text
saying that the behavior of realloc is arbitrarily undefined when size
is zero.

> Everything that's been done for decades will still work just
> fine.

Even if nothing changes in the library implementation of realloc, a
highly optimizing compiler can now treat everything after a realloc(ptr,
0) as unreachable code:

void f(void *p)
{
realloc(p, 0);
}

This function can be compiled the same as:

void f(void *p)
{
__builtin_unreachable();
}

This is not something that the library people can keep fully working by
simply not messing with realloc.

> Realloc with a length zero - never - made any sense anyway.

Yes it does; in a dynamic vector type, you can tell the thing to resize
to zero size. If that API is implemented with realloc, it now has
undefined behavior according to n3096.

>>language standard because some players are getting it wrong
>>in some way.
>>
>>I now have to do things like this in all the code I work on:
>>
>> void *sane_realloc(void *ptr, size_t size)
>> {
>> if (size == 0) {
>> free(ptr);
>> return malloc(0); // don't care if that didn't work
>> }
>> return realloc(ptr, size);
>> }
>
> You can also just never call sane_realloc with a size of zero.

That could be more work. You have to identify what does that,
if anything, and fix it.

If your project already has a wrapper for realloc, it's easy
to fix it there.

Blue-Maned_Hawk

unread,

Jan 17, 2024, 6:03:49 PM1/17/24

to

It looks to me like it was always undefined behavior, but it just wasn't
explicitly stated as such in C99. Undefined behavior does not require the
specification to explictly state that the behavior is not defined; it just
has to provide no definition for it.

--
Blue-Maned_HawkÃƒƒƒÃƒ‚Ã‚Â¢”‚shortens to
HawkÃƒƒƒÃƒ‚Ã‚Â¢”‚/
blu.mÃƒƒƒ‰›in.dÃƒƒƒŠÃƒƒ‚Ãƒ‚Ã‚Â°ak/
ÃƒƒƒÃƒ‚Ã‚Â¢”‚he/him/his/himself/Mr.
blue-maned_hawk.srht.site
Atlanta makes it against the law to tie a giraffe to a street lamp or
telephone pole.

Message has been deleted

Tim Rentsch

unread,

Jan 17, 2024, 8:35:51 PM1/17/24

to

Blue-Maned_Hawk <bluema...@invalid.invalid> writes:

> It looks to me like it was always undefined behavior, but it just wasn't
> explicitly stated as such in C99. Undefined behavior does not require the
> specification to explictly state that the behavior is not defined; it just
> has to provide no definition for it.

You are mistaken. The C99 standard, and also the C11 standard,
plainly state that such requests give implementation-defined
behavior:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned,
or the behavior is as if the size were some nonzero value,
except that the returned pointer shall not be used to access
an object.

Tim Rentsch

unread,

Jan 17, 2024, 8:39:03 PM1/17/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

> On 2024-01-17, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
>
>> It looks to me like it was always undefined behavior, but it just wasn't
>> explicitly stated as such in C99.
>

> That is incorrect. A behavior for the following can be readily inferred
> from C99:
>
> void *p = malloc(42);
> void *q = realloc(p, 0);
>
> The p object is freed, and a new object q is obtained as if by
> malloc(0). q may be null, or else a pointer to a unique object
> that can be freed.
>
> q and p could possibly point to the same address; just like in
> a non-zero-sized realloc call.

>
>> Undefined behavior does not require the
>> specification to explictly state that the behavior is not defined; it just
>> has to provide no definition for it.
>

> A definition is provided by virtue of the size 0 not being special in
> any way in realloc. It's defined because the behavior for size 1 is
> defined, as for size 2, ...

Actually, a size of 0 is specifically stated as implementation-defined
behavior, in section 7.20.3, paragraph 1 (in N1256), which applies to
all the memory allocation functions, realloc() included.

Tim Rentsch

unread,

Jan 17, 2024, 8:45:48 PM1/17/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

> I'm looking at the C99 and N3096 (April 2023) definitions of realloc
> side by side.
>
> N3096 says
>
> "Otherwise, if ptr does not match a pointer earlier returned by a memory
> management function, or if the space has been deallocated by a call to
> the free or realloc function, or if the size is zero, the behavior is
> undefined."
>
> Yikes! In C99 there is nothing about the size being zero:
>
> "Otherwise, if ptr does not match a pointer earlier returned by the
> calloc, malloc, or realloc function, or if the space has been
> deallocated by a call to the free or realloc function, the behavior is
> undefined."
>
> Nothing about "or if the size is zero".
>
> Yikes; when did this criminal stupidity get perpetrated?

I guess you missed the thread from a little while back when
this change was discussed.

During that conversation, I was in complete agreement with
your reaction as stated above. Moreover nothing that has
come to light in the meantime that gives me any reason to
think otherwise.

Tim Rentsch

unread,

Jan 17, 2024, 8:47:04 PM1/17/24

to

Is that last sentence your own assessment, or are you simply
repeating someone else's assessment?

Lawrence D'Oliveiro

unread,

Jan 18, 2024, 12:54:31 AM1/18/24

to

On Wed, 17 Jan 2024 17:35:38 -0800, Tim Rentsch wrote:

> The C99 standard, and also the C11 standard,
> plainly state that such requests give implementation-defined behavior:
>
> If the size of the space requested is zero, the behavior is
> implementation-defined: either a null pointer is returned, or the
> behavior is as if the size were some nonzero value, except that the
> returned pointer shall not be used to access an object.

Still, most of us use C on more reasonable, POSIX-type systems, where the
behaviour is somewhat better defined.

Kaz Kylheku

unread,

Jan 18, 2024, 12:59:22 AM1/18/24

to

No, it isn't. The above wording is exactly the same in POSIX:

"If the size of the space requested is zero, the behavior shall be

implementation-defined: either a null pointer is returned, or the

behavior shall be as if the size were some non-zero value, except that
the behavior is undefined if the returned pointer is used to access an
object."

https://pubs.opengroup.org/onlinepubs/9699919799/functions/realloc.html

If you find the behavior better defined in your POSIX system, it's
not because of POSIX.

Lawrence D'Oliveiro

unread,

Jan 18, 2024, 2:00:06 AM1/18/24

to

On Thu, 18 Jan 2024 05:59:08 -0000 (UTC), Kaz Kylheku wrote:

> If you find the behavior better defined in your POSIX system, it's not
> because of POSIX.

From <https://manpages.debian.org/3/realloc.en.html>:

If size is equal to zero, and ptr is not NULL, then the call is
equivalent to free(ptr) (but see "Nonportable behavior" for
portability issues).

And from the referenced section:

The behavior of these functions when the requested size is zero is
glibc specific; other implementations may return NULL without setting
errno, and portable POSIX programs should tolerate such behavior.

GNU-specific, then.

Tim Rentsch

unread,

Jan 18, 2024, 3:19:10 AM1/18/24

to

You say that like you think your statement has some relevance to
the issue being discussed. It doesn't.

Tim Rentsch

unread,

Jan 18, 2024, 3:23:16 AM1/18/24

to

sc...@slp53.sl.home (Scott Lurndal) writes:

> Kaz Kylheku <433-92...@kylheku.com> writes:
>
>> On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:
>>
>>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>
>>>> Yikes; when did this criminal stupidity get perpetrated?
>>>
>>> I'm not sure what stupidity you are referring to, but IIRC, there was
>>> some recent standardization activity relating to realloc
>>> when size == 0 because there were differences in the behavior
>>> between different implementations. Making the behavior
>>> undefined was the only rational choice.
>>
>> No, the rational choice is letting those implementations be
>> nonconforming, until they fix their shit.
>>
>> You cannot claw back decades-old defined behaviors in a major
>

> Nobody is clawing anything back. [...]

This statement seems directly contradicted by the proposed
modification to the C standard, which changes previously
defined behavior to undefined behavior.

Scott Lurndal

unread,

Jan 18, 2024, 10:13:00 AM1/18/24

to

My assessment. I've never found realloc useful, regardless of
the value of the size parameter. YMMV.

Scott Lurndal

unread,

Jan 18, 2024, 10:15:11 AM1/18/24

to

Clawing back implies that existing programs that rely
on either behavior will stop working.

That's not the case in the real world.

From the application portability point of view, there is little
difference between implementation defined and undefined behavior.

James Kuyper

unread,

Jan 18, 2024, 1:27:09 PM1/18/24

to

On 1/18/24 10:14, Scott Lurndal wrote:
...

> From the application portability point of view, there is little
> difference between implementation defined and undefined behavior.

There's a big difference when there's only a limited range of permitted
behaviors, as in this case. It's entirely feasible to write code that
copes with both possibilities, which won't be able to deal with the
infinite variety of behaviors allowed when the behavior is undefined.
The behavior of non __STDC__ pragmas is implementation-defined, but it's
almost the same as undefined behavior. There's some implicit
restrictions: since pragmas are recognized as such in translation phase
4, they cannot affect behavior that occurs during translation phases
1-3. Most other kinds of implementation-defined behavior are
substantially more restricted than that.
It's been said that real-world implementations won't take advantage of
this undefined behavior - but I think that's optimistic thinking.

Kaz Kylheku

unread,

Jan 18, 2024, 2:31:58 PM1/18/24

to

On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:
> On 1/18/24 10:14, Scott Lurndal wrote:
> ...
>> From the application portability point of view, there is little
>> difference between implementation defined and undefined behavior.
>
> There's a big difference when there's only a limited range of permitted
> behaviors, as in this case. It's entirely feasible to write code that
> copes with both possibilities, which won't be able to deal with the
> infinite variety of behaviors allowed when the behavior is undefined.

Perhaps what Scott means that it doesn't matter because all the
implementations you're targetting will play nice and continue to define
the behavior as before, so you will be relying on a documented
extension.

You only have to deal with the infinite variety of behaviors when you're
not relying on a documented extension, but on disassembly of code and
extensive testing. For this specific feature, that doesn't even begin to
be economic; what you do is write a realloc wrapper which provides the
old behavior(s) and doesn't step on the UB.

James Kuyper

unread,

Jan 18, 2024, 3:01:51 PM1/18/24

to

On 1/18/24 14:31, Kaz Kylheku wrote:
...

> Perhaps what Scott means that it doesn't matter because all the
> implementations you're targetting will play nice and continue to define

> the behavior as before, ...

I already addressed that point - confidence that implementations will
play nice is unjustified.

> ... so you will be relying on a documented
> extension.

You need to make sure what behavior is actually documented in that case
by each implementation. I remember one famous bug that was due to an
implementation documenting that, when a particular option was chosen,
code which could only be reached if certain kinds of undefined behavior
occurred would be removed. This is a valid optimization, because there's
no requirements that any particular thing occur when the behavior is
undefined - in particular, there's no requirement that the otherwise
unreachable code be executed. It can be a useful optimization, because
it removes dead code, reducing the size of the executable.
Someone wrote such code, with incorrect expectations about what would
happen when the behavior was undefined, when in fact it wouldn't be
executed at all. They compiled it with that option turned on (it wasn't
even on by default, but only when explicitly requested), and got upset
when the relevant code was in fact removed.
It's OK to rely upon the requirements imposed by an implementation when
the C standard doesn't impose any - but when you do so, you need to make
sure you actually know what those requirements are.

Lawrence D'Oliveiro

unread,

Jan 18, 2024, 3:47:52 PM1/18/24

to

On Thu, 18 Jan 2024 15:12:46 GMT, Scott Lurndal wrote:

> I've never found realloc useful, regardless of the value
> of the size parameter. YMMV.

Looking back through my own code, I could only find one example, from
quite a few years ago
<https://bitbucket.org/ldo17/dvd_menu_animator/src/master/>.

Chris M. Thomasson

unread,

Jan 18, 2024, 4:34:34 PM1/18/24

to

Iirc, the only times I had to deal with it is when I was working on an
existing code base that used it. I had to use it, so to speak to get
along with the team.

Blue-Maned_Hawk

unread,

Jan 18, 2024, 5:20:23 PM1/18/24

to

Kaz Kylheku wrote:

> On 2024-01-17, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:

>> It looks to me like it was always undefined behavior, but it just
>> wasn't explicitly stated as such in C99.
>

> That is incorrect. A behavior for the following can be readily inferred
> from C99:
>
> void *p = malloc(42); void *q = realloc(p, 0);

Is that from an example? All the examples in the document are purely
informative and cannot define any behavior.

Blue-Maned_Hawk

unread,

Jan 18, 2024, 5:21:44 PM1/18/24

to

Oh. Well, nevermind then.

--
Blue-Maned_HawkÃƒƒƒÃƒ‚Ã‚Â¢”‚shortens to
HawkÃƒƒƒÃƒ‚Ã‚Â¢”‚/
blu.mÃƒƒƒ‰›in.dÃƒƒƒŠÃƒƒ‚Ãƒ‚Ã‚Â°ak/
ÃƒƒƒÃƒ‚Ã‚Â¢”‚he/him/his/himself/Mr.
blue-maned_hawk.srht.site

Floppy inside!

Kaz Kylheku

unread,

Jan 18, 2024, 5:44:10 PM1/18/24

to

On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:

> It's OK to rely upon the requirements imposed by an implementation when
> the C standard doesn't impose any - but when you do so, you need to make
> sure you actually know what those requirements are.

Exactly, and in this specific case, it's not worth the effort compared
to writing a realloc wrapper that avoids the undefined behavior, while
itself providing a C99 conforming one.

I'm not going to use realloc(ptr, 0) and check everyone's documentation.

And then what if I don't find it defined? The what? Back to the
wrapper I could have just written in the first place.

We could have system-specific #ifdefs to make the wrapper a trivial
inline job that just calls realloc versus something meaty.

I hate testing for specific systems; it's a better process to check
for *features*, like

#if HAVE_BRAINDAMAGED_REALLOC
...
#else

I like to have these features auto-detected. This cannot be
auto-detected; the build configuration machinery would have to fall back
on checking for specifi compilers/versions and libraries/versions.

In the end I will just have a CPU-cycle-wasting wrapper for all
platforms.

Kaz Kylheku

unread,

Jan 18, 2024, 5:46:13 PM1/18/24

to

On 2024-01-18, Lawrence D'Oliveiro <l...@nz.invalid> wrote:

And screw your code from years ago and its users, especially if it's
just one place in the code. At least three places are required for it
to be a issue (Microsoft Rule of Three).

Kaz Kylheku

unread,

Jan 18, 2024, 5:48:00 PM1/18/24

to

On 2024-01-18, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
> Kaz Kylheku wrote:
>
>> On 2024-01-17, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
>>> It looks to me like it was always undefined behavior, but it just
>>> wasn't explicitly stated as such in C99.
>>
>> That is incorrect. A behavior for the following can be readily inferred
>> from C99:
>>
>> void *p = malloc(42); void *q = realloc(p, 0);
>
> Is that from an example?

No.

> All the examples in the document are purely informative and cannot
> define any behavior.

Of course, that's not what "behavior can be readily inferred" means, no
matter where I got the example.

Chris M. Thomasson

unread,

Jan 18, 2024, 6:14:51 PM1/18/24

to

On 1/18/2024 2:43 PM, Kaz Kylheku wrote:
> On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:
>> It's OK to rely upon the requirements imposed by an implementation when
>> the C standard doesn't impose any - but when you do so, you need to make
>> sure you actually know what those requirements are.
>
> Exactly, and in this specific case, it's not worth the effort compared
> to writing a realloc wrapper that avoids the undefined behavior, while
> itself providing a C99 conforming one.
>
> I'm not going to use realloc(ptr, 0) and check everyone's documentation.
>
> And then what if I don't find it defined? The what? Back to the
> wrapper I could have just written in the first place.
>
> We could have system-specific #ifdefs to make the wrapper a trivial
> inline job that just calls realloc versus something meaty.
>
> I hate testing for specific systems; it's a better process to check
> for *features*, like
>
> #if HAVE_BRAINDAMAGED_REALLOC
> ...
> #else

Made me literally laugh out loud! Thanks! :^)

James Kuyper

unread,

Jan 18, 2024, 6:37:59 PM1/18/24

to

On 1/18/24 17:20, Blue-Maned_Hawk wrote:
> Kaz Kylheku wrote:
>
>> On 2024-01-17, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
>>> It looks to me like it was always undefined behavior, but it just
>>> wasn't explicitly stated as such in C99.
>>
>> That is incorrect. A behavior for the following can be readily inferred
>> from C99:
>>
>> void *p = malloc(42); void *q = realloc(p, 0);
>
> Is that from an example? All the examples in the document are purely
> informative and cannot define any behavior.

True - but the normative text of the standard can define the behavior of
such code, whether or not the code is from a non-normative example section.

Prior to n2573.pdf, which is dated 2020-10-01, the behavior of the above
code was in fact defined by 7.22.3.5p3, which said
"If ptr is a null pointer, the realloc function behaves like the malloc
function for the specified size. Otherwise, if ptr does not match a

pointer earlier returned by a memory management function, or if the
space has been deallocated by a call to the free or realloc function,

the behavior is undefined. If size is nonzero and memory for the new
object is not allocated, the old object is not deallocated. If size is
zero and memory for the new object is not allocated, it is
implementation-defined whether the old object is deallocated. If the old
object is not deallocated, its value shall be unchanged."

Since p was initialized by a call to a memory management function, the
only undefined behavior mentioned in that paragraph doesn't apply.

If ptr is null, the behavior of malloc() referred to above was
controlled by 7.22.3p1:

"If the size of the space requested is zero, the behavior is

implementation-defined: either a null pointer is returned to indicate an
error, or the behavior is as if the size were some nonzero value, except

that the returned pointer shall not be used to access an
object."

Therefore, in n2478, which was dated 2020-02-05, and in all previous
drafts of the standard, the behavior was implementation-defined, with
only a small number of choices available to the implementation.

Scott Lurndal

unread,

Jan 18, 2024, 6:48:10 PM1/18/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:
>On 2024-01-18, Lawrence D'Oliveiro <l...@nz.invalid> wrote:
>> On Thu, 18 Jan 2024 15:12:46 GMT, Scott Lurndal wrote:
>>
>>> I've never found realloc useful, regardless of the value
>>> of the size parameter. YMMV.
>>
>> Looking back through my own code, I could only find one example, from
>> quite a few years ago
>><https://bitbucket.org/ldo17/dvd_menu_animator/src/master/>.
>
>And screw your code from years ago and its users, especially if it's
>just one place in the code. At least three places are required for it
>to be a issue (Microsoft Rule of Three).

I would wager dollars to donuts that Lawrence's code didn't do
a realloc(x, 0).

Lawrence D'Oliveiro

unread,

Jan 18, 2024, 6:48:44 PM1/18/24

to

On Thu, 18 Jan 2024 22:45:59 -0000 (UTC), Kaz Kylheku wrote:

> On 2024-01-18, Lawrence D'Oliveiro <l...@nz.invalid> wrote:
>> On Thu, 18 Jan 2024 15:12:46 GMT, Scott Lurndal wrote:
>>
>>> I've never found realloc useful, regardless of the value of the size
>>> parameter. YMMV.
>>
>> Looking back through my own code, I could only find one example, from
>> quite a few years ago
>><https://bitbucket.org/ldo17/dvd_menu_animator/src/master/>.
>

> And screw your code ...

And screw your code too.

Kaz Kylheku

unread,

Jan 18, 2024, 7:03:53 PM1/18/24

to

Ok.

realloc(x, 0);
dollars_and_doughtnuts_t get_your_win_here;

Keith Thompson

unread,

Jan 18, 2024, 7:14:12 PM1/18/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:
> On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:
>> It's OK to rely upon the requirements imposed by an implementation when
>> the C standard doesn't impose any - but when you do so, you need to make
>> sure you actually know what those requirements are.
>
> Exactly, and in this specific case, it's not worth the effort compared
> to writing a realloc wrapper that avoids the undefined behavior, while
> itself providing a C99 conforming one.
>
> I'm not going to use realloc(ptr, 0) and check everyone's documentation.
>
> And then what if I don't find it defined? The what? Back to the
> wrapper I could have just written in the first place.

[...]

I think I would have *liked* to see C23 drop the special permission
to return a null pointer for a requested size of zero. C11 says (and
this applies to malloc, realloc, and all other allocation functions):

If the space cannot be allocated, a null pointer is returned. If

the size of the space requested is zero, the behavior is

implementation-defined: either a null pointer is returned, or

the behavior is as if the size were some nonzero value, except
that the returned pointer shall not be used to access an object.

This could have been changed to:

If the space cannot be allocated, a null pointer is returned. If
the size of the space requested is zero, the behavior is as

if the size were some nonzero value, except that the returned
pointer shall not be used to access an object.

Any existing implementations that always return a null pointer
for malloc(0) would have to be updated. That shouldn't be a
great burden.

Note that malloc(0) or realloc(ptr, 0) can still fail and return
a null pointer if no space can be allocated, so all allocations
should still be checked. But with this proposed change, code
could rely on realloc(ptr, 0) returning a non-null pointer *unless*
available memory is critically low -- pretty much the same as in C11,
except that a null pointer would be an indication that something
is seriously wrong.

(I remember seeing a discussion about making the behavior of
realloc(ptr, 0) undefined. I'm making inquiries, and I'll follow
up if I learn anything relevant.)

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Kaz Kylheku

unread,

Jan 18, 2024, 10:07:21 PM1/18/24

to

On 2024-01-19, Keith Thompson <Keith.S.T...@gmail.com> wrote:
> Kaz Kylheku <433-92...@kylheku.com> writes:
>> On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:
>>> It's OK to rely upon the requirements imposed by an implementation when
>>> the C standard doesn't impose any - but when you do so, you need to make
>>> sure you actually know what those requirements are.
>>
>> Exactly, and in this specific case, it's not worth the effort compared
>> to writing a realloc wrapper that avoids the undefined behavior, while
>> itself providing a C99 conforming one.
>>
>> I'm not going to use realloc(ptr, 0) and check everyone's documentation.
>>
>> And then what if I don't find it defined? The what? Back to the
>> wrapper I could have just written in the first place.
> [...]
>
> I think I would have *liked* to see C23 drop the special permission
> to return a null pointer for a requested size of zero.

Yes. That would be the best thing.

It also works well when the memory is subject to memcpy or memset,
which have undefined behavior on null pointers.

E.g. a dynamic array management routine can allocate a zero length array
with malloc(0), get a non-null pointer and then initialize it with
memset(ptr, 0, 0); no special casing in the code for zero length.

> C11 says (and
> this applies to malloc, realloc, and all other allocation functions):
>
> If the space cannot be allocated, a null pointer is returned. If
> the size of the space requested is zero, the behavior is
> implementation-defined: either a null pointer is returned, or
> the behavior is as if the size were some nonzero value, except
> that the returned pointer shall not be used to access an object.
>
> This could have been changed to:
>
> If the space cannot be allocated, a null pointer is returned. If
> the size of the space requested is zero, the behavior is as
> if the size were some nonzero value, except that the returned
> pointer shall not be used to access an object.

I would add a footnote that implementors are encouraged to strive
to minimize the nonzero value.

> Any existing implementations that always return a null pointer
> for malloc(0) would have to be updated. That shouldn't be a
> great burden.

For that matter, compilers could fix this independently of libraries.

Easy case: constant size: compiler adjusts malloc(0) call to malloc(1).

Non-constant size calls: redirected through a compiler-generated stub:

static void *__gcc_malloc_wrap(size_t __size)
{
return __size ? malloc(1) : malloc(__size);
}

Indirect calls: when address of malloc is taken, it takes this function.

This would fix the issue retroactively for a good many C libraries,
without them lifting a finger.

Needless to say, calloc and realloc and others get that compiler
treatment.

> Note that malloc(0) or realloc(ptr, 0) can still fail and return
> a null pointer if no space can be allocated, so all allocations
> should still be checked. But with this proposed change, code
> could rely on realloc(ptr, 0) returning a non-null pointer *unless*
> available memory is critically low -- pretty much the same as in C11,
> except that a null pointer would be an indication that something
> is seriously wrong.

Yes; all ambiguity in regard to the null return value is gone.

Blue-Maned_Hawk

unread,

Jan 19, 2024, 11:27:52 AM1/19/24

to

Kaz Kylheku wrote:

> On 2024-01-18, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
>> Kaz Kylheku wrote:
>>
>>> On 2024-01-17, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
>>>> It looks to me like it was always undefined behavior, but it just
>>>> wasn't explicitly stated as such in C99.
>>>
>>> That is incorrect. A behavior for the following can be readily
>>> inferred from C99:
>>>
>>> void *p = malloc(42); void *q = realloc(p, 0);
>>
>> Is that from an example?
>
> No.

Ah, well then.

>> All the examples in the document are purely informative and cannot
>> define any behavior.
>
> Of course, that's not what "behavior can be readily inferred" means, no
> matter where I got the example.

Inferences may be makeäble, but they have no power.

Kaz Kylheku

unread,

Jan 19, 2024, 11:39:37 AM1/19/24

to

On 2024-01-19, Blue-Maned_Hawk <bluema...@invalid.invalid> wrote:
> Kaz Kylheku wrote:
>> Of course, that's not what "behavior can be readily inferred" means, no
>> matter where I got the example.
>
> Inferences may be makeäble, but they have no power.

Yeah, tell that to your compiler when it surprisingly deletes code based
on inference.

Lawrence D'Oliveiro

unread,

Jan 19, 2024, 3:30:10 PM1/19/24

to

On Thu, 18 Jan 2024 16:13:48 -0800, Keith Thompson wrote:

> But with this proposed change, code could rely on realloc(ptr,
> 0) returning a non-null pointer *unless* available memory is critically
> low -- pretty much the same as in C11,
> except that a null pointer would be an indication that something is
> seriously wrong.

So having to allocate something, when it didn’t actually need to allocate
anything, could lead to program failures in situations where things might
otherwise work fine.

Unless, of course, there was a special non-null preallocated address value
that was returned for every zero-length allocation.

Returning a null pointer for a zero-length allocation shouldn’t make any
difference to the logic of your program.

Lawrence D'Oliveiro

unread,

Jan 19, 2024, 4:12:09 PM1/19/24

to

On Thu, 18 Jan 2024 23:47:55 GMT, Scott Lurndal wrote:

>>><https://bitbucket.org/ldo17/dvd_menu_animator/src/master/>.

>
> I would wager dollars to donuts that Lawrence's code didn't do a
> realloc(x, 0).

Is it cheating to look?

Scott Lurndal

unread,

Jan 19, 2024, 4:31:34 PM1/19/24

to

Lawrence D'Oliveiro <l...@nz.invalid> writes:

Wouldn't be much of a wager, then.

Kaz Kylheku

unread,

Jan 19, 2024, 4:35:24 PM1/19/24

to

Unfortunately, that requires checking for it in some cases.

For instance:

unsigned char *buf = malloc(s);
memset(buf, 0, s);

has undefined behavior if buf is null, even if s is zero.

It is undefined behavior to pass a null pointer to a C library
function, except where that is documented.

the solution of using a special, pre-allocated address for every
zero-length allocation would be fantastic, but a big change.

At the very least, it should be one of the permitted choices of
behavior.

Keith posted the opinion that zero length allocations should have
the behavior of returning a non-null allocated object which
can be freed. (I.e. remove the freedom to return null.)

If, additionally, implentations could have a single dedicated object for
representing empty allocations (which can be passed to free any
number of times), that would also be a nice requirement.

All that the standrad would have to say is something like "pointers from
separate zero-sized allocations need not be distinct".

Keith Thompson

unread,

Jan 19, 2024, 4:37:42 PM1/19/24

to

Lawrence D'Oliveiro <l...@nz.invalid> writes:

> On Thu, 18 Jan 2024 16:13:48 -0800, Keith Thompson wrote:
>> But with this proposed change, code could rely on realloc(ptr,
>> 0) returning a non-null pointer *unless* available memory is critically
>> low -- pretty much the same as in C11,
>> except that a null pointer would be an indication that something is
>> seriously wrong.
>
> So having to allocate something, when it didn’t actually need to allocate
> anything, could lead to program failures in situations where things might
> otherwise work fine.
>
> Unless, of course, there was a special non-null preallocated address value
> that was returned for every zero-length allocation.

That wouldn't meet the current requirements. If malloc(0) returns a
non-null result, then two calls to malloc(0) must yield distinct results
(if free() isn't called in between), just as two calls to malloc(1) must
do.

The standard *could* be modified to permit malloc(0) to return a
non-unique non-null pointer, and to require it to always succeed. I'm
not sure that complication would be worthwhile.

> Returning a null pointer for a zero-length allocation shouldn’t make any
> difference to the logic of your program.

I dislike the fact that the behavior is currently
implementation-defined. I think requiring malloc(0) to return a null
pointer would be an improvement over the current (C11) specification,
though it would be an odd special case; a null pointer would mean either
that the allocation failed (and the system is likely in a bad state) or
that the requested size was 0. Note that an application might call
malloc() with a variable argument whose value can just happen to be
zero.

Currently, code that might call malloc(0) (or realloc(ptr, 0), or ...)
must be written to work correctly whichever way the implementation
works. Every implementation I've tried returns a non-null pointer,
which makes it likely that any such code would never be tested with an
implementation that returns a null pointer.

C23 making the behavior of realloc(ptr, 0) (but *not* of malloc(0))
undefined could break existing code, requiring special cases to be added
-- and again, it's likely that most implementations will continue to
work the way they do now, so code might never be tested in all
scenarios. I'm suggesting that removing special cases makes it easier
to write robust code. If malloc(0) and realloc(ptr, 0) return a
non-null pointer on success, then a null result *always* indicates an
allocation failure.

Kaz Kylheku

unread,

Jan 19, 2024, 4:50:57 PM1/19/24

to

On 2024-01-19, Keith Thompson <Keith.S.T...@gmail.com> wrote:

> Lawrence D'Oliveiro <l...@nz.invalid> writes:
>> On Thu, 18 Jan 2024 16:13:48 -0800, Keith Thompson wrote:
>>> But with this proposed change, code could rely on realloc(ptr,
>>> 0) returning a non-null pointer *unless* available memory is critically
>>> low -- pretty much the same as in C11,
>>> except that a null pointer would be an indication that something is
>>> seriously wrong.
>>
>> So having to allocate something, when it didn’t actually need to allocate
>> anything, could lead to program failures in situations where things might
>> otherwise work fine.
>>
>> Unless, of course, there was a special non-null preallocated address value
>> that was returned for every zero-length allocation.
>
> That wouldn't meet the current requirements. If malloc(0) returns a
> non-null result, then two calls to malloc(0) must yield distinct results
> (if free() isn't called in between), just as two calls to malloc(1) must
> do.

But if malloc(0) returns null, then two such calls don't yield distinct
results.

We already don't know today whether malloc(0) == malloc(0).

> to write robust code. If malloc(0) and realloc(ptr, 0) return a
> non-null pointer on success, then a null result *always* indicates an
> allocation failure.

Not requiring the non-null return from malloc(0) to be distinct
from previous malloc(0) return values (whether they were freed or not),
could help to "sell" the idea of taking away the null return value.

Some implementors might grumble that null return allowed malloc(0) to be
efficient by not allocating anything. If they were allowed to return
(void *) -1 or something, that would placate that concern.

Say you have a large, sparse vector of dynamic vectors. Sparse in the
sense that most of the dynamic vectors in the sparse vector are empty;
only a few are nonempty. If those empty vectors come from malloc(0) in
an efficient way (nothing is allocated on the heap), that's nice.

Keith Thompson

unread,

Jan 19, 2024, 5:16:27 PM1/19/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:
[...]

> Not requiring the non-null return from malloc(0) to be distinct
> from previous malloc(0) return values (whether they were freed or not),
> could help to "sell" the idea of taking away the null return value.

But it would still be an additional special case -- and someone might
use malloc(0) specifically as a way to obtain a unique non-null pointer
while allocating minimal memory. (Of course the could use malloc(1) to
do the same thing.)

> Some implementors might grumble that null return allowed malloc(0) to be
> efficient by not allocating anything. If they were allowed to return
> (void *) -1 or something, that would placate that concern.
>
> Say you have a large, sparse vector of dynamic vectors. Sparse in the
> sense that most of the dynamic vectors in the sparse vector are empty;
> only a few are nonempty. If those empty vectors come from malloc(0) in
> an efficient way (nothing is allocated on the heap), that's nice.

What's being suggested is basically a second kind of null pointer, i.e.,
a second unique pointer value that can't be dereferenced. And if two
malloc(0) calls were allowed to return the same non-null value, that
would require additional wording in the standard. I don't strongly
object to the idea, but I don't think it's necessary.

And if, say, C26 permitted or required two non-null malloc(0) results to
be equal, code could not rely on that behavior unless it's guaranteed to
be compiled with a C26 or later compiler. It would be effectively
implementation-defined, but across editions of the standard rather than
implementations. (There's probably not much existing code that relies
on this.)

Tim Rentsch

unread,

Jan 19, 2024, 7:14:20 PM1/19/24

to

sc...@slp53.sl.home (Scott Lurndal) writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>> sc...@slp53.sl.home (Scott Lurndal) writes:
>>
>>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>
>>>> I'm looking at the C99 and N3096 (April 2023) definitions of realloc
>>>> side by side.
>>>>
>>>> N3096 says
>>>>

>>>> "Otherwise, if ptr does not match a pointer earlier returned by a memory
>>>> management function, or if the space has been deallocated by a call to

>>>> the free or realloc function, or if the size is zero, the behavior is
>>>> undefined."
>>>>
>>>> Yikes! In C99 there is nothing about the size being zero:
>>>>
>>>> "Otherwise, if ptr does not match a pointer earlier returned by the

>>>> calloc, malloc, or realloc function, or if the space has been

>>>> deallocated by a call to the free or realloc function, the behavior is
>>>> undefined."
>>>>

>>>> Nothing about "or if the size is zero".
>>>>
>>>> Yikes; when did this criminal stupidity get perpetrated?
>>>
>>> I'm not sure what stupidity you are referring to, but IIRC, there was
>>> some recent standardization activity relating to realloc
>>> when size == 0 because there were differences in the behavior
>>> between different implementations. Making the behavior
>>> undefined was the only rational choice.
>>
>> Is that last sentence your own assessment, or are you simply
>> repeating someone else's assessment?
>

> My assessment. I've never found realloc useful, regardless of

> the value of the size parameter. YMMV.

So you aren't really in a position to say whether this decision was
the only rational choice. Generally I hope people who would make
such a statement would first make an effort to learn and understand
other people's thoughts on the matter.

Scott Lurndal

unread,

Jan 19, 2024, 7:26:36 PM1/19/24

to

The fact that I didn't find it useful, doesn't imply in any way
that I don't understand other's thoughts on the matter. I did
spend several years on various standards committees, compromise
was often the best way to make forward progress and 'undefined
behavior' was a rational compromise in that context.

I also believe that the fears over the impact of that decision
are overblown.

Tim Rentsch

unread,

Jan 20, 2024, 6:42:03 PM1/20/24

to

sc...@slp53.sl.home (Scott Lurndal) writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>
>> sc...@slp53.sl.home (Scott Lurndal) writes:
>>
>>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>

>>>> On 2024-01-17, Scott Lurndal <sc...@slp53.sl.home> wrote:
>>>>
>>>>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>>>>

>>>>>> Yikes; when did this criminal stupidity get perpetrated?
>>>>>
>>>>> I'm not sure what stupidity you are referring to, but IIRC, there was
>>>>> some recent standardization activity relating to realloc
>>>>> when size == 0 because there were differences in the behavior
>>>>> between different implementations. Making the behavior
>>>>> undefined was the only rational choice.
>>>>

>>>> No, the rational choice is letting those implementations be
>>>> nonconforming, until they fix their shit.
>>>>
>>>> You cannot claw back decades-old defined behaviors in a major
>>>
>>> Nobody is clawing anything back. [...]
>>
>> This statement seems directly contradicted by the proposed
>> modification to the C standard, which changes previously
>> defined behavior to undefined behavior.
>
> Clawing back implies that existing programs that rely
> on either behavior will stop working.
>
> That's not the case in the real world.

There is no question that the proposed change (which is possibly
ratified by now) would claw back some defined behavior in favor of
undefined behavior. What you're saying is that there will be no
consequences of that clawing back.

First, there certainly will be _some_ consequences, because some
people will modify their code even if they think there will be no
changes in the compilers and libraries they use.

Second, even if many compilers and many C libraries make no
changes (at least in the near future), the chances are high that
at least some will, especially in the presence of library software
updates and new platforms coming on line (low-end consumer
hardware such as switches and wifi routters), and consequently
some applications will get bitten by this.

Third, even if there are no changes in the near term, if we look
out ten years or more it is highly likely that these things will
change in many implementations, especially when taking into
account cross-compiling. The idea that there will be no changes
"in the real world" over a long time frame is incredibly naive.
The Linux null-pointer-use bug illustrates the problem.

> From the application portability point of view, there is little
> difference between implementation defined and undefined behavior.

This statement is nonsense. Essentially all C programs depend on
implementation-defined behavior to some degree. If all of that ID
behavior were changed to be undefined behavior, programming in C
would be impossible for all practical purposes, whether or not
portability is a consideration.

Keith Thompson

unread,

Jan 20, 2024, 10:50:49 PM1/20/24

to

I got a response from JeanHeyd Meneide.

If realloc(ptr, 0) returns a null pointer there's no way to tell whether
allocation failed (and ptr has not been freed), or the implementation
returns a null pointer for zero-sized allocations (and ptr has been
freed). Some implementations set errno, but C doesn't require it.

C17 added this in 7.31.11, Future library directions:

Invoking realloc with a size argument equal to zero is an
obsolescent feature.

More references:

DR 400 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/summary.htm#dr_400>
"realloc with size zero problems"

N2438 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2438.htm>
"Clarification Request", "Realloc with size 0 ambiguity"

N2464 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf>
This is the paper that proposed making realloc(?, 0) undefined behavior,
whether the first argument is null or not. Part of the rationale was to
allow POSIX to define it however they please.

N2665 <https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2665.pdf>
Not directly relevant; it just removes the statement that zero-sized
reallocations are obsolescent, since it's no longer necessary.

Personally, I still think it would have been cleaner to remove the
permission for any allocation functions to return a null pointer, even
with a requested size of 0, unless there isn't enough memory for the
allocation to succeed.

Lawrence D'Oliveiro

unread,

Jan 21, 2024, 12:12:00 AM1/21/24

to

On Sat, 20 Jan 2024 19:50:31 -0800, Keith Thompson wrote:

> If realloc(ptr, 0) returns a null pointer there's no way to tell whether

> allocation failed (and ptr has not been freed) ...

Actually, that doesn’t seem like a reasonable interpretation, because it
leads to memory leaks.

Kaz Kylheku

unread,

Jan 21, 2024, 12:20:19 AM1/21/24

to

That's not an "interpretation"; that's just a fact.

A null return from realloc has two documented meanings; situations exist
in which it cannot be distinguished which one, in any portable way.

Tim Rentsch

unread,

Jan 21, 2024, 6:08:20 AM1/21/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

[.. considering the behavior of malloc(0) ..]

> [If] implentations could have a single dedicated object for

> representing empty allocations (which can be passed to free any
> number of times), that would also be a nice requirement.

That defeats the whole purpose of having malloc(0) return
a non-null value. Don't you understand anything?

Tim Rentsch

unread,

Jan 21, 2024, 6:25:39 AM1/21/24

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

> What's being suggested is basically a second kind of null pointer, i.e.,
> a second unique pointer value that can't be dereferenced. And if two
> malloc(0) calls were allowed to return the same non-null value, that
> would require additional wording in the standard. I don't strongly
> object to the idea, but I don't think it's necessary.

It's a counterproductive idea. The whole point of malloc(0) being
able to return a non-null pointer is to get distinct "objects" for
different zero-sized allocations. The "objects" can't be used in
any way but comparing the pointers allows the "objects" from two
different allocations to be distinguished.

Tim Rentsch

unread,

Jan 21, 2024, 6:47:49 AM1/21/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

> On 2024-01-19, Keith Thompson <Keith.S.T...@gmail.com> wrote:
>
>> Kaz Kylheku <433-92...@kylheku.com> writes:
>>
>>> On 2024-01-18, James Kuyper <james...@alumni.caltech.edu> wrote:
>>>
>>>> It's OK to rely upon the requirements imposed by an implementation when
>>>> the C standard doesn't impose any - but when you do so, you need to make
>>>> sure you actually know what those requirements are.
>>>
>>> Exactly, and in this specific case, it's not worth the effort compared
>>> to writing a realloc wrapper that avoids the undefined behavior, while
>>> itself providing a C99 conforming one.
>>>
>>> I'm not going to use realloc(ptr, 0) and check everyone's documentation.
>>>
>>> And then what if I don't find it defined? The what? Back to the
>>> wrapper I could have just written in the first place.
>>
>> [...]
>>
>> I think I would have *liked* to see C23 drop the special permission
>> to return a null pointer for a requested size of zero.
>
> Yes. That would be the best thing.
>
> It also works well when the memory is subject to memcpy or memset,
> which have undefined behavior on null pointers.

That's a half-assed argument. There are other ways a pointer might
have a null value than just being the result of a call to malloc().
If code might call memset() et al with a zero size and a null
pointer, it's better to address all possible cases at once rather
than just some of them:

static inline void *
safer_memset( void *s, int c, size_t n ){
return n ? memset( s, c, n ) : s;
}

static inline void *
safer_memcpy( void *d, const void *s, size_t n ){
return n ? memcpy( d, s, n ) : d;
}

/* ... etc ... */

bart

unread,

Jan 21, 2024, 6:55:53 AM1/21/24

to

malloc has sort of created a rod for its own back by needing to store
the size of the allocation. That will take up some space even when
malloc(0) is called, if NULL is not being returned.

I've looked at my own allocator for small objects, which does not store
the size. There, successive calls to my_alloc(0) return the same pointer
value to a zero-sized memory block (but not if there are intervening
calls to my_alloc(100).)

A call to free a zero-sized block will be my_free(p,0) so it knows no
action is needed.

malloc(0) can be made efficient by it detecting the zero-size and
returning a pointer to the same special memory block, even if there are
intervening non-zero calls.

A call to free(p) where p refers to that zero-sized block can also be
detected.

So it will save on memory if allocating millions of zero-sized blocks,
but it means extra checks on each call.

Other replies however suggested that such malloc(0) calls need to return
unique values. But you can't have both have unique values and save
memory (at best you will need 1 byte per malloc(0), and some hairy means
of detecting whether the p in free(p) refers to one of those bytes, so a
bigger runtime overhead).

Tim Rentsch

unread,

Jan 21, 2024, 7:04:22 AM1/21/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

[...]

> Not requiring the non-null return from malloc(0) to be distinct
> from previous malloc(0) return values (whether they were freed or not),
> could help to "sell" the idea of taking away the null return value.
>
> Some implementors might grumble that null return allowed malloc(0) to be
> efficient by not allocating anything. If they were allowed to return

> (void *) -1 or something, that would placate that concern. [...]

You have the tail wagging the dog here. If the results of
different malloc(0) calls don't need to be distinguishable,
they might just as well all be null.

Tim Rentsch

unread,

Jan 21, 2024, 8:10:56 AM1/21/24

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

[...]

> Note that malloc(0) or realloc(ptr, 0) can still fail and return
> a null pointer if no space can be allocated, so all allocations

> should still be checked. [...]

Note that it isn't hard to write code for realloc() that
guarantees a realloc( p, 0 ) call will never fail (that
is, if 'p' is non-null, there will never be a case where
a null value indicating an allocation failure has to be
returned). To do that, all the implementation needs to
do is see if the malloc(0) allocation would fail, and
if it would then simply return 'p', without freeing it.

Tim Rentsch

unread,

Jan 21, 2024, 9:55:19 AM1/21/24

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

[.. considering the definition of malloc(0) ..]

> I dislike the fact that the behavior is currently
> implementation-defined. I think requiring malloc(0) to return a
> null pointer would be an improvement over the current (C11)
> specification, though it would be an odd special case; a null
> pointer would mean either that the allocation failed (and the
> system is likely in a bad state) or that the requested size was 0.
> Note that an application might call malloc() with a variable

> argument whose value can just happen to be zero. [...]

First I think it is worth going back and re-reading the Rationale
where it talks about malloc() and friends.

I'm sympathetic to the reaction about malloc() and friends having
different behavior in different implemenations. On the other
hand let's look at it from the point of view of implementations:

(A) For malloc(0) returning non-null: that can be a convenience
for some applications, and can be supplied in a way that client
code cannot provide.

(B) For malloc(0) returning null: most programs don't allocate
zero-sized objects, except perhaps inadvertently; in most uses
code will work without needing to distinguish, and in those cases
where the distinction is important it isn't hard to write code
that works for both allowed behaviors; being able to return null
for zero-size requests both simplifies the code and uses less
resource than choice (A). In small systems the added resource
demands may very well be the deciding factor between choosing (A)
and choosing (B).

For most aspects, client code can ignore the distinction between
behavior (A) and behavior (B). The big exception to that is how
to detect errors. If we think the C standard should continue to
allow both possibilities (and personally I think it should), a
way around the problem is to provide a testable macro symbol, as
for example

#if __MALLOC_0_GIVES_NULL
...
#elif __MALLOC_0_GIVES_NONNULL
...
#else
... deal with uncertainty in some way
... (not hard although it may involve a run-time cost)
#endif

If some such macros were provided it isn't hard to define a macro
that does error checking in both kinds of environments, so there
could be code like this:

Foo *p = malloc( n * sizeof *p );
if( MALLOC_FAILED( p, n ) ) ...

where the MALLOC_FAILED() macro would either simply test '!p' or
would test '!p && n > 0', depending on whether the code is
running in an (A) regime or a (B) regime.

Personally I would like to see the memory allocation functions
augmented to be explicit about which behavior is wanted:

void *malloc0( size_t ); /* returns NULL for size of 0 */
void *malloc1( size_t ); /* returns NULL only on failure */

along with analogous changes for calloc(), realloc(), etc.

Kaz Kylheku

unread,

Jan 21, 2024, 12:33:05 PM1/21/24

to

I undertand very clearly from past disccussions that you're attached to
a particular use case whereby malloc(0) returns unique objects.

However, I don't see that as the purpose, let alone the "whole purpose".

The standard currently does not endorse malloc(0) as a factory
for unique pointers.

Currently, it cannot be relied on due to the possibility of
the null return. If you want that behavior portably, you currently have
to use malloc(1) instead, or some other nonzero value.
Even if your program detects that malloc(0) returns a non-null
pointer one time, there is no requirement that all subsequent such
allocations will return non-null.

Tim Rentsch

unread,

Jan 21, 2024, 12:41:56 PM1/21/24

to

bart <b...@freeuk.com> writes:

> malloc has sort of created a rod for its own back by needing to
> store the size of the allocation.

malloc does not need to store the size of the space requested.

It does need to save enough information to be able to compute
how much memory needs to be reclaimed, based on the pointer
to the memory area to be free()'d. That might not be as much
memory as a whole size word, although typically the overhead
is as many bytes as a pointer, or a size_t, per block (including
both allocated blocks and free blocks).

> That will take up some space even when malloc(0) is called, if

> NULL is not being returned. [...]

>
> Other replies however
> suggested that such malloc(0) calls need to return unique values.
> But you can't have both have unique values and save memory (at
> best you will need 1 byte per malloc(0), and some hairy means of
> detecting whether the p in free(p) refers to one of those bytes,
> so a bigger runtime overhead).

In a conventional architecture, eg x64, it isn't too difficult to
devise a method for malloc() and free() of zero-sized objects that
(a) has a fast check to see if an argument value refers to a
zero-size object, and (b) takes 1.5 bits or less per zero-size
object allocated.

Chris M. Thomasson

unread,

Jan 21, 2024, 3:22:21 PM1/21/24

to

[...]

Why would malloc need to store the size of its allocations?

bart

unread,

Jan 21, 2024, 3:35:08 PM1/21/24

to

If you do this:

p = malloc(N);
...
free(p);

'free' will need to know what N was used to allocate p in order to
deallocate right size of memory.

Lew Pitcher

unread,

Jan 21, 2024, 3:52:12 PM1/21/24

to

So that, in a naive implementation, free() knows how much memory to
return to the freelist. See K&R Chapter 8, secion 8.7 "Example - A
Storage Allocator" for an example.

--
Lew Pitcher
"In Skills We Trust"

David Brown

unread,

Jan 21, 2024, 3:54:55 PM1/21/24

to

It does not need to store the size of the allocation. It merely has to
store sufficient information to be able to figure out how to deallocate
properly. And any storage it needs can be in a different place from the
allocation.

Some malloc/free systems track the information separately from the
allocation data. One possibility is to use pools for different memory
sizes. When the user asks for X bytes, this is rounded up to the
nearest pool size - call it Y - and the allocation is made from the Y
pool, which can be viewed as an array of Y-size lumps. Only need one
single bit to track each allocation, to say which indexes in the array
are used. There are many other ways to do it.

Personally, I think it would be often more efficient in modern C if the
allocation system didn't track sizes at all, and "free" passed the
original size as a parameter. But that ship sailed long ago for
standard C. (C++ supports a sized deallocation system, and of course
there's nothing to stop you making your own allocator system for C.)

Kaz Kylheku

unread,

Jan 21, 2024, 4:16:33 PM1/21/24

to

While that is true in the abstract, it is not necesarily the case
that it needs to pull N out of the p object.

For instance, suppose N is a 64 byte allocation, and the allocator
has special heaps for small allocations.

It can figure out that p points into a heap that has 64 byte objects
and then do some pointer arithmetic to figure out which one,
and add that to a free list or bitmap or whatever.

Kaz Kylheku

unread,

Jan 21, 2024, 4:21:02 PM1/21/24

to

On 2024-01-21, David Brown <david...@hesbynett.no> wrote:
> Personally, I think it would be often more efficient in modern C if the
> allocation system didn't track sizes at all, and "free" passed the
> original size as a parameter. But that ship sailed long ago for
> standard C.

That ship newly sailed in 2023.

The N3096 draft describes a function free_sized that takes a size. If
the size is wrong, the behavior is undefined.

So now C has two ways to free an object: an efficient one where
the program helps by giving the size, and the old free, which
may have to do more work.

I think going forward, it may start to become wise to detect the
implementation's support for free_sized (e.g. in a configure script)
and use that as much as possible, using free only when it's
inconvenient: for instance a function resembling POSIX strdup
might not be able to to safely assume that it can do
free_sized(str, strlen(str)+1), because the underlying buffer
might be larger.

Keith Thompson

unread,

Jan 21, 2024, 4:31:58 PM1/21/24

to

David Brown <david...@hesbynett.no> writes:
[...]

> Personally, I think it would be often more efficient in modern C if
> the allocation system didn't track sizes at all, and "free" passed the
> original size as a parameter. But that ship sailed long ago for
> standard C. (C++ supports a sized deallocation system, and of course
> there's nothing to stop you making your own allocator system for C.)

I suspect that calling malloc() with one size and free() with a
different one would have been a rich source of subtle bugs.

bart

unread,

Jan 21, 2024, 4:38:16 PM1/21/24

to

On 21/01/2024 20:54, David Brown wrote:
> On 21/01/2024 21:34, bart wrote:
>> On 21/01/2024 20:22, Chris M. Thomasson wrote:
>>> On 1/21/2024 3:55 AM, bart wrote:
>>
>>>> malloc has sort of created a rod for its own back by needing to
>>>> store the size of the allocation. That will take up some space even
>>>> when malloc(0) is called, if NULL is not being returned.
>>> [...]
>>>
>>> Why would malloc need to store the size of its allocations?
>>>
>>
>> If you do this:
>>
>>      p = malloc(N);
>>      ...
>>      free(p);
>>
>> 'free' will need to know what N was used to allocate p in order to
>> deallocate right size of memory.
>>
>
> It does not need to store the size of the allocation.

So it doesn't specifically need to store N for free to do its job ...

> It merely has to
> store sufficient information to be able to figure out how to deallocate
> properly. And any storage it needs can be in a different place from the
> allocation.
>
> Some malloc/free systems track the information separately from the
> allocation data. One possibility is to use pools for different memory
> sizes. When the user asks for X bytes, this is rounded up to the
> nearest pool size - call it Y - and the allocation is made from the Y
> pool, which can be viewed as an array of Y-size lumps. Only need one
> single bit to track each allocation, to say which indexes in the array
> are used. There are many other ways to do it.
>
> Personally, I think it would be often more efficient in modern C if the
> allocation system didn't track sizes at all, and "free" passed the
> original size as a parameter.

... but here you're specifically passing N for free to do its job.
Suggesting this value is a good way to determine the necessary info.

Kaz Kylheku

unread,

Jan 21, 2024, 4:49:31 PM1/21/24

to

On 2024-01-21, Keith Thompson <Keith.S.T...@gmail.com> wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> Personally, I think it would be often more efficient in modern C if
>> the allocation system didn't track sizes at all, and "free" passed the
>> original size as a parameter. But that ship sailed long ago for
>> standard C. (C++ supports a sized deallocation system, and of course
>> there's nothing to stop you making your own allocator system for C.)
>
> I suspect that calling malloc() with one size and free() with a
> different one would have been a rich source of subtle bugs.

Welcome to C23! free_sized(malloc(42), 73) -> UB.

bart

unread,

Jan 21, 2024, 5:09:59 PM1/21/24

to

On 21/01/2024 21:49, Kaz Kylheku wrote:
> On 2024-01-21, Keith Thompson <Keith.S.T...@gmail.com> wrote:
>> David Brown <david...@hesbynett.no> writes:
>> [...]
>>> Personally, I think it would be often more efficient in modern C if
>>> the allocation system didn't track sizes at all, and "free" passed the
>>> original size as a parameter. But that ship sailed long ago for
>>> standard C. (C++ supports a sized deallocation system, and of course
>>> there's nothing to stop you making your own allocator system for C.)
>>
>> I suspect that calling malloc() with one size and free() with a
>> different one would have been a rich source of subtle bugs.
>
> Welcome to C23! free_sized(malloc(42), 73) -> UB.
>

I've used such a sized free function in my own libraries for a long time.

Yes, you can pass the wrong size (although there are debugging options
where it will keep track of the sizes and double-check, if you suspect
such a bug).

But the real problems are the same as they are now in C:

free(p) when p has not been assigned a value from malloc
free(p); free(p); call twice
--- forget to call free
free(q); free the wrong pointer
free(p+1); pass an invalid pointer

Getting a p=malloc() correctly matched up with just one free(p) in a
different part of the program, at a different time, is the hard bit.

The easy bit is this:

p = malloc(sizeof(*p));
...
free_sized(p, sizeof(*p));

(Is it still malloc or is there a malloc_sized? Since part of the point
is not wasting time and memory managing that extra metadata.)

Tim Rentsch

unread,

Jan 21, 2024, 5:33:27 PM1/21/24

to

It's trivial to fix that problem: simply require implementations
to define a preprocessor symbol about how the implementation
works. Problem solved.

(There are other instances of implementation-defined behavior
that would benefit from analogous changes along these lines.)

Tim Rentsch

unread,

Jan 21, 2024, 5:38:18 PM1/21/24

to

Kaz Kylheku <433-92...@kylheku.com> writes:

> On 2024-01-21, Lawrence D'Oliveiro <l...@nz.invalid> wrote:
>
>> On Sat, 20 Jan 2024 19:50:31 -0800, Keith Thompson wrote:
>>
>>> If realloc(ptr, 0) returns a null pointer there's no way to tell whether
>>> allocation failed (and ptr has not been freed) ...
>>

>> Actually, that doesn?t seem like a reasonable interpretation, because it

>> leads to memory leaks.
>
> That's not an "interpretation"; that's just a fact.
>
> A null return from realloc has two documented meanings; situations exist
> in which it cannot be distinguished which one, in any portable way.

I don't agree that they can't be distinguished in _any_ portable
way.

I suppose a case could be made that they can't be distinguished in
any _convenient_ portable way. But that's not the same as _any_
portable way.

Keith Thompson

unread,

Jan 21, 2024, 6:04:37 PM1/21/24

to

I tend to agree that such a preprocessor symbol would be an improvement.

I still think, as I wrote above, that removing the permission to return
a null pointer on a successful zero-sized allocation would be a greater
improvement.

A preprocessor symbol makes it easier for programmers to work around the
potential difference between implementations. The change I advocate
would make it completely unnecessary.

Except, of course, that most code would still have to allow for pre-C26
behavior, even if the change were adopted in C26. That's unavoidable in
the absence of time machines. On the gripping hand, since it seems that
most existing implementations (well, all of the few I've tried) return a
non-null pointer for malloc(0), it might be reasonable to ignore the few
pre-C26 implementations that return a null pointer.

Malcolm McLean

unread,

Jan 21, 2024, 6:50:05 PM1/21/24

to

The thing is you have to be disciplined. I have a fairly strict rule
that either alloaction and free must be in the same function, or if a
function returns a structure that needs to be operated on, there is a
constructor which sets it up in a valid state and a matching destructor
(I always use the prefix kill, though a colleague who had been in the
military objected because it reminded him too much of the army).

One difficulty is when you pass up an allocatred pointer which is not to
a custom structure, eg from strdup(). So you have to break the rule and
treat the strdup call as though it were a call to malloc. Then rarely
you have graph like structures which add and subtract nodes. So you
isolate the code that adds or subtracts the node, and set up the whole
thing in a valid (but maybe empty) state in one call, and destroy the
whole thing in another. But you can't keep to the rule that each
addition of a node will be matched by a freeing of the node in the same
function.
--
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm

Chris M. Thomasson

unread,

Jan 21, 2024, 8:04:39 PM1/21/24

to

Why?

Chris M. Thomasson

unread,

Jan 21, 2024, 8:05:46 PM1/21/24

to

Think of a block of memory aligned on a very large boundary, and padded
up to a boundary size. So, we free take take its address, round down to
a boundary and get at a header...

Chris M. Thomasson

unread,

Jan 21, 2024, 8:09:54 PM1/21/24

to

On 1/21/2024 5:05 PM, Chris M. Thomasson wrote:
[...]

> Think of a block of memory aligned on a very large boundary, and padded
> up to a boundary size. So, we free take take its address, round down to
> a boundary and get at a header...

God damn typos!

free takes its address, rounds down to a boundary, and can get at a
header structure.

This works fine if we make sure if we take alignment and a large
boundary into account. Basically, we need each allocation to be at least
be as large as a word/pointer if you will. This pointer is used for a
linked list in the header, that sits on alignment boundaries.

Chris M. Thomasson

unread,

Jan 21, 2024, 8:14:36 PM1/21/24

to

On 1/21/2024 1:16 PM, Kaz Kylheku wrote:
> On 2024-01-21, bart <b...@freeuk.com> wrote:
>> On 21/01/2024 20:22, Chris M. Thomasson wrote:
>>> On 1/21/2024 3:55 AM, bart wrote:
>>
>>>> malloc has sort of created a rod for its own back by needing to store
>>>> the size of the allocation. That will take up some space even when
>>>> malloc(0) is called, if NULL is not being returned.
>>> [...]
>>>
>>> Why would malloc need to store the size of its allocations?
>>>
>>
>> If you do this:
>>
>> p = malloc(N);
>> ...
>> free(p);
>>
>> 'free' will need to know what N was used to allocate p in order to
>> deallocate right size of memory.
>
> While that is true in the abstract, it is not necesarily the case
> that it needs to pull N out of the p object.
>
> For instance, suppose N is a 64 byte allocation, and the allocator
> has special heaps for small allocations.
>
> It can figure out that p points into a heap that has 64 byte objects
> and then do some pointer arithmetic to figure out which one,
> and add that to a free list or bitmap or whatever.

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Right!

Malcolm McLean

unread,

Jan 21, 2024, 9:05:17 PM1/21/24

to

It depends how malloc is implemented.
But the traditional way, using a contiguous pool and a linked list of
free blocks, each allocation is for a fixed number of units the same
size as the header, and you need to know how many units are in the block
to add it to the free list on deallocation, and also for
defragmentation. So the header is a pointer to the next free block plus
a size.

Chris M. Thomasson

unread,

Jan 21, 2024, 9:37:18 PM1/21/24

to

Think of free rounding a block down to a large boundary, where access to
a header is right there...

Richard Kettlewell

unread,

Jan 22, 2024, 3:13:00 AM1/22/24

to

Keith Thompson <Keith.S.T...@gmail.com> writes:

> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>> It's trivial to fix that problem: simply require implementations
>> to define a preprocessor symbol about how the implementation
>> works. Problem solved.
>>
>> (There are other instances of implementation-defined behavior
>> that would benefit from analogous changes along these lines.)
>
> I tend to agree that such a preprocessor symbol would be an improvement.
>
> I still think, as I wrote above, that removing the permission to
> return a null pointer on a successful zero-sized allocation would be a
> greater improvement.

Fully agreed. That permission has been grit in the gears for a very long
time, for much of which I had the misfortune of having to deal with it
in real life thanks to IBM’s bad decisions.

Fixing it would have been, what, a 2-line change to the impacted
implementations, but apparently it’s better for all the users to deal
with the consequences instead.

> A preprocessor symbol makes it easier for programmers to work around the
> potential difference between implementations. The change I advocate
> would make it completely unnecessary.
>
> Except, of course, that most code would still have to allow for pre-C26
> behavior, even if the change were adopted in C26. That's unavoidable in
> the absence of time machines. On the gripping hand, since it seems that
> most existing implementations (well, all of the few I've tried) return a
> non-null pointer for malloc(0), it might be reasonable to ignore the few
> pre-C26 implementations that return a null pointer.

“Reasonable” isn’t really relevant (at least in my working environment):
either my code has to run on such implementations or it doesn’t.

--
https://www.greenend.org.uk/rjk/

David Brown

unread,

Jan 22, 2024, 3:20:48 AM1/22/24

to

On 21/01/2024 22:20, Kaz Kylheku wrote:
> On 2024-01-21, David Brown <david...@hesbynett.no> wrote:
>> Personally, I think it would be often more efficient in modern C if the
>> allocation system didn't track sizes at all, and "free" passed the
>> original size as a parameter. But that ship sailed long ago for
>> standard C.
>
> That ship newly sailed in 2023.
>
> The N3096 draft describes a function free_sized that takes a size. If
> the size is wrong, the behavior is undefined.
>

I hadn't noticed that addition in C23 - I'll look it up.

> So now C has two ways to free an object: an efficient one where
> the program helps by giving the size, and the old free, which
> may have to do more work.

It would be a lot more efficient if the sized free was matched with an
appropriate malloc - if malloc() still has to track the size somewhere
in case the user calls free() instead of free_sized(), the gains are
much less.

David Brown

unread,

Jan 22, 2024, 3:24:46 AM1/22/24

to

On 21/01/2024 22:31, Keith Thompson wrote:
> David Brown <david...@hesbynett.no> writes:
> [...]
>> Personally, I think it would be often more efficient in modern C if
>> the allocation system didn't track sizes at all, and "free" passed the
>> original size as a parameter. But that ship sailed long ago for
>> standard C. (C++ supports a sized deallocation system, and of course
>> there's nothing to stop you making your own allocator system for C.)
>
> I suspect that calling malloc() with one size and free() with a
> different one would have been a rich source of subtle bugs.
>

Sure. But that's part of the fun of C :-)

Many calls to malloc are of the form :

p = malloc(sizeof(*p));

so freeing them with :

free_sized(p, sizeof(*p));

should not be a significant risk.

There are circumstances where you'd have to track the size as well, and
there is then plenty of scope for mistakes.

Richard Kettlewell

unread,

Jan 22, 2024, 5:21:13 AM1/22/24

to

Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
> That's a half-assed argument. There are other ways a pointer might
> have a null value than just being the result of a call to malloc().
> If code might call memset() et al with a zero size and a null
> pointer, it's better to address all possible cases at once rather
> than just some of them:
>
> static inline void *
> safer_memset( void *s, int c, size_t n ){
> return n ? memset( s, c, n ) : s;
> }
>
> static inline void *
> safer_memcpy( void *d, const void *s, size_t n ){
> return n ? memcpy( d, s, n ) : d;
> }
>
> /* ... etc ... */

Of course that’s what the cautious programmer must do practice. But in
terms of the total cost (over all users, implementers, etc) fixing the
definitions of memcpy/memset/etc (as well as malloc) would have been the
better answer.

Standard C is trying to have its own cake and eat it here: 0-sized
allocations can be represented by null pointers when it’s malloc, but
not when it’s memcpy.

--
https://www.greenend.org.uk/rjk/

Richard Kettlewell

unread,

Jan 22, 2024, 5:22:36 AM1/22/24

to

bart <b...@freeuk.com> writes:
> malloc has sort of created a rod for its own back by needing to store
> the size of the allocation. That will take up some space even when
> malloc(0) is called, if NULL is not being returned.

The alternative is making a rod for the back of every single caller
(i.e. all consumers must track allocation sizes). I think the design of
malloc got this one right.

--
https://www.greenend.org.uk/rjk/

bart

unread,

Jan 22, 2024, 6:27:20 AM1/22/24

to

On 22/01/2024 10:22, Richard Kettlewell wrote:
> bart <b...@freeuk.com> writes:
>> malloc has sort of created a rod for its own back by needing to store
>> the size of the allocation. That will take up some space even when
>> malloc(0) is called, if NULL is not being returned.
>
> The alternative is making a rod for the back of every single caller
> (i.e. all consumers must track allocation sizes).

Not at all. Let's first emulate a pair of functions where the caller is
expected to remember the size:

void* malloc_s (size_t n) {return malloc(n);}
void free_s (void* p, size_t n) {free(p);}

Then a typical alloc/dealloc sequence might look like this:

typedef struct {int d,m,y;}Date;
Date* p;

p=malloc_s(sizeof(*p));
...
free_s(p, sizeof(*p));

Is that particularly onerous? If you have a fixed-size object like a
struct, then you will always know its size.

For variable-sized objects, then yes you need to keep a record of the
size, but the chances are you have to do that anyway. For example to be
able to iterate over that dynamic array.

But if you really wanted (for example when allocating variable length,
zero-terminated strings), you can write a couple of wrappers around
malloc_s and free_s to emulate what malloc and free provide in terms of
not needing to remember the allocation size:

typedef long long int i64;

void* malloc2(i64 n) {
void* p;

p=malloc_s(n+sizeof(i64));
if (p==NULL) return NULL;
*((i64*)p) = n;
return (char*)p+sizeof(i64);
}

void free2(void* p) {
i64* q = (i64*)((char*)p-sizeof(i64));
i64 size = *q;

free_s(q, *q);
}

Untested code, it's to demonstrate what's involved: you ask for an
allocation 8 bytes bigger, use the 8 bytes at the beginning to store the
size, and return a pointer to just after those 8 bytes. (I'm assuming
8-byte alignment will suffice.)

Now you can do this:

p=malloc2(sizeof(*p));
...
free2(p);

> I think the design of malloc got this one right.

I think it got it wrong. Now everyone is lumbered with an allocation
scheme that will ALWAYS have to store the size of that struct, perhaps
taking as much space as the struct itself.

Imagine allocating 100M of those structs, and also storing 100M copies
of sizeof(Date) or whatever metadata is needed.

Getting around that, by writing your own small-object allocator on top
of malloc, is a lot harder that adding your own size-tracking on top of
a malloc that does not store any extra data. As I've shown.

(This is also a scheme where, if a user needs to get the size of an
allocation block, it can do so:

i64 size_s(void* p) {
i64* q = (i64*)((char*)p-sizeof(i64));
return *q;
}

But this will be the requested size not the capacity of the allocated
block. That would depend on how malloc_s/free_s are implemented.)

bart

unread,

Jan 22, 2024, 6:55:51 AM1/22/24

to

Try writing an allocator with zero memory overheads that doesn't spend
all its time on trying to deduce the extend of the block being allocated.

Apparently, 'malloc' on my machines doesn't know about such techniques,
because if I run this program:

enum {n=1024};
char* lastp = malloc(n);
char* p;

for(int i=0; i<10; ++i) {
p=malloc(n);
printf("%p %d\n", p, p-lastp);
lastp=p;
}

Then on Windows I might get:

0000026d06d15b30 18288
0000026d06d15f40 1040
0000026d06d16350 1040
0000026d06d16760 1040
0000026d06d16b70 1040
0000026d06d16f80 1040
0000026d06d17390 1040
0000026d06d177a0 1040
0000026d06d17bb0 1040
0000026d06d17fd0 1056

Some odd results, but generally it uses 16 bytes more than 1024. If I
run it under WSL, the same thing. Also on rextester.com.

You're saying it should be only 1024 bytes that are occupied? OK, tell
the authors of these various mallocs that.

BTW if I run this program (in my language which uses an allocator that
requires a size to free):

const n=1024
ref byte p:=pcm_alloc(n), lastp

to 10 do
p:=pcm_alloc(n)
println p, p-lastp
lastp:=p
od

I get these results:

02E2D440 1024
02E2D840 1024
02E2DC40 1024
02E2E040 1024
02E2E440 1024
02E2E840 1024
02E2EC40 1024
02E2F040 1024
02E2F440 1024
02E2F840 1024

Unless I use a bigger N, since then it switches to using malloc, and I
get similar results to above. But the overheads then are less significant.

bart

unread,

Jan 22, 2024, 7:01:04 AM1/22/24

to

On 22/01/2024 11:55, bart wrote:

>
> BTW if I run this program (in my language which uses an allocator that
> requires a size to free):
>
> const n=1024
> ref byte p:=pcm_alloc(n), lastp

(p and lastp are declared in the wrong order. The results shown are from
the fixed version.)

> I get these results:
>
> 02E2D440 1024
> 02E2D840 1024

...

(If I compile to an .obj file, which enables high-memory code, and link
using gcc, then it will show the bigger addresses like the C, but still
with 1024-byte allocations.)

Malcolm McLean

unread,

Jan 22, 2024, 7:36:49 AM1/22/24

to

On 21/01/2024 12:04, Tim Rentsch wrote:
> Kaz Kylheku <433-92...@kylheku.com> writes:
>
> [...]
>
>> Not requiring the non-null return from malloc(0) to be distinct
>> from previous malloc(0) return values (whether they were freed or not),
>> could help to "sell" the idea of taking away the null return value.
>>
>> Some implementors might grumble that null return allowed malloc(0) to be
>> efficient by not allocating anything. If they were allowed to return
>> (void *) -1 or something, that would placate that concern. [...]
>
> You have the tail wagging the dog here. If the results of
> different malloc(0) calls don't need to be distinguishable,
> they might just as well all be null.
No, because it's natural to write

employees = malloc(Nemployees * sizeof(EMPLOYEE));
if (!employees)
goto out_of_memory;

You don't want to have to special case Nemployees == 0.

Keith Thompson

unread,

Jan 22, 2024, 11:06:27 AM1/22/24

to

Richard Kettlewell <inv...@invalid.invalid> writes:
> Keith Thompson <Keith.S.T...@gmail.com> writes:
>> Tim Rentsch <tr.1...@z991.linuxsc.com> writes:
>>> It's trivial to fix that problem: simply require implementations
>>> to define a preprocessor symbol about how the implementation
>>> works. Problem solved.
>>>
>>> (There are other instances of implementation-defined behavior
>>> that would benefit from analogous changes along these lines.)
>>
>> I tend to agree that such a preprocessor symbol would be an improvement.
>>
>> I still think, as I wrote above, that removing the permission to
>> return a null pointer on a successful zero-sized allocation would be a
>> greater improvement.
>
> Fully agreed. That permission has been grit in the gears for a very long
> time, for much of which I had the misfortune of having to deal with it
> in real life thanks to IBM’s bad decisions.

Can you expand on "IBM's bad decisions"?

[...]

Keith Thompson

unread,

Jan 22, 2024, 11:23:43 AM1/22/24

to

There are three scenarios being considered.

1. malloc(0) always returns a null pointer.
2. All malloc(0) calls return the same non-null pointer, perhaps to a
single pre-allocated object.
3. malloc(0) acts like malloc(1), returning a null pointer only if
memory has been exhausted.

Implementations currently choose between 1 and 3; 2 would be
non-conforming. Kaz suggested allowing (or requiring?) 2. (I advocate
requiring 3, which is the current behavior of all the implementations
I've tried.)

Tim is talking about scenario 2, wh

Keith Thompson

unread,

Jan 22, 2024, 11:27:39 AM1/22/24

to

[Ignore my previous followup. I accidentally posted it before I
finished writing it.]

There are three scenarios being considered.

1. malloc(0) always returns a null pointer.
2. All malloc(0) calls return the same non-null pointer, perhaps to a
single pre-allocated object.
3. malloc(0) acts like malloc(1), returning a null pointer only if
memory has been exhausted.

Implementations currently choose between 1 and 3; 2 would be
non-conforming. Kaz suggested allowing (or requiring?) 2. (I advocate
requiring 3, which is the current behavior of all the implementations
I've tried.)

Tim is talking about scenario 2. Your code sample would work correctly
in that scenario.

Scott Lurndal

unread,

Jan 22, 2024, 11:35:09 AM1/22/24

to

I see no problem in special casing it.