Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Black magic, or insanity?

162 views
Skip to first unread message

Robbie Brown

unread,
Jan 21, 2014, 7:33:12 AM1/21/14
to
I've been reviewing what I've learned about pointers.

I thought I'd do a few tests just to consolidate what I thought I'd
learned and frankly .. I'm dumfounded.

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

}

the output is what I expected

ip is (nil)
Segmentation fault (core dumped)

I then add the following statement after the last printf

int *ip2;

compile and exec and get the same output

ip is (nil)
Segmentation fault (core dumped)

Now then, the next bit is a total head****

If I modify the last statement so that it reads

int *ip2 = NULL;

so the code is now

int main(int argc, char *argv[]){

//declare a pointer to int
int *ip;

//print ... what exactly, prints 'nil'
printf("ip is %p\n", ip);
//dereference the pointer, seg fault
printf("*ip is %d\n", *ip);

int *ip2 = NULL;

}

then compile and exec I get the following

ip is 0x7fff0dfeb230
*ip is 1

WTF!!! ... how does initalizing ip2 to NULL cause the
previous code to now display ... something.

Is this for real?
I mean seriously, this is just ... what

I have no idea

Dazed and confused.


--
rob

Zoltan Kocsi

unread,
Jan 21, 2014, 8:47:21 AM1/21/14
to
On Tue, 21 Jan 2014 12:33:12 +0000
Robbie Brown <d...@nomail.invalid> wrote:

> I've been reviewing what I've learned about pointers.
>
> I thought I'd do a few tests just to consolidate what I thought I'd
> learned and frankly .. I'm dumfounded.
>
> int main(int argc, char *argv[]){
>
> //declare a pointer to int
> int *ip;
>
> //print ... what exactly, prints 'nil'
> printf("ip is %p\n", ip);
> //dereference the pointer, seg fault
> printf("*ip is %d\n", *ip);
>
> }
>
> the output is what I expected
>
> ip is (nil)
> Segmentation fault (core dumped)

Your expectation is completely wrong. The fact that ip is nil is due to
luck. You do not initialise it. Automatic variables (i.e. ones defined
inside a function without the 'static' keyword) are *not* initialised
by the compiler. Whatever junk is on the stack, that's the initial
value. If your compiler does any optimisation, then it's not even
the stack. Most likely ip was allocated in a register, which the start
code (which executes before your main() enters) happened to set to 0.

> [ snip ]
> WTF!!! ... how does initalizing ip2 to NULL cause the
> previous code to now display ... something.

Chances are, ip was now allocated in a different register, due to the
need of allocating space for ip2. The new register contained a valid
address.

Since you have not initialised the pointers and they were not in the
BSS, you could expect nothing, absolutely nothing about their values.

Any decent compiler should have given you a warning about the
uninitialised nature of ip. Also note that even zeroing the BSS is a
hosted environment thing, many embedded systems do not initialise the
memory before starting main() at all.

Zoltan
--
Zoltán Kócsi
Bendor Research Pty. Ltd.

Ben Bacarisse

unread,
Jan 21, 2014, 9:09:32 AM1/21/14
to
Robbie Brown <d...@nomail.invalid> writes:

> I've been reviewing what I've learned about pointers.
>
> I thought I'd do a few tests just to consolidate what I thought I'd
> learned and frankly .. I'm dumfounded.

You just need to re-adjust your expectations. All of your examples have
what C calls undefined behaviour. The language standard does not say
what should happen, so compilers can do pretty much what they like.
Having any expectation at all is going to lead to puzzlement.

If, on the other hand, you want to know what is actually going on, then
just look at the generated code, but keep in mind that this will tell
you about one version of one compiler with one set of command-line flags
on one system at some particular time. You probably won't learn much of
use.

<snip>
--
Ben.

Robbie Brown

unread,
Jan 21, 2014, 9:14:27 AM1/21/14
to
On 21/01/14 13:47, Zoltan Kocsi wrote:
> On Tue, 21 Jan 2014 12:33:12 +0000
> Robbie Brown <d...@nomail.invalid> wrote:
>
>> I've been reviewing what I've learned about pointers.

<snip>

> Any decent compiler should have given you a warning about the
> uninitialised nature of ip.

Hmm, I'm using gcc version 4.6.3 ... is this a 'decent compiler'

gcc -std=gnu99 -Wall pointers.c -g -o pointers
gives no warnings about uninitialised anything.

I hear what you are saying though and have taken it on board.

Thanks for your time

--
rob

Robbie Brown

unread,
Jan 21, 2014, 9:19:59 AM1/21/14
to
On 21/01/14 14:09, Ben Bacarisse wrote:
> Robbie Brown <d...@nomail.invalid> writes:
>
>> I've been reviewing what I've learned about pointers.
>>
>> I thought I'd do a few tests just to consolidate what I thought I'd
>> learned and frankly .. I'm dumfounded.
>
> You just need to re-adjust your expectations. All of your examples have
> what C calls undefined behaviour. The language standard does not say
> what should happen, so compilers can do pretty much what they like.
> Having any expectation at all is going to lead to puzzlement.

I'm discovering this, fascinating stuff.

Thanks


--
rob

Eric Sosman

unread,
Jan 21, 2014, 9:43:19 AM1/21/14
to
On 1/21/2014 9:14 AM, Robbie Brown wrote:
> On 21/01/14 13:47, Zoltan Kocsi wrote:
>> On Tue, 21 Jan 2014 12:33:12 +0000
>> Robbie Brown <d...@nomail.invalid> wrote:
>>
>>> I've been reviewing what I've learned about pointers.
>
> <snip>
>
>> Any decent compiler should have given you a warning about the
>> uninitialised nature of ip.
>
> Hmm, I'm using gcc version 4.6.3 ... is this a 'decent compiler'
>
> gcc -std=gnu99 -Wall pointers.c -g -o pointers
> gives no warnings about uninitialised anything.

Strange. Even a much older (4.4.1) gcc gives me

foo.c: In function 'main':
foo.c:7: warning: implicit declaration of function 'printf'
foo.c:7: warning: incompatible implicit declaration of built-in
function 'printf'
foo.c:7: warning: 'ip' is used uninitialized in this function

A truly ancient (3.4.4) version emits only the `printf' warning,
but if invoked with optimization at -O1 or higher it also squawks
"warning: 'ip' might be used uninitialized in this function" (note
"might be" rather than "is"; this could be a different warning).

Wild guess: The detection of uninitialized uses depends on data
developed while optimizing, and the default optimization level when
no -Ox is specified varies from one gcc version to another. Try
adding -O1 or -O2 (or even -O3) to your command line, to see if
the compiler will offer more commentary.

--
Eric Sosman
eso...@comcast-dot-net.invalid

Kaz Kylheku

unread,
Jan 21, 2014, 12:27:52 PM1/21/14
to
On 2014-01-21, Robbie Brown <d...@nomail.invalid> wrote:
> I've been reviewing what I've learned about pointers.
>
> I thought I'd do a few tests just to consolidate what I thought I'd
> learned and frankly .. I'm dumfounded.
>
> int main(int argc, char *argv[]){
>
> //declare a pointer to int
> int *ip;

Since this is a non-static local variable that is uninitialized, it contains
data which is traditionally called "garbage" in programmer lingo.

In C standard formal terms, its value is "indeterminate": which means that
it is an unspecified value which may be a trap representation.

By dumb tuck, this indeterminate value could look like a valid pointer,
and dereference successfully.

The indeterminate garbage inside ip could be different upon different
executions of the program, and could be influenced by changes to seemingly
irrelevant parts of the program.

> //print ... what exactly, prints 'nil'
> printf("ip is %p\n", ip);

This is undefined behavior already: you're accesing the value
indeterminately-valued object ip.

> //dereference the pointer, seg fault
> printf("*ip is %d\n", *ip);

We have no basis for expecting a "seg fault" here. The behavior here is
also undefined for the same reason. Undefined means not defined by the ISO
standard document which describes the C language. (If there were a requirement
to rpoduce a segmentation fault, that would be a definition of behavior; it
would not be "undefined".)

In the case of some undefined behaviors, we do have a basis for expecting
some particular behavior on a particular platform. That happens when the
language implementors give us a definition, or else we can otherwise deduce
the behavior from the structure of the platform, or from knowing something
about the compiler behavior, etc.

Keith Thompson

unread,
Jan 21, 2014, 3:36:16 PM1/21/14
to
Robbie Brown <d...@nomail.invalid> writes:
> I've been reviewing what I've learned about pointers.
>
> I thought I'd do a few tests just to consolidate what I thought I'd
> learned and frankly .. I'm dumfounded.
>
> int main(int argc, char *argv[]){
>
> //declare a pointer to int
> int *ip;
>
> //print ... what exactly, prints 'nil'
> printf("ip is %p\n", ip);
> //dereference the pointer, seg fault
> printf("*ip is %d\n", *ip);
>
> }
[...]

This is not directly relevant to your question, but the "%p" printf
format expects an argument of type void*. You're giving it an argument
of type int*, which strictly speaking causes undefined behavior.

It's very very likely to work correctly on any system where void* and
int* have the same representation (which is the vast majority of
existing systems), but for maximum portability you should cast the
pointer value to void:

printf("ip is %p\n", (void*)ip);

This is one of the few cases where casting, particularly pointer
casting, is a good habit.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Helmut Tessarek

unread,
Jan 21, 2014, 4:20:47 PM1/21/14
to
On 21.01.14 7:33 , Robbie Brown wrote:
>
> Is this for real?
> I mean seriously, this is just ... what

check out my mail signature. it will also answer your question.

--
Helmut K. C. Tessarek

/*
Thou shalt not follow the NULL pointer for chaos and madness
await thee at its end.
*/

Robbie Brown

unread,
Jan 21, 2014, 4:39:14 PM1/21/14
to
On 21/01/14 21:20, Helmut Tessarek wrote:
> On 21.01.14 7:33 , Robbie Brown wrote:
>>
>> Is this for real?
>> I mean seriously, this is just ... what
>
> check out my mail signature. it will also answer your question.
>

Heh, that's about right.

--
rob

Keith Thompson

unread,
Jan 21, 2014, 4:51:53 PM1/21/14
to
Helmut Tessarek <tess...@evermeet.cx> writes:
> On 21.01.14 7:33 , Robbie Brown wrote:
> check out my mail signature. it will also answer your question.
>
> [...]
> /*
> Thou shalt not follow the NULL pointer for chaos and madness
> await thee at its end.
> */

Good advice, but not actually relevant in this case.

The OP *expected* a segmentation fault on dereferencing a null pointer.
The problem was that the pointer object in question was uninitialized,
and therefore might or might not contain a null pointer value.

Helmut Tessarek

unread,
Jan 21, 2014, 6:59:52 PM1/21/14
to
On 21.01.14 16:51 , Keith Thompson wrote:
> Good advice, but not actually relevant in this case.

A lot of people already gave extensive explanations and I think the main point
is that anything can and will happen.

So I think 'chaos and madness' is quite relevant, if you mess with null
pointers (or pointers that are potentially null pointers). ;-)

> The OP *expected* a segmentation fault on dereferencing a null pointer.
> The problem was that the pointer object in question was uninitialized,
> and therefore might or might not contain a null pointer value.

Yep, for me a(n) (uninitialized) pointer that is a potential null pointer
still falls in the category not to mess with.

Cheerio!

--
Helmut K. C. Tessarek

Robbie Brown

unread,
Jan 22, 2014, 7:06:28 AM1/22/14
to
On 21/01/14 21:51, Keith Thompson wrote:
> Helmut Tessarek <tess...@evermeet.cx> writes:
>> On 21.01.14 7:33 , Robbie Brown wrote:
>> check out my mail signature. it will also answer your question.
>>
>> [...]
>> /*
>> Thou shalt not follow the NULL pointer for chaos and madness
>> await thee at its end.
>> */
>
> Good advice, but not actually relevant in this case.
>
> The OP *expected* a segmentation fault on dereferencing a null pointer.
> The problem was that the pointer object in question was uninitialized,
> and therefore might or might not contain a null pointer value.

Yes, I'm starting to get the impression that, unlike other languages I
have used, C (or rather the C compiler perhaps) doesn't stop you from
doing all manner of exceptionally stupid things.

For example, for no other reason that experimentation I tried to get my
head around pointers to pointers and came up with the following.
Trying hard not to make assumptions, just observations.

[Linux 3.2.0-23-generic x86_64 GNU/Linux]

int **arpi = (int**) malloc(sizeof(int*) * 5);
*(arpi + 4) = malloc(sizeof(int));
*(*(arpi + 4)) = 14;

If I run this through gdb I can see what I expected to see (there's that
word again, what other word can I use?).

arpi is a pointer to the first of 5 64 bit addresses.
the first 4 addresses contain 0x0000000000000000 I hope I understand
that these are uninitialized addresses ... or maybe they have been
initialized to 0 by some voodoo priest :-) anyway
the fifth address contains the 64 bit address 0x0000000000602010
this seems reasonable as I malloc'd enough space for a pointer to int.
if I inspect the contents of 0x602010 I see 0x0e which is (I hope) what
I was expecting

Then it got all strange again

I changed the first line to
int **arpi = (int**) malloc(sizeof(int) * 5);

now I malloc int instead of int*
Compile, run, inspect, same old results
I think this works because an int is probably 64 bits same as an address
(gross assumption)

Then it gets weirder
int **arpi = (int**) malloc(0);
Now realistically what should I 'expect' to happen

I sort of expected it not to compile ... wrong, it compiled
I sort of expected it to blow up ... wrong, ran and exited normally
I even found 0x0e lurking about almost where I hoped it would be.

gdb exposed the memory and it was obviously not right but it still ran.

This *is* fun isn't it?

Ah well, onwards and upwards.

--
rob

James Kuyper

unread,
Jan 22, 2014, 8:22:30 AM1/22/14
to
On 01/22/2014 07:06 AM, Robbie Brown wrote:
...
> int **arpi = (int**) malloc(0);
> Now realistically what should I 'expect' to happen
>
> I sort of expected it not to compile ... wrong, it compiled
> I sort of expected it to blow up ... wrong, ran and exited normally
> I even found 0x0e lurking about almost where I hoped it would be.
>
> gdb exposed the memory and it was obviously not right but it still ran.

What the C standard requires is that malloc(0) may return either
a) a null pointer
b) a pointer suitably aligned for any type, but which points at memory
that cannot be safely written to.
--
James Kuyper

Malcolm McLean

unread,
Jan 22, 2014, 8:35:47 AM1/22/14
to
On Wednesday, January 22, 2014 12:06:28 PM UTC, Robbie Brown wrote:
>
> Yes, I'm starting to get the impression that, unlike other languages I
> have used, C (or rather the C compiler perhaps) doesn't stop you from
> doing all manner of exceptionally stupid things.
>
All you really need to understand is that C allows you to write to "raw"
addresses. Often the bits in the pointer are the actual bits which go on the
address bus to fetch data to and from RAM. Other times there's a very low-level
layer of indirection which prevents programs from corrupting each other and,
possible, damage to hardware.
Now if you write to a random address, it's very hard to say what will happen.
You might hit another variable, you might destroy your call stack, you might
send a byte to a memory-mapped port or put up a pixel on a memory-mapped
screen. The system might detect that what you are doing is illegal and issue
a segfault (this is the best, most desirable result from the point of view
of someone trying to write a useful program). You might even hit the pointer
itself.

That's all there really is to it. Some systems also put in protections against
reading from random addresses.

James Kuyper

unread,
Jan 22, 2014, 10:10:24 AM1/22/14
to
I should have mentioned that malloc(0) returns any non-null pointer
value, that value must be the result of malloc() having behaved exactly
the same as if it had been asked to allocate some non-zero amount of
memory. This implies that each non-null value returned by malloc(0) will
be unique, in the sense that will not compare equal to any other valid
pointer to an object.

Robbie Brown

unread,
Jan 22, 2014, 11:12:57 AM1/22/14
to
On 22/01/14 15:10, James Kuyper wrote:
> On 01/22/2014 08:22 AM, James Kuyper wrote:
>> On 01/22/2014 07:06 AM, Robbie Brown wrote:

<snip>

> I should have mentioned that malloc(0) returns any non-null pointer
> value, that value must be the result of malloc() having behaved exactly
> the same as if it had been asked to allocate some non-zero amount of
> memory. This implies that each non-null value returned by malloc(0) will
> be unique, in the sense that will not compare equal to any other valid
> pointer to an object.

Now to me, that just seems perverse. By what strange incantation of
inverse logic was the decision made to use a request for 0 bytes of
memory as meaning 'give me anything but 0 bytes'.

I would have thought NULL was the perfect value to return in this case.
I suppose there is a good reason for it but I can't for the life of me
think what it could be. It's almost as if it were *designed* to confuse
and befuddle the unwary neophyte ........ no, surely not?

--
rob

Malcolm McLean

unread,
Jan 22, 2014, 11:37:13 AM1/22/14
to
Do you ask for a bag of no beans or no bag of beans?
Some took the former view, some the latter. It's a difficult problem how to
handle the empty case, you tend to want programs that treat it as part of
normal control flow, because that's likely to be more robust and correct.
But often treating specially is more efficient and easier to think through.

Lowell Gilbert

unread,
Jan 22, 2014, 11:37:50 AM1/22/14
to
Both usages were already extant by the time standardization came around,
so we're stuck with them. The logic by which the not-returning-null
approach came about was the idea that a valid return value should not be
the same as an error return. I don't see that as completely silly.

--
Lowell Gilbert, embedded/networking software engineer
http://be-well.ilk.org/~lowell/

James Kuyper

unread,
Jan 22, 2014, 12:00:51 PM1/22/14
to
On 01/22/2014 11:12 AM, Robbie Brown wrote:
> On 22/01/14 15:10, James Kuyper wrote:
>> On 01/22/2014 08:22 AM, James Kuyper wrote:
>>> On 01/22/2014 07:06 AM, Robbie Brown wrote:
>
> <snip>
>
>> I should have mentioned that malloc(0) returns any non-null pointer

Missing word: ^ if

>> value, that value must be the result of malloc() having behaved exactly
>> the same as if it had been asked to allocate some non-zero amount of
>> memory. This implies that each non-null value returned by malloc(0) will
>> be unique, in the sense that will not compare equal to any other valid
>> pointer to an object.
>
> Now to me, that just seems perverse. By what strange incantation of
> inverse logic was the decision made to use a request for 0 bytes of
> memory as meaning 'give me anything but 0 bytes'.

For some purposes, it's convenient to create objects of varying sizes,
without having to do special case handling for objects with a size of 0.
It's sometimes important that each such object be distinguishable.
Objects allocated by using malloc(0), if it returns a non-null value,
are distinguishable by their addresses. The cost of making that possible
is that those addresses cannot be used for any other purpose, which is
pretty much the same effect as if those addresses had been used to store
something. Portable code cannot rely upon this behavior, but unportable
code exists that relies upon the fact that malloc(0) has this behavior
on a particular implementation of C.

> I would have thought NULL was the perfect value to return in this case.
> I suppose there is a good reason for it but I can't for the life of me
> think what it could be. It's almost as if it were *designed* to confuse
> and befuddle the unwary neophyte ........ no, surely not?

No, the standard was designed to accommodate the wide variety of
existing implementations of C. This often results in confusion and
befuddlement, but that wasn't the purpose. There are arguments for
either way of implementing malloc(0), but I don't think anyone would
have chosen to allow both if they'd been free to ignore existing
implementations.

Keith Thompson

unread,
Jan 22, 2014, 12:00:55 PM1/22/14
to
Robbie Brown <d...@nomail.invalid> writes:
> On 21/01/14 21:51, Keith Thompson wrote:
>> Helmut Tessarek <tess...@evermeet.cx> writes:
>>> On 21.01.14 7:33 , Robbie Brown wrote:
>>> check out my mail signature. it will also answer your question.
>>>
>>> [...]
>>> /*
>>> Thou shalt not follow the NULL pointer for chaos and madness
>>> await thee at its end.
>>> */
>>
>> Good advice, but not actually relevant in this case.
>>
>> The OP *expected* a segmentation fault on dereferencing a null pointer.
>> The problem was that the pointer object in question was uninitialized,
>> and therefore might or might not contain a null pointer value.
>
> Yes, I'm starting to get the impression that, unlike other languages I
> have used, C (or rather the C compiler perhaps) doesn't stop you from
> doing all manner of exceptionally stupid things.

Yes. Another interesting, um, feature of C is that the syntax is what I
think of as "dense". What that means is that a single-character typo in
an otherwise correct C program can easily produce something that's
perfectly correct as far as the compiler is concerned, but has
completely different behavior.

> For example, for no other reason that experimentation I tried to get my
> head around pointers to pointers and came up with the following.
> Trying hard not to make assumptions, just observations.
>
> [Linux 3.2.0-23-generic x86_64 GNU/Linux]
>
> int **arpi = (int**) malloc(sizeof(int*) * 5);

A good idiom for malloc that mostly avoids type mismatches is:

int **arpi = malloc(5 * sizeof *arpi);

Casting the result of malloc is unnecessary and can mask errors in some
cases. Applying sizeof to *arpi (more generally, to what the LHS points
to) ensures that you have the correct size and type without having to
repeat the type name.

> *(arpi + 4) = malloc(sizeof(int));

Probably better written as:

arpi[4] = malloc(sizeof *(arpi[4]));

> *(*(arpi + 4)) = 14;

*(arpi[4]) = 14;
>
> If I run this through gdb I can see what I expected to see (there's that
> word again, what other word can I use?).
>
> arpi is a pointer to the first of 5 64 bit addresses.
> the first 4 addresses contain 0x0000000000000000 I hope I understand
> that these are uninitialized addresses ... or maybe they have been
> initialized to 0 by some voodoo priest :-) anyway

malloc returns a pointer to uninitialized memory. The contents might
happen to be all bits zero, but that's not guaranteed, and you shouldn't
rely on it. And the null pointer is very commonly represented as
all-bits-zero, but that's not guaranteed either.

> the fifth address contains the 64 bit address 0x0000000000602010
> this seems reasonable as I malloc'd enough space for a pointer to int.
> if I inspect the contents of 0x602010 I see 0x0e which is (I hope) what
> I was expecting

Yes.

> Then it got all strange again
>
> I changed the first line to
> int **arpi = (int**) malloc(sizeof(int) * 5);
>
> now I malloc int instead of int*
> Compile, run, inspect, same old results
> I think this works because an int is probably 64 bits same as an address
> (gross assumption)

Yes, that's the kind of type mismatch that can be avoided by the idiom I
suggested above.

> Then it gets weirder
> int **arpi = (int**) malloc(0);
> Now realistically what should I 'expect' to happen
>
> I sort of expected it not to compile ... wrong, it compiled

I'm not sure why you'd expect it not to compile. malloc is a library
function, not a built-in language feature. It takes an integer argument
(specifically an argument of the unsigned integer type size_t), and
you've called it with an integer value. Even if 0 were not a valid
argument value, it's of the right type (or rather, is implicitly
convertible to the right type), so there's nothing for the compiler to
complain about. The run time behavior may be another matter; as James
Kuyper already explained, the behavior of malloc(0) is
implementation-defined.

> I sort of expected it to blow up ... wrong, ran and exited normally
> I even found 0x0e lurking about almost where I hoped it would be.
>
> gdb exposed the memory and it was obviously not right but it still ran.

It's likely that malloc(0) allocated some small amount of memory from
the heap (it could have returned a null pointer, but then your program
probably would have crashed). The actual amount of memory allocated for
malloc(N) is likely to be a bit bigger than N, but you can only safely
access the first N bytes (and only if malloc(N) actually succeeded).
But if you try to access memory beyond those first N bytes, you're
*probably* still accessing memory within your program's memory space.
The behavior is undefined, but that doesn't mean it's going to crash;
if you're *unlucky*, it will appear to "work".

> This *is* fun isn't it?
>
> Ah well, onwards and upwards.

--

Kaz Kylheku

unread,
Jan 22, 2014, 12:13:01 PM1/22/14
to
On 2014-01-22, Robbie Brown <d...@nomail.invalid> wrote:
> On 22/01/14 15:10, James Kuyper wrote:
>> On 01/22/2014 08:22 AM, James Kuyper wrote:
>>> On 01/22/2014 07:06 AM, Robbie Brown wrote:
>
><snip>
>
>> I should have mentioned that malloc(0) returns any non-null pointer
>> value, that value must be the result of malloc() having behaved exactly
>> the same as if it had been asked to allocate some non-zero amount of
>> memory. This implies that each non-null value returned by malloc(0) will
>> be unique, in the sense that will not compare equal to any other valid
>> pointer to an object.
>
> Now to me, that just seems perverse. By what strange incantation of
> inverse logic was the decision made to use a request for 0 bytes of
> memory as meaning 'give me anything but 0 bytes'.

The actual logic is "return either null, or erturn a unique pointer".
The issue is not the number of bytes, but rather the important expectation that
malloc doesn't return the same pointer two or more times (when nothing is freed
in betwen), unless perhaps it is the null.

Since pointers are basically addresses, the requirement for returning unique
pointers requires a non-zero amount of allocation.

Note that the blocks returned by malloc are often larger than what is
requested, though there isn't any portable way to find out how much larger.
This is done for the sake of alignment of the meta-data structures that
lie between the allocated blocks.

If there is a free-space block after the block you've just allocated, a common
strategy is to put a header structure into that free space, which places it
into a list of other such free space blocks. On many architectures, such a
structure has to be properly aligned since it contains word-sized quantities
like pointers.

Also, some malloc implementations simply have "buckets" of fixed-sizes of
blocks. For instance there might be a bucket for, say, 32 byte objects, one for 48 byte ones, then 64, 92, 128, ...

If you allocate a 49 byte object, you may actually get 64 bytes; you just don't
know.

It is not reasonable to get a 16 byte object when you asked for zero.

> I would have thought NULL was the perfect value to return in this case.

Yes, and so did some traditional C library implementors. So when it came time
to standardize the language, it was found that some libraries produced null,
whereas others returned something new.

This was simply captured in the standard: that programs being ported
among implementations could expect either behavior.

--
Music DIY Mailing List: http://www.kylheku.com/diy
ADA MP-1 Mailing List: http://www.kylheku.com/mp1

Joe Pfeiffer

unread,
Jan 22, 2014, 1:33:40 PM1/22/14
to
Robbie Brown <d...@nomail.invalid> writes:

> On 21/01/14 21:51, Keith Thompson wrote:
>> Helmut Tessarek <tess...@evermeet.cx> writes:
>>> On 21.01.14 7:33 , Robbie Brown wrote:
>>> check out my mail signature. it will also answer your question.
>>>
>>> [...]
>>> /*
>>> Thou shalt not follow the NULL pointer for chaos and madness
>>> await thee at its end.
>>> */
>>
>> Good advice, but not actually relevant in this case.
>>
>> The OP *expected* a segmentation fault on dereferencing a null pointer.
>> The problem was that the pointer object in question was uninitialized,
>> and therefore might or might not contain a null pointer value.
>
> Yes, I'm starting to get the impression that, unlike other languages I
> have used, C (or rather the C compiler perhaps) doesn't stop you from
> doing all manner of exceptionally stupid things.

Years and years ago I came across ways to shoot yourself in the foot in
various programming languages (in assembly code, you started by building
a gun. In Pascal, you changed your mind and shot yourself in the head
when you realized you couldn't actually accomplish anything useful in
the language. And so forth.). For C, it simply stated "you shoot
yourself in the foot".

For me, that's always been simultaneously C's strongest and weakest
point: it will let you do what you say you want to do without arguing
with you about it.

Ken Brody

unread,
Jan 22, 2014, 1:36:40 PM1/22/14
to
On 1/22/2014 11:12 AM, Robbie Brown wrote:
> On 22/01/14 15:10, James Kuyper wrote:
[...]
>> I should have mentioned that malloc(0) returns any non-null pointer

I assume there is a missing "if"? ("... if malloc(0) returns ...")

>> value, that value must be the result of malloc() having behaved exactly
>> the same as if it had been asked to allocate some non-zero amount of
>> memory. This implies that each non-null value returned by malloc(0) will
>> be unique, in the sense that will not compare equal to any other valid
>> pointer to an object.
>
> Now to me, that just seems perverse. By what strange incantation of inverse
> logic was the decision made to use a request for 0 bytes of memory as
> meaning 'give me anything but 0 bytes'.
>
> I would have thought NULL was the perfect value to return in this case.
> I suppose there is a good reason for it but I can't for the life of me think
> what it could be. It's almost as if it were *designed* to confuse and
> befuddle the unwary neophyte ........ no, surely not?

Consider the fact that, for non-zero lengths, a return of NULL means
failure. If malloc(0) returns NULL, did it really fail? (Valid arguments
can be made for both sides.)

I'm sure that, at the time the Standard was written, there were
implementations on both sides of the argument, and there was no compelling
reason to require one over the other. If there was any change to existing
implementations, it would have been to add the requirement that non-NULL
returns from malloc(0) must be different than any previous non-free()ed
return from malloc(), just as would be the case of non-zero malloc()s.

In short, you can think of "malloc(len)" where len==0 to be no different
than any other malloc(len) call -- if it succeeds, it returns a buffer of
the requested length.

--
Kenneth Brody

James Kuyper

unread,
Jan 22, 2014, 1:52:40 PM1/22/14
to
On 01/22/2014 01:36 PM, Ken Brody wrote:
> On 1/22/2014 11:12 AM, Robbie Brown wrote:
>> On 22/01/14 15:10, James Kuyper wrote:
> [...]
>>> I should have mentioned that malloc(0) returns any non-null pointer
>
> I assume there is a missing "if"? ("... if malloc(0) returns ...")

Correct.

...
> In short, you can think of "malloc(len)" where len==0 to be no different
> than any other malloc(len) call -- if it succeeds, it returns a buffer of
> the requested length.

"... of at least the requested length.". malloc(n) is always permitted
to allocate more than n bytes. In the case of malloc(0), a non-null
return value is not only allowed to point at a larger allocation, it is
required to do so.


Keith Thompson

unread,
Jan 22, 2014, 2:45:20 PM1/22/14
to
But even if malloc(0) returns a non-null value, it's not necessarily
*quite* the same as a value returned by malloc() with some non-zero
argument:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or the
behavior is as if the size were some nonzero value, except that the
returned pointer shall not be used to access an object.

So this:

char *p1 = malloc(1);
if (p1 != NULL) *p1 = 'x';

is well behaved, but this:

char p0 = malloc(0);
if (p0 != NULL) *p0 = 'x';

has undefined behavior.

A reasonable implementation would probably either return NULL for
malloc(0), or treat malloc(0) as equivalent to malloc(1), but other
behaviors are permitted.

James Kuyper

unread,
Jan 22, 2014, 3:38:30 PM1/22/14
to
On 01/22/2014 02:45 PM, Keith Thompson wrote:
> James Kuyper <james...@verizon.net> writes:
...
>> "... of at least the requested length.". malloc(n) is always permitted
>> to allocate more than n bytes. In the case of malloc(0), a non-null
>> return value is not only allowed to point at a larger allocation, it is
>> required to do so.
>
> But even if malloc(0) returns a non-null value, it's not necessarily
> *quite* the same as a value returned by malloc() with some non-zero
> argument:
>
> If the size of the space requested is zero, the behavior is
> implementation-defined: either a null pointer is returned, or the
> behavior is as if the size were some nonzero value, except that the
> returned pointer shall not be used to access an object.
>
> So this:
>
> char *p1 = malloc(1);
> if (p1 != NULL) *p1 = 'x';
>
> is well behaved, but this:
>
> char p0 = malloc(0);
> if (p0 != NULL) *p0 = 'x';
>
> has undefined behavior.
>
> A reasonable implementation would probably either return NULL for
> malloc(0), or treat malloc(0) as equivalent to malloc(1), but other
> behaviors are permitted.

Yes, it's permitted to behave like malloc(n) where n is an arbitrary
positive number which could even, in principle, differ between one call
to malloc(0) and another. But every permitted variation for malloc(0)
that involves returning a non-null pointer is correctly described by the
phrase "allocates more than 0 bytes". The as-if rule provides a limited
amount of protection - the memory need not actually be allocated, since
the pointer cannot be safely used to access that memory. However, the
address returned must not point to memory allocated for any other
purpose that is visible from the user code, which is almost the same thing.

Eric Sosman

unread,
Jan 22, 2014, 3:56:28 PM1/22/14
to
On 1/22/2014 2:45 PM, Keith Thompson wrote:
> [...]
> But even if malloc(0) returns a non-null value, it's not necessarily
> *quite* the same as a value returned by malloc() with some non-zero
> argument:
>
> If the size of the space requested is zero, the behavior is
> implementation-defined: either a null pointer is returned, or the
> behavior is as if the size were some nonzero value, except that the
> returned pointer shall not be used to access an object.
>
> So this:
>
> char *p1 = malloc(1);
> if (p1 != NULL) *p1 = 'x';
>
> is well behaved, but this:
>
> char p0 = malloc(0);
> if (p0 != NULL) *p0 = 'x';
>
> has undefined behavior.

True, but that's just a special case of

size_t n = ...;
char *pn = malloc(n);
if (pn != NULL) pn[n] = 'x';

... having undefined behavior.

--
Eric Sosman
eso...@comcast-dot-net.invalid

Paul N

unread,
Jan 22, 2014, 6:00:48 PM1/22/14
to
On Wednesday, 22 January 2014 12:06:28 UTC, Robbie Brown wrote:

> Yes, I'm starting to get the impression that, unlike other languages I
> have used, C (or rather the C compiler perhaps) doesn't stop you from
> doing all manner of exceptionally stupid things.

C is derived from BCPL, of which a book co-written by the author of the language (Martin Richards) says "The philosophy of BCPL is not one of the tyrant who thinks he knows best and lays down the law on what is and what is not allowed; rather, BCPL acts more as a servant offering his services to the best of his ability without complaint, even when confronted with apparent nonsense. The programmer is always assumed to know what he is doing and is not hemmed in by petty restrictions."

Kaz Kylheku

unread,
Jan 22, 2014, 6:44:10 PM1/22/14
to
On 2014-01-22, Paul N <gw7...@aol.com> wrote:
> On Wednesday, 22 January 2014 12:06:28 UTC, Robbie Brown wrote:
>
>> Yes, I'm starting to get the impression that, unlike other languages I
>> have used, C (or rather the C compiler perhaps) doesn't stop you from
>> doing all manner of exceptionally stupid things.
>
> C is derived from BCPL, of which a book co-written by the author of the
> language (Martin Richards) says "The philosophy of BCPL is not one of the
> tyrant who thinks he knows best and lays down the law on what is and what is
> not allowed; rather, BCPL acts more as a servant offering his services to the

BCPL is completely "typeless"; everything is a word. If you use a word as
apointer, then it's a pointer. If you use it as a number, it's a number.

C has a comparatively "rich" type system, and its declarations and type
checking are the tyranny the above alludes to.

Richard Damon

unread,
Jan 23, 2014, 8:38:44 AM1/23/14
to
No, the allocation space may have 0 bytes of user data. Most mallocs
return a block which includes a few bytes before the block as memory
management. This often is exactly the size (or a multiple of) the
alignment requirement for allocations. Thus malloc(0) CAN return a
pointer to 0 bytes of usable memory while still making every allocation
unique.

Yes, some version "handle" the problem of a 0 byte request by bumping it
up to 1 byte, but that is not required (unless the memory allocation
method is overheadless in the memory pool being allocated from).

James Kuyper

unread,
Jan 23, 2014, 9:02:38 AM1/23/14
to
On 01/23/2014 08:38 AM, Richard Damon wrote:
> On 1/22/14, 1:52 PM, James Kuyper wrote:
>> On 01/22/2014 01:36 PM, Ken Brody wrote:
>>> On 1/22/2014 11:12 AM, Robbie Brown wrote:
>>>> On 22/01/14 15:10, James Kuyper wrote:
>>> [...]
>>>>> I should have mentioned that malloc(0) returns any non-null pointer
>>>
>>> I assume there is a missing "if"? ("... if malloc(0) returns ...")
>>
>> Correct.
>>
>> ...
>>> In short, you can think of "malloc(len)" where len==0 to be no different
>>> than any other malloc(len) call -- if it succeeds, it returns a buffer of
>>> the requested length.
>>
>> "... of at least the requested length.". malloc(n) is always permitted
>> to allocate more than n bytes. In the case of malloc(0), a non-null
>> return value is not only allowed to point at a larger allocation, it is
>> required to do so.
>>
>>
>
> No, the allocation space may have 0 bytes of user data. Most mallocs
> return a block which includes a few bytes before the block as memory
> management. This often is exactly the size (or a multiple of) the
> alignment requirement for allocations. Thus malloc(0) CAN return a
> pointer to 0 bytes of usable memory while still making every allocation
> unique.

The standard requires that "If the size of the space requested is zero,
the behavior is implementation-defined: either a null pointer is
returned, or the behavior is as if the size were some nonzero value,
...". This means that, since returning a non-null pointer to a block of
memory 0 bytes long is not permissible behavior for malloc(n) when n is
non-zero, it is therefore not permissible behavior for malloc(0).
However, since you can't access the memory allocated, the as-if rule
probably covers that.

Some mallocs() use other methods of memory management, such are rounding
all allocations up to the next power of 2, and reserving distinct blocks
of memory for each power of two. They can then figure out the size of
each allocation by determining which block it was allocated from, and
therefore don't need to store the allocation's size in a header. Such an
implementation cannot allocate 0 bytes when malloc(0) is called, because
then it could return the same pointer for multiple calls to malloc(0).
That would not be covered by the as-if rule: different allocations of
non-zero amounts of memory cannot have the same starting address,
therefore different calls to malloc(0) are also not allowed to return
equivalent pointer values.
--
James Kuyper

Keith Thompson

unread,
Jan 23, 2014, 2:47:40 PM1/23/14
to
I think the "..." in your quotation hides something critical.

The full sentence is:

If the size of the space requested is zero, the behavior is
implementation-defined: either a null pointer is returned, or
the behavior is as if the size were some nonzero value, except
that the returned pointer shall not be used to access an object.

So the behavior of malloc(0), if it returns a non-null pointer, is *not*
necessarily the same as malloc(n) for some positive n. It can be, but
it can behave differently.

For example, the implementation could maintain a pool of addresses that
point outside the actual memory space, and dole them out only for
malloc(0) calls. As long as they're non-null, unique, and comparable
for equality to other addresses, the implementation is still conforming
(which would not be the case if the "except that" clause weren't there).

Certainly malloc(0) *can* behave exactly like malloc(1), but it doesn't
have to.

James Kuyper

unread,
Jan 23, 2014, 3:17:37 PM1/23/14
to
On 01/23/2014 02:47 PM, Keith Thompson wrote:
> James Kuyper <james...@verizon.net> writes:
>> On 01/23/2014 08:38 AM, Richard Damon wrote:
...
That's certainly acceptable, so long as they are doled out with a
spacing of at least 1 byte; which is essentially an allocation of 1
byte, even if the byte itself is never used. However, they can't be
doled out with a spacing of 0 bytes, which is the possibility I was
concerned about.

Richard Damon

unread,
Jan 24, 2014, 7:43:18 AM1/24/14
to
But the 1 byte+ spacing might not be memory which the user program can
use, for instance, it might (likely) be the control block for the
allocation (and actually be at locations prior to the returned pointer,
the memory which the pointer is pointing to may well be the control
block for some other allocation block in the heap).

The phrase "the behavior is as if the size were some nonzero value", due
to the restrictions later, basically mean that if it returns a non-null
pointer, that pointer is usable for all normally pointer operations
(except dereferencing) and is unique from all other malloced and not yet
free'd results.
0 new messages