Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

May calloc() allocate more than SIZE_MAX bytes?

818 views
Skip to first unread message

Keith Thompson

unread,
Jan 6, 2007, 4:45:00 PM1/6/07
to
This came out of a discussion on comp.lang.c.

In most cases, an object cannot exceed SIZE_MAX bytes. If a declared
object exceeded SIZE_MAX bytes, the sizeof operator could not yield
its size. (VLAs might complicate this; I haven't thought much about
that aspect of the question.) An object created by malloc() or
realloc() cannot exceed SIZE_MAX bytes simply because the size
parameter is of type size_t. But calloc() takes two arguments, and
attempts to create an object whose total size is the product of those
arguments.

Assume a 32-bit size_t, with SIZE_MAX equal to 4294967295 (2**32-1).
Consider a call calloc(65521, 65552). The mathematical product of the
arguments is 4295032592; reducing this modulo SIZE_MAX+1 yields 65296.
A call malloc(65521U * 65552U) would simply attempt to allocate 65296
bytes, since that's the result of the multiplication. But the
standard's definition of calloc() doesn't say it multiplies its
arguments; it just says it allocates an object of the requested size.
An implementation in which calloc(65521, 65552) allocated just
65296 bytes would be non-conforming.

A correct implementation would most likely detect that the
mathematical product exceeds SIZE_MAX and return a null pointer; the
calloc() implementations I've seen work by calling malloc(). But
*could* it legally allocate the full 4294967295 bytes?

If so, this is (I think) the only case where an object can be bigger
than SIZE_MAX bytes. sizeof isn't an issue here, because you can't
apply sizeof to the allocated object.

If this is allowed, I suspect is was unintended. Is there anything in
the standard that explicitly disallows such a huge allocation?

(As I said, VLAs could also present some interesting issues, and a VLA
can be an operand of sizeof. Would sizeof invoke undefined behavior
for such a huge VLA?)

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Harald van Dijk

unread,
Jan 6, 2007, 6:01:18 PM1/6/07
to
Keith Thompson wrote:
> This came out of a discussion on comp.lang.c.
>
> In most cases, an object cannot exceed SIZE_MAX bytes.

Actually, according to DR 266, objects are allowed to exceed SIZE_MAX
bytes if the implementation supports such large objects, even if sizeof
cannot meaningfully be applied to them.

Keith Thompson

unread,
Jan 6, 2007, 9:34:39 PM1/6/07
to

Interesting. The reasonable thing to do would be to make size_t
bigger, but implementations aren't required to be reasonable.

Peter Nilsson

unread,
Jan 6, 2007, 11:24:49 PM1/6/07
to
Keith Thompson wrote:
> "Harald van Dijk" <tru...@gmail.com> writes:
> > Keith Thompson wrote:
> > > ...In most cases, an object cannot exceed SIZE_MAX bytes.

> >
> > Actually, according to DR 266, objects are allowed to exceed SIZE_MAX
> > bytes if the implementation supports such large objects, even if sizeof
> > cannot meaningfully be applied to them.
>
> Interesting.

The discussion resolved that the standard is unclear (more than one
interpretation is reasonable), but the proposed responce is not to
modify the wording of the standard.

I don't see that the response of DR 266 makes allocation of such
objects
defined or not. Only the application of sizeof was mentioned. An
(unrequired)
diagnostic was recommended.

> The reasonable thing to do would be to make size_t
> bigger, but implementations aren't required to be reasonable.

Discussion on DR266 ignored calloc (cf n1061), so the question remains
whether objects larger than SIZE_MAX are allowable via calloc.

N1085 calls for a sizeof_alloc function that can take an allocated
pointer
and return its size (as size_t). Obviously such a function would have
the
same problem as sizeof in DR266, and strlen in the program below.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#define N (((size_t) -1) / 2 + 2) /* 2N is (SIZE_MAX+1) + 2 */

int main(void)
{
char *p = calloc(N,2);
if (p)
{
memset(p , 'X', N);
memset(p + N, 'X', N - 1);
strlen(p);
}
puts("Hello World");
return 0;
}

Is this strictly conforming?

My copy of gcc/glibc segaults because it treats the calloc call as
calloc(2)
and the subsequent memset calls result in buffer overflow.

Is the implementation broken, or does the program invoke undefined
behaviour?

--
Peter

Douglas A. Gwyn

unread,
Jan 7, 2007, 3:02:02 PM1/7/07
to
"Peter Nilsson" <ai...@acay.com.au> wrote in message
news:1168143889....@s34g2000cwa.googlegroups.com...

> Is the implementation broken, or does the program invoke undefined
> behaviour?

I think it's pretty clear that any program that attempts to create or use
an object with more than SIZE_MAX bytes does not conform to the
standard. It seems to satisfy the definition of "undefined behavior"
(C99 subclause 3.4.3).


Keith Thompson

unread,
Jan 7, 2007, 4:10:17 PM1/7/07
to

Can you expand on that? Consider the following:

#include <limits.h>
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
char *huge = calloc(SIZE_MAX, 2);
if (huge == NULL) {
printf("calloc failed\n");
}
else {
huge[0] = 'x';
putchar(huge[0]);
putchar('\n');
}
return 0;
}

On most implementations, I'd expect calloc() to fail. But if it
happens to succeed, I don't see how the output can be anything other
than "x" followed by a newline.

It may be "pretty clear" that objects bigger than SIZE_MAX bytes are
not allowed, but as far as I can tell it's not actually stated in the
standard.

kuy...@wizard.net

unread,
Jan 7, 2007, 7:38:08 PM1/7/07
to
Douglas A. Gwyn wrote:
> "Peter Nilsson" <ai...@acay.com.au> wrote in message
> news:1168143889....@s34g2000cwa.googlegroups.com...
> > Is the implementation broken, or does the program invoke undefined
> > behaviour?
>
> I think it's pretty clear that any program that attempts to create or use
> an object with more than SIZE_MAX bytes does not conform to the
> standard.

Citation, please? What requirement specified in the standard does it
fail to conform to?

Even if it is true, I would disagree most strenuously that it is
clearly true, and that isn't just a matter of my normal pendanticism. I
have, in fact, from the first time I was aware of calloc(), assumed
that one of the reasons calloc() was created was to give
implementations the freedom to implement it in a fashion that allows
the allocation of objects bigger than can be allocated by malloc().

christian.bau

unread,
Jan 7, 2007, 8:31:51 PM1/7/07
to
kuy...@wizard.net wrote:

> Citation, please? What requirement specified in the standard does it
> fail to conform to?
>
> Even if it is true, I would disagree most strenuously that it is
> clearly true, and that isn't just a matter of my normal pendanticism. I
> have, in fact, from the first time I was aware of calloc(), assumed
> that one of the reasons calloc() was created was to give
> implementations the freedom to implement it in a fashion that allows
> the allocation of objects bigger than can be allocated by malloc().

There is one problem that needs to be overcome: If I use calloc to
allocate more than SIZE_MAX bytes, then I could fill the allocated
memory with a C string whose size cannot be represented in a value of
type size_t. Clearly an implementation must meet all the requirements
of the C standard, including the one that strlen always returns the
length of a string, as long as a valid string is passed in. If I can
construct a string where strlen cannot possibly return the correct
result, then the C implementation is not conforming.

This is, however, a weak argument. You could argue that by passing
parameters to strlen that make it impossible for the function to
deliver the correct result, I invoke undefined behavior.

Old Wolf

unread,
Jan 7, 2007, 9:50:06 PM1/7/07
to
christian.bau wrote:
> There is one problem that needs to be overcome: If I use calloc to
> allocate more than SIZE_MAX bytes, then I could fill the allocated
> memory with a C string whose size cannot be represented in a value of
> type size_t. Clearly an implementation must meet all the requirements
> of the C standard, including the one that strlen always returns the
> length of a string, as long as a valid string is passed in.

This can easily be resolved by including a limit on the length
of a string, in the definition of "string".

> This is, however, a weak argument.

Dare I say that if someone is trying to allocate more than SIZE_MAX
bytes, they probably aren't planning to store a single 0-terminated
string in it?

Peter Nilsson

unread,
Jan 7, 2007, 11:23:20 PM1/7/07
to
> kuy...@wizard.net wrote:
> > Citation, please? What requirement specified in the standard does it
> > fail to conform to?
> >
> > Even if it is true, I would disagree most strenuously that it is
> > clearly true, and that isn't just a matter of my normal pendanticism. I
> > have, in fact, from the first time I was aware of calloc(), assumed
> > that one of the reasons calloc() was created was to give
> > implementations the freedom to implement it in a fashion that allows
> > the allocation of objects bigger than can be allocated by malloc().

But such an allocation cannot be realloc-ed to a size that is larger
than SIZE_MAX bytes. So the intent (if not the wording) is probably
there.

christian.bau wrote:
> There is one problem that needs to be overcome: If I use calloc to
> allocate more than SIZE_MAX bytes, then I could fill the allocated
> memory with a C string whose size cannot be represented in a value of
> type size_t. Clearly an implementation must meet all the requirements
> of the C standard, including the one that strlen always returns the
> length of a string, as long as a valid string is passed in. If I can
> construct a string where strlen cannot possibly return the correct
> result, then the C implementation is not conforming.

I included such a strlen() test in a sample program upthread and
asked if it was strictly conforming. Doug's answer was that the
calloc itself invoked undefined behaviour, but his assertion that
it was _clearly_ UB is obviously debatable.

> This is, however, a weak argument. You could argue that by passing
> parameters to strlen that make it impossible for the function to
> deliver the correct result, I invoke undefined behavior.

The argument still leaves the case of exactly SIZE_MAX+1 bytes,
e.g. calloc(SIZE_MAX/2+1,2). It is possible for strlen to return
the correct result for a string within such an object.

A further (extreme) case to consider is that of a freestanding
implementation that supports <stdlib.h> but not <string.h>.

--
Peter

christian.bau

unread,
Jan 8, 2007, 8:13:12 PM1/8/07
to
Peter Nilsson wrote:
> The argument still leaves the case of exactly SIZE_MAX+1 bytes,
> e.g. calloc(SIZE_MAX/2+1,2). It is possible for strlen to return
> the correct result for a string within such an object.

I think a 16 bit DOS implementation (8086 processor, 64KB segments,
size_t = 16 bits) could have done exactly that. (I don't think any did,
but they could have).

Douglas A. Gwyn

unread,
Jan 8, 2007, 11:46:05 PM1/8/07
to
"Keith Thompson" <ks...@mib.org> wrote in message
news:lnps9qz...@nuthaus.mib.org...

> "Douglas A. Gwyn" <DAG...@null.net> writes:
>> "Peter Nilsson" <ai...@acay.com.au> wrote in message
>> news:1168143889....@s34g2000cwa.googlegroups.com...
>>> Is the implementation broken, or does the program invoke undefined
>>> behaviour?
>> I think it's pretty clear that any program that attempts to create or use
>> an object with more than SIZE_MAX bytes does not conform to the
>> standard. It seems to satisfy the definition of "undefined behavior"
>> (C99 subclause 3.4.3).
> Can you expand on that? Consider the following:
> char *huge = calloc(SIZE_MAX, 2);

A conforming implementation cannot return anything other than a
null pointer from such a call: the spec for calloc says that if it
were to return non-null, that would point to an array of objects
(meaning an aggregate object having array type), and the spec for
sizeof says (by omission of any alternative) that sizeof can be
applied to any object type, and that the value of the sizeof shall
be the number of bytes in that object type (including padding).
Since SIZE_MAX is the largest value representable in type
size_t, which is the type that contans the result of sizeof, at
least one of the requirements cannot be met, which would be
nonconforming.

Of course I didn't mean to imply that a s.c. application cannot
invoke calloc like that (it can in fact assume that such a call to
calloc always returns null). What I had in mind was more like
int x[SIZE_MAX/2][3];
which is logically invalid.


Keith Thompson

unread,
Jan 9, 2007, 12:45:32 AM1/9/07
to

I'm not quite convinced that sizeof "can be applied to any object
type". It's conceivable that some applications of sizeof might
invoke undefined behavior.

Consider:

size_t count = some_huge_value();
/* assume sizeof(int) > 1 */
/* assume count * sizeof(int) > SIZE_MAX */
void *ptr = calloc(count, sizeof(int));
/* assume ptr != NULL */

The only way to apply sizeof to the allocated object would be:

sizeof(int[count]);

which is a VLA type -- but the object created by calloc() wasn't
declared as a VLA, so I'm not sure whether it makes sense to treat it
as one.

Note that in C90, the rules for sizeof and the *alloc() functions were
very similar to those in C99, but VLAs didn't exist; there was *no*
way to apply sizeof to the object created by calloc(). Was this kind
of huge allocation legal in C90; did the introduction of VLAs in C99
cause it to become illegal?

Or are VLAs even relevant here? Is the type of the allocated object
really int[N], where N is some constant that happens to be equal to
the value (computed at runtime) of count? Are int[10] and int[expr]
the same type if expr happens to equal 10?

> Of course I didn't mean to imply that a s.c. application cannot
> invoke calloc like that (it can in fact assume that such a call to
> calloc always returns null). What I had in mind was more like
> int x[SIZE_MAX/2][3];
> which is logically invalid.

The alternative is that the the object could be created, but an
attempt to apply sizeof to it would invoke undefined behavior.

I suspect this was an oversight. I'd be perfectly happy with a rule
that no object may exceed SIZE_MAX bytes, or that any attempt to
create such an object eithe fails (for calloc()) or invokes undefined
behavior (for a declared object). It just appears that the standard
doesn't say this directly, and I'm not convinced that it says it
indirectly either.

Jun Woong

unread,
Jan 9, 2007, 3:06:44 AM1/9/07
to

Douglas A. Gwyn wrote:
> "Keith Thompson" <ks...@mib.org> wrote in message
> news:lnps9qz...@nuthaus.mib.org...
> > "Douglas A. Gwyn" <DAG...@null.net> writes:
[...]

> >> I think it's pretty clear that any program that attempts to create or use
> >> an object with more than SIZE_MAX bytes does not conform to the
> >> standard. It seems to satisfy the definition of "undefined behavior"
> >> (C99 subclause 3.4.3).
> > Can you expand on that? Consider the following:
> > char *huge = calloc(SIZE_MAX, 2);
>
> A conforming implementation cannot return anything other than a
> null pointer from such a call: the spec for calloc says that if it
> were to return non-null, that would point to an array of objects
> (meaning an aggregate object having array type), and the spec for
> sizeof says (by omission of any alternative) that sizeof can be
> applied to any object type, and that the value of the sizeof shall
> be the number of bytes in that object type (including padding).
> Since SIZE_MAX is the largest value representable in type
> size_t, which is the type that contans the result of sizeof, at
> least one of the requirements cannot be met, which would be
> nonconforming.

I don't see the connection here between the result of sizeof and
calloc. Even in DR266 the committee didn't mention the description
of sizeof to answer the question on the max. size of an object that
is determined at compile-time.

>
> Of course I didn't mean to imply that a s.c. application cannot
> invoke calloc like that (it can in fact assume that such a call to
> calloc always returns null). What I had in mind was more like
> int x[SIZE_MAX/2][3];
> which is logically invalid.

The committee answered in a DR that a program containing such a
declaration is undefined (it can simply fail to generate the code),
because it violates an environmental limits, not because it is
impossible for sizeof to return the correct size of the object.
This leaves many questions:

- The environmental limit is applied to the number of bytes in an
object rather than in an object *type*. Then, what makes the
following expression result in UB?

sizeof(int [SIZE_MAX][3]);

- The committee made sure that the translation limits do not apply
to an object allocated at run-time. Then what makes the following
attempt result in UB?

calloc(SIZE_MAX, 2);

- Or where does the standard say that the above call to calloc has
to return NULL rather than simply resulting in UB?

- An object whose size is determined at run-time also includes VLAs.
Then what makes the following attemp result in UB,

int a[n][2];
sizeof(a);

where n is SIZE_MAX?

- and so on.

One requirement to say that the size of every single object shall
not exceed the max. value of size_t can answer to all of these
questions.


--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.

``All opinions expressed are mine, and do not represent
the official opinions of any organization.''

Keith Thompson

unread,
Jan 9, 2007, 3:36:56 AM1/9/07
to
"Jun Woong" <wo...@icu.ac.kr> writes:
[...]

> - The committee made sure that the translation limits do not apply
> to an object allocated at run-time. Then what makes the following
> attempt result in UB?
>
> calloc(SIZE_MAX, 2);
>
> - Or where does the standard say that the above call to calloc has
> to return NULL rather than simply resulting in UB?
[...]

I think I can answer that one. The standard's description of calloc()
says it either allocates the requested space or returns NULL; there
are no other choices. Since there's no practical reason why an
implementation *can't* detect any error conditions, there's no need to
allow undefined behavior. (This applies whether allocating more than
SIZE_MAX bytes is allowed or not.)

Jun Woong

unread,
Jan 9, 2007, 4:59:00 AM1/9/07
to

Keith Thompson wrote:

> "Jun Woong" <wo...@icu.ac.kr> writes:
> [...]
> > - The committee made sure that the translation limits do not apply
> > to an object allocated at run-time. Then what makes the following
> > attempt result in UB?
> >
> > calloc(SIZE_MAX, 2);
> >
> > - Or where does the standard say that the above call to calloc has
> > to return NULL rather than simply resulting in UB?
> [...]
>
> I think I can answer that one. The standard's description of calloc()
> says it either allocates the requested space or returns NULL; there
> are no other choices. Since there's no practical reason why an
> implementation *can't* detect any error conditions, there's no need to
> allow undefined behavior. (This applies whether allocating more than
> SIZE_MAX bytes is allowed or not.)
>

Assuming that the standard requires the size of every single object
not to exceed SIZE_MAX (yes, I believed it allowed such a huge
object even if it leads us to another problems), a call like
calloc(SIZE_MAX, 2) can be taken as invoking UB (*as* int
a[SIZE_MAX][2] is), after which all bets are off and which makes the
overflow check in calloc unnecessary. That's the reason I asked those
questions.

If the following are intended to hold:

- int a[SIZE_MAX][2] is UB (specified in DR266)
- sizeof(a) is UB (specified in DR266)
- sizeof(int [SIZE_MAX][2]) is UB (probably because of the
description of sizeof)
- calloc(SIZE_MAX, 2) *always* returns NULL
- int vla[n][2] (where n is assigned SIZE_MAX) is UB
- sizeof(vla) is UB (probably because of the description of sizeof)
- sizeof(int [n][2]) is UB (probably because of the description of
sizeof)

(I tried to make an exhaustive list; is there anything missing?)

some clarification should be put into the standard. The current
wording at least has no answer to the questions asking the reason
of the fourth and fifth items.

Keith Thompson

unread,
Jan 9, 2007, 5:28:20 AM1/9/07
to
"Jun Woong" <wo...@icu.ac.kr> writes:
> Keith Thompson wrote:
>> "Jun Woong" <wo...@icu.ac.kr> writes:
>> [...]
>> > - The committee made sure that the translation limits do not apply
>> > to an object allocated at run-time. Then what makes the following
>> > attempt result in UB?
>> >
>> > calloc(SIZE_MAX, 2);
>> >
>> > - Or where does the standard say that the above call to calloc has
>> > to return NULL rather than simply resulting in UB?
>> [...]
>>
>> I think I can answer that one. The standard's description of calloc()
>> says it either allocates the requested space or returns NULL; there
>> are no other choices. Since there's no practical reason why an
>> implementation *can't* detect any error conditions, there's no need to
>> allow undefined behavior. (This applies whether allocating more than
>> SIZE_MAX bytes is allowed or not.)
>>
>
> Assuming that the standard requires the size of every single object
> not to exceed SIZE_MAX (yes, I believed it allowed such a huge
> object even if it leads us to another problems), a call like
> calloc(SIZE_MAX, 2) can be taken as invoking UB (*as* int
> a[SIZE_MAX][2] is), after which all bets are off and which makes the
> overflow check in calloc unnecessary. That's the reason I asked those
> questions.
[snip]

Hmm. If wording were going to be added to the standard requiring that
no object can exceed SIZE_MAX bytes, I'd still like it to guarantee
that calloc(SIZE_MAX, 2) must return NULL. I don't think there's any
need to allow undefined behavior; it's already possible for calloc()
to detect this error.

kuy...@wizard.net

unread,
Jan 9, 2007, 7:09:04 AM1/9/07
to
Douglas A. Gwyn wrote:
...
> ... the spec for

> sizeof says (by omission of any alternative) that sizeof can be
> applied to any object type, ...

So sizeof(int[SIZE_MAX][SIZE_MAX]) is allowed, and will return the
correct size for that type? Or is it not an object type, and if so, on
what grounds?

I believe that the only thing we can validly conclude from
consideration of such cases is that the definition of how sizeof works
was not properly written to cover those cases.

David R Tribble

unread,
Jan 9, 2007, 6:54:01 PM1/9/07
to
Peter Nilsson wrote:
>> The argument still leaves the case of exactly SIZE_MAX+1 bytes,
>> e.g. calloc(SIZE_MAX/2+1,2). It is possible for strlen to return
>> the correct result for a string within such an object.
>

Christian Bau wrote:
> I think a 16 bit DOS implementation (8086 processor, 64KB segments,
> size_t = 16 bits) could have done exactly that. (I don't think any did,
> but they could have).

Perhaps more realistically nowadays we should consider implementations
having 64-bit pointers and objects no larger than 4GB (32-bit sizeof).
I believe there are, in fact, several implementations like this.

That being the case, it might still be possible for such an
implementation to support allocating user-space memory pages in
allocations larger than 4GB. Provided the pointer arithmetic is
correct, it would still be possible to access all of the bytes within
such an object without using an array index wider than 32 bits:

// Assume sizeof(unsigned) is 32 bits
// Assume sizeof(size_t) is 32 bits
// Assume sizeof(char *) is 64 bits

void big_alloc_test()
{
char * obj;
char * ptr[10];

// Allocate a large object > 4GB
obj = calloc(10, UINT_MAX); // 10 x 4GB

// Assign pointers into the large object
ptr[0] = obj;
for (int i = 1; i < 10; i++)
ptr[i] = ptr[i-1] + UINT_MAX;

// Use the pointers into the object
... use ptr[i][0 ... UINT_MAX-1] ...
}

I don't know if this is conforming, though.

-drt

Charlie Gordon

unread,
Jan 9, 2007, 8:46:26 PM1/9/07
to

"Keith Thompson" <ks...@mib.org> wrote in message
news:lnmz4rr...@nuthaus.mib.org...

> "Charlie Gordon" <ne...@chqrlie.org> writes:
> > "Keith Thompson" <ks...@mib.org> wrote in message
> > news:ln64bfs...@nuthaus.mib.org...
> >> "Charlie Gordon" <ne...@chqrlie.org> writes:
> >> [...]
> >> > I looked up the GNU libc, calloc does test for overflow and
> >> > returns NULL. Their test is a bit smarter as they don't compute
> >> > the division if neither of size and nmemb is larger than SIZE_MAX
> >> > / 2.
> >>
> >> I think you mean (some approximation of) sqrt(SIZE_MAX).
> >
> > Yes of course, using the gazinta exponential ;-)
> >
> > 2 __________
> > |/ SIZE_MAX
> >
> > computed as: (((size_t)1) << (8 * sizeof(size_t) / 2))
> >
> > How can we get rid of the 8 bit byte assumption here ?
>
> I haven't thought about it much, but replacing 8 by CHAR_BIT would be
> a good start.

Of course! and it would even work if both CHAR_BIT and sizeof(size_t) are
odd...

Chqrlie.


Charlie Gordon

unread,
Jan 9, 2007, 5:19:41 PM1/9/07
to
"christian.bau" <christ...@cbau.wanadoo.co.uk> wrote in message
news:1168305192.7...@i15g2000cwa.googlegroups.com...

That was called the 'huge' memory model.
Array indices were still limited to SIZE_MAX, but aggregate objects could be
larger than SIZE_MAX bytes. There were severe restrictions on the types af such
aggregate objects: for instance arrays of structures with certain sizes (powers
of 2).
Depending on the compiler vendor, such objects could be declared statically or
needed to be allocated with calloc or other non-standard library functions.
strlen() was still limited to arrays of SIZE_MAX bytes.

These days are gone, but 32/64 bit issues arise now that can revive some of
these oddities.

Lets we first answer simple questions:
7.20.3 does not specify anything regarding the simple case where nmemb * size is
greater than SIZE_MAX.
- Should the norm keep this unspecified case in limbo?
- Should it say that such parameter values invoke undefined behaviour?
- Should calloc() check for such a possibility as if by such simple code:

void *calloc(size_t nmemb, size_t size) {
if (size > 1 && nmemb > 1 && SIZE_MAX / size > nmemb)
return NULL;
...
}

A similar issue is present in fread(7.19.8.1) and fwrite(7.19.8.2), but the
language of the norm is more precise and implementations that would naively
compute size * nmemb and try to read them are non conformant. On another level,
I'm surprise to see it specified that fread() must make size calls to fgetc()
for each object... most implementations use smarter methods, are they
non-conformant ?

Chqrlie.


Charlie Gordon

unread,
Jan 9, 2007, 6:11:22 PM1/9/07
to
"Charlie Gordon" <ne...@chqrlie.org> wrote in message
news:45a41476$0$29759$426a...@news.free.fr...

> Lets we first answer simple questions:
> 7.20.3 does not specify anything regarding the simple case where nmemb * size
is
> greater than SIZE_MAX.
> - Should the norm keep this unspecified case in limbo?
> - Should it say that such parameter values invoke undefined behaviour?
> - Should calloc() check for such a possibility as if by such simple code:
>
> void *calloc(size_t nmemb, size_t size) {
> if (size > 1 && nmemb > 1 && SIZE_MAX / size > nmemb)
> return NULL;
> ...
> }

I looked up the GNU libc, calloc does test for overflow and returns NULL.


Their test is a bit smarter as they don't compute the division if neither of
size and nmemb is larger than SIZE_MAX / 2.

Void_t*
public_cALLOc(size_t n, size_t elem_size)
{
INTERNAL_SIZE_T bytes;
/* size_t is unsigned so the behavior on overflow is defined. */
bytes = n * elem_size;
#define HALF_INTERNAL_SIZE_T \
(((INTERNAL_SIZE_T) 1) << (8 * sizeof (INTERNAL_SIZE_T) / 2))
if ((n | elem_size) >= HALF_INTERNAL_SIZE_T) {
if (elem_size != 0 && bytes / elem_size != n) {
return 0;
}
...

The code assumes 8 bit bytes, I didn't know that was a requirement for the GNU
libc :-(

> A similar issue is present in fread(7.19.8.1) and fwrite(7.19.8.2), but the
> language of the norm is more precise and implementations that would naively
> compute size * nmemb and try to read them are non conformant. On another
level,
> I'm surprise to see it specified that fread() must make size calls to fgetc()
> for each object... most implementations use smarter methods, are they
> non-conformant ?

As for fread and fwrite, their implementation is indeed naive, and will fail
this simple test:

// fp is a FILE* for a 2 byte file.
char buf[2];
size_t n = fread(buf, sizeof(buf), (SIZE_MAX / 2) + 1, fp);
if (n != 1)
printf("Fail: n=%zd\n", n);

Like most C libraries, Glibc will return 0 on this test

Even worse:
n = fread(buf, sizeof(buf), (SIZE_MAX / 2) + 2, fp);
On this one, it will return (SIZE_MAX / 2) + 2 instead of 1 !!!

n = fread(buf, SIZE_MAX, SIZE_MAX, fp);
On this one, it will return SIZE_MAX instead of 0 !!!

Try it with your favorite C library and decide if failure to these tests should
be fixed.

There is always one more bug ;-)

Chqrlie.


Keith Thompson

unread,
Jan 9, 2007, 6:29:52 PM1/9/07
to
"Charlie Gordon" <ne...@chqrlie.org> writes:
[...]

> I looked up the GNU libc, calloc does test for overflow and returns NULL.
> Their test is a bit smarter as they don't compute the division if neither of
> size and nmemb is larger than SIZE_MAX / 2.

I think you mean (some approximation of) sqrt(SIZE_MAX).

--

Charlie Gordon

unread,
Jan 9, 2007, 7:53:03 PM1/9/07
to
"Keith Thompson" <ks...@mib.org> wrote in message
news:ln64bfs...@nuthaus.mib.org...
> "Charlie Gordon" <ne...@chqrlie.org> writes:
> [...]
> > I looked up the GNU libc, calloc does test for overflow and returns NULL.
> > Their test is a bit smarter as they don't compute the division if neither of
> > size and nmemb is larger than SIZE_MAX / 2.
>
> I think you mean (some approximation of) sqrt(SIZE_MAX).

Yes of course, using the gazinta exponential ;-)

2 __________
|/ SIZE_MAX

computed as: (((size_t)1) << (8 * sizeof(size_t) / 2))

How can we get rid of the 8 bit byte assumption here ?

Chqrlie.


Keith Thompson

unread,
Jan 9, 2007, 8:11:46 PM1/9/07
to

I haven't thought about it much, but replacing 8 by CHAR_BIT would be
a good start.

--

Peter Nilsson

unread,
Jan 9, 2007, 10:08:23 PM1/9/07
to
David R Tribble wrote:
> ...

> Perhaps more realistically nowadays we should consider implementations
> having 64-bit pointers and objects no larger than 4GB (32-bit sizeof).
> I believe there are, in fact, several implementations like this.

It would be useful if you could name any where calloc is capable of
allocations of exactly 4GB or larger.

> That being the case, it might still be possible for such an
> implementation to support allocating user-space memory pages in
> allocations larger than 4GB. Provided the pointer arithmetic is
> correct, it would still be possible to access all of the bytes within
> such an object without using an array index wider than 32 bits:

<snip>


> I don't know if this is conforming, though.

That's the million dollar question. [The intent appears to be no, but
the standard's wording appears to be... maybe.]

As previously mentioned though, if the suggestions in N1085 make
it into the standard, calloc will have no choice but to only allocate
an object whose size is representable as a size_t.

--
Peter

christian.bau

unread,
Jan 10, 2007, 4:58:40 PM1/10/07
to
Charlie Gordon wrote:
> Lets we first answer simple questions:
> 7.20.3 does not specify anything regarding the simple case where nmemb * size is
> greater than SIZE_MAX.

It specifies that calloc either returns a pointer to space for nmemb
objects of size bytes each, or NULL. If the product is larger than
SIZE_MAX then returning space for these objects may be difficult,
impossible, or not allowed for some other reason in the C Standard;
that doesn't change the basic requirement: Return a pointer to enough
space, or NULL

That specification is quite clear.

Douglas A. Gwyn

unread,
Jan 12, 2007, 1:06:20 AM1/12/07
to
"christian.bau" <christ...@cbau.wanadoo.co.uk> wrote ...

> It specifies that calloc either returns a pointer to space for nmemb
> objects of size bytes each, or NULL. If the product is larger than
> SIZE_MAX then returning space for these objects may be difficult,

Actually it returns a pointer to an array (which is an object)
of nmemb objects of size bytes each, or a null pointer.
It is the objectness of the array that requires calloc to fail if
the size of that array would exceed what sizeof can represent.


Keith Thompson

unread,
Jan 12, 2007, 1:42:25 AM1/12/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:

Can you support this assertion (that no object can exceed SIZE_MAX
bytes) by citing the standard? The authors of the standard, including
yourself, may have had such a restriction in mind when they wrote the
standard, but if it's not expressed in the words of the standard
itself, it doesn't count.

kuy...@wizard.net

unread,
Jan 12, 2007, 6:01:30 AM1/12/07
to

I've repeatedly asked for a citation to justify this assertion; I
haven't seen one yet.

Consider:

char c[SIZE_MAX][SIZE_MAX]; // 1
size_t n = sizeof c; //2
n = sizeof(char[SIZE_MAX][SIZE_MAX]); //3
void *pv = calloc(SIZE_MAX, SIZE_MAX); //4
if(pv)
{
char (*pvec)[SIZE_MAX] = (char(*)[SIZE_MAX])pv; // 5
char (*parr)[SIZE_MAX][SIZE_MAX] =
(char(*)[SIZE_MAX][SIZE_MAX])pv; //6
n = sizeof(*pvec); // 7
n = sizeof(*parr); //8

free(pv);
}

It is clearly not possible for sizeof() to meet it's requirements on
lines 2, 3, and 8. However, the standard says nothing about how that
fact is to be dealt with. If there's a syntax error, it's my fault;
this example was not intended to have any. Line 1 could violate an
implementation limit, though an implementation is certainly not
required to set the limit that low. There's certainly no constraint
violation. It isn't implicitly undefined behavior, because there is
well-defined behavior for sizeof that applies to these cases; it's
simply a definition that's impossible to implement. It is trivially
feasible for an implementation to diagnose these cases at compile time.
I think that a diagnostic should be made mandatory, by making lines 1,
3, and 6 constraint violations.

However, the standard clearly specifies the behavior of the calloc()
call on line 4. It must either allocate enough memory, zero it, and
return a pointer to that memory, or return a null pointer. if it
returns a non-null pointer, there's no intrinsic reason why lines 5 and
7 should cause any problems. If it's the committee's intent that this
not be allowed, the standard should be modified to say so; but I see no
reason why the committee has to disallow it.

Douglas A. Gwyn

unread,
Jan 12, 2007, 7:26:27 PM1/12/07
to
Keith Thompson wrote:
> Can you support this assertion (that no object can exceed SIZE_MAX
> bytes) by citing the standard? The authors of the standard, including
> yourself, may have had such a restriction in mind when they wrote the
> standard, but if it's not expressed in the words of the standard
> itself, it doesn't count.

The specification for sizeof says that it returns the size of
the specified object type, and does not provide any exception
to that requirement. Also, the specification for size_t says
that it is defined as the type returned by sizeof. Clearly,
sizeof could not meet its spec if size_t is incapable of
representing the (correct) value.

Douglas A. Gwyn

unread,
Jan 12, 2007, 7:37:11 PM1/12/07
to
kuy...@wizard.net wrote:
> I've repeatedly asked for a citation to justify this assertion; I
> haven't seen one yet.

I cited the relevant portions of the standard previously.

> char c[SIZE_MAX][SIZE_MAX]; // 1

Undefined behavior (see the definition of that term):
an erroneous construct, because it attempts to construct
an object that is larger than the implementation supports...

> size_t n = sizeof c; //2

... as illustrated by this construct. If c were a valid
object, this line would be a valid construct, so taking
the contrapositive, the evident invalidity of this proves
the invalidity of the object c.

> n = sizeof(char[SIZE_MAX][SIZE_MAX]); //3

Not essentially different from //1 and //2.

> void *pv = calloc(SIZE_MAX, SIZE_MAX); //4

In order for calloc to meet its spec, it must return a
null pointer value, since otherwise it has to return a
pointer to a valid object of the specified size, and we
can easily see that there isn't such an object (same as
//1 except the burden of compliance is on the
implementation rather than on the programmer).

> if(pv)

The rest is irrelevant since pv must be null.

Richard Tobin

unread,
Jan 12, 2007, 8:35:54 PM1/12/07
to
In article <45A82733...@null.net>,

Douglas A. Gwyn <DAG...@null.net> wrote:

>The specification for sizeof says that it returns the size of
>the specified object type, and does not provide any exception
>to that requirement.

But if I do

char *p = calloc(100000, 100000)

there is no way to apply sizeof to the object created by calloc(),
so the fact that sizeof returns a size_t is irrelevant.

To be concrete, on a system where size_t is 32 bits, exactly what use
of sizeof would not be unable to return a size_t as required, as a
result of an implementation allowing the above call to calloc() to
succeed?

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Richard Tobin

unread,
Jan 12, 2007, 9:00:37 PM1/12/07
to
>In article <45A82733...@null.net>,
>Douglas A. Gwyn <DAG...@null.net> wrote:

>>The specification for sizeof says that it returns the size of
>>the specified object type, and does not provide any exception
>>to that requirement.

Thinking further about this, it seems to me that we can reason:

- sizeof returns the size of any type;
- sizeof returns a size_t;
- so no type can have a size that cannot fit in a size_t.
- but every expression has a type (I assume, I didn't find c&v for this);
- so no expression can refer to the object allocated by

> char *p = calloc(100000, 100000)

(given 32-bit size_t) because that expression would not have a type.

But this still does not prove that such an object cannot be created by
calloc(), and it does not require an expression of the impossible type
to use the object.

Keith Thompson

unread,
Jan 12, 2007, 8:55:51 PM1/12/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:

If I call

calloc(SIZE_MAX/2, 3);

and it returns a non-null result, what is the object type which sizeof
would not be able to handle?

The description of the unary "-" operator says:

The result of the unary - operator is the negative of its
(promoted) operand. The integer promotions are performed on the
operand, and the result has the promoted type.

It does not provide any exception to that requirement. But it clearly
can overflow if, for example, the operand is INT_MIN and int has a
two's-complement representation with no padding bits.

C99 6.5p5 says:

If an _exceptional condition_ occurs during the evaluation of an
expression (that is, if the result is not mathematically defined
or not in the range of representable values for its type), the
behavior is undefined.

The sizeof operator is described in section 6.5, Expressions. Why
does 6.5p5 not apply to sizeof?

Clark S. Cox III

unread,
Jan 12, 2007, 9:19:54 PM1/12/07
to

That only tells us that no *type* that is larger than SIZE_MAX can
exist. calloc doesn't create a type, so that is not relevant.

--
Clark S. Cox III
clar...@gmail.com

kuy...@wizard.net

unread,
Jan 12, 2007, 11:05:59 PM1/12/07
to
Douglas A. Gwyn wrote:
> kuy...@wizard.net wrote:
> > I've repeatedly asked for a citation to justify this assertion; I
> > haven't seen one yet.
>
> I cited the relevant portions of the standard previously.

No, you cited the same things you've mentioned here, which aren't
relevant.

> > char c[SIZE_MAX][SIZE_MAX]; // 1
>
> Undefined behavior (see the definition of that term):

How? The standard defines the meaning of that construct in the section
that describes how to declare arrays, without making any exceptions
based upon the dimensions of the array. Therefore, it's not undefined
by reason of a lack of definition. That definition leads to problems
when combined with the definition of sizeof, but the standard doesn't
specify how that conflict is to be resolved.

> an erroneous construct, because it attempts to construct
> an object that is larger than the implementation supports...

That's circular logic - you're concluding that the object is too big,
by arguing that declaration of the argument has undefined behaviour
because it's too big. You still haven't cited an explicit statement
that the behavior is undefined, or identified a way in which the
behavior of that declaration is any less well defined than the behavior
of the declaration of char c[128][128].

> > size_t n = sizeof c; //2
>
> ... as illustrated by this construct. If c were a valid
> object, this line would be a valid construct, so taking
> the contrapositive, the evident invalidity of this proves
> the invalidity of the object c.

There's certainly a problem here. You're asserting the equivalent of
having an explicit statement in the standard that tdeclaring an object
with size greater than SIZE_MAX bytes has undefined behavior. If we're
going to invent extra imaginary text for the standard, that imaginary
text could equally easily be a statement that it's undefined behavior
to apply the sizeof operator to such objects. The reality is that the
standard contains neither statement, and no means of validly inferring
either one.

> > void *pv = calloc(SIZE_MAX, SIZE_MAX); //4
>
> In order for calloc to meet its spec, it must return a
> null pointer value, since otherwise it has to return a
> pointer to a valid object of the specified size, and we
> can easily see that there isn't such an object (same as
> //1 except the burden of compliance is on the
> implementation rather than on the programmer).

Since there's no named object that you can apply sizeof to, your
previous invocation of the definition of sizeof is even less relevant
here than it was before. I could vaguely see an application of that
argument to the variable I named 'parr', since you could form the
expression sizeof(*parr). Again, it's equally arguable that the
undefined behavior is in the application of the sizeof operator, and
not in the call to calloc. Neither argument is correct; the correct
statement is the standard doesn't identify where this goes wrong; it
merely defines the behavior in a way that guarantees that something
must go wrong.

However, since you can't apply sizeof directly to *pv, I don't see how
the objectness of the allocated memory has any bearing on the question.
The standard doesn't have a requirement that objects be less than a
given size, only a requirement that sizeof, when correctly applied to
an object, must give that object's size.

Douglas A. Gwyn

unread,
Jan 13, 2007, 12:33:22 AM1/13/07
to
> But if I do
> char *p = calloc(100000, 100000)
> there is no way to apply sizeof to the object created by calloc(),
> so the fact that sizeof returns a size_t is irrelevant.

True, you can't directly access the supposed object as a unit.
However, calloc is supposed to return a pointer to an object,
and one of the most essential properties of an object is its
size, as determinable by sizeof in many cases, so it seems at
least inconsistent for an implementation to create objects
with sizes not representable in size_t.

In other words, the intent of size_t was that each conforming
implementation should make it big enough to hold the size of
the largest object supported by the implementation. We didn't
want another of those fseek/fseeko fiascos.

Douglas A. Gwyn

unread,
Jan 13, 2007, 12:50:58 AM1/13/07
to
kuy...@wizard.net wrote:

> Douglas A. Gwyn wrote:
> > Undefined behavior (see the definition of that term):
> How? The standard defines the meaning of that construct in the section
> that describes how to declare arrays, without making any exceptions

No, I meant the definition of "undefined behavior".

> > > size_t n = sizeof c; //2
> > ... as illustrated by this construct. If c were a valid
> > object, this line would be a valid construct, so taking
> > the contrapositive, the evident invalidity of this proves
> > the invalidity of the object c.
> There's certainly a problem here. You're asserting the equivalent of
> having an explicit statement in the standard that tdeclaring an object
> with size greater than SIZE_MAX bytes has undefined behavior.

No, please try to understand the argument I gave rather than
some substitute for it that I did not give. c is not a valid
object because sizeof cannot work for it, contrary to the
requirements imposed by the standard. (As has recently been
pointed out, this argument doesn't work as directly for the
supposed object that might be pointed to by a non-null return
from calloc. We could discuss that further, once you grasp
why the argument does work for the declared supposed object
c.)

> ... Neither argument is correct; the correct


> statement is the standard doesn't identify where this goes wrong; it
> merely defines the behavior in a way that guarantees that something
> must go wrong.

Actually, no: the implementation can ensure that it doesn't
go wrong by not "successfully" allocating such objects.

> The standard doesn't have a requirement that objects be less than a
> given size, only a requirement that sizeof, when correctly applied to
> an object, must give that object's size.

As to dynamically allocated storage, a cast (to an appropriate
array type) could be used to access the supposed object, if
calloc reports success. And then sizeof clearly can be
applied to the effective object thus "impressed" upon that
storage, leading back to the same contradiction and thus the
proof that that is not a conforming implementation.

kuy...@wizard.net

unread,
Jan 13, 2007, 11:35:42 AM1/13/07
to
Douglas A. Gwyn wrote:
> kuy...@wizard.net wrote:
> > Douglas A. Gwyn wrote:
> > > Undefined behavior (see the definition of that term):
> > How? The standard defines the meaning of that construct in the section
> > that describes how to declare arrays, without making any exceptions
>
> No, I meant the definition of "undefined behavior".

So, what feature of this code meets that definition? It's not erroneous
code, it's not erroneous data - the only "error" it contains is a
violation of the rule that you're trying to argue for the existence of.
Arguing that there is a size limit because the behavior is undefined
because the code violates the size limit is circular reasoning. This
code is non-portable, in that it depends upon whether or not the size
of the array exceeds implementation limits. However, you have to read
the whole definition: non-portable code has undefined behavior only
when the standard imposes no requirements on it's behavior. When the
size of that array does not exceed a given implementation's limits, the
standard very clearly imposes requirements; they just happen to be
requirements that can't be met.

> > > > size_t n = sizeof c; //2
> > > ... as illustrated by this construct. If c were a valid
> > > object, this line would be a valid construct, so taking
> > > the contrapositive, the evident invalidity of this proves
> > > the invalidity of the object c.
> > There's certainly a problem here. You're asserting the equivalent of
> > having an explicit statement in the standard that tdeclaring an object
> > with size greater than SIZE_MAX bytes has undefined behavior.
>
> No, please try to understand the argument I gave rather than
> some substitute for it that I did not give. c is not a valid
> object because sizeof cannot work for it, contrary to the

I understand that argument - it just doesn't work. The fact that
sizeof(c) cannot work doesn't uniquely identify the declaration of c as
the invalid construct. You could equally say that it is the sizeof
expression that has undefined behavior, and that code which makes no
use of sizeof on that object has defined behavior, assuming the
implementation's size limits are high enough. The standard provides no
basis for deciding whether it's the declaration of c or the sizeof(c)
expression which is at fault.

> > ... Neither argument is correct; the correct
> > statement is the standard doesn't identify where this goes wrong; it
> > merely defines the behavior in a way that guarantees that something
> > must go wrong.
>
> Actually, no: the implementation can ensure that it doesn't
> go wrong by not "successfully" allocating such objects.

Yes; but that's not in conflict with my statement. The fact that an
impementation can do this doesn't mean that the standard identifies
this as something an implementation must do. It doesn't, unless you've
got an additional citation from the standard that you haven't brought
into the discussion yet. An implementation can also avoid the problem
by rejecting the sizeof expression, or implementing it to return the
size of the object modulo 1 more than SIZE_MAX, or any number of other
ways. The standard, by imposing conflicting requirements, doesn't
distinguish between the many different ways that those conflicts could
be resolved.

> > The standard doesn't have a requirement that objects be less than a
> > given size, only a requirement that sizeof, when correctly applied to
> > an object, must give that object's size.
>
> As to dynamically allocated storage, a cast (to an appropriate
> array type) could be used to access the supposed object, if
> calloc reports success. And then sizeof clearly can be
> applied to the effective object thus "impressed" upon that
> storage, leading back to the same contradiction and thus the
> proof that that is not a conforming implementation.

But, by the same argument, all dynamically objects are prohibited. I
can take any pointer to void which points at dynamically allocated
memory, regardless of the size of the block of memory that it points
at. I can convert it into a pointer (let's call it pT) to an arbitrary
type T - it's guaranteed to be correctly aligned for type T. The
results of sizeof(*pT) depend only upon the type which pT points at.
They don't depend upon the place where pT points at, or whether or not
there's an object with an effective type of T at the location where the
pointer points; the dereferencing operator is not actually evaluated
inside a sizeof expression.

kuy...@wizard.net

unread,
Jan 13, 2007, 11:36:35 AM1/13/07
to
Douglas A. Gwyn wrote:
> > But if I do
> > char *p = calloc(100000, 100000)
> > there is no way to apply sizeof to the object created by calloc(),
> > so the fact that sizeof returns a size_t is irrelevant.
>
> True, you can't directly access the supposed object as a unit.
> However, calloc is supposed to return a pointer to an object,
> and one of the most essential properties of an object is its
> size, as determinable by sizeof in many cases, so it seems at
> least inconsistent for an implementation to create objects
> with sizes not representable in size_t.

The key question is, is it inconsistent with the standard. The standard
doesn't define size_t as being able to store the size of any object, it
only documents it as the type of a sizeof expression. Without the
application of sizeof, your argument doesn't connect to SIZE_MAX.

> In other words, the intent of size_t was that each conforming
> implementation should make it big enough to hold the size of
> the largest object supported by the implementation. We didn't
> want another of those fseek/fseeko fiascos.

Intent is one thing; the actual words of the standard are another. Was
that intent clearly expressed in the words of the standard? If so,
where?

Keith Thompson

unread,
Jan 13, 2007, 3:24:39 PM1/13/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:
[...]

> In other words, the intent of size_t was that each conforming
> implementation should make it big enough to hold the size of
> the largest object supported by the implementation. We didn't
> want another of those fseek/fseeko fiascos.

And how is that "intent" expressed in the normative wording of the
standard? As far as I can tell, it isn't.

If you say that's the intent, I'll take your word for it -- but that's
*not* what I'm asking.

jacob navia

unread,
Jan 13, 2007, 4:15:34 PM1/13/07
to
kuy...@wizard.net a écrit :
C Standard 7.21.6.3 The strlen function

size_t strlen(const char *s);

If you use that object with size bigger than size_t to store a string,
strlen could not return the size of the object.

Q.E.D

jacob navia

unread,
Jan 13, 2007, 4:17:02 PM1/13/07
to
Douglas A. Gwyn a écrit :

size_t strlen(const char *s);

If you store a string in such an object, strlen can't
return its length, hence that object is illegal.

Q.E.D

:-)

Keith Thompson

unread,
Jan 13, 2007, 4:45:38 PM1/13/07
to

Ok. So what if you *don't* store a string in such an object? What's
wrong with a program that creates an object bigger than SIZE_MAX
bytes, and either doesn't store a string in it or doesn't attempt to
apply strlen() to it?

(And even if you do call strlen() with a pointer to such an object, I
*think* it invokes undefined behavior, though I'm not 100% certain
this is supported by the standard.)

jacob navia

unread,
Jan 13, 2007, 5:32:24 PM1/13/07
to
Keith Thompson a écrit :

> jacob navia <ja...@jacob.remcomp.fr> writes:
>
>>Douglas A. Gwyn a écrit :
>>
>>>>But if I do
>>>> char *p = calloc(100000, 100000)
>>>>there is no way to apply sizeof to the object created by calloc(),
>>>>so the fact that sizeof returns a size_t is irrelevant.
>>>
>>>True, you can't directly access the supposed object as a unit.
>>>However, calloc is supposed to return a pointer to an object,
>>>and one of the most essential properties of an object is its
>>>size, as determinable by sizeof in many cases, so it seems at
>>>least inconsistent for an implementation to create objects
>>>with sizes not representable in size_t.
>>>In other words, the intent of size_t was that each conforming
>>>implementation should make it big enough to hold the size of
>>>the largest object supported by the implementation. We didn't
>>>want another of those fseek/fseeko fiascos.
>>
>>size_t strlen(const char *s);
>>
>>If you store a string in such an object, strlen can't
>>return its length, hence that object is illegal.
>>
>>Q.E.D
>
>
> Ok. So what if you *don't* store a string in such an object? What's
> wrong with a program that creates an object bigger than SIZE_MAX
> bytes, and either doesn't store a string in it or doesn't attempt to
> apply strlen() to it?

Nothing. I am just saying that it is not legal because
it *could* been used to store a string whose length
does not comply with the strlen specs.

If calloc returns a valid pointer to some chunk of contiguos
memory and you can use the hardware to address that, then by
all means do not be ashamed because of a little sin...
In that case sizeof(void *) > sizeof(size_t).

Don't forget however that you can't pass that pointer to realloc
since THAT would fail. That function receives a size_t.

Besides other functions that receive a size_t could be slightly disturbed:

Functions that receive a void * and
a size_t will not work, "including but not limited to"
memcpy, memset, and family.

Yes, everything can be constructed by some implementation
including that calloc returns a special kind of pointer
(by changing the prototype?), or that the compiler
"automagically" knows and keeps tract of several families
of pointers with size_t limit and the others etc...

But all those are just theoretical execises Keith.

Yes, maybe it would be better that the standard says so explicitely
but...

Keith Thompson

unread,
Jan 13, 2007, 5:59:19 PM1/13/07
to

C99 7.1.4:

If an argument to a function has an invalid value (such as a value
outside the domain of the function, or [...]) or [...], the
behavior is undefined.

If it's possible to create a string bigger than SIZE_MAX bytes, then
such a string's address is an invalid value for strlen(), and calling
strlen() with such an agurment invokes undefined behavior.

> If calloc returns a valid pointer to some chunk of contiguos
> memory and you can use the hardware to address that, then by
> all means do not be ashamed because of a little sin...

"ashamed"? "sin"? Sorry, I have no idea what your point is here.

> In that case sizeof(void *) > sizeof(size_t).

Not necessarily, due to padding bits. But in any case, size_t only
needs to hold the size of an object (though possibly not of all
possible objects); void* needs to uniquely address each byte of *each*
object.

If I can create an object bigger than SIZE_MAX bytes, there are some
things you can't do with that object without invoking undefined
behavior. As long as you avoid those things, I don't see any problem.
(Of course, an implementation doesn't *have* to support objects bigger
than SIZE_MAX bytes; my argument is that it's allowed to.)

> Don't forget however that you can't pass that pointer to realloc
> since THAT would fail. That function receives a size_t.

Undefined behavior, as above. (It would be reasonable for realloc()
to treat this as an error and return a null pointer, but it's not
required.)

> Besides other functions that receive a size_t could be slightly disturbed:
>
> Functions that receive a void * and
> a size_t will not work, "including but not limited to"
> memcpy, memset, and family.
>
> Yes, everything can be constructed by some implementation
> including that calloc returns a special kind of pointer
> (by changing the prototype?), or that the compiler
> "automagically" knows and keeps tract of several families
> of pointers with size_t limit and the others etc...

There's no need for any "special kind of pointer", just an ordinary
void*.

> But all those are just theoretical execises Keith.

And I believe they're all covered by the standard (as undefined
behavior if nothing else).

> Yes, maybe it would be better that the standard says so explicitely
> but...

But what?

kuy...@wizard.net

unread,
Jan 14, 2007, 10:48:54 AM1/14/07
to
jacob navia wrote:
> kuy...@wizard.net a écrit :
> > Douglas A. Gwyn wrote:
...

> >>In other words, the intent of size_t was that each conforming
> >>implementation should make it big enough to hold the size of
> >>the largest object supported by the implementation. We didn't
> >>want another of those fseek/fseeko fiascos.
> >
> >
> > Intent is one thing; the actual words of the standard are another. Was
> > that intent clearly expressed in the words of the standard? If so,
> > where?
> >
> C Standard 7.21.6.3 The strlen function
>
> size_t strlen(const char *s);
>
> If you use that object with size bigger than size_t to store a string,
> strlen could not return the size of the object.

Which means that passing a pointer to such a string to strlen() would
be a problem. Like previous arguments, however, this one fails to
identify what step in that process, if any, has undefined behavior. I
like Keith's argument that such a pointer is an invalid argument for
strlen() (and similar functions). However, in the absence of text that
explicitly covers that case, I'm not sure his argument valid. Even if
that were indeed the intent when that clause was written, it's too
obscure a way of imposing such a requirement. Such a requirement should
be expressed directly, either in the location where size_t is defined,
or possibly in the one where SIZE_MAX is defined.

The reality is that there's something missing from the standard: an
explicit statement that some particular construct has undefined
behavior when the relevant object would have a size greater than
SIZE_MAX. I think t that making any use (including as an argument to
sizeof) of a type whose size is greater than SIZE_MAX should be
undefined behavior. However, when the memory that contains an object
was allocated by a call to calloc(), it seems to me that there's a
small but real advantage to allowing the object have a size greater
than SIZE_MAX, so long as that object is never accessed using an
over-sized type. I'm not arguing strongly for this; I would be quite
comfortable if such allocations were explicitly forbidden. What I
object to is the claim that the existing text already clearly forbids
them.

Richard Tobin

unread,
Jan 14, 2007, 11:56:29 AM1/14/07
to
In article <45a94c4d$0$25911$ba4a...@news.orange.fr>,
jacob navia <ja...@jacob.remcomp.fr> wrote:

>size_t strlen(const char *s);
>
>If you store a string in such an object, strlen can't
>return its length, hence that object is illegal.
>
>Q.E.D

How do you know that it's the object that's illegal, rather than calling
strlen on it? What if the memory doesn't contain a string, is it illegal
then?

Richard Tobin

unread,
Jan 14, 2007, 11:59:07 AM1/14/07
to
In article <45a95df8$0$27409$ba4a...@news.orange.fr>,
jacob navia <ja...@jacob.remcomp.fr> wrote:

>Nothing. I am just saying that it is not legal because
>it *could* been used to store a string whose length
>does not comply with the strlen specs.

Char *p = malloc(1);
p[0] = 'a';
strlen((p);

Oops, I stored something in the memory allocated by malloc(1) that
does not compply to the strlen() spec. So do we conclude that
malloc(1) must fail? No, we conclude that you mustn't call strlen()
on such a thing.

jacob navia

unread,
Jan 14, 2007, 3:32:37 PM1/14/07
to
Richard Tobin a écrit :

> In article <45a94c4d$0$25911$ba4a...@news.orange.fr>,
> jacob navia <ja...@jacob.remcomp.fr> wrote:
>
>
>>size_t strlen(const char *s);
>>
>>If you store a string in such an object, strlen can't
>>return its length, hence that object is illegal.
>>
>>Q.E.D
>
>
> How do you know that it's the object that's illegal, rather than calling
> strlen on it? What if the memory doesn't contain a string, is it illegal
> then?
>
> -- Richard

You can't

memset
memcpy
bsearch
memchr
realloc
or do anything with that memory...

But this whole discussion is completely useless. It is obvious
that nobody in the standards comitee thought of this weird case.

It is obvious for you that this is not written down and it should be.

I think so too, but in my opinion, this doesn't have any
practical significance...

Keith Thompson

unread,
Jan 14, 2007, 3:59:17 PM1/14/07
to
kuy...@wizard.net writes:
> jacob navia wrote:
[...]

>> If you use that object with size bigger than size_t to store a string,
>> strlen could not return the size of the object.
>
> Which means that passing a pointer to such a string to strlen() would
> be a problem. Like previous arguments, however, this one fails to
> identify what step in that process, if any, has undefined behavior. I
> like Keith's argument that such a pointer is an invalid argument for
> strlen() (and similar functions). However, in the absence of text that
> explicitly covers that case, I'm not sure his argument valid.

I'm not entirely sure of it either.

If the authors of the standard implicitly assumed that no object can
exceed SIZE_MAX bytes, but merely neglected to say so, it's not
surprising that it's difficult to prove that such objects are allowed.

A decision needs to be made one way or the other, and the standard
needs to be augmented to clearly state it.

> Even if
> that were indeed the intent when that clause was written, it's too
> obscure a way of imposing such a requirement. Such a requirement should
> be expressed directly, either in the location where size_t is defined,
> or possibly in the one where SIZE_MAX is defined.

Or perhaps even where "object" is defined.

> The reality is that there's something missing from the standard: an
> explicit statement that some particular construct has undefined
> behavior when the relevant object would have a size greater than
> SIZE_MAX.

Assuming we want to ban objects bigger than SIZE_MAX bytes, there are
cases that should be constraint violations or run-time errors rather
than undefined behavior. For example, declaring an object whose size
can be determined at compilation time (i.e., not a VLA) to be bigger
than SIZE_MAX byte could be a constraint violation. calloc() could be
required to return a null pointer if the mathematical product of its
arguments exceeds SIZE_MAX.

> I think t that making any use (including as an argument to
> sizeof) of a type whose size is greater than SIZE_MAX should be
> undefined behavior. However, when the memory that contains an object
> was allocated by a call to calloc(), it seems to me that there's a
> small but real advantage to allowing the object have a size greater
> than SIZE_MAX, so long as that object is never accessed using an
> over-sized type. I'm not arguing strongly for this; I would be quite
> comfortable if such allocations were explicitly forbidden. What I
> object to is the claim that the existing text already clearly forbids
> them.

Allowing objects bigger than SIZE_MAX bytes would cause some real
problems. For example, a library that allows information about
arbitrary objects to be represented as an address (void*) and a size
(size_t) would not work for such objects; see memcpy(), for example.
And banning such objects shouldn't be a burden on implementers; if you
want to support huge objects, make size_t bigger.

As far as I'm concerned, an agreement that (a) the intent is to
disallow objects bigger than SIZE_MAX bytes, but (b) the standard does
not express this intent should end this whole debate. Probably a DR
would be appropriate. I object to unfounded assertions that the
standard is already clear on this point.

Keith Thompson

unread,
Jan 14, 2007, 4:47:37 PM1/14/07
to
jacob navia <ja...@jacob.remcomp.fr> writes:
> Richard Tobin a écrit :
>> In article <45a94c4d$0$25911$ba4a...@news.orange.fr>,
>> jacob navia <ja...@jacob.remcomp.fr> wrote:
>>
>>>size_t strlen(const char *s);
>>>
>>>If you store a string in such an object, strlen can't
>>>return its length, hence that object is illegal.
>>>
>>>Q.E.D
>> How do you know that it's the object that's illegal, rather than
>> calling
>> strlen on it? What if the memory doesn't contain a string, is it illegal
>> then?
>
> You can't
>
> memset
> memcpy
> bsearch
> memchr
> realloc
> or do anything with that memory...

I don't see why you couldn't use bsearch(), which doesn't take an
argument specifying the size of the whole object. Like calloc(), it
takes two size_t arguments specifying a number of elements and the
size of each element.

Furthermore, you could use any of the above with a *subset* of a huge
object, and you could do anything you like with anything that doesn't
require the size of the entire object to be stored in a size_t. For
example, unsigned long long is bigger than size_t, you can use values
of type unsigned long long as array indices over the entire object.


>
> But this whole discussion is completely useless. It is obvious
> that nobody in the standards comitee thought of this weird case.
>
> It is obvious for you that this is not written down and it should be.
>
> I think so too, but in my opinion, this doesn't have any
> practical significance...

And contrary opinions have been expressed (see Doug Gwyn's responses).

This started when I asked a specific question:

May calloc() allocate more than SIZE_MAX bytes?

and that's what we've been discussing. If you think the discussion is
useless, why are you participating?

kuy...@wizard.net

unread,
Jan 14, 2007, 6:52:01 PM1/14/07
to
jacob navia wrote:
> Richard Tobin a écrit :
> > In article <45a94c4d$0$25911$ba4a...@news.orange.fr>,
> > jacob navia <ja...@jacob.remcomp.fr> wrote:
> >
> >
> >>size_t strlen(const char *s);
> >>
> >>If you store a string in such an object, strlen can't
> >>return its length, hence that object is illegal.
> >>
> >>Q.E.D
> >
> >
> > How do you know that it's the object that's illegal, rather than calling
> > strlen on it? What if the memory doesn't contain a string, is it illegal
> > then?
> >
> > -- Richard
>
> You can't
>
> memset
> memcpy
> bsearch
> memchr
> realloc

With the exception of bsearch(), I agree that those functions can't be
used on the whole object allocated by calloc(), if that object were
larger than SIZE_MAX. However, there's no reason why the mem*() family
can't be used on any arbitrary subset of the object with a size less
than SIZE_MAX. bsearch() should work fine so long as the nmemb and size
arguments are in range, regardless of what their product is.

> or do anything with that memory...

The single most plausible case where you might want to use calloc() to
allocate more than SIZE_MAX bytes is when calloc(n_elements, sizeof(T))
is used to dynamically allocate an array of n_elements of objects of
type T, where n_elements and sizeof(T) are both <= SIZE_MAX. An
implementation that chose to support such an allocation should give you
no problems when accessing any single element of the array, nor when
performing pointer arithmetic on pointers to elements of that array,
nor when using the mem*() and str*() families on portions of the array
with a length shorter than SIZE_MAX. That seems to leave open a pretty
wide variety of uses for such memory. Most of the things I do with
large arrays in my own programs would fit within those restrictions.

Douglas A. Gwyn

unread,
Jan 15, 2007, 3:23:44 AM1/15/07
to
"jacob navia" <ja...@jacob.remcomp.fr> wrote in message
news:45aa9363$0$27403$ba4a...@news.orange.fr...

> But this whole discussion is completely useless. It is obvious
> that nobody in the standards comitee thought of this weird case.

Actually we assumed that not having any objects larger than the
largest supported object prevented the issue from arising.


Douglas A. Gwyn

unread,
Jan 15, 2007, 3:28:48 AM1/15/07
to
"Keith Thompson" <ks...@mib.org> wrote in message
news:lnmz4le...@nuthaus.mib.org...

> As far as I'm concerned, an agreement that (a) the intent is to
> disallow objects bigger than SIZE_MAX bytes, but (b) the standard does
> not express this intent should end this whole debate. Probably a DR
> would be appropriate. I object to unfounded assertions that the
> standard is already clear on this point.

First, SIZE_MAX has nothing to do with it -- SIZE_MAX merely
measures a particular type, and the largest supported object might
be much smaller than SIZE_MAX. The key is that size_t is
supposed to be large enough to represent the size of the largest
supported object. We thought that was implied by the specs for
sizeof and size_t, and didn't expect anybody to think that calloc
was meant as a way to produce an object bigger than that, which
would be problematic for a variety of reasons.


Harald van Dijk

unread,
Jan 15, 2007, 3:55:42 AM1/15/07
to
Douglas A. Gwyn wrote:
> "Keith Thompson" <ks...@mib.org> wrote in message
> news:lnmz4le...@nuthaus.mib.org...
> > As far as I'm concerned, an agreement that (a) the intent is to
> > disallow objects bigger than SIZE_MAX bytes, but (b) the standard does
> > not express this intent should end this whole debate. Probably a DR
> > would be appropriate. I object to unfounded assertions that the
> > standard is already clear on this point.
>
> First, SIZE_MAX has nothing to do with it -- SIZE_MAX merely
> measures a particular type, and the largest supported object might
> be much smaller than SIZE_MAX. The key is that size_t is
> supposed to be large enough to represent the size of the largest
> supported object.

In other words, that SIZE_MAX is large enough that the size of the
largest supported object does not exceed it?

> We thought that was implied by the specs for
> sizeof and size_t, and didn't expect anybody to think that calloc
> was meant as a way to produce an object bigger than that, which
> would be problematic for a variety of reasons.

I don't think anyone expects that calloc is meant as a way of producing
such a large object. The problem is whether the specs actually prohibit
it from being one.

kuy...@wizard.net

unread,
Jan 15, 2007, 7:23:37 AM1/15/07
to
Douglas A. Gwyn wrote:
...
> be much smaller than SIZE_MAX. The key is that size_t is
> supposed to be large enough to represent the size of the largest
> supported object. We thought that was implied by the specs for
> sizeof and size_t,

It isn't.

kuy...@wizard.net

unread,
Jan 15, 2007, 7:38:17 AM1/15/07
to
Harald van Dijk wrote:
> Douglas A. Gwyn wrote:
...
> > We thought that was implied by the specs for
> > sizeof and size_t, and didn't expect anybody to think that calloc
> > was meant as a way to produce an object bigger than that, which
> > would be problematic for a variety of reasons.
>
> I don't think anyone expects that calloc is meant as a way of producing
> such a large object. The problem is whether the specs actually prohibit
> it from being one.

Doug is referring to me - as I mentioned in a previous message, I did
expect precisely that. I thought, when I first learned about calloc() a
few decades ago, that it seemed excessive to define a special function
that differed from malloc() only by encapsulating a multiplication,
and a call to memset(); so I assumed it was permitted (but not
required) to support allocations that malloc() couldn't do.

I trust that Doug is correctly expressing the intent of the committee,
even if the standard itself does not, so I was apparantly wrong about
that. In which case I fall back on my original judgement: it's a
function with insufficient justification for it's presence in the
standard library. However, it does have some justification, so I
wouldn't bother lobbying to have it removed, even if backwards
compatibility weren't an issue.

kuy...@wizard.net

unread,
Jan 15, 2007, 7:40:38 AM1/15/07
to

You assumed that you had written text which required size_t to be big
enough to store the size of the largest supported object, even if that
object were one to which sizeof could not be directly applied. That
text is not actually present in the standard.

Harald van Dijk

unread,
Jan 15, 2007, 12:58:29 PM1/15/07
to
kuy...@wizard.net wrote:
> Harald van Dijk wrote:
> > Douglas A. Gwyn wrote:
> ...
> > > We thought that was implied by the specs for
> > > sizeof and size_t, and didn't expect anybody to think that calloc
> > > was meant as a way to produce an object bigger than that, which
> > > would be problematic for a variety of reasons.
> >
> > I don't think anyone expects that calloc is meant as a way of producing
> > such a large object. The problem is whether the specs actually prohibit
> > it from being one.
>
> Doug is referring to me - as I mentioned in a previous message, I did
> expect precisely that. I thought, when I first learned about calloc() a
> few decades ago, that it seemed excessive to define a special function
> that differed from malloc() only by encapsulating a multiplication,
> and a call to memset(); so I assumed it was permitted (but not
> required) to support allocations that malloc() couldn't do.

Sorry, I completely overlooked your message. While I didn't re-read the
start of this thread before commenting on Doug Gwyn's message, I missed
it before, even.

Keith Thompson

unread,
Jan 15, 2007, 4:26:19 PM1/15/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:
> "Keith Thompson" <ks...@mib.org> wrote in message
> news:lnmz4le...@nuthaus.mib.org...
>> As far as I'm concerned, an agreement that (a) the intent is to
>> disallow objects bigger than SIZE_MAX bytes, but (b) the standard does
>> not express this intent should end this whole debate. Probably a DR
>> would be appropriate. I object to unfounded assertions that the
>> standard is already clear on this point.
>
> First, SIZE_MAX has nothing to do with it -- SIZE_MAX merely
> measures a particular type, and the largest supported object might
> be much smaller than SIZE_MAX. The key is that size_t is
> supposed to be large enough to represent the size of the largest
> supported object.

SIZE_MAX has everything to do with it. SIZE_MAX is the maximum value
of type size_t. If size_t can represent the size of the largest
supported object, then no object can be bigger than SIZE_MAX bytes.
Certainly an implementation can impose a smaller limit; the question
is whether it can impose a larger one.

In other words, the statements "No object can be larger than SIZE_MAX
bytes" and "size_t can represent the size of the largest supported
object" are equivalent. I'm just using SIZE_MAX because it makes the
idea a bit easier to express.

> We thought that was implied by the specs for
> sizeof and size_t, and didn't expect anybody to think that calloc
> was meant as a way to produce an object bigger than that, which
> would be problematic for a variety of reasons.

I agree that allowing objects whose size cannot be represented in
size_t is problematic. But the specs for sizeof and size_t do not
exclude the possibility of such objects.

My only disagreement is on one narrow point: The standard does not
disallow objects bigger than SIZE_MAX bytes, and an implementation
that conforms to the letter of the standard may support such objects
(and if it does so, it must follow the rules of the standard with
respect to such objects). Some uses of such objects may invoke
undefined behavior, but others do not. I understand that that wasn't
the intent, but it's the only conclusion I can draw from the wording
of the standard itself.

So far, you have neither produced wording from the standard that
supports your assertion, nor acknowledged that there is no such
wording.

Douglas A. Gwyn

unread,
Jan 15, 2007, 4:41:25 PM1/15/07
to
<kuy...@wizard.net> wrote...

> Douglas A. Gwyn wrote:
> You assumed that you had written text which required size_t to be big
> enough to store the size of the largest supported object, even if that
> object were one to which sizeof could not be directly applied. That
> text is not actually present in the standard.

No, and please quit putting words into my mouth.


Douglas A. Gwyn

unread,
Jan 15, 2007, 4:50:16 PM1/15/07
to
"Keith Thompson" <ks...@mib.org> wrote in message
news:ln7ivoc...@nuthaus.mib.org...

> In other words, the statements "No object can be larger than SIZE_MAX
> bytes" and "size_t can represent the size of the largest supported
> object" are equivalent.

No, they are not equivalent.

> I'm just using SIZE_MAX because it makes the idea a bit easier to express.

Yes, I understood that. More precisely, an object size larger
than SIZE_MAX is perforce unrepresentable in type size_t
and thus cannot be the result of sizeof applied to that object.

> I agree that allowing objects whose size cannot be represented in
> size_t is problematic. But the specs for sizeof and size_t do not
> exclude the possibility of such objects.

Seems to me that they do, since otherwise the program could
apply sizeof to the supposed object (perhaps via a dereferenced
cast of a dynamically allocated pointer), producing a situation
that a conforming implementation both must support and cannot
support. The contradiction disproves the premise.

> So far, you have neither produced wording from the standard that
> supports your assertion, nor acknowledged that there is no such
> wording.

I've more than once given the argument (repeated above).


Richard Tobin

unread,
Jan 15, 2007, 6:29:10 PM1/15/07
to
In article <Q5ydnWE2eriIajbY...@comcast.com>,

Douglas A. Gwyn <DAG...@null.net> wrote:

>> In other words, the statements "No object can be larger than SIZE_MAX
>> bytes" and "size_t can represent the size of the largest supported
>> object" are equivalent.

>No, they are not equivalent.

If no object can be larger than SIZE_MAX bytes, then its size must fit in
a size_t, because a size_t can handle values up to SIZE_MAX.

If size_t can represent the size of the largest supported object, then
that size must be less than SIZE_MAX, because otherwise it would not
fit in a size_t.

So they are equivalent.

Neither of them implies that you can have an object that big.

>> I agree that allowing objects whose size cannot be represented in
>> size_t is problematic. But the specs for sizeof and size_t do not
>> exclude the possibility of such objects.

>Seems to me that they do, since otherwise the program could
>apply sizeof to the supposed object (perhaps via a dereferenced
>cast of a dynamically allocated pointer), producing a situation
>that a conforming implementation both must support and cannot
>support. The contradiction disproves the premise.

So you're suggesting something like:

void *p = calloc(100000, 100000);
sizeof(*(char[100000][100000] *)p);

and sizeof would be unable to return the right result, so the calloc() must
be in error. Is that right?

If so, you could perfectly well do:

sizeof(char[100000][100000])

without bothering to do the calloc(), and have the same problem, so I
don't see how you deduce that the calloc() is the source of the error.

Keith Thompson

unread,
Jan 15, 2007, 6:36:22 PM1/15/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:
> "Keith Thompson" <ks...@mib.org> wrote in message
> news:ln7ivoc...@nuthaus.mib.org...
>> In other words, the statements "No object can be larger than SIZE_MAX
>> bytes" and "size_t can represent the size of the largest supported
>> object" are equivalent.
>
> No, they are not equivalent.

I don't understand; how do they differ? Are you making a distinction
between "object" and "supported object"?

>> I'm just using SIZE_MAX because it makes the idea a bit easier to express.
>
> Yes, I understood that. More precisely, an object size larger
> than SIZE_MAX is perforce unrepresentable in type size_t
> and thus cannot be the result of sizeof applied to that object.

Agreed.

>> I agree that allowing objects whose size cannot be represented in
>> size_t is problematic. But the specs for sizeof and size_t do not
>> exclude the possibility of such objects.
>
> Seems to me that they do, since otherwise the program could
> apply sizeof to the supposed object (perhaps via a dereferenced
> cast of a dynamically allocated pointer), producing a situation
> that a conforming implementation both must support and cannot
> support. The contradiction disproves the premise.

I disagree. I'll repeat some arguments I've made before.

Suppose I execute

void *ptr = calloc(SIZE_MAX/2, 3);

and suppose I get a non-null result. How do I apply "sizeof" to get
the size of the allocated object?

The standard requires unary "-" to yield the negative of its
(promoted) operand. There are values, such as INT_MIN in many
implementations, that do not have a representable negative. There
is no contradiction here; the situation is resolved by 6.5p5:

If an _exceptional condition_ occurs during the evaluation of an
expression (that is, if the result is not mathematically defined
or not in the range of representable values for its type), the
behavior is undefined.

This clearly applies to the unary "-" operator; why would it not apply
equally to the unary "sizeof" operator?

[snip]

James Dennett

unread,
Jan 15, 2007, 9:40:23 PM1/15/07
to

It seemed, to me, to be a reasonable summary of what you
have said in this thread (combined with the sensible
interpretation that the standard text does not, in fact,
contain text prohibiting objects whose size is not
representable as a size_t).

You wrote (presumably referring to the C committee) that


"The key is that size_t is supposed to be large enough to

represent the size of the largest supported object. We
thought that was implied by the specs for sizeof and size_t"
which appears to document your assumption.

It's not clear to which item you are referring with the
accusation that words are being put into your mouth; it
seems that the post to which you replied was closer to
a paraphrasing of what you'd written than to a fabrication.

That aside, your argument that objects larger than size_t
are not permitted is a weak one, and a number of holes in
it have been highlighted. I've not seen you respond to
those with specificity. Based on the discussion so far,
I see nothing in the standard to prevent objects larger
than size_t bytes being created by calloc, though many
uses of such an object would be undefined.

-- James

kuy...@wizard.net

unread,
Jan 16, 2007, 6:32:20 AM1/16/07
to

The only words I was "putting in your mouth" are the following: "You


assumed that you had written text which required size_t to be big

enough to store the size of the largest supported object". As far as I
can see, that's an accurate description of the assumption you've been
making. If it is not, I'd appreciate knowing how it falls short of
being accurate.

I am not attributing to you, even as a paraphrase, the remainder of
that sentence: "even if that object were one to which sizeof could not
be directly applied". That was comment on your response, or lack
thereof, when we pointed out this fact to you. I was not in any way
trying to assert that you understood this fact; quite the contrary. I
am fully aware that you believe that sizeof can somehow be directly
applied to these objects, despite the fact that we've repeatedly shown
that it cannot be.

My final comment, "That text is not actually present in the standard",
was simply a statement of fact; I was not trying in any way to libel
you by implying that you had ever accepted it as being true.

Douglas A. Gwyn

unread,
Jan 16, 2007, 1:06:34 PM1/16/07
to
Richard Tobin wrote:
> So you're suggesting something like:
> void *p = calloc(100000, 100000);
> sizeof(*(char[100000][100000] *)p);
> and sizeof would be unable to return the right result, so the calloc() must
> be in error. Is that right?

Not exactly. Let's assume there is an if(p!=NULL) inserted;
the for the code to be part of s.c. program the implementation
must support object sizes at least as large as 10^10 bytes,
otherwise the array construct is erroneous and the behavior
is undefined. (I expect any good compiler to diagnose the
too-large object; the one I usually use issues a fatal
message: "array dimension too big", i.e. the program is not
accepted, which as a non-s.c. program it doesn't have to be.)

> If so, you could perfectly well do:
> sizeof(char[100000][100000])
> without bothering to do the calloc(), and have the same problem, so I
> don't see how you deduce that the calloc() is the source of the error.

Since that is the same kind of undefined behavior if the
attempted object is too large, it would also be u.b.

And of course if the object is not too large this is all
uninteresting.

My actual point was that *if* calloc were to return non-null,
then by the spec for calloc it *has* allocated an object
that *could* be denoted by *(char[100000][100000] *)p, and
in that case sizeof can be (correctly) applied to it. The
contrapositive (which has precisely the same logical truth
value) is that *if* sizeof cannot be (correctly) applied to
that supposed object, then calloc cannot return non-null.

Douglas A. Gwyn

unread,
Jan 16, 2007, 1:25:27 PM1/16/07
to
Keith Thompson wrote:
> Suppose I execute
> void *ptr = calloc(SIZE_MAX/2, 3);
> and suppose I get a non-null result. How do I apply "sizeof" to get
> the size of the allocated object?

Per the calloc spec, you would be allowed to use
sizeof *(char(*)[SIZE_MAX/2][3])ptr
for example.

You might try the example on your favorite compiler and see
what happens.

> If an _exceptional condition_ occurs during the evaluation of an
> expression (that is, if the result is not mathematically defined
> or not in the range of representable values for its type), the
> behavior is undefined.
> This clearly applies to the unary "-" operator; why would it not apply
> equally to the unary "sizeof" operator?

Because sizeof is specified as returning the size of its
(object type) operand, without exception.

Douglas A. Gwyn

unread,
Jan 16, 2007, 1:39:00 PM1/16/07
to
"Douglas A. Gwyn" wrote:
> that *could* be denoted by *(char[100000][100000] *)p,

I copied Tobin's construct without looking at it too closely.
Of course it is syntactically incorrect, but that can be
easily corrected, as I did in another nearby posting.

Richard Tobin

unread,
Jan 16, 2007, 2:22:12 PM1/16/07
to
In article <45AD142A...@null.net>,

>> If so, you could perfectly well do:
>> sizeof(char[100000][100000])
>> without bothering to do the calloc(), and have the same problem, so I
>> don't see how you deduce that the calloc() is the source of the error.

>Since that is the same kind of undefined behavior if the
>attempted object is too large, it would also be u.b.

There is no object here, only a type. sizeof(type) returns a size_t,
and sizeof(char[100000][100000]) can't, so presumably we can deduce
that (for this implementation) char[100000][100000] is not a type.

If that is true, then presumably *(char[100000][100000] *)p is not a
legal cast, so you can't call sizeof on it.

>My actual point was that *if* calloc were to return non-null,
>then by the spec for calloc it *has* allocated an object

>that *could* be denoted by *(char[100000][100000] *)p [...]

It *could* be represented that way, and if it was, then sizeof could
not be applied to it. But how does that show that the object must not
exist, rather than that it must not be represented that way?

Richard Tobin

unread,
Jan 16, 2007, 2:23:25 PM1/16/07
to
In article <45AD1BC4...@null.net>,

Take it to be so corrected in my followup posting.

Keith Thompson

unread,
Jan 16, 2007, 2:56:08 PM1/16/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:
> Keith Thompson wrote:
[...]

>> If an _exceptional condition_ occurs during the evaluation of an
>> expression (that is, if the result is not mathematically defined
>> or not in the range of representable values for its type), the
>> behavior is undefined.
>> This clearly applies to the unary "-" operator; why would it not apply
>> equally to the unary "sizeof" operator?
>
> Because sizeof is specified as returning the size of its
> (object type) operand, without exception.

And unary "-" is specified as returning the negative of its (promoted)
operand, without exception.

If you give unary "-" an operand whose negative cannot be represented
in the appropriate type, it invokes undefined behavior. That doesn't
imply that such an operand cannot legally exist.

If you give "sizeof" an operand whose size cannot be represented in
the appropriate type, it invokes undefined behavior. That doesn't
imply that such an operand cannot legally exist.

I'll ask you again. Why does 6.5p5 apply to unary "-" but not to
"sizeof"?

kuy...@wizard.net

unread,
Jan 16, 2007, 3:06:28 PM1/16/07
to

Douglas A. Gwyn wrote:
> Keith Thompson wrote:
> > Suppose I execute
> > void *ptr = calloc(SIZE_MAX/2, 3);
> > and suppose I get a non-null result. How do I apply "sizeof" to get
> > the size of the allocated object?
>
> Per the calloc spec, you would be allowed to use
> sizeof *(char(*)[SIZE_MAX/2][3])ptr
> for example.
>
> You might try the example on your favorite compiler and see
> what happens.

But that doesn't retrieve the size of the object pointed at by ptr; it
retrieves the size of the object that ptr might or might not be
pointing at, whether or not it actually exists, and without paying any
attention to the actual value of ptr.

> > If an _exceptional condition_ occurs during the evaluation of an
> > expression (that is, if the result is not mathematically defined
> > or not in the range of representable values for its type), the
> > behavior is undefined.
> > This clearly applies to the unary "-" operator; why would it not apply
> > equally to the unary "sizeof" operator?
>
> Because sizeof is specified as returning the size of its
> (object type) operand, without exception.

Keith has already addressed this point.

Douglas A. Gwyn

unread,
Jan 17, 2007, 10:49:37 AM1/17/07
to
Richard Tobin wrote:
> There is no object here, only a type. sizeof(type) returns a size_t,
> and sizeof(char[100000][100000]) can't, so presumably we can deduce
> that (for this implementation) char[100000][100000] is not a type.

Structurally, it's a type, and I don't recall anything in the
standard that says otherwise. It happens to be an impossible-
to-realize type (under our working assumptions), and the only
escape I'm aware of is to consider it an erroneous construct.

> >My actual point was that *if* calloc were to return non-null,
> >then by the spec for calloc it *has* allocated an object
> >that *could* be denoted by *(char[100000][100000] *)p [...]
> It *could* be represented that way, and if it was, then sizeof could
> not be applied to it. But how does that show that the object must not
> exist, rather than that it must not be represented that way?

Because if it is an object, sizeof *could* be applied to it,
per the sizeof spec.

kuy...@wizard.net

unread,
Jan 17, 2007, 12:38:55 PM1/17/07
to

How could it be applied? Please provide an example. The following is
NOT an example:

void *pv = calloc(SIZE_MAX, SIZE_MAX);
char (*pvec)[SIZE_MAX][SIZE_MAX] = (char(*)[SIZE_MAX][SIZE_MAX])pv;
size_t s = sizeof(*pvec);

That was not an example because, as you yourself have just argued, the
second line involves an erroneous construct. That prevents us from
reaching any conclusions about the validity of the first line based
upon the undeniable problems presented by the third line. Please
provide an example which does not involve any such erroneous
constructs, where a successful allocation by calloc() of an object with
a size greater than SIZE_MAX would make it impossible for sizeof to
meet it's requirements.

Wojtek Lerch

unread,
Jan 17, 2007, 2:36:25 PM1/17/07
to
"Douglas A. Gwyn" <DAG...@null.net> wrote in message
news:45AE4591...@null.net...

> Because if it is an object, sizeof *could* be applied to it,
> per the sizeof spec.

No, sizeof is not applied to objects; it's applied to expressions or type
names. If the operand is an lvalue that involves pointers (and does not
involve a VLA), neither the validity nor the value of the sizeof expression
depends on the amount of storage that is actually allocated where the
pointer points to (or whether it actually points to anything or not).

Douglas A. Gwyn

unread,
Jan 17, 2007, 3:34:31 PM1/17/07
to
kuy...@wizard.net wrote:
> That was not an example because, as you yourself have just argued, the
> second line involves an erroneous construct. ...

That is one way of analyzing it, but we're talking about an
impossibility/contradiction, so there are numerous alternative
ways to set it up, none of them permissible. Therefore your
demand for a correct coding example is not reasonable. The
example I gave was meant merely to exhibit the contradiction.

It appears that what you are maintaining is that calloc could
successfully allocate something that is an object with a size
that cannot be measured by sizeof. What I am maintaining is
that sizeof has always been intended to work with any validly
constructed object type; there is no hint in the standard to
the contrary, such as a limit macro for the largest supported
object size. (SIZE_MAX is *not* that; it's just a property
of the integer type chosen to represent object sizes.) The
spec for calloc describes an object type like the ones we've
been using in the examples, relevant when calloc reports
success. The infeasibility of such an object type is why I
say that calloc should not be reporting success for such an
invocation. Another way of putting it is that if the
implementation can handle such a large object, then it should
also be using a sufficiently wide size_t to represent its size.

jacob navia

unread,
Jan 17, 2007, 4:26:04 PM1/17/07
to
Douglas A. Gwyn wrote:

[snip]

Another way of putting it is that if the
> implementation can handle such a large object, then it should
> also be using a sufficiently wide size_t to represent its size.

YES.

I think this is the crux of the matter. The implementation should change
size_t to hold the size of the object. That way, it could be passed
to realloc/memcpy/memset/and other functions!

It just makes NO SENSE to have an object whose size is bigger than
what size_t can represent.

jacob

Keith Thompson

unread,
Jan 17, 2007, 5:37:11 PM1/17/07
to

I agree that if an implementation can allocate an object whose size is
N bytes, then size_t *should* be able to represent the value N.
Making size_t bigger if necessary is more sensible than letting
calloc() create objects whose size cannot be represented as a size_t.
And I understand that it was not the intent of the committee to permit
such objects.

But that is *not* the point of the question I asked at the beginning
of this thread.

The point, once again, is that there is no *normative wording* in the
standard that forbids the creation of objects whose size cannot be
represented using size_t. You can shout all you like that having such
an object makes NO SENSE, but that doesn't change the fact that the
standard does not forbid such objects.

Consider the following program:

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
void *p = calloc(SIZE_MAX/2, 3);
if (p == NULL) {
puts("ok, calloc failed");
}
else {
*(int*)p = 42;
if (*(int*)p == 42) {
puts("ok, calloc succeeded");
}
else {
puts("oops");
}
}
return 0;
}

I assert the following:

1. It must print either "ok, calloc failed" or "ok, calloc succeeded".

2. In a conforming implementation, it *may* print "ok, calloc
succeeded". There is no undefined behavior; if it prints "oops",
then the implementation is non-conforming.

I understand it was the committee's *intent* that the program must
always print "ok, calloc failed", but that intent is not expressed in
the normative wording of the standard. In the 80 articles so far in
this thread, I don't believe anyone has demonstrated that there is any
such wording. (Doug Gwyn claims to have done so; I believe I have
refuted his arguments, and in my opinion he has not adequately
answered my refutations.)

This discussion is really more about the wording of the standard than
about whether objects bigger than SIZE_MAX bytes should be allowed.

I will gladly accept the idea that this lack of an explicit statement
is a flaw in the standard.

kuy...@wizard.net

unread,
Jan 17, 2007, 5:42:23 PM1/17/07
to
Douglas A. Gwyn wrote:
> kuy...@wizard.net wrote:
> > That was not an example because, as you yourself have just argued, the
> > second line involves an erroneous construct. ...
>
> That is one way of analyzing it, but we're talking about an
> impossibility/contradiction, so there are numerous alternative
> ways to set it up, none of them permissible. Therefore your
> demand for a correct coding example is not reasonable. The
> example I gave was meant merely to exhibit the contradiction.

We're in agreement that there's a contradiction. You conclude from the
contradiction that one particular way of avoiding the contradiction is
supposed to be inferred. I see the same contradiction, and notice that
there are at least two or thee different ways of avoiding it, and
conclude from the contradiction that we cannot infer reliably which of
them applies. Even if the method of avoiding the problem were unique,
it would be inappropriate to specify that kind of thing implicitly.
It's too fragile a method; a tiny change in the wording of one clause
could remove either the uniqueness of the solution, or it's very
existence. Such restrictions should be stated explicitly.

> ... What I am maintaining is


> that sizeof has always been intended to work with any validly

> constructed object type; ...

As usual, you pay too much attention to the intent, and not enough
attention to whether or not that intent has been clearly and correctly
communicated.

> ... there is no hint in the standard to
> the contrary, ...

To me, the fact that calloc() exists, and takes two size_t arguments,
seemed to be precisely such a hint. I'm willing to take your word for
it that it was not so intended, but I can't really see much point in
having calloc() in the standard if the only difference between
calloc(m,n) and malloc(m*n) is the equivalent of a call to memset().

> ... Another way of putting it is that if the


> implementation can handle such a large object, then it should
> also be using a sufficiently wide size_t to represent its size.

It would be nice if the standard actually said that.

Keith Thompson

unread,
Jan 17, 2007, 5:44:08 PM1/17/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:
> kuy...@wizard.net wrote:
> > That was not an example because, as you yourself have just argued, the
> > second line involves an erroneous construct. ...
>
> That is one way of analyzing it, but we're talking about an
> impossibility/contradiction, so there are numerous alternative
> ways to set it up, none of them permissible. Therefore your
> demand for a correct coding example is not reasonable. The
> example I gave was meant merely to exhibit the contradiction.

In my opinion, there is no contradiction, and you have not
demonstrated that there is.

> It appears that what you are maintaining is that calloc could
> successfully allocate something that is an object with a size
> that cannot be measured by sizeof. What I am maintaining is
> that sizeof has always been intended to work with any validly
> constructed object type; there is no hint in the standard to
> the contrary, such as a limit macro for the largest supported
> object size. (SIZE_MAX is *not* that; it's just a property
> of the integer type chosen to represent object sizes.)

I've discussed this in terms of whether an object's size may exceed
SiZE_MAX. You've asserted that this is not equivalent to the question
of whether size_t can represent the size of any possible object, but
you haven't explained how they're not equivalent.

If your assertion is correct, then SIZE_MAX is not the upper bound on
the size of an object, but it is *an* upper bound.

> The
> spec for calloc describes an object type like the ones we've
> been using in the examples, relevant when calloc reports
> success. The infeasibility of such an object type is why I
> say that calloc should not be reporting success for such an
> invocation. Another way of putting it is that if the
> implementation can handle such a large object, then it should
> also be using a sufficiently wide size_t to represent its size.

I agree that it *should*. I do not agree that the standard states
that it *must*. If, as you say, the (unstated) intent is that size_t
must be able to represent the size of any object, then the failure of
the standard to express that intent is a flaw in the standard, one
that should probably be address by a DR and by revised wording in the
(hypothetical) next edition of the standard.

Keith Thompson

unread,
Jan 17, 2007, 6:03:53 PM1/17/07
to
kuy...@wizard.net writes:
> Douglas A. Gwyn wrote:
[...]

> > ... there is no hint in the standard to
> > the contrary, ...
>
> To me, the fact that calloc() exists, and takes two size_t arguments,
> seemed to be precisely such a hint. I'm willing to take your word for
> it that it was not so intended, but I can't really see much point in
> having calloc() in the standard if the only difference between
> calloc(m,n) and malloc(m*n) is the equivalent of a call to memset().
[...]

There's another significant difference. If the result of m*n wraps
around to a value smaller than the mathematical product, then calloc()
must detect this and return a null pointer (or possible allocate a
huge object if that's permitted). In malloc(m*n), the multiplication
takes place in user code, and malloc() never sees the values of m*n,
only their (possibly wrapped) product.

Jun Woong

unread,
Jan 17, 2007, 7:52:17 PM1/17/07
to

jacob navia wrote:

> Douglas A. Gwyn wrote:
>
> [snip]
>
> Another way of putting it is that if the
> > implementation can handle such a large object, then it should
> > also be using a sufficiently wide size_t to represent its size.
>
> YES.
>
> I think this is the crux of the matter.

What matter? The crux of this discussion is that there is nothing in
the standard to explicitly (or even implicitly depending on the
standpoint) say it is not allowed to *have* an object or object type
(e.g., a pointer to a huge array) whose size cannot represented with
size_t. Whether such a huge object should be allowed or not doesn't
matter at least in this discussion. And most (including me) seem to
want it not to be allowed because of the nuisences it brings.


--
Jun, Woong (woong at icu.ac.kr)
Samsung Electronics Co., Ltd.

``All opinions expressed are mine, and do not represent
the official opinions of any organization.''

lawrenc...@ugs.com

unread,
Jan 17, 2007, 11:26:21 PM1/17/07
to
Keith Thompson <ks...@mib.org> wrote:
>
> A decision needs to be made one way or the other, and the standard
> needs to be augmented to clearly state it.

Why? What practical impact would it have? I understand the desire for
theoretical completeness and consistency, and sympathize with it, but
the Standard is intended first and foremost to solve practical problems,
not to be an academic exercise.

-Larry Jones

I obey the letter of the law, if not the spirit. -- Calvin

Keith Thompson

unread,
Jan 18, 2007, 12:39:29 AM1/18/07
to
lawrenc...@ugs.com writes:
> Keith Thompson <ks...@mib.org> wrote:
> >
> > A decision needs to be made one way or the other, and the standard
> > needs to be augmented to clearly state it.
>
> Why? What practical impact would it have? I understand the desire for
> theoretical completeness and consistency, and sympathize with it, but
> the Standard is intended first and foremost to solve practical problems,
> not to be an academic exercise.

Replace "needs to be" with "should be, in my opinion" in the above.

The practical impact is that, if it's made clear that size_t can
represent the size of any object, programmers can safely depend on
that guarantee. At least one person in this thread assumed, on
encountering calloc(), that it's intended to be capable of allocating
objects larger than can be allocated by malloc(). If I want to write
code that deals with generalized objects, is size_t enough, or do I
need to use some bigger type for full generality? size_t is
*supposed* to be able to represent object sizes; the fact that the
standard doesn't actually say so is a problem.

In any case completeness and consistency are good things. The current
ambiguity of the standard has, in this case, led to a lengthy (and
perhaps somewhat wasteful) argument about what it really says. (And
yes, I started it.)

Douglas A. Gwyn

unread,
Jan 18, 2007, 2:06:12 AM1/18/07
to
<kuy...@wizard.net> wrote in message
news:1169073741....@l53g2000cwa.googlegroups.com...
> ... I can't really see much point in

> having calloc() in the standard if the only difference between
> calloc(m,n) and malloc(m*n) is the equivalent of a call to memset().

calloc was part of the legacy (Unix) base library, which is the
main reason it was standardized (and has the interface it has).
You'd have to find the original inventor of calloc to hear why
he thought it should have two arguments. Certainly, early
implementations of Unix had no way for calloc to allocate
anything whose size could not be represented in size_t (the
type used for sizeof), and as we have discussed that wouldn't
be very useful anyway.

It may be somewhat interesting to note that very early on,
before unsigned integer types were sufficiently well supported,
the result of sizeof was (accidentally) signed; thus on the PDP-11
(which has a 16-bit address space) it was possible to get into
trouble by designating an object greater than 2^15-1 bytes.
(You didn't need to use calloc to do that, however.) This was
properly considered a bug and was soon fixed in the compilers .


Douglas A. Gwyn

unread,
Jan 18, 2007, 2:12:31 AM1/18/07
to
"jacob navia" <ja...@jacob.remcomp.fr> wrote in message
news:45ae946d$0$27415$ba4a...@news.orange.fr...

> It just makes NO SENSE to have an object whose size is bigger than
> what size_t can represent.

Or rather, it wouldn't be very useful.

Some might argue that there is no need to prevent calloc
from creating an oversized object, because the program
wouldn't have any legitimate way to access it beyond
the initial SIZE_MAX bytes. However, I think that
argument is not very good, because then it would be
easy for a program to accidentally specify too much to
calloc and if the allocation was reported successful,
the program would then go on to try to use the storage
in ways that would malfunction.


Douglas A. Gwyn

unread,
Jan 18, 2007, 2:16:30 AM1/18/07
to
"Keith Thompson" <ks...@mib.org> wrote...

> "Douglas A. Gwyn" <DAG...@null.net> writes:
>> "Keith Thompson" <ks...@mib.org> wrote...

>>> In other words, the statements "No object can be larger than SIZE_MAX
>>> bytes" and "size_t can represent the size of the largest supported
>>> object" are equivalent.
>> No, they are not equivalent.
> I don't understand; how do they differ? Are you making a distinction
> between "object" and "supported object"?

No, I'm making a distinction between the range of numbers that
can be encoded in an integer type and the largest object that can
exist in an implementation. While for some architectures those
may be identical, in most cases those limits have different values.


Douglas A. Gwyn

unread,
Jan 18, 2007, 2:19:39 AM1/18/07
to
"Douglas A. Gwyn" <DAG...@null.net> wrote...

> You'd have to find the original inventor of calloc to hear why
> he thought it should have two arguments. ...

> It may be somewhat interesting to note that very early on,
> before unsigned integer types were sufficiently well supported,
> the result of sizeof was (accidentally) signed; ...

It occurs to me that perhaps calloc was given two arguments
back during that time when there wasn't a good way to
specify a mallocation greater than half the address space..


Keith Thompson

unread,
Jan 18, 2007, 2:26:10 AM1/18/07
to
"Douglas A. Gwyn" <DAG...@null.net> writes:

There are some ways a program *could* access the entire object. For
example, array indices may be of any integer type, including types
bigger than size_t.

Keith Thompson

unread,
Jan 18, 2007, 3:13:57 AM1/18/07
to

Of course. I've never said otherwise.

Let's consider a concrete example. Suppose for a given implementation
size_t is 32 bits, SIZE_MAX === 2**32-1, and the largest supported
object is, let's say, 2**24 bytes.

For such an implementation, "size_t can represent the size of the
largest supported object".

It's also true that "No object can be larger than SIZE_MAX bytes".
(And it's also true that no object can be larger than 2**24 bytes.)

This does *not* imply that any object can be as large as SIZE_MAX
bytes. I think you've been assuming that I meant to imply that.
I did not. Please read more carefully.

Jean-Marc Bourguet

unread,
Jan 18, 2007, 4:10:41 AM1/18/07
to
jacob navia <ja...@jacob.remcomp.fr> writes:

> It just makes NO SENSE to have an object whose size is bigger than
> what size_t can represent.

I think this is just the reverse: it makes no sense to choose a size_t
which can not hold the size of all possible objects.

Yours,

--
Jean-Marc

kuy...@wizard.net

unread,
Jan 18, 2007, 6:48:36 AM1/18/07
to
Keith Thompson wrote:
> kuy...@wizard.net writes:
> > Douglas A. Gwyn wrote:
> [...]
> > > ... there is no hint in the standard to
> > > the contrary, ...
> >
> > To me, the fact that calloc() exists, and takes two size_t arguments,
> > seemed to be precisely such a hint. I'm willing to take your word for
> > it that it was not so intended, but I can't really see much point in
> > having calloc() in the standard if the only difference between
> > calloc(m,n) and malloc(m*n) is the equivalent of a call to memset().
> [...]
>
> There's another significant difference. If the result of m*n wraps
> around to a value smaller than the mathematical product, then calloc()
> must detect this and return a null pointer (or possible allocate a
> huge object if that's permitted). In malloc(m*n), the multiplication
> takes place in user code, and malloc() never sees the values of m*n,
> only their (possibly wrapped) product.

Granted - I should have said "biggest difference" rather than "only
difference"; I consider that a fairly minor additional advantage. It's
also something that, as recent discussions have shown, some people
didn't even expect calloc() to do, or at least didn't believe that
calloc() was required to do it. Some have reported implementations that
don't do it, and others have provided example code that was intended to
perform the check, which turned out to be defective. It doesn't sound
like something to rely on.

David R Tribble

unread,
Jan 19, 2007, 12:21:24 PM1/19/07
to
Keith Thompson wrote:
> I agree that if an implementation can allocate an object whose size is
> N bytes, then size_t *should* be able to represent the value N.
> Making size_t bigger if necessary is more sensible than letting
> calloc() create objects whose size cannot be represented as a size_t.
> And I understand that it was not the intent of the committee to permit
> such objects.
>
> But that is *not* the point of the question I asked at the beginning
> of this thread.
>
> The point, once again, is that there is no *normative wording* in the
> standard that forbids the creation of objects whose size cannot be
> represented using size_t. You can shout all you like that having such
> an object makes NO SENSE, but that doesn't change the fact that the
> standard does not forbid such objects.

That may be so, but it looks like the standard does not allow calloc()
to create such an object.

7.20.3.1 The calloc function
Synopsis
#include <stdlib.h>
void *calloc(size_t nmemb, size_t size);
Description
The calloc function allocates space for an array of 'nmemb'
objects, each of whose size is 'size'. The space is initialized
to all bits zero.

So calloc() is defined as allocating an array of objects. An array
type is also an object type [6.2.5p20]. An object's size must be
representable as a size_t value. Thus calloc() allocates an object
whose size is representable as a size_t value. Q.E.D.

Or did I miss something? I think it boils down to whether or not
"array of objects" is an object itself. If so, then it must have a
size representable as a size_t value.

-drt

Richard Tobin

unread,
Jan 19, 2007, 12:29:54 PM1/19/07
to
In article <1169227284....@38g2000cwa.googlegroups.com>,

David R Tribble <da...@tribble.com> wrote:

>An object's size must be
>representable as a size_t value.

Isn't that what is at issue? It's not stated explicitly in the standard.

David R Tribble

unread,
Jan 19, 2007, 12:26:36 PM1/19/07
to
Kuyper wrote:
>> ... I can't really see much point in
>> having calloc() in the standard if the only difference between
>> calloc(m,n) and malloc(m*n) is the equivalent of a call to memset().
>

Douglas A. Gwyn wrote:
> calloc was part of the legacy (Unix) base library, which is the
> main reason it was standardized (and has the interface it has).

The fact that it also initializes (zeroes) the allocated memory is
significant, since this could be done with specialized O/S
support or CPU instructions more efficiently than just calling
memset().

However, "existing practice" alone was sufficient to put it into C89.

-drt

David R Tribble

unread,
Jan 19, 2007, 12:32:48 PM1/19/07
to
Douglas A. Gwyn wrote:
> It may be somewhat interesting to note that very early on,
> before unsigned integer types were sufficiently well supported,
> the result of sizeof was (accidentally) signed; thus on the PDP-11
> (which has a 16-bit address space) it was possible to get into
> trouble by designating an object greater than 2^15-1 bytes.
> (You didn't need to use calloc to do that, however.) This was
> properly considered a bug and was soon fixed in the compilers .

IIRC, Harbison & Steele mentioned this in their book about C
implementations. Having a signed size_t type the same width as
array index types (i.e., 'signed int') has the curious effect of
preventing one from allocating 'char' arrays larger than INT_MAX,
but allowing one to allocate arrays larger than this of any other
type wider than 'char'.

Changing the signedness of type 'size_t' was half the fix needed.
The other half was ensuring that array subscript expressions were
treated as unsigned.

-drt

David R Tribble

unread,
Jan 19, 2007, 12:45:55 PM1/19/07
to
Jacob Navia wrote:
>> It just makes NO SENSE to have an object whose size is bigger than
>> what size_t can represent.
>

Douglas A. Gwyn wrote:
> Or rather, it wouldn't be very useful.
>
> Some might argue that there is no need to prevent calloc
> from creating an oversized object, because the program
> wouldn't have any legitimate way to access it beyond
> the initial SIZE_MAX bytes.

Not so. I gave code in an earlier post [2007-01-09]
http://groups.google.com/group/comp.std.c/msg/332c389974fd1d3d
showing how it could be done, provided that pointers are wide
enough to allow addressing such large objects (e.g., on a system
having 32-bit size_t and 64-bit pointers).

Here it is again:

// Assume sizeof(unsigned) is 32 bits
// Assume sizeof(size_t) is 32 bits
// Assume sizeof(char *) is 64 bits

void big_alloc_test()
{
char * obj;
char * ptr[10];

// Allocate a large object > 4GB
obj = calloc(10, UINT_MAX); // 10 x 4GB

// Assign pointers into the large object
ptr[0] = obj;
for (int i = 1; i < 10; i++)
ptr[i] = ptr[i-1] + UINT_MAX;

// Use the pointers into the object
... use ptr[i][0 ... UINT_MAX-1] ...
}

I believe this is conforming code.

Such an implementation would most likely have a 64-bit
ptrdiff_t type.

-drt

It is loading more messages.
0 new messages