detecting size_t overflow

m n

unread,

Aug 29, 2002, 10:41:28 AM8/29/02

to

Dear c.l.c,

I'm concerned about the possibility of malloc's parameter silently
overflowing, leading to undefined behavior if the program thinks the
allocation succeeded as it expected.

My current method of detecting potential size_t overflow is something like:

/* Concatenate file lines pointed to by foo, bar, and baz */

#define SIZE_T_MAX (size_t)-1

if(SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) <= 1)
{
/* Handle overflow error */
}
else
{
if( (ptr = malloc(strlen(foo) + strlen(bar) + strlen(baz) + 1)) == NULL)
{
/* Handle error */
}
/* Perform concatenation */
}

My reasoning is that a potential exists for a size_t overflow
if a bunch of size_t's are added together. Thus, by subtracting
each of them from SIZE_T_MAX, I can verify that the result leaves
room for at least one character (a '\0').

Any comments on my methodology/logic? Does anyone have other, perhaps
more elegant/robust/guaranteed methods?

Thanks,
Matt

Darklingน

unread,

Aug 29, 2002, 10:54:29 AM8/29/02

to

"m n"

>
> My reasoning is that a potential exists for a size_t overflow
> if a bunch of size_t's are added together.

size_t can not overflow as size_t is an unsigned type (instead it is reduced
modulo SIZE_MAX + 1).

>Thus, by subtracting each of them from SIZE_T_MAX, I can verify that the result
leaves room for at least one character (a '\0').

Your method is not guaranteed to work. What if strlen(foo) > SIZE_T_MAX / 2 and
strlen(bar) > SIZE_T_MAX /2?

> Any comments on my methodology/logic? Does anyone have other, perhaps
> more elegant/robust/guaranteed methods?

You can use something like the following to detect a "wrap-around" of values
when adding two variables a and b of type size_t.

size_t sum = a + b;
if (sum < a) {
/* value "wrapped-around" */
}

Darklingน

unread,

Aug 29, 2002, 11:10:23 AM8/29/02

to

"Darklingน"

> >
> >Thus, by subtracting each of them from SIZE_T_MAX, I can verify that the
result
> leaves room for at least one character (a '\0').

In fact, your method is guaranteed not to work in most cases; here a
"wrap-around" is only detected if the sum of the size_t values reduced module
SIZE_MAX+1 takes on a value of either 0 or 1.

Dan Pop

unread,

Aug 29, 2002, 11:38:16 AM8/29/02

to

In <eb4acddc.02082...@posting.google.com> iced_p...@yahoo.com (m n) writes:

>I'm concerned about the possibility of malloc's parameter silently
>overflowing, leading to undefined behavior if the program thinks the
>allocation succeeded as it expected.
>
>My current method of detecting potential size_t overflow is something like:
>
>/* Concatenate file lines pointed to by foo, bar, and baz */
>
>#define SIZE_T_MAX (size_t)-1
>
>if(SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) <= 1)
>{
> /* Handle overflow error */
>}

The logic behind this test (if any) is hopelessly broken. There is no
good reason to expect this expression to produce 0 or 1 in case of
overflow, any result is possible.

>else
>{
> if( (ptr = malloc(strlen(foo) + strlen(bar) + strlen(baz) + 1)) == NULL)
> {
> /* Handle error */
> }
> /* Perform concatenation */
>}
>
>My reasoning is that a potential exists for a size_t overflow
>if a bunch of size_t's are added together. Thus, by subtracting
>each of them from SIZE_T_MAX, I can verify that the result leaves
>room for at least one character (a '\0').

Nope, you can't. If overflow occurs, the result can be anything.

>Any comments on my methodology/logic? Does anyone have other, perhaps
>more elegant/robust/guaranteed methods?

Since size_t arithmetic is unsigned arithmetic, the behaviour in case
of overflow is well defined. If the result of adding two size_t values
is not larger than any of them, you know that overflow occured. This
cannot be extended to more than two values, so multiple checks are needed
if you have to add more than two values. In your case, you have to do
something like:

size_t temp = strlen(foo) + strlen(bar);
if (temp < strlen(foo) || temp + strlen(baz) < temp) /* overflow */;
if (temp + strlen(baz) == SIZE_T_MAX) /* no room for the null char */;

In practice, unless you're on a brain dead platform, size_t is large
enough to cover the whole address space available to your application,
so you have a big problem if you detect overflow.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

Jeremy Yallop

unread,

Aug 29, 2002, 11:42:41 AM8/29/02

to

* m n

| I'm concerned about the possibility of malloc's parameter silently
| overflowing, leading to undefined behavior if the program thinks the
| allocation succeeded as it expected.

This is a legitimate concern. This very problem (well, with realloc()
instead of malloc()) was discovered recently in the main
implementation of a popular scripting language; it caused a
segmentation fault when it showed up. By the way, "overflow" isn't
the correct way to describe the "wrap-around" (reduction modulo
FOO_MAX + 1) that takes place with unsigned integer arithmetic.

| My current method of detecting potential size_t overflow is something like:
|
| /* Concatenate file lines pointed to by foo, bar, and baz */
|
| #define SIZE_T_MAX (size_t)-1

C99 calls this SIZE_MAX, FWIW.

| if(SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) <= 1)
| {
| /* Handle overflow error */
| }
| else
| {
| if( (ptr = malloc(strlen(foo) + strlen(bar) + strlen(baz) + 1)) == NULL)
| {
| /* Handle error */
| }
| /* Perform concatenation */
| }
|
| My reasoning is that a potential exists for a size_t overflow
| if a bunch of size_t's are added together. Thus, by subtracting
| each of them from SIZE_T_MAX, I can verify that the result leaves
| room for at least one character (a '\0').
|
| Any comments on my methodology/logic?

Just one small thing: it's completely wrong. On most implementations
the arithmetic will be done in size_t, an unsigned type, yielding an
unsigned result. This means that

SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) <= 1

will only be true if the left side is equal to equal to zero or one.
Suppose that each of the strings has length SIZE_T_MAX/2 + 1. Then

SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) == SIZE_T_MAX/2

so you'll pass SIZE_T_MAX/2 to malloc and try to write about 5 * SIZE_T_MAX/2
bytes into the allocated space.

In addition, it's probably more efficient to only call strlen() once
for each string (although an optimizing compiler may fix this for
you).

Jeremy.

Dan Pop

unread,

Aug 29, 2002, 12:46:54 PM8/29/02

to

In <aklfdh$1k9r3r$1...@ID-114079.news.dfncis.de> Jeremy Yallop <jer...@jdyallop.freeserve.co.uk> writes:

>* m n
>| I'm concerned about the possibility of malloc's parameter silently
>| overflowing, leading to undefined behavior if the program thinks the
>| allocation succeeded as it expected.
>
>This is a legitimate concern. This very problem (well, with realloc()
>instead of malloc()) was discovered recently in the main
>implementation of a popular scripting language; it caused a
>segmentation fault when it showed up. By the way, "overflow" isn't
>the correct way to describe the "wrap-around" (reduction modulo
>FOO_MAX + 1) that takes place with unsigned integer arithmetic.

In the usual computing jargon, a result that is out of the range of values
representable in a given type causes overflow. Even if the behaviour
is well defined for unsigned arithmetic, it is still overflow, despite
the C standard suggesting otherwise.

Dan Pop

unread,

Aug 29, 2002, 12:50:48 PM8/29/02

to

In <aklfdh$1k9r3r$1...@ID-114079.news.dfncis.de> Jeremy Yallop <jer...@jdyallop.freeserve.co.uk> writes:

>In addition, it's probably more efficient to only call strlen() once
>for each string (although an optimizing compiler may fix this for
>you).

Highly unlikely, because C has no concept of "pure" functions, i.e.
functions with no side effects. It is still theoretically possible in
the case of strlen, because, being a standard library function, the
compiler is allowed to make additional assumptions about its behaviour.

Jeremy Yallop

unread,

Aug 29, 2002, 2:22:46 PM8/29/02

to

* Dan Pop

| In the usual computing jargon, a result that is out of the range of values
| representable in a given type causes overflow. Even if the behaviour
| is well defined for unsigned arithmetic, it is still overflow, despite
| the C standard suggesting otherwise.

I understand your point, but as the OP was clearly confused about
signed and unsigned arithmetic I thought it was important to use the
correct terminology. In the context of C, and with C's definitions,
unsigned arithmetic cannot overflow.

AFAICS, the standard (which, as you know, states that a "computation
involving unsigned operands can never overflow") is following K&R (who
state that the "handling of overflow [...] is not defined by the
language") in the use of the term.

Jeremy.

Jeremy Yallop

unread,

Aug 29, 2002, 2:35:36 PM8/29/02

to

* Dan Pop

I wrote "may" because I know of at least one implementation that
performs precisely this optimization (based on special knowledge of
strlen, as you suggest). I wouldn't say it's "highly unlikely".

Jeremy.

Micah Cowan

unread,

Aug 29, 2002, 6:05:39 PM8/29/02

to

iced_p...@yahoo.com (m n) writes:

> Dear c.l.c,
>
> I'm concerned about the possibility of malloc's parameter silently
> overflowing, leading to undefined behavior if the program thinks the
> allocation succeeded as it expected.
>
> My current method of detecting potential size_t overflow is something like:
>
> /* Concatenate file lines pointed to by foo, bar, and baz */
>
> #define SIZE_T_MAX (size_t)-1
>
> if(SIZE_T_MAX - strlen(foo) - strlen(bar) - strlen(baz) <= 1)
> {
> /* Handle overflow error */
> }
> else
> {
> if( (ptr = malloc(strlen(foo) + strlen(bar) + strlen(baz) + 1)) == NULL)
> {
> /* Handle error */
> }
> /* Perform concatenation */
> }

Other people have pointed out the problems with your solution. I'll
just present an alternative:

#define SIZES_ELEMS 3

...

size_t size_to_allocate = 1; /* for '\0' */
size_t remaining_size = SIZE_T_MAX - size_to_allocate;
size_t sizes_to_add[SIZES_ELEMS];
size_t *i;

sizes_to_add[0] = strlen(foo);
sizes_to_add[1] = strlen(bar);
sizes_to_add[2] = strlen(baz);

for (i = sizes_to_add; i != sizes_to_add + SIZES_ELEMS;
++i)
{
if (remaining_size > *i)
{
size_to_allocate += *i;
remaining_size -= *i;
}
else
{
/* Not enough space left in your size_t to add *i.
Deal with it. */
printf("Not enough space to add size number %d.\n",
i - sizes_to_add + 1);
break;
}
}

-Micah

m n

unread,

Aug 30, 2002, 7:13:26 PM8/30/02

to

Thank you all for your responses...you are all correct that
I was confused about unsigned arithmetic--I was assuming that
while unsigned integers cannot hold a negative value, the result
of an expression involving unsigned integers could
be negative, and therefore would be caught by my test.

That said, I'd like to offer another solution, more akin to your
responses, and ask for further feedback. What follows is a
variadic function which will return the sum of a set of integers
of type size_t; the function will further set the integer pointed
to by the overflow parameter to 1 if overflow occurred or zero
otherwise.

#include <stdarg.h>

size_t add_size_t(int *overflow, size_t num, size_t firstVal, ...)
{
va_list ap;
size_t oldSum, result;

*overflow = 0;
va_start(ap, firstVal);

for(result = firstVal; num > 1 && !(*overflow); num--)
{
oldSum = result;
result += va_arg(ap, size_t);
if(result < oldSum) *overflow = 1;
}

va_end(ap);
return result;
}

Clearly, the caller is responsible for ensuring that the number
of size_t integers to add is equivalent to the value passed through
the num parameter. If the overflow is set to zero, the caller
can proceed with a malloc using the return value of the function
instead of calling strlen() again for each char *. Please offer
your thoughts...

Cheers,
Matt

Darklingน

unread,

Aug 31, 2002, 1:08:42 AM8/31/02

to

"m n"

>
> #include <stdarg.h>
>
> size_t add_size_t(int *overflow, size_t num, size_t firstVal, ...)
> {
> va_list ap;
> size_t oldSum, result;
>
> *overflow = 0;
> va_start(ap, firstVal);
>
> for(result = firstVal; num > 1 && !(*overflow); num--)
> {
> oldSum = result;
> result += va_arg(ap, size_t);
> if(result < oldSum) *overflow = 1;
> }
>
> va_end(ap);
> return result;
> }
>
> Clearly, the caller is responsible for ensuring that the number
> of size_t integers to add is equivalent to the value passed through
> the num parameter. If the overflow is set to zero, the caller
> can proceed with a malloc using the return value of the function
> instead of calling strlen() again for each char *. Please offer
> your thoughts...

Your code is correct for the intended purpose. However, I would prefer to have
the indication of a possible overflow to be the return value of the function, as
this allows you to write your client code in a more convenient way. Something
like:

int add_size_t(size_t *sum, size_t n, size_t first, ...)
{
int rc = 0;
size_t s = first;
va_list ap;
va_start(ap, first);
for ( ; 1 < n; --n) {
const size_t t = s;
s += va_arg(ap, size_t);
if (s < t) {
rc = 1;
break;
}
}
va_end(ap);
*sum = s;
return rc;
}

Darklingน

unread,

Aug 31, 2002, 2:18:55 AM8/31/02

to

"Darklingน"
> "m n"
> > Please offer your thoughts...

A function I personally use, which might be of interest to you, for adding
values of type size_t is presented below. It is capable of adding two values
only (which is quite common in practice; I find the necessity to add more than
two such values quite rare, and ''clips'' a possible ''overflow'' to a value of
((size_t)-1). This means that the value ((size_t)-1) can no longer be
discrimated from an occurence of ''overflow'', but this is hardly a problem in
practice. This function also allows for adding more than two values in a
convenient way as demonstrated. Hope this can be of any use to you.

#include <stddef.h>

size_t summise_size_t(const size_t a, const size_t b)
{
const size_t sum = a + b;
return (sum < a) ? ((size_t)-1) : sum;
}

int main(void)
{
size_t a = 6, b = 100, c = 4300;
size_t sum = summise_size_t(a, summise_size_t(b, c));
if (sum == (size_t)-1) {
/* ''overflow'' */
}
else {
/* no ''overflow'' */
}
return 0;
}

Peter Nilsson

unread,

Aug 31, 2002, 3:06:21 AM8/31/02

to

"Darklingน" <Darklingน@invalid.email> wrote in message
news:Z2Zb9.28326$28.35...@zwoll1.home.nl...

> "Darklingน"
> > "m n"
> > > Please offer your thoughts...
>
> A function I personally use, which might be of interest to you, for adding
> values of type size_t is presented below. It is capable of adding two
values
> only (which is quite common in practice; I find the necessity to add more
than
> two such values quite rare, and ''clips'' a possible ''overflow'' to a
value of
> ((size_t)-1). This means that the value ((size_t)-1) can no longer be
> discrimated from an occurence of ''overflow'', but this is hardly a
problem in
> practice. This function also allows for adding more than two values in a
> convenient way as demonstrated. Hope this can be of any use to you.
>
> #include <stddef.h>
>
> size_t summise_size_t(const size_t a, const size_t b)
> {
> const size_t sum = a + b;
> return (sum < a) ? ((size_t)-1) : sum;
> }

[snip]

Ignoring the DeathStation scenario of size_t promoting to int, consider the
call...

summise_size_t((size_t)-1, 4)

Better is...

return (a < (size_t)-1 - b) ? a + b
: (size_t) -1;

--
Peter

Darklingน

unread,

Aug 31, 2002, 6:53:30 AM8/31/02

to

"Peter Nilsson"

>
> Ignoring the DeathStation scenario of size_t promoting to int, consider the
> call...
>
> summise_size_t((size_t)-1, 4)

Afaik the C standard does not allow for an integral promotion of the type
size_t.

pete

unread,

Aug 31, 2002, 9:30:30 AM8/31/02

to

What is the extent of your knowledge based on?

--
pete

Darklingน

unread,

Aug 31, 2002, 11:00:54 AM8/31/02

to

"pete"

I will have to guess what your phrase exactly means as I am not that familiar
with the English language to understand your question, but here goes: the way I
interpret the standard integral promotions do not apply to the type size_t. I
will be interested in the section(s) from the standard from which you can tell
me different.

Mathew Hendry

unread,

Aug 31, 2002, 12:14:35 PM8/31/02

to

Darklingš wrote:

>"pete" wrote:

>
>> Darklingš wrote:
>>
>> > Afaik the C standard
>> > does not allow for an integral promotion of the type size_t.
>>
>> What is the extent of your knowledge based on?
>
>I will have to guess what your phrase exactly means as I am not that familiar
>with the English language to understand your question, but here goes: the way I
>interpret the standard integral promotions do not apply to the type size_t. I
>will be interested in the section(s) from the standard from which you can tell
>me different.

They do apply because there is nothing in the standard to imply that
they do not. size_t is only a typedef for one of the standard unsigned
integer types, and they are all subject to the promotions.

-- Mat.

pete

unread,

Aug 31, 2002, 4:41:59 PM8/31/02

to

Darklingš wrote:
>
> "pete"
> > Darklingš wrote:
> > >
> > > "Peter Nilsson"
> > > >
> > > > Ignoring the DeathStation scenario of size_t promoting to int,
> > > > consider the call...
> > > >
> > > > summise_size_t((size_t)-1, 4)
> > >
> > > Afaik the C standard
> > > does not allow for an integral promotion of the type size_t.
> >
> > What is the extent of your knowledge based on?
>
> I will have to guess what your phrase exactly means as
> I am not that familiar with the English language to understand
> your question, but here goes:
> the way I interpret the standard integral promotions do not apply
> to the type size_t.

Do you have any reason to believe that the range of size_t
must be equal or greater than the range of int ?

> I will be interested in the section(s)
> from the standard from which you can tell me different.

N869
6.3.1.1 Boolean, characters, and integers

[#2] The following may be used in an expression wherever an
int or unsigned int may be used:
-- An object or expression with an integer type whose
integer conversion rank is less than the rank of int
and unsigned int.
-- A bit-field of type _Bool, int, signed int, or unsigned
int.
If an int can represent all values of the original type, the
value is converted to an int; otherwise, it is converted to
an unsigned int. These are called the integer
promotions.

--
pete

Darklingน

unread,

Sep 1, 2002, 2:46:58 AM9/1/02

to

"pete"

>
> Do you have any reason to believe that the range of size_t
> must be equal or greater than the range of int ?

Ah yes, you are right. I have not considered size_t to be an alias for a type
''smaller'' than int; untill now that is, though this seems quite exotic to me.
Anyway, you are right in that my code invokes UB in case the size_t values are
promoted to an int and the resulting sum overflows. If I am not mistaken, this
can be resolved by casting the parameters to type size_t in the summation to
have addition performed on unsigned types, so no overflow occurs.

size_t summise_size_t(const size_t a, const size_t b)
{

const size_t sum = (size_t)a + (size_t)b;

Peter Nilsson

unread,

Sep 1, 2002, 4:57:26 AM9/1/02

to

"Darklingน" <Darklingน@invalid.email> wrote in message

news:bzic9.34685$28.42...@zwoll1.home.nl...

> "pete"
> >
> > Do you have any reason to believe that the range of size_t
> > must be equal or greater than the range of int ?
>
> Ah yes, you are right. I have not considered size_t to be an alias for a
> type ''smaller'' than int; untill now that is, though this seems quite
> exotic to me.

It is. No sane implementation would do it that way.

> Anyway, you are right in that my code invokes UB in case the size_t
> values are promoted to an int and the resulting sum overflows. If I am
> not mistaken, this can be resolved by casting the parameters to type
> size_t in the summation to have addition performed on unsigned types,
> so no overflow occurs.

You are mistaken. :-)

>
> size_t summise_size_t(const size_t a, const size_t b)
> {
> const size_t sum = (size_t)a + (size_t)b;

The promotion would still apply. You're casts have literally left the
semantics unchanged. Even this would be the same...

size_t sum = a;
sum += b;

> return (sum < a) ? ((size_t)-1) : sum;
> }

In C, you cannot (portably) perform the sum, then check for overflow. The
parameters (size_t) -2 and 4, say, /still/ have the potential to produce a
false positive from your function! Reread my original reply for the
correction.

--
Peter

Darklingน

unread,

Sep 1, 2002, 2:32:19 PM9/1/02

to

"Peter Nilsson"
> "Darkling¹"

> >
> > size_t summise_size_t(const size_t a, const size_t b)
> > {
> > const size_t sum = (size_t)a + (size_t)b;
>
> The promotion would still apply. You're casts have literally left the
> semantics unchanged. Even this would be the same...

Holy kanarie, yes, you are right. It took me some time to see this though, but
then again I am just a new bee. It seems I will have to be more careful in my
coding. Do you happen to know the rationale behind these integral promotions (or
could you give a reference). Are there more of these "counterintuitive" things
in C for which I should code with some more care? TIA

Dan Pop

unread,

Sep 2, 2002, 7:13:44 AM9/2/02

to

In <sUsc9.36626$28.47...@zwoll1.home.nl> "Darklingน" <Darklingน@invalid.email> writes:

>"Peter Nilsson"
>> "Darklingน"

>> >
>> > size_t summise_size_t(const size_t a, const size_t b)
>> > {
>> > const size_t sum = (size_t)a + (size_t)b;
>>
>> The promotion would still apply. You're casts have literally left the
>> semantics unchanged. Even this would be the same...
>
>Holy kanarie, yes, you are right. It took me some time to see this though, but
>then again I am just a new bee. It seems I will have to be more careful in my
>coding. Do you happen to know the rationale behind these integral promotions (or
>could you give a reference).

Performance. Certain machines had no instructions for arithmetic on
entities shorter than the machine register.

>Are there more of these "counterintuitive" things
>in C for which I should code with some more care? TIA

The canonical example is an implementation where sizeof(int) == 1.
Plenty of things we commonly do in our programs would break on such an
implementation.

Chris Torek

unread,

Sep 2, 2002, 5:28:31 PM9/2/02

to

In article <aklf58$q6e$1...@sunnews.cern.ch> Dan Pop <Dan...@ifh.de> wrote:
>Since size_t arithmetic is unsigned arithmetic, the behaviour in case

>of overflow is well defined. ...

Unless, of course, size_t is a narrow unsigned type (such as
unsigned short) that widens to a signed type (e.g., signed int)
under "the usual arithmetic conversions". See how the unsigned
widening rules in ANSI C are broken? :-)

>In practice, unless you're on a brain dead platform, size_t is large
>enough to cover the whole address space available to your application,
>so you have a big problem if you detect overflow.

Also, only an obtuse implementor would make size_t a narrow unsigned
type that widens to a signed type. But I still consider this a
(small) bug in the standard.
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA Domain: to...@bsdi.com
http://67.40.109.61/torek/ (for the moment) I report spam to abuse@.
"nos...@elf.eng.bsdi.com" *is* my address (one of many actually).

Dan Pop

unread,

Sep 3, 2002, 5:18:08 AM9/3/02

to

In <al0l5v$hgv$1...@elf.eng.bsdi.com> Chris Torek <nos...@elf.eng.bsdi.com> writes:

>In article <aklf58$q6e$1...@sunnews.cern.ch> Dan Pop <Dan...@ifh.de> wrote:
>>Since size_t arithmetic is unsigned arithmetic, the behaviour in case
>>of overflow is well defined. ...
>
>Unless, of course, size_t is a narrow unsigned type (such as
>unsigned short) that widens to a signed type (e.g., signed int)
>under "the usual arithmetic conversions". See how the unsigned
>widening rules in ANSI C are broken? :-)
>
>>In practice, unless you're on a brain dead platform, size_t is large
>>enough to cover the whole address space available to your application,
>>so you have a big problem if you detect overflow.
>
>Also, only an obtuse implementor would make size_t a narrow unsigned
>type that widens to a signed type. But I still consider this a
>(small) bug in the standard.

I am not convinced that a size_t which is subject to integral promotions
is allowed.

Consider the following program:

void *malloc();

int main()
{
char *p = malloc(sizeof(int));
return 0;
}

As far as I can tell, this is a strictly conforming program. If size_t
is promoted to anything else, it will invoke undefined behaviour, because
malloc will not be called with a size_t argument.

The standard requires a correct declaration for malloc, but not a
prototype one:

[#2] Provided that a library function can be declared
without reference to any type defined in a header, it is
also permissible to declare the function and use it without
including its associated header.

Peter Nilsson

unread,

Sep 3, 2002, 6:09:19 PM9/3/02

to

"Dan Pop" <Dan...@ifh.de> wrote in message
news:al1uog$sip$1...@sunnews.cern.ch...
...

>
> I am not convinced that a size_t which is subject to integral promotions
> is allowed.
>
> Consider the following program:
>
> void *malloc();
>
> int main()
> {
> char *p = malloc(sizeof(int));
> return 0;
> }
>
> As far as I can tell, this is a strictly conforming program. If size_t
> is promoted to anything else, it will invoke undefined behaviour, because
> malloc will not be called with a size_t argument.

A program is either strictly conforming or it isn't. :-) Your use of the
term "invoke" is misleading here since an /implementation/ is never required
to /impose/ undefined behaviour.

> The standard requires a correct declaration for malloc, but not a
> prototype one:
>
> [#2] Provided that a library function can be declared
> without reference to any type defined in a header, it is
> also permissible to declare the function and use it without
> including its associated header.

The first word of the above clause is "Provided". AFAIK, size_t cannot be
guessed by a strictly conforming program. Indeed, under C99, size_t need not
be a standard integer type. Since the application of a function call may not
impose any implicit promotion on parameters other than the standard
promotions, that would seem to preclude the option of a strictly conforming
program declaring malloc without a prototype.

--
Peter

pete

unread,

Sep 3, 2002, 9:29:09 PM9/3/02

to

It's a different situation depending on whether or not the result
is guaranteed to be representable within the range of the type.
"overflow" is for when the result isn't guaranteed to be
representable within the range of the type, and the same word
shouldn't be used for what happens to unsigned integers.

--
pete

Dan Pop

unread,

Sep 4, 2002, 4:39:39 AM9/4/02

to

That's why we talk about signed arithmetic overflow and unsigned
arithmetic overflow. One results in undefined behaviour, the other in
truncation modulo Ux_MAX + 1. But in neither case the mathematically
correct result of the operation can be represented by the given type,
therefore we do have overflow.

Dan Pop

unread,

Sep 4, 2002, 4:46:59 AM9/4/02

to

In <3d75...@news.rivernet.com.au> "Peter Nilsson" <ai...@acay.com.au> writes:

>"Dan Pop" <Dan...@ifh.de> wrote in message
>news:al1uog$sip$1...@sunnews.cern.ch...
>...
>>
>> I am not convinced that a size_t which is subject to integral promotions
>> is allowed.
>>
>> Consider the following program:
>>
>> void *malloc();
>>
>> int main()
>> {
>> char *p = malloc(sizeof(int));
>> return 0;
>> }
>>
>> As far as I can tell, this is a strictly conforming program. If size_t
>> is promoted to anything else, it will invoke undefined behaviour, because
>> malloc will not be called with a size_t argument.
>
>A program is either strictly conforming or it isn't. :-) Your use of the
>term "invoke" is misleading here since an /implementation/ is never required
>to /impose/ undefined behaviour.

That was precisely my point: if the program is strictly conforming, such
an implementation would be non-conforming.

>> The standard requires a correct declaration for malloc, but not a
>> prototype one:
>>
>> [#2] Provided that a library function can be declared
>> without reference to any type defined in a header, it is
>> also permissible to declare the function and use it without
>> including its associated header.
>
>The first word of the above clause is "Provided".

And my declaration of malloc is, AFAICT, correct.

>AFAIK, size_t cannot be
>guessed by a strictly conforming program.

It doesn't have to.

>Indeed, under C99, size_t need not
>be a standard integer type.

So what?

>Since the application of a function call may not
>impose any implicit promotion on parameters other than the standard
>promotions, that would seem to preclude the option of a strictly conforming
>program declaring malloc without a prototype.

Or to preclude the option of size_t being subject to the integral
promotions.

pete

unread,

Sep 4, 2002, 9:03:11 AM9/4/02

to

Dan Pop wrote:

> That's why we talk about signed arithmetic overflow and unsigned
> arithmetic overflow.

I think it's best not to use words in a way that directly
contradicts the C standard, when discussing C,
and especially in clc.

--
pete

Dan Pop

unread,

Sep 4, 2002, 9:44:47 AM9/4/02

to

Then, what do you propose as a "better" replacement for unsigned
arithmetic overflow?

Peter Nilsson

unread,

Sep 4, 2002, 5:42:44 PM9/4/02

to

"Dan Pop" <Dan...@ifh.de> wrote in message

news:al4ha3$d3h$1...@sunnews.cern.ch...

> In <3d75...@news.rivernet.com.au> "Peter Nilsson" <ai...@acay.com.au>
writes:
> >"Dan Pop" <Dan...@ifh.de> wrote in message
> >news:al1uog$sip$1...@sunnews.cern.ch...
> >...
> >>
> >> I am not convinced that a size_t which is subject to integral
promotions
> >> is allowed.
> >>
> >> Consider the following program:
> >>
> >> void *malloc();
> >>
> >> int main()
> >> {
> >> char *p = malloc(sizeof(int));
> >> return 0;
> >> }
> >>
> >> As far as I can tell, this is a strictly conforming program. If size_t
> >> is promoted to anything else, it will invoke undefined behaviour,
because
> >> malloc will not be called with a size_t argument.

...

> >> The standard requires a correct declaration for malloc, but not a
> >> prototype one:
> >>
> >> [#2] Provided that a library function can be declared
> >> without reference to any type defined in a header, it is
> >> also permissible to declare the function and use it without
> >> including its associated header.
> >
> >The first word of the above clause is "Provided".
>
> And my declaration of malloc is, AFAICT, correct.

Well... I guess it's a matter of interpretation! You see [#2] as allowing an
unprototyped declaration of malloc. I see it as stating a prototype
requirement for those functions which take an implementation defined
parameter type, like malloc.

I find it curious that functions like cosf under C99 can also be "declared
without reference to any type /defined in a header/", and yet its
unprototyped declaration without an associated header is certainly not
permissible in a strictly conforming program.

--
Peter

pete

unread,

Sep 4, 2002, 11:45:37 PM9/4/02

to

Dan Pop wrote:
>
> In <3D7604...@mindspring.com> pete <pfi...@mindspring.com> writes:
>
> >Dan Pop wrote:
> >
> >> That's why we talk about signed arithmetic overflow and unsigned
> >> arithmetic overflow.
> >
> >I think it's best not to use words in a way that directly
> >contradicts the C standard, when discussing C,
> >and especially in clc.
>
> Then, what do you propose as a "better" replacement for unsigned
> arithmetic overflow?

"unsigned arithmetic wrap around"

--
pete

Dan Pop

unread,

Sep 5, 2002, 5:17:28 AM9/5/02

to

Since when is "wrap around" a technical term?

"wrap around" describes what happens when unsigned arithmetic overflow
occurs, not the unsigned arithmetic overflow condition itself.

Dan Pop

unread,

Sep 5, 2002, 5:27:04 AM9/5/02

to

Before C99, NO standard library function was designed to have parameters
whose types would have been subjected to the default argument promotions
if used as arguments (i.e. no parameters narrower than int, no parameters
of float type). My argument is based on the fact that malloc() predates
C99.

E. Gibbons

unread,

Sep 5, 2002, 1:29:53 PM9/5/02

to

In article <al77f8$387$1...@sunnews.cern.ch>, Dan Pop <Dan...@ifh.de> wrote:
>In <3D76D3...@mindspring.com> pete <pfi...@mindspring.com> writes:
>
>>Dan Pop wrote:
>>>
>>> In <3D7604...@mindspring.com> pete <pfi...@mindspring.com> writes:
>>>
>>> >Dan Pop wrote:
>>> >
>>> >> That's why we talk about signed arithmetic overflow and unsigned
>>> >> arithmetic overflow.
>>> >
>>> >I think it's best not to use words in a way that directly
>>> >contradicts the C standard, when discussing C,
>>> >and especially in clc.
>>>
>>> Then, what do you propose as a "better" replacement for unsigned
>>> arithmetic overflow?
>>
>>"unsigned arithmetic wrap around"
>
>Since when is "wrap around" a technical term?

Since now, or some time in the future: you asked for a proposal for
a replacement, not for something that is necessarily in-use currently.
The original issue, apparently, was that *your* term was not a proper
technical term either, in that it wasn't in the Standard (I don't know
this to be the case firsthand, it's just my paraphrasing of the thread).

Personally, I call it "rollover".

>"wrap around" describes what happens when unsigned arithmetic overflow
>occurs, not the unsigned arithmetic overflow condition itself.

Isn't the "condition" one and the same with "what happens when it occurs"?

I don't like calling what happens with unsigned ints, "overflow", because
I view the situation analogously to a buffer that one is indexing and
writing to: after one hits the end of the buffer, if one writes past the
last element, that is "overflow", whereas if one starts back at the
beginning and overwrites the first element, that is "rollover" (or
"wraparound" if you will).

--Ben

--

Darklingน

unread,

Sep 5, 2002, 3:11:06 PM9/5/02

to

"E. Gibbons" <euph...@u.washington.edu>

> In article <al77f8$387$1...@sunnews.cern.ch>, Dan Pop <Dan...@ifh.de> wrote:
> >In <3D76D3...@mindspring.com> pete <pfi...@mindspring.com> writes:
> >
> >>Dan Pop wrote:
> >>>
> >>> In <3D7604...@mindspring.com> pete <pfi...@mindspring.com> writes:
> >>>
> >>> >Dan Pop wrote:
> >>> >
> >>> >> That's why we talk about signed arithmetic overflow and unsigned
> >>> >> arithmetic overflow.
> >>> >
> >>> >I think it's best not to use words in a way that directly
> >>> >contradicts the C standard, when discussing C,
> >>> >and especially in clc.
> >>>
> >>> Then, what do you propose as a "better" replacement for unsigned
> >>> arithmetic overflow?
> >>
> >>"unsigned arithmetic wrap around"
> >
> >Since when is "wrap around" a technical term?
>
> Since now, or some time in the future: you asked for a proposal for
> a replacement, not for something that is necessarily in-use currently.
> The original issue, apparently, was that *your* term was not a proper
> technical term either, in that it wasn't in the Standard (I don't know
> this to be the case firsthand, it's just my paraphrasing of the thread).
>
> Personally, I call it "rollover".

I opt for 'reduced', as this is the closest )short) term in accordence with the
satandard.

pete

unread,

Sep 5, 2002, 7:42:14 PM9/5/02

to

The conditions are:
1 A computation involving unsigned operands that
cannot be represented by the resulting unsigned integer type
2 Assignment of an out of range value to an unsigned integer type

--
pete

Dan Pop

unread,

Sep 6, 2002, 7:34:22 AM9/6/02

to

In <al84ah$8bm$1...@nntp6.u.washington.edu> euph...@u.washington.edu (E. Gibbons) writes:

>In article <al77f8$387$1...@sunnews.cern.ch>, Dan Pop <Dan...@ifh.de> wrote:
>>"wrap around" describes what happens when unsigned arithmetic overflow
>>occurs, not the unsigned arithmetic overflow condition itself.
>
>Isn't the "condition" one and the same with "what happens when it occurs"?

Nope. Think about signed arithmetic: when the condition (i.e. overflow)
occurs, what happens is undefined behaviour.

Simon Biber

unread,

Sep 6, 2002, 7:06:59 PM9/6/02

to

"pete" <pfi...@mindspring.com> wrote:

> Dan Pop wrote:
> > "wrap around" describes what happens when unsigned
> > arithmetic overflow occurs, not the unsigned
> > arithmetic overflow condition itself.
>
> The conditions are:
> 1 A computation involving unsigned operands that
> cannot be represented by the resulting unsigned
> integer type
> 2 Assignment of an out of range value to an
> unsigned integer type

Wrap-around can also happen in integer promotions, argument conversion for
function calls, through explicit casts, etc. Condition 2 should read
"Conversion of an out-of-range value to an unsigned integer type". Such
conversions do not only happen on assignment.

--
Simon.

Joe Wright

unread,

Sep 7, 2002, 10:41:52 AM9/7/02

to

I'm jumping in here out of turn so to speak but I think the subject begs
a question that these responses are not answering. Consider..

typedef unsigned long size_t; /* Somewhere */

Do we think this means that size_t overflows at ULONG_MAX+1 ? Maybe not.
Consider..

#define ULONG_MAX 4294967295UL
#define SIZE_MAX 2147483647UL

I would think size_t overflows at SIZE_MAX+1.
--
Joe Wright mailto:joeww...@earthlink.net
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---

Mathew Hendry

unread,

Sep 7, 2002, 11:48:51 AM9/7/02

to

On Sat, 07 Sep 2002 14:41:52 GMT, Joe Wright
<joeww...@earthlink.net> wrote:

>I'm jumping in here out of turn so to speak but I think the subject begs
>a question that these responses are not answering. Consider..
>
>typedef unsigned long size_t; /* Somewhere */
>
>Do we think this means that size_t overflows at ULONG_MAX+1 ? Maybe not.
>Consider..
>
>#define ULONG_MAX 4294967295UL
>#define SIZE_MAX 2147483647UL

I don't think that's allowed. The *_MAX macros defined in <limits.h>
specify "the minimum and maximum limits of integer types corresponding
to types defined in other standard headers" (C99 7.18.3), so if
unsigned long corresponds to size_t, then SIZE_MAX must equal
ULONG_MAX.

-- Mat.

Simon Biber

unread,

Sep 7, 2002, 11:59:25 AM9/7/02

to

"Joe Wright" <joeww...@earthlink.net> wrote:
> I'm jumping in here out of turn so to speak but I think
> the subject begs a question that these responses are not
> answering. Consider..
>
> typedef unsigned long size_t; /* Somewhere */
>
> Do we think this means that size_t overflows at
> ULONG_MAX+1 ? Maybe not. Consider..
>
> #define ULONG_MAX 4294967295UL
> #define SIZE_MAX 2147483647UL
>
> I would think size_t overflows at SIZE_MAX+1.

I would think that your three definitions there are inconsistent. If size_t
is a typedef alias for unsigned long, then they will wrap around at the
same value. I should think it's guaranteed that both:
SIZE_MAX == (size_t)-1
and
ULONG_MAX == (unsigned long)-1
must be true.

--
Simmon.

pete

unread,

Sep 7, 2002, 10:07:41 PM9/7/02

to

Simon Biber wrote:
>
> "pete" <pfi...@mindspring.com> wrote:
> > Dan Pop wrote:
> > > "wrap around" describes what happens when unsigned
> > > arithmetic overflow occurs, not the unsigned
> > > arithmetic overflow condition itself.
> >
> > The conditions are:
> > 1 A computation involving unsigned operands that
> > cannot be represented by the resulting unsigned
> > integer type
> > 2 Assignment of an out of range value to an
> > unsigned integer type
>
> Wrap-around can also happen in integer promotions,
> argument conversion for
> function calls, through explicit casts, etc.

I don't think that non-negative values can be altered during
integer promotions, but conversions in general, seems OK.

> Condition 2 should read
> "Conversion of an out-of-range value to an unsigned integer type".
> Such conversions do not only happen on assignment.

--
pete

Dan Pop

unread,

Sep 9, 2002, 8:25:02 AM9/9/02

to

In <3D7A11...@earthlink.net> Joe Wright <joeww...@earthlink.net> writes:

>I'm jumping in here out of turn so to speak but I think the subject begs
>a question that these responses are not answering. Consider..
>
>typedef unsigned long size_t; /* Somewhere */
>
>Do we think this means that size_t overflows at ULONG_MAX+1 ?

Well, I do.

> Maybe not. Consider..
>
>#define ULONG_MAX 4294967295UL
>#define SIZE_MAX 2147483647UL
>
>I would think size_t overflows at SIZE_MAX+1.

I would think this is a non-conforming implementation. SIZE_MAX
reflects a property of the type aliased by size_t. If this type is
unsigned long, there is no way to have SIZE_MAX != ULONG_MAX in a
conforming implementation. Even if the implementation doesn't support
objects whose size exceeds 2 GB.

Joe Wright

unread,

Sep 9, 2002, 10:31:40 PM9/9/02

to

My apologies, please. I jumped in here just a little off beam. My take
on "detecting size_t overflow" was not about types so much as values. If
SIZE_MAX is 2 GB then detecting size_t overflow at 4 GB misses the
point.

Dan Pop

unread,

Sep 10, 2002, 5:15:02 AM9/10/02

to

In <3D7D5B...@earthlink.net> Joe Wright <joeww...@earthlink.net> writes:

>Dan Pop wrote:
>>
>> In <3D7A11...@earthlink.net> Joe Wright <joeww...@earthlink.net> writes:
>>
>> >I'm jumping in here out of turn so to speak but I think the subject begs
>> >a question that these responses are not answering. Consider..
>> >
>> >typedef unsigned long size_t; /* Somewhere */
>> >
>> >Do we think this means that size_t overflows at ULONG_MAX+1 ?
>>
>> Well, I do.
>>
>> > Maybe not. Consider..
>> >
>> >#define ULONG_MAX 4294967295UL
>> >#define SIZE_MAX 2147483647UL
>> >
>> >I would think size_t overflows at SIZE_MAX+1.
>>
>> I would think this is a non-conforming implementation. SIZE_MAX
>> reflects a property of the type aliased by size_t. If this type is
>> unsigned long, there is no way to have SIZE_MAX != ULONG_MAX in a
>> conforming implementation. Even if the implementation doesn't support
>> objects whose size exceeds 2 GB.
>>
>My apologies, please. I jumped in here just a little off beam. My take
>on "detecting size_t overflow" was not about types so much as values. If
>SIZE_MAX is 2 GB then detecting size_t overflow at 4 GB misses the
>point.

But, since SIZE_MAX cannot be 2 GB, in context, I can't see your point.

Joe Wright

unread,

Sep 10, 2002, 9:02:22 PM9/10/02

to

Really? What is the point then of SIZE_MAX if not to limit the value of
valid size_t values?

pete

unread,

Sep 11, 2002, 1:22:16 AM9/11/02

to

Dan Pop wrote:

> That's why we talk about signed arithmetic overflow and unsigned
> arithmetic overflow.

It's like when I hear people talk about
"fake psychics" and "real psychics".

--
pete

Simon Biber

unread,

Sep 11, 2002, 3:14:49 AM9/11/02

to

Dan Pop wrote:
> > But, since SIZE_MAX cannot be 2 GB, in context, I can't
> > see your point.

Joe Wright replied:

> Really? What is the point then of SIZE_MAX if not to limit
> the value of valid size_t values?

size_t is an alias for one of the unsigned integer types. It must have the
same range of valid values, and behave the same way on 'overflow'
(wraparound), as the underlying unsigned integer type.

if size_t is unsigned int, then SIZE_MAX == UINT_MAX.
If size_t is unsigned long, then SIZE_MAX == ULONG_MAX.
In C99, size_t may be an alias of an extended integer type.
But still, you can be sure that SIZE_MAX == (size_t)-1

For a value of size_t to be "valid" does not mean that the implementation
must allow objects of that size. It's just a matter of what values are
representable in the type.

--
Simon.

Dan Pop

unread,

Sep 11, 2002, 5:19:00 AM9/11/02

to

Really! Did you bother to read (and understand) what I wrote in the
paragraph starting with "I would think this is a non-conforming
implementation..." still quoted above?

>What is the point then of SIZE_MAX if not to limit the value of
>valid size_t values?

To document a property of the size_t type. The standard doesn't impose
*any* restriction on how this type is used by a program, therefore there
is no such thing as an "invalid" size_t value.

Most implementations have idea about what is the size of the largest
object they can support, *at compile time*, because this value depends
on the resources available at run time. And knowing what is the size of
the largest object an implementation could theoretically support (if
"infinite" resources were available at run time) is of no particular use
for most applications.

Furthermore, on many modern systems, the size of the largest supported
object can't be properly determined at run time either: the amount of
available memory can change while the program is running in ways the
program has no control upon.