size_t vs long.

Amit

unread,

Nov 17, 2022, 1:20:45 AM11/17/22

to

Hi,

I prefer long over size_t.

This is because, in case the user passes a negative number by mistake then I will be able to check it if the type is long and return immediately.

But if size_t is used, then most probably, it will result in a crash - like malloc(-1) will crash the program because unsigned -1 is 0xFFFFFFFF and this much memory is not available on today's computers and probably may not be available at all in future also (RAM size of 2^64 bits is really really huge).

Another thing is that if size_t is used an array index then array[-1] will result in wrong behavior or program crash. But with long, the developer can check whether the index is negative, thus avoiding program crash.

So, in my opinion, long should be used instead of size_t.

I know that original glibc authors had chosen size_t, so there must be some reason for that, however that reason is not clear to me.

Amit

James Kuyper

unread,

Nov 17, 2022, 2:44:49 AM11/17/22

to

On 11/17/22 01:20, Amit wrote:
> Hi,
>
> I prefer long over size_t.

For which purpose? C supports a large number of different types,
precisely because different types are preferred for different purposes.
More often than not, the type I need to use is dictated by the
interfaces to standard and third-part libraries that I'm using.

When I do have a choice in the matter, I prefer unsigned types like
size_t in any context where bitwise operations are involved, because
many of those operations can have undefined behavior under certain
circumstances when using a signed types, but not with unsigned types.

Unsigned types are often preferred when the quantity in question is
incapable of being negative, though care must be used when applying this
criterion. Operations involving quantities that cannot be negative, may
have results that can be negative. If such operations are relevant to a
particular variable, that variable should have a signed type.

Amit

unread,

Nov 17, 2022, 3:16:14 AM11/17/22

to

I am not talking about signed vs unsigned.

I am specifically talking about size_t. In general, size_t is used in gibc where size is concerned such as in malloc(), memcmp(), etc.

size_t is defined as unsigned long.

My question is why size_t is used in these functions when, if the user passes a negative value by mistake then these functions will crash the program.

What reasons glibc original authors had when they used size_t in these functions instead of long.

If we don't know those reasons then if a developer has to implement a function called allocate_memory() then which implementation should he use and WHY?

Implementation 1:
-----------------------------

void *allocate_memory(size_t len)
{
}

Implementation 2:
-----------------------------

void *allocate_memory(long len)
{
}

Of the above two implementations, which implementation should the developer use and WHY?

In my opinion, using size_t in malloc(), memcmp(), etc. is wrong. I could be wrong but if we don't know original glibc authors reasons then may be I am right.

Amit

Amit

unread,

Nov 17, 2022, 3:28:44 AM11/17/22

to

To shorten the discussion and making it to the point, I would like to know why is size_t used in malloc() when a negative value (passed by user by mistake) can crash the program. Using long and checking for negative values can prevent the program from crashing.

Some people might say that user should check for value being negative but for checking for negative values long is required, not size_t. So, then the user will end up using long instead of size_t. So, in effect using size_t in malloc is not correct (unless we get further insight that explains why size_t is correct in malloc()).

Amit

Kaz Kylheku

unread,

Nov 17, 2022, 4:31:54 AM11/17/22

to

On 2022-11-17, Amit <amitchou...@gmail.com> wrote:
> I know that original glibc authors had chosen size_t, so there must be
> some reason for that, however that reason is not clear to me.

The ANSI C standard was ratified in 1989; it specified a malloc function
prototyped in <stdlib.h> like this:

void *malloc(size_t);

It has nothing to do with glibc.

ANSI C made size_t an unsigned type because:

- sizes of objects are never negative, so a type representing
object size need not represent negative values.

- at the time it was not unusual to have objects which needed
the full unsigned range for expressing their size:
like declaring a 60 kilobyte array under the "small"
memory model on an Intel 8088 PC. The size would be a 16
bit type, which would have to be unsigned to capture the
60 kilobyte size.

- Also relevant in 32 bits. Under 32 bits, a signed, 32 bit
size_t means that only objects up to 2Gb have a meaningful
sizeof value.

Your preference for long spells trouble in a 16 bit system,
because long must be at least 32 bits wide. That's wasteful
and unnatural for object sizes in that system.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Kaz Kylheku

unread,

Nov 17, 2022, 4:37:24 AM11/17/22

to

On 2022-11-17, Amit <amitchou...@gmail.com> wrote:

> On Thursday, November 17, 2022 at 1:14:49 PM UTC+5:30, james...@alumni.caltech.edu wrote:
>> On 11/17/22 01:20, Amit wrote:
>> > Hi,
>> >
>> > I prefer long over size_t.

> [ ... ]

> I am not talking about signed vs unsigned.

Effectively, you are, because long is signed, by definition; and size_t
is unsigned, by definition.

> What reasons glibc original authors had when they used size_t in these functions instead of long.

They were conforming to the requirements written in the ISO C standard,
which defines those functions.

>
> If we don't know those reasons then if a developer has to implement a function called allocate_memory() then which implementation should he use and WHY?
>
> Implementation 1:
> -----------------------------
>
> void *allocate_memory(size_t len)
> {
> }

This can easily express the allocation of 3Gb on a 32 bit system, or 48K on a 16 bit system.

>
> Implementation 2:
> -----------------------------
>
> void *allocate_memory(long len)
> {
> }

This forces a 16 bit system to wastefully pass the object size as a 32
bit operand, the top 16 bits of which will always be zero.

The function cannot express the allocation of more than 2Gb on a 32 bit system;
at least not with straightforward positive values.

David Brown

unread,

Nov 17, 2022, 5:16:31 AM11/17/22

to

On 17/11/2022 07:20, Amit wrote:
> Hi,
>
> I prefer long over size_t.
>
> This is because, in case the user passes a negative number by mistake
> then I will be able to check it if the type is long and return
> immediately.
>
> But if size_t is used, then most probably, it will result in a crash
> - like malloc(-1) will crash the program because unsigned -1 is
> 0xFFFFFFFF and this much memory is not available on today's computers
> and probably may not be available at all in future also (RAM size of
> 2^64 bits is really really huge).

You are mixing 64-bit and 32-bit here. Do you mean
0xffff'ffff'ffff'ffff ? After all, 0xffff'ffff is just shy of 4 GB,
which is not excessive on modern computers.

And do you mean "long long", rather than "size_t" ? On some platforms
"long" is 32-bit and "size_t" is 64-bit. On some, "long" is 64-bit. On
others, "long is 32-bit" and "size_t" is 16-bit.

>
> Another thing is that if size_t is used an array index then array[-1]
> will result in wrong behavior or program crash. But with long, the
> developer can check whether the index is negative, thus avoiding
> program crash.
>
> So, in my opinion, long should be used instead of size_t.
>
> I know that original glibc authors had chosen size_t, so there must
> be some reason for that, however that reason is not clear to me.
>
> Amit

This all seems a pretty drastic decision based solely on spotting a very
specific type of error that is unlikely to happen in practice, and which
will immediately be found in testing.

Why just check for negative values? What happens if the caller passes
0x0bad0bad0bad0bad as the size? It is perfectly good value in a 64-bit
"long", yet just as unrealistic for a memory size as -1. You are
drawing arbitrary lines that I don't think help anyone.

A

unread,

Nov 17, 2022, 5:43:07 AM11/17/22

to

>
> - at the time it was not unusual to have objects which needed
> the full unsigned range for expressing their size:
> like declaring a 60 kilobyte array under the "small"
> memory model on an Intel 8088 PC. The size would be a 16
> bit type, which would have to be unsigned to capture the
> 60 kilobyte size.
>

Intel 8088 came in 1979. But C standard library came with C89 in 1989. In 1989, i486 was released. Size of "int" on i486 was 32 bits and probably RAM size was 4 MB (22 bits). I don't think that in 1989 C standard would have even thought about Intel 8088 (Intel 8088 would have been probably obsolete by 1989).

In fact, i386 came in 1985 and size of "int" on it was also probably 32 bits.

So, I don't think C89 would have thought about Intel 8088.

Amit

A

unread,

Nov 17, 2022, 5:48:26 AM11/17/22

to

> This all seems a pretty drastic decision based solely on spotting a very
> specific type of error that is unlikely to happen in practice, and which
> will immediately be found in testing.

> Why just check for negative values? What happens if the caller passes
> 0x0bad0bad0bad0bad as the size? It is perfectly good value in a 64-bit
> "long", yet just as unrealistic for a memory size as -1. You are
> drawing arbitrary lines that I don't think help anyone.

Let's say that you have to design an allocate_memory() function. Would you choose a size_t argument over long and WHY?

Amit

Bart

unread,

Nov 17, 2022, 6:30:22 AM11/17/22

to

I'd assume a 64-bit target and use u64 or the C equivalent.

I really wouldn't bother with 'long' at all, which on Windows 64 is 32
bits anyway.

A

unread,

Nov 17, 2022, 6:52:13 AM11/17/22

to

So, the reason for not using long is that it is 32 bits on Windows 64. Lets assume that long is 64 bits on Windows. Then would you use long? If not WHY?

No one is actually giving any compelling reason(s) for using size_t over long in malloc().

Amit

Bart

unread,

Nov 17, 2022, 7:27:54 AM11/17/22

to

Because malloc takes a size_t parameter?

Mark Bluemel

unread,

Nov 17, 2022, 7:28:42 AM11/17/22

to

On Thursday, 17 November 2022 at 11:52:13 UTC, A wrote:

> No one is actually giving any compelling reason(s) for using size_t over long in malloc().

Your definition of compelling seems very personal, and can largely be ignored, I feel.
I'm not sure that I'm going to worry too much about the views of someone who can't recognise the distinction between a standard (the C library definition) and an implementation (glibc).

The most compelling reason for malloc to take a size_t argument is that that is what the standard mandates. It's a little late to try and debate the decision.

A

unread,

Nov 17, 2022, 7:29:04 AM11/17/22

to

Let me ask a similar question.

Please see the function below:

void *allocate_memory(long size)
{
if (size < 0) {
errno = -ENEGATIVESIZE;
return NULL;
}
return (malloc((size_t)(size)));
}

Would you ask the developer to use size_t instead of long and WHY?

Amit

A

unread,

Nov 17, 2022, 7:34:55 AM11/17/22

to

On Thursday, 17 November 2022 at 17:58:42 UTC+5:30, Mark Bluemel wrote:
> On Thursday, 17 November 2022 at 11:52:13 UTC, A wrote:

> I'm not sure that I'm going to worry too much about the views of someone who can't recognise the distinction between a standard (the C library definition) and an implementation (glibc).

How did you came to the conclusion that I can't recognize the distinction between a standard (the C library definition) and an implementation (glibc)?

Please explain.

Amit

Bart

unread,

Nov 17, 2022, 7:46:42 AM11/17/22

to

On 17/11/2022 12:28, A wrote:
> On Thursday, 17 November 2022 at 17:22:13 UTC+5:30, A wrote:
>> On Thursday, 17 November 2022 at 17:00:22 UTC+5:30, Bart wrote:
>>> On 17/11/2022 10:48, A wrote:
>>>>
>>>>> This all seems a pretty drastic decision based solely on spotting a very
>>>>> specific type of error that is unlikely to happen in practice, and which
>>>>> will immediately be found in testing.
>>>>
>>>>> Why just check for negative values? What happens if the caller passes
>>>>> 0x0bad0bad0bad0bad as the size? It is perfectly good value in a 64-bit
>>>>> "long", yet just as unrealistic for a memory size as -1. You are
>>>>> drawing arbitrary lines that I don't think help anyone.
>>>>
>>>> Let's say that you have to design an allocate_memory() function. Would you choose a size_t argument over long and WHY?
>>> I'd assume a 64-bit target and use u64 or the C equivalent.
>>>
>>> I really wouldn't bother with 'long' at all, which on Windows 64 is 32
>>> bits anyway.
>> So, the reason for not using long is that it is 32 bits on Windows 64. Lets assume that long is 64 bits on Windows. Then would you use long? If not WHY?
>>
>> No one is actually giving any compelling reason(s) for using size_t over long in malloc().
>>
>> Amit
>
> Let me ask a similar question.
>
> Please see the function below:
>
> void *allocate_memory(long size)
> {
> if (size < 0) {

Why is the parameter signed?

> errno = -ENEGATIVESIZE;
> return NULL;
> }
> return (malloc((size_t)(size)));
> }
>
> Would you ask the developer to use size_t instead of long and WHY?

As I said, on Windows 'long' is always 32 bits. You can't change that,
so it would be silly to use what will be an i32 type on a 64-bit machine.

AIUI, 'size_t' is usually defined as either u32 or u64 depending on
whether the platform is 32 or 64 bits. So it's a conditional type.

I personally don't care about 32-bit systems, and for such a function,
I'd use u64, or even i64, but then you might need that check.

However, I think it's pointless to check for a size of -1, but not for
one of 2**52 for example. Passing -1 (0xFFFFFFFFFFFFFFFF interpreted as
unsigned) to malloc will make it fail too.

Ben Bacarisse

unread,

Nov 17, 2022, 7:48:05 AM11/17/22

to

Amit <amitchou...@gmail.com> writes:

> I prefer long over size_t.

Hmm... ok, but you probably can't avoid using it. The argument to
malloc is a size_t, for example, as is the result of strlen.

> This is because, in case the user passes a negative number by mistake
> then I will be able to check it if the type is long and return
> immediately.

You can do the same check with size_t. Any value > SIZE_MAX/2 is
probably the result of converting a negative value to size_t. You are
cutting out half of the possible sizes, but you are doing that (for the
common widths of these types) already by excluding negative values of
type long.

The only case where you would not want to lose half the range like this
is when size_t is shorter than long. But on those implementations,
there is probably some non-trivial cost to doing long arithmetic so on
these systems you might prefer to use size_t as it was intended to be
used. I'm prepared to bet your code would not run on such a system
anyway given that you assume -1 to be 0xFFFFFFFF!

<cut>

> I know that original glibc authors had chosen size_t, so there must be
> some reason for that, however that reason is not clear to me.

size_t is not a glibc invention. It's from the very first C standard
(1989) when machines were much smaller. On a 16-bit implementation,
size_t might be only 16 bits wide so making it unsigned was worth while.
At the same time, insisting on using long would have incurred
unnecessary runt-time costs since long might not be supported by the
hardware.

--
Ben.

Richard Damon

unread,

Nov 17, 2022, 7:59:46 AM11/17/22

to

But with an unsigned, you CAN'T pass a negative number, because as you
have said, it becomes a very big number, which for something like
malloc, should return an error, that will

>
> Some people might say that user should check for value being negative but for checking for negative values long is required, not size_t. So, then the user will end up using long instead of size_t. So, in effect using size_t in malloc is not correct (unless we get further insight that explains why size_t is correct in malloc()).
>
> Amit

But unsigneds CAN'T be negative, just too big to be reasonable, which
would also need to be checked with a long.

Thus with unsigned sizes you can make LESS checks for bad values, you
only need to check that it isn't too big, not that it negative.

Note, before we got to 64 bit computers, it was sometimes reasonable to
ask for more than 1/2 of the max value of the max unsigned value, so
making size_t signed would have been an error.

A

unread,

Nov 17, 2022, 8:10:00 AM11/17/22

to

Is there an example of this?

As far as I know, malloc(size_t) was introduced in 1989 with C89 standard. At that time, i486 was also released. In 1985, i386 was released. They both had size of "long" as 32 bits but RAM size was only about 4 MB (22 bits) or 8 MB (23 bits).

So, I don't think that your comment is correct unless you can give an example.

Regards,
Amit

Ike Naar

unread,

Nov 17, 2022, 8:15:26 AM11/17/22

to

On 2022-11-17, Amit <amitchou...@gmail.com> wrote:

> But if size_t is used, then most probably, it will result in a crash -
> like malloc(-1) will crash the program because unsigned -1 is 0xFFFFFFFF
> and this much memory is not available on today's computers and probably
> may not be available at all in future also (RAM size of 2^64 bits is really
> really huge).

If malloc cannot allocate the requested size, it does not crash, it returns
a null pointer.

> Another thing is that if size_t is used an array index then array[-1]
> will result in wrong behavior or program crash. But with long, the
> developer can check whether the index is negative, thus avoiding program
> crash.

The array index should be in the range [0 .. N-1] where N is the number
of elements in the array.
Passing an index < 0 or an index >= N is an error, so don't do that.
If the index is an unsigned type, it cannot be < 0, but you should still
be sure it's < N.

Paul N

unread,

Nov 17, 2022, 8:24:23 AM11/17/22

to

On Thursday, November 17, 2022 at 12:29:04 PM UTC, A wrote:
> Please see the function below:
>
> void *allocate_memory(long size)
> {
> if (size < 0) {
> errno = -ENEGATIVESIZE;
> return NULL;
> }
> return (malloc((size_t)(size)));
> }

If you really want to do this test, you could always do:

void *allocate_memory(size_t size)
{
if ((long)size < 0) {

errno = -ENEGATIVESIZE;
return NULL;
}

return (malloc(size));
}

Bart

unread,

Nov 17, 2022, 8:45:22 AM11/17/22

to

On 17/11/2022 13:09, A wrote:

The comment was about the 32-bit machines just before 64-bit ones. And
those 32-bit machines could easily have more then 4GB memory in total,
and up to 4GB per task.

So asking for a 3GB allocation was reasonable, but this needs a 32-bit
unsigned value to represent.

(Windows 32 would limit the address space for user code to 2GB, but
other systems can work differently. What you don't want to throw away
half the possible available memory just because of choosing i32 over u32.)

Scott Lurndal

unread,

Nov 17, 2022, 10:01:23 AM11/17/22

to

Amit <amitchou...@gmail.com> writes:
>Hi,

>
>I know that original glibc authors had chosen size_t, so there must be some=

> reason for that, however that reason is not clear to me.

size_t predates glibc by several years. size_t was specified by the
SVID and adopted into POSIX and XPG.

It defines the size of an object/region, and it is defined such that it can
describe the largest object/region that can be created or allocated by
the implementation.

A negative size is not sensible.

Lew Pitcher

unread,

Nov 17, 2022, 10:24:35 AM11/17/22

to

POSIX now defines a ssize_t, a signed integer type which is "used for a
count of bytes or an error indication", with negative values indicating
error conditions, and positive (or zero) values indicating a "count of
bytes".

--
Lew Pitcher
"In Skills, We Trust"

Kaz Kylheku

unread,

Nov 17, 2022, 10:49:07 AM11/17/22

to

Because earlier you were writing nonsense about how the glibc authors
chose the size_t parameter for malloc, and why did they do so.

Kaz Kylheku

unread,

Nov 17, 2022, 10:56:05 AM11/17/22

to

On 2022-11-17, A <amit2342...@gmail.com> wrote:

> Let me ask a similar question.
>
> Please see the function below:
>
> void *allocate_memory(long size)
> {
> if (size < 0) {
> errno = -ENEGATIVESIZE;

errno values aren't negated; might you have been studying Linux kernel
code recently?

The existing ERANGE value might be suitable here.

> return NULL;
> }
> return (malloc((size_t)(size)));
> }
>
> Would you ask the developer to use size_t instead of long and WHY?

It depends on the expected uses and portability requirements
for the code.

If the code had to be portable to 32 bit systems, and it was expected
that the allocator would have to handle huge objects, I would flag it
as a problem. With a 32 bit long, you cannot express an allocation
larger than 2Gb, whereas a 32 bit size_t will let you do that.

A possible portability concern would be 64 bit targets that have not
chosen 64 bits for the long type.

Note also that the "sizeof" operator in C yields a value of type size_t.
So when we write malloc(sizeof (some_type)), the sizeof expressino
is yielding exactly the type which malloc expects. That's a powerful
argument to me. While you may get to define your own allocation
function, you don't get to redefine how sizeof works.

Kaz Kylheku

unread,

Nov 17, 2022, 11:17:35 AM11/17/22

to

On 2022-11-17, A <amit2342...@gmail.com> wrote:

> On Thursday, 17 November 2022 at 18:29:46 UTC+5:30, Richard Damon wrote:
>> On 11/17/22 3:28 AM, Amit wrote:
>>
>>
>> Note, before we got to 64 bit computers, it was sometimes reasonable to
>> ask for more than 1/2 of the max value of the max unsigned value, so
>> making size_t signed would have been an error.
>
> Is there an example of this?
>
> As far as I know, malloc(size_t) was introduced in 1989 with C89
> standard.

ANSI C wasn't suddenly produced in 1989; it was ratified that year after
years of standardization.

Lots of 16-bit systems existed in that era and were programmed in C.

There are still 16 bit microcontrollers today with C toolchains
today. The most recewnt C standard allows implementations to be
conforming which have a 16 bit size_t, and the library continues
to be organized accordingly.

The 8088 itself was used in some embedded systems, long after it
disappeared from new personal computers.

IBM-PC compatibles continued to run 16 bit platforms: MS-DOS was
still used in 1989. Windows 3.0 was released in May, 1990.

In the 1990 I had a programming job, which was on 16 bit DOS. I was
wrestling with near and far pointers and such. That crud was still
everywhere, and development for it was going on; it took some years
after that for 16 bit PC's to begin disappearing.

I was doing contract work in 1995 in various companies, and distinctly
remember installations of Windows for Workgroups. Windows NT and 95
weren't brought in everywhere overnight.

> At that time, i486 was also released. In 1985, i386 was
> released. They both had size of "long" as 32 bits but RAM size was
> only about 4 MB (22 bits) or 8 MB (23 bits).

Eventually, 32 bit x86 machines did have RAM sizes of 4Gb and beyond,
where some kinds of of programs that malloced over 2Gb made sense.
(Not to mention that it's possible to do with virtual memory before
you actually have that much RAM.)

Ben Bacarisse

unread,

Nov 17, 2022, 11:20:42 AM11/17/22

to

This won't work if long is wider than size (as is permitted).

> }
> return (malloc(size));
> }

--
Ben.

Ben Bacarisse

unread,

Nov 17, 2022, 11:59:50 AM11/17/22

to

A <amit2342...@gmail.com> writes:

> Let me ask a similar question.
>
> Please see the function below:
>
> void *allocate_memory(long size)
> {
> if (size < 0) {
> errno = -ENEGATIVESIZE;

It's better to avoid using names starting with E since implementations
may add their own to errno.h.

My main objection is that this should not be reported as an allocation
failure. This *should* "crash" the program. I'd have an assert in
there to force termination during testing.

> return NULL;
> }
> return (malloc((size_t)(size)));

Why all the brackets? return malloc((size_t)size); is clearer isn't it?

> }
>
> Would you ask the developer to use size_t instead of long and WHY?

If the sort of bug you are hoping to catch is at all likely, I'd try to
improve testing. I'd have a testing malloc function that has

assert(size < 0x10000);

(or whatever is a reasonable maximum for this program) at the top so as
to catch all these bug as early as possible. I most certainly don't
want code that detect a NULL return and then does something if errno is
-ENEGATIVESIZE. It's not a condition I want the code to handle like an
allocation failure!

--
Ben.

Keith Thompson

unread,

Nov 17, 2022, 1:03:11 PM11/17/22

to

In 1989, the Intel 8088 was 10 years old, and was still in extensive
use. Of course the ANSI C committee would have thought about 8088-based
systems. Even today, there are embedded systems with small word sizes.

And on Windows, size_t is 64 bits but long is only 32 bits.

C implementations are far more varied than the ones you or I have
encountered.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for XCOM Labs
void Void(void) { Void(); } /* The recursive call of the void */

Chris M. Thomasson

unread,

Nov 18, 2022, 12:48:37 AM11/18/22

to

On 11/16/2022 10:20 PM, Amit wrote:
> Hi,
>
> I prefer long over size_t.
>

> This is because, in case the user passes a negative number by mistake then I will be able to check it if the type is long and return immediately.
>

> But if size_t is used, then most probably, it will result in a crash - like malloc(-1) will crash the program because unsigned -1 is 0xFFFFFFFF and this much memory is not available on today's computers and probably may not be available at all in future also (RAM size of 2^64 bits is really really huge).

[...]

Patient: Doctor, it hurts when I malloc(-1)...

Doctor: Well, then don't do that...

;^)

Lynn McGuire

unread,

Nov 18, 2022, 1:27:02 AM11/18/22

to

On 11/17/2022 12:20 AM, Amit wrote:
> Hi,
>
> I prefer long over size_t.
>
> This is because, in case the user passes a negative number by mistake then I will be able to check it if the type is long and return immediately.
>
> But if size_t is used, then most probably, it will result in a crash - like malloc(-1) will crash the program because unsigned -1 is 0xFFFFFFFF and this much memory is not available on today's computers and probably may not be available at all in future also (RAM size of 2^64 bits is really really huge).
>

> Another thing is that if size_t is used an array index then array[-1] will result in wrong behavior or program crash. But with long, the developer can check whether the index is negative, thus avoiding program crash.
>

> So, in my opinion, long should be used instead of size_t.
>

> I know that original glibc authors had chosen size_t, so there must be some reason for that, however that reason is not clear to me.
>
> Amit

Are longs and long longs the same in Unix ?

Thanks,
Lynn

Amit

unread,

Nov 18, 2022, 1:36:19 AM11/18/22

to

On Thursday, November 17, 2022 at 9:19:07 PM UTC+5:30, Kaz Kylheku wrote:
> On 2022-11-17, A <amit2342...@gmail.com> wrote:
> >
> > How did you came to the conclusion that I can't recognize the
> > distinction between a standard (the C library definition) and an
> > implementation (glibc)?
>>
> Because earlier you were writing nonsense about how the glibc authors
> chose the size_t parameter for malloc, and why did they do so.

Kaz,

You are a stupid person.

Amit

Amit

unread,

Nov 18, 2022, 1:39:44 AM11/18/22

to

On Thursday, November 17, 2022 at 7:15:22 PM UTC+5:30, Bart wrote:
> On 17/11/2022 13:09, A wrote:
> > On Thursday, 17 November 2022 at 18:29:46 UTC+5:30, Richard Damon wrote:
> >> On 11/17/22 3:28 AM, Amit wrote:
> >>
> >>
> >> Note, before we got to 64 bit computers, it was sometimes reasonable to
> >> ask for more than 1/2 of the max value of the max unsigned value, so
> >> making size_t signed would have been an error.
> >
> > Is there an example of this?
> >
> > As far as I know, malloc(size_t) was introduced in 1989 with C89 standard. At that time, i486 was also released. In 1985, i386 was released. They both had size of "long" as 32 bits but RAM size was only about 4 MB (22 bits) or 8 MB (23 bits).
> >
> > So, I don't think that your comment is correct unless you can give an example.
> The comment was about the 32-bit machines just before 64-bit ones. And
> those 32-bit machines could easily have more then 4GB memory in total,
> and up to 4GB per task.
>
> So asking for a 3GB allocation was reasonable, but this needs a 32-bit
> unsigned value to represent.
>

Did some application/software really ask for 3 GB allocation in one go? Do you know of any such application/software?

Amit

Ike Naar

unread,

Nov 18, 2022, 3:19:29 AM11/18/22

to

On 2022-11-18, Amit <amitchou...@gmail.com> wrote:
> Did some application/software really ask for 3 GB allocation in one go?
> Do you know of any such application/software?

Here's one:

------8<------ start application/software.c ------>8------
#include <stdio.h>
#include <stdlib.h>

void try(size_t size)
{
printf("try: size as size_t: %zu, size as long: %ld\n", size, (long) size);
char * const p = malloc(size);
printf("try: malloc(%zu) = %p\n", size, p);
free(p);
}

int main(void)
{
printf("sizeof (long) = %zu\n", sizeof (long));
printf("sizeof (size_t) = %zu\n", sizeof (size_t));
size_t const giga = 1024 * 1024 * 1024;
try(3 * giga);
return 0;
}
------8<------ end application/software.c ------>8------

$ gcc -m32 application/software.c
$ ./a.out
sizeof (long) = 4
sizeof (size_t) = 4
try: size as size_t: 3221225472, size as long: -1073741824
try: malloc(3221225472) = 0x30800000

David Brown

unread,

Nov 18, 2022, 3:56:42 AM11/18/22

to

The history of both "malloc" and C goes back a lot longer than the first
ANSI C standard.

And in 1989 there were 64-bit computers with 32 GB of ram. Not many, to
be fair - Cray was not a mass-market supplier - but they existed.

Addresses had already been used for far bigger ranges than physical
memory, however, as people used virtual memory for large tasks.

For 16-bit systems, it is entirely reasonable to ask for more than half
the address range. "size_t" is 16-bit unsigned, while "long" will be
32-bit (the minimum size allowed), and programs may deal with objects
that are greater than 32 KB but less than the full 64K address range.

We did have 64-bit systems before 4 GB ram became an economically and
physically practical size of memory. But 32-bit processors were still
common with a long overlap with 64-bit processors in the mass market,
and the Windows world was especially limited. Normal 32-bit Windows
limited user space to 2 GB, and thus a signed 32-bit size would be
enough, but IIRC 32-bit server versions of Windows supported a 3/1 GB
split between user and kernel space. That was common on 32-bit Linux too.

This is all history, of course. The main reasons to use "size_t"
instead of "long" for sizes in C are that it is the standard practice
and you'd need a /very/ good reason for doing something else, and that
on Windows systems, "long" is not big enough - you'd need "long long".
And on 32-bit systems, "long long" would be too big. You need a type
that is "just right" for all systems - "size_t".

David Brown

unread,

Nov 18, 2022, 4:19:36 AM11/18/22

to

On 17/11/2022 11:43, A wrote:
>>
>> - at the time it was not unusual to have objects which needed
>> the full unsigned range for expressing their size:
>> like declaring a 60 kilobyte array under the "small"
>> memory model on an Intel 8088 PC. The size would be a 16
>> bit type, which would have to be unsigned to capture the
>> 60 kilobyte size.
>>
>
> Intel 8088 came in 1979. But C standard library came with C89 in 1989. In 1989, i486 was released. Size of "int" on i486 was 32 bits and probably RAM size was 4 MB (22 bits). I don't think that in 1989 C standard would have even thought about Intel 8088 (Intel 8088 would have been probably obsolete by 1989).
>
> In fact, i386 came in 1985 and size of "int" on it was also probably 32 bits.
>
> So, I don't think C89 would have thought about Intel 8088.
>

> Amit

The 80386 and 80486 were used primarily as 16-bit systems, due to the
incredible slowness of innovation in the Microsoft world - it wasn't
until 2001 (with XP) that the most common versions of MS's OS's were
actually 32-bit at core. They had hybrid Frankenstein's monsters with
16-bit kernel and some 32-bit user-land code (from Windows 3 through to
Windows ME) or with 32-bit kernel and some 16-bit user-land code
(original NT).

So for most people, "int" on 80386 and 8046 was 16-bit.

And your history of C is bizarre. The ANSI C standard in 1989 was a
standardisation of /existing/ C common usage - it did not suddenly
invent C and its standard library out of thin air.

You also miss the point that the C standard is not remotely interested
in what happens in the DOS/Windows/x86 world - it is designed to work
with an enormous range of systems. That includes 8-bit devices, which
still ship in similar numbers to 16-bit and 32-bit devices. (64-bit bit
devices are minor in comparison, in terms of device numbers. Sale
prices are a different matter entirely.)

David Brown

unread,

Nov 18, 2022, 4:30:29 AM11/18/22

to

On 17/11/2022 11:48, A wrote:
>
>> This all seems a pretty drastic decision based solely on spotting a very
>> specific type of error that is unlikely to happen in practice, and which
>> will immediately be found in testing.
>
>> Why just check for negative values? What happens if the caller passes
>> 0x0bad0bad0bad0bad as the size? It is perfectly good value in a 64-bit
>> "long", yet just as unrealistic for a memory size as -1. You are
>> drawing arbitrary lines that I don't think help anyone.
>
> Let's say that you have to design an allocate_memory() function. Would you choose a size_t argument over long and WHY?
>

> Amit

Reason number one for choosing "size_t" is that it is what programmers
expect, fits with all existing memory allocation functions (such as
those in the standard library), fits the common usage of C programmers
when dealing with something that is the "size of something", and is the
type returned by the "sizeof" operator.

Reason number two would be that it is the right range for the purpose on
a wide variety of systems - including those where "size_t" is smaller
than "long", and those where it is bigger than "long", as these are all
in mainstream usage.

Reason number three is that sometimes (admittedly not often) the range
of "long" is not enough. I still have a 32-bit computer with 8 GB ram
under my desk at home, though it's a number of years since I last had it
switched on.

So, lots of benefits of "size_t". And what is the benefit of "long",
assuming you happen to be on a target where "long" is the same size as
"size_t" ? No one is going to pass a negative value to the function -
it's not going to happen intentionally because it makes no sense, and it
is extremely unlikely to happen by accident. (Testing for 0 might be
appropriate.) So in reality, it gives no advantage whatsoever.

David Brown

unread,

Nov 18, 2022, 4:38:08 AM11/18/22

to

No.

They are the same size and range in all 64-bit *nix systems (AFAIK), but
on 32-bit *nix systems they are different (long is 32-bit, long long is
64-bit). That also applies to the x32 ABI for Linux on amd64 processor,
where you have 64-bit registers but 32-bit pointers.

Even if the size and range are the same, they are still different types
- a "long *" and a "long long *" pointer are incompatible.

A

unread,

Nov 18, 2022, 4:46:16 AM11/18/22

to

On Friday, 18 November 2022 at 14:49:36 UTC+5:30, David Brown wrote:
> On 17/11/2022 11:43, A wrote:
> >>
>
> And your history of C is bizarre. The ANSI C standard in 1989 was a
> standardisation of /existing/ C common usage - it did not suddenly
> invent C and its standard library out of thin air.
>

I was talking in context of glibc. glibc 0.1 came out in 1991. In 1988, glibc pre-release appeared.

Link: https://sourceware.org/glibc/wiki/Glibc%20Timeline

Amit

tTh

unread,

Nov 18, 2022, 5:02:05 AM11/18/22

to

On 11/18/22 10:19, David Brown wrote:

>
> So for most people, "int" on 80386 and 8046 was 16-bit.

I remember, back in the 80's, working on a DOS software
with the Lattice C compiler, where int was 32 bits.
Don't remeber if it was the default or an option, but
very nice when your soft has to be run on PC-Dos and
Atari ST :)

tTh

--
+------------------------------------------------------------------+
| https://danstonchat.com/1138.html |
+------------------------------------------------------------------+

David Brown

unread,

Nov 18, 2022, 7:02:09 AM11/18/22

to

On 18/11/2022 10:46, A wrote:
>
> On Friday, 18 November 2022 at 14:49:36 UTC+5:30, David Brown wrote:
>> On 17/11/2022 11:43, A wrote:
>>>>
>>
>> And your history of C is bizarre. The ANSI C standard in 1989 was a
>> standardisation of /existing/ C common usage - it did not suddenly
>> invent C and its standard library out of thin air.
>>
>
> I was talking in context of glibc. glibc 0.1 came out in 1991. In 1988, glibc pre-release appeared.
>

So when you wrote "C standard library came with C89 in 1989", you meant
to write "glibc was released in 1991" ? It is /really/ difficult to
figure out what you are trying to say here.

Regardless of history, every C implementation of malloc() uses "size_t"
because that is what the C standards have always said was required for
the standard library. It's that simple.

Pretty much every other allocation function written for C has followed
the same style and used "size_t", because it is the appropriate type for
the task.

Richard Damon

unread,

Nov 18, 2022, 7:33:20 AM11/18/22

to

And the glibc library was supposed to be an implementation of the C
Standard Library, so the function signatures were definied by the C
Standard.

To deviate would have been an error.

A

unread,

Nov 18, 2022, 7:55:36 AM11/18/22

to

My assumption was that glibc original authors would have been part of C89 standardization committee.

Amit

A

unread,

Nov 18, 2022, 7:56:24 AM11/18/22

to

On Friday, 18 November 2022 at 17:32:09 UTC+5:30, David Brown wrote:
> On 18/11/2022 10:46, A wrote:
> >
> > On Friday, 18 November 2022 at 14:49:36 UTC+5:30, David Brown wrote:
> >> On 17/11/2022 11:43, A wrote:
> >>>>
> >>
> >> And your history of C is bizarre. The ANSI C standard in 1989 was a
> >> standardisation of /existing/ C common usage - it did not suddenly
> >> invent C and its standard library out of thin air.
> >>
> >
> > I was talking in context of glibc. glibc 0.1 came out in 1991. In 1988, glibc pre-release appeared.
> >
> So when you wrote "C standard library came with C89 in 1989", you meant
> to write "glibc was released in 1991" ? It is /really/ difficult to
> figure out what you are trying to say here.
>

Scott Lurndal

unread,

Nov 18, 2022, 9:50:04 AM11/18/22

to

You know what they say about assumptions.

Paul

unread,

Nov 18, 2022, 1:21:26 PM11/18/22

to

Windows XP (x86), allows the address space split to be changed.

The default virtual addresses are split 2GB userspace, 2GB kernelspace.

You can change this to 3GB userspace, 1GB kernelspace, but
this may cause various issues with the operation of the OS
for day-to-day activity (at least on WinXP it does -- the design is a
bit better on Windows 7).

I made that modification once, to get 32-bit Firefox to build on my WinXP x86
machine. And the build completed, the linking stage for XUL.dll
took every ounce of RAM the machine had. So at least the machine
behaved well enough, to finish a Firefox build that way.

As for hobby programming projects, this might help...

/* gcc -o malloc.exe -Wl,--large-address-aware malloc.c */

So once you change the split on your Windows XP machine, that's
how a little hobby program could use all 3GB of addresses.

And you may notice something a little weird about this 2:2
or 3:1 thing, as you don't get "exactly 2" or "exactly 3".
But I can't figure out why virtual addressing would have
any "holes cut in it", so I can't explain any size
discrepancy you might find. My test program, on 2:2 case, notes...

01919 megabytes t=001.263926

2048 minus 1919 is about 128MB. The test program in that
case, was very small, so it does not need 128MB of
address space just for the read-only code segment.

The result was not as pure as I had hoped. I first noticed
this on Photoshop and I blamed it on "bloated" code, but
that does not seem to be the reason. Even a small test program,
doesn't get to use the amount of space you might expect.

Paul

Opus

unread,

Nov 18, 2022, 2:12:50 PM11/18/22

to

Just a note regarding this, on top of all that has been already said:
the above would be easily caught by a decent compiler. Unfortunately,
for historical reasons, GCC (and LLVM, which strives to behave the same
as GCC) will not emit a warning for this (yes even with '-Wall') unless
you explicitely pass the additional '-Wconversion' warning flag. And I
highly recommend it. Of course, with this flag enabled, a crapton of
existing code is bound to throw warnings all over the place. Usually
because said code is very flaky.

You can also use third-party static analyzers, which would easily catch
this and a LOT of other potentially problematic stuff in your code. Do
not hesitate to use them. Heck, some are even free.

Chris M. Thomasson

unread,

Nov 18, 2022, 2:21:18 PM11/18/22

to

A volumetric renderer.

James Kuyper

unread,

Nov 19, 2022, 1:41:49 AM11/19/22

to

On 11/17/22 03:16, Amit wrote:
> On Thursday, November 17, 2022 at 1:14:49 PM UTC+5:30,
> james...@alumni.caltech.edu wrote:

>> On 11/17/22 01:20, Amit wrote:
>>> Hi,
>>> I prefer long over size_t.

>> For which purpose? C supports a large number of different types,
>> precisely because different types are preferred for different
>> purposes. More often than not, the type I need to use is dictated by
>> the interfaces to standard and third-part libraries that I'm using.
>> When I do have a choice in the matter, I prefer unsigned types like
>> size_t in any context where bitwise operations are involved, because
>> many of those operations can have undefined behavior under certain
>> circumstances when using a signed types, but not with unsigned types.
>> Unsigned types are often preferred when the quantity in question is
>> incapable of being negative, though care must be used when applying
>> this criterion. Operations involving quantities that cannot be
>> negative, may have results that can be negative. If such operations
>> are relevant to a particular variable, that variable should have a
>> signed type.
>
> I am not talking about signed vs unsigned.

True, but signed vs. unsigned is the most important difference between
long and size_t that would affect decisions about which one to use. The
only other difference is that long is guaranteed to be able to hold any
value between -2147483648 and 2147483647, which requires at least 32
bits, while size_t is only required to have at least 16 bits (and on
some systems is in fact that small).

> What reasons glibc original authors had when they used size_t in these
> functions instead of long.

They wanted to match the existing implementations of malloc(). Keep in
mind that C dates back to the early 70s. It was released to the general
public by the publication of "The C programming language" in 1978. I
started programming in C in 1979. What we now call the C standard
library had already started taking form at that time, though it wasn't
standardized until 1989. I'm not sure when in the process malloc() was
added to that library, but I believe it was well before glibc came out
in the late 1980s.

> If we don't know those reasons then if a developer has to implement a
> function called allocate_memory() then which implementation should he
> use and WHY?

Well, if he's creating an implementation of C, he should use the
interface that has been specified for malloc().

If you were instead to ask the people who designed malloc() chose
size_t, I would guess that it's because it is never valid to request a
negative amount, so they chose an interface that guarantees that it will
never receive a negative amount. It's also never valid for the size of a
type to negative, which is why the value of a sizeof expression has the
type size_t. In most cases, the value which should be passed to a
malloc() call should be the result of a calculation involving size_t.
Because of the way C's usual arithmetic conversions work, there's a very
good chance that any such calculated value has, itself, a type of
size_t, which is how it all fits together.

James Kuyper

unread,

Nov 19, 2022, 1:42:14 AM11/19/22

to

On 11/17/22 03:28, Amit wrote:
> On Thursday, November 17, 2022 at 1:46:14 PM UTC+5:30, Amit wrote:
...
>> Amit
>
> To shorten the discussion and making it to the point, I would like to
> know why is size_t used in malloc() when a negative value (passed by
> user by mistake) can crash the program. Using long and checking for
> negative values can prevent the program from crashing.

It shouldn't crash the program. Any negative value passed to malloc()
automatically gets converted to size_t. Such conversions always succeed
and have a well-defined result. Any value of type size_t is a valid
argument for malloc(). If that value is too large to be allocated,
malloc() may return a null value, but it is not permitted to crash the
program. Since any non-zero value passed to malloc() might result in a
failure, you should always check the value returned by malloc(), and
take appropriate action if that value is null. If you do so, a null
value shouldn't crash your program.

James Kuyper

unread,

Nov 19, 2022, 1:42:47 AM11/19/22

to

On 11/18/22 03:19, Ike Naar wrote:
...

> size_t const giga = 1024 * 1024 * 1024;

The initializer for that expression is evaluated as an int. INT_MAX is
permitted to be low enough for that expression to overflow, and that's
because there have been (and I believe, still are) real-world machines
where int had as few as 16 bits. Note that, on many such machines,
size_t also has only 16 bits, which is also permitted by the standard.
I'd recommend using a hexadecimal constant here instead.

James Kuyper

unread,

Nov 19, 2022, 1:43:25 AM11/19/22

to

On 11/17/22 10:55, Kaz Kylheku wrote:
> On 2022-11-17, A <amit2342...@gmail.com> wrote:

>> Let me ask a similar question.
>>
>> Please see the function below:
>>
>> void *allocate_memory(long size)
>> {
>> if (size < 0) {
>> errno = -ENEGATIVESIZE;
>

> errno values aren't negated; might you have been studying Linux kernel
> code recently?

Standard library routines are required to set errno to positive values,
specifically to allow users to use negative values for their own
purposes. allocate_memory() is not supposed to be a standard library
function, and therefore can use a negative value.

James Kuyper

unread,

Nov 19, 2022, 1:44:00 AM11/19/22

to

On 11/18/22 01:26, Lynn McGuire wrote:
...

> Are longs and long longs the same in Unix ?

Unix imposes only the same constraints on long and long long as the C
standard itself:

{LLONG_MAX}
Maximum value for an object of type long long.
Minimum Acceptable Value: +9223372036854775807
{LLONG_MIN}
Minimum value for an object of type long long.
Maximum Acceptable Value: -9223372036854775807

{LONG_MAX}
Maximum value for an object of type long.
Minimum Acceptable Value: +2 147 483 647
{LONG_MIN}
Minimum value for an object of type long.
Maximum Acceptable Value: -2 147 483 647

James Kuyper

unread,

Nov 19, 2022, 2:23:56 AM11/19/22

to

On 11/18/22 07:56, A wrote:
...

> My assumption was that glibc original authors would have been part of C89 standardization committee.

When C89 was being created, not a single member of the committee was
very familiar with either gcc or glibc. Many people have criticized the
committee for this, but it was actually the fault of the Free Software
Foundation (FSF).

There's multiple committees that are relevant to this question, so I'll
give a little background about the committees before directly explaining
that comment.

C89 was a US standard. It got taken over by ISO, which released almost
exactly the same text as C90. They added 3 sections at the beginning of
the standard to meet ISO requirements, which resulted in every section
number after that being increased by 3 (which means every single
cross-reference had to be updated, too), but it was otherwise almost
identical. Each member of the ISO C committee is a national
standardization organization. The single most influential member is the
US one, primarily because it has better funding and a very large number
of people working on it. When doing volunteer work, the people who can
do the most work tend to have the most influence over what gets done.

The US standardization committee is open to membership by almost anyone.
As a result, many people who's country is not represented on the ISO
committee, or whose national standard's organization is too expensive to
join, participate by joining the US committee. That's one reason it has
some many members.
They only allow one committee member representing any given
organization, except for "Self", but arbitrarily many people can choose
"Self" as their organization. Non-voting membership is free, but you can
be influential by participating in discussions even if you can't vote.
Voting membership is slightly expensive for an individual, and includes
a requirement that you attend in person at least 2 of the 3 annual
meetings each year, but that's not a serious problem for any group as
big as the Free Software Foundation. Despite that fact, no one from the
FSF chose to participate in C89 (that changed in later versions of the
standard). Keep in mind that gcc and glibc were both still very new at
that time, and not many people outside FSF knew much about them.

A

unread,

Nov 19, 2022, 5:52:06 AM11/19/22

to

They do take a lot of memory. However, I doubt that they will malloc 3 GB in one go - like malloc(3 GB). I think they will do malloc() as and when required.

Amit

Tim Rentsch

unread,

Nov 19, 2022, 9:20:04 AM11/19/22

to

Makes an ass out of you and umption.

Tim Rentsch

unread,

Nov 19, 2022, 9:42:45 AM11/19/22

to

James Kuyper <james...@alumni.caltech.edu> writes:

> C89 was a US standard. It got taken over by ISO, which released
> almost exactly the same text as C90. They added 3 sections at the
> beginning of the standard to meet ISO requirements, which resulted
> in every section number after that being increased by 3 (which
> means every single cross-reference had to be updated, too), but it

> was otherwise almost identical. [...]

For the most part the ISO (C90) standard uses exactly the same
wording that was used in the ANSI (C89) standard (ignoring
differences in numbering, like you say). In places though there
are some non-trivial changes.

A

unread,

Nov 19, 2022, 9:44:10 AM11/19/22

to

People really lack group mail etiquettes.

Tim,

If your statement is directed at me, then I must say that you are an asshole.

Amit

Richard Damon

unread,

Nov 19, 2022, 10:00:18 AM11/19/22

to

My understanding was that, execpt for the early "boiler-plate" section
added near the beginning which causes the numbering difference, the rest
of the content was adopted verbatim.

This accounts for the ISO version being adopted so quick after the ANSI
version.

Tim Rentsch

unread,

Nov 19, 2022, 12:54:09 PM11/19/22

to

A <amit2342...@gmail.com> writes:

> On Saturday, 19 November 2022 at 19:50:04 UTC+5:30, Tim Rentsch wrote:
>
>> sc...@slp53.sl.home (Scott Lurndal) writes:
>>
>>> A <amit2342...@gmail.com> writes:

[...]

>>>> My assumption was that glibc original authors would have been
>>>> part of C89 standardization committee.
>>>
>>> You know what they say about assumptions.
>>
>> Makes an ass out of you and umption.
>
> People really lack group mail etiquettes.
>
> Tim,
>
> If your statement is directed at me, then I must say that you are
> an asshole.

My statement wasn't directed at you, nor at anyone else. Nothing
in what I said was meant to be directed at anyone. It was simply
a joke, a variation on the usual rejoinder to the word "assume".
(Incidentally, I can't take credit for the joke - it's a line
from the movie The Long Kiss Goodnight.)

Tim Rentsch

unread,

Nov 19, 2022, 1:09:24 PM11/19/22

to

My observation is this. I have a copy of ansi.c.txt, which is
supposed to be a text-based copy of the ANSI C standard (and
presumably after the ANSI C standard was ratified). Perusing
that document from time to time, and occasionally going back and
forth between that and the first ISO C document, I happened to
see passages in places that were different between the ANSI
document and the ISO document. Indeed most of the text in the
two documents is word-for-word identical, which made the few
instances where there were discrepancies stand out rather
vividly. Unfortunately I remember the fact of there being
differences but not where they are specifically. But I am quite
sure that I observed some differences in the wordings of the two
documents.

Chris M. Thomasson

unread,

Nov 19, 2022, 2:38:08 PM11/19/22

to

Loading up a high res volume say, 2048^3, does take a toll on the system...

Keith Thompson

unread,

Nov 19, 2022, 3:33:11 PM11/19/22

to

Just in case someone misunderstands, you're recommending 0x40000000, not
0x400 * 0x400 * 0x400.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for XCOM Labs
void Void(void) { Void(); } /* The recursive call of the void */

Keith Thompson

unread,

Nov 19, 2022, 3:47:21 PM11/19/22

to

I don't think that's correct. The standard requires the values of EDOM,
EILSEQ, and ERANGE to be positive, but makes no such guarantee about any
implementation-defined E* macros, or about values that errno might be
set to by library function calls.

There is a convention for Linux kernel functions that implement system
calls to return negative values corresponding to E* macros, for example
`return -ENOENT;`; the wrapper typically sets errno to the corresponding
positive value and returns -1 to denote failure.

Somebody may have intended to reserve negative error values for user
code (though I've never heard of it), but that intent is not expressed
in the standard.

Keith Thompson

unread,

Nov 19, 2022, 3:57:08 PM11/19/22

to

I suggest that the most important difference is that size_t is
guaranteed to be able to hold the size of any object, and long is not.
That difference is not directly implied by the standard's guarantees
about their ranges.

[...]

If there were a good reason to dislike size_t because it's an unsigned
type, one could probably use the signed type that corresponds to it,
which is long long in some implementations. That loses the ability to
represent very large size_t values, which might or might not matter
depending on the implementation. There is no straightforward way to
determine what that signed type is. (POSIX defines ssize_t, but it's not
specified to be the signed type corresponding to size_t, so for example
the "%du" format is not guaranteed to work with it.)

Or one could use a signed type that can hold all the values that size_t
can represent, but there might not be such a type. long long can almost
certainly hold all *meaningful* size_t values.

Unsigned types can be tricky, but anyone who wants to program in C needs
to deal with their idiosyncrasies. Accidentally passing a negative
value to malloc (which will be implicitly converted to a very large
positive value) just isn't a common enough error to be worth such a
drastic solution.

Chris M. Thomasson

unread,

Nov 19, 2022, 4:03:24 PM11/19/22

to

[...]

ptrdiff_t?

Scott Lurndal

unread,

Nov 19, 2022, 4:17:50 PM11/19/22

to

Keith Thompson <Keith.S.T...@gmail.com> writes:
>James Kuyper <james...@alumni.caltech.edu> writes:
>> On 11/17/22 03:16, Amit wrote:
> (POSIX defines ssize_t, but it's not
>specified to be the signed type corresponding to size_t, so for example
>the "%du" format is not guaranteed to work with it.)

%zu?

Keith Thompson

unread,

Nov 19, 2022, 4:27:34 PM11/19/22

to

Assuming you have the same ansi.c.txt that I have, I believe it's a
pre-release draft of the 1989 ANSI C standard. The first paragraph is:

(This foreword is not a part of American National Standard for
Information Systems --- Programming Language C, X3.???-1988.)

The "???" and the 1988 date suggest to me that it had not yet been
finalized.

My original source for the file was
http://flash-gordon.me.uk/ansi.c.txt
but it's no longer there -- but a Google search indicates that
there are copies elsewhere.

I speculate that that explains any differences between ansi.c.txt and
the ISO C90 standard.

If you can find any specific differences, and if someone out there has a
copy of the published 1989 ANSI C standard (I don't), perhaps we can
clear this up.

I have a copy of Schildt's very bad book "The Annotated ANSI C
Standard", which has the text of the standard on every other page, but I
think it actually uses the 1990 ISO C standard, so it would not be
useful for this exercise -- and I can't find my copy anyway.
Background: https://www.lysator.liu.se/c/schildt.html

Keith Thompson

unread,

Nov 19, 2022, 8:41:55 PM11/19/22

to

No, I actually meant "%zd" (thanks for catching my error).

"%zu" expects an argument of type size_t. "%zd" expects an argument of
the signed type corresponding to size_t, but there's no good way to
determine what that type is.

Bonita Montero

unread,

Nov 19, 2022, 9:07:00 PM11/19/22

to

I always use int, it's the best choice for indexing on all machines.

Chris M. Thomasson

unread,

Nov 19, 2022, 10:58:59 PM11/19/22

to

On 11/19/2022 6:06 PM, Bonita Montero wrote:
> I always use int, it's the best choice for indexing on all machines.

Until the index gets larger than INT_MAX?

James Kuyper

unread,

Nov 20, 2022, 1:32:53 AM11/20/22

to

On 11/19/22 15:32, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> On 11/18/22 03:19, Ike Naar wrote:
>> ...
>>> size_t const giga = 1024 * 1024 * 1024;
>>
>> The initializer for that expression is evaluated as an int. INT_MAX is
>> permitted to be low enough for that expression to overflow, and that's
>> because there have been (and I believe, still are) real-world machines
>> where int had as few as 16 bits. Note that, on many such machines,
>> size_t also has only 16 bits, which is also permitted by the standard.
>> I'd recommend using a hexadecimal constant here instead.
>
> Just in case someone misunderstands, you're recommending 0x40000000, not
> 0x400 * 0x400 * 0x400.

You're right - I left out one crucial word: "... a single hexadecimal
constant ...".

James Kuyper

unread,

Nov 20, 2022, 1:33:19 AM11/20/22

to

On 11/19/22 15:47, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:

...

>> Standard library routines are required to set errno to positive values,
>> specifically to allow users to use negative values for their own
>> purposes. allocate_memory() is not supposed to be a standard library
>> function, and therefore can use a negative value.
>
> I don't think that's correct. The standard requires the values of EDOM,
> EILSEQ, and ERANGE to be positive, but makes no such guarantee about any
> implementation-defined E* macros, or about values that errno might be
> set to by library function calls.

The description of errno in 7.5p2 says "... the value of which is set to
a positive error number by several library functions."

James Kuyper

unread,

Nov 20, 2022, 1:33:42 AM11/20/22

to

On 11/19/22 15:56, Keith Thompson wrote:
...

> I suggest that the most important difference is that size_t is
> guaranteed to be able to hold the size of any object, and long is not.
> That difference is not directly implied by the standard's guarantees
> about their ranges.

The standard doesn't quite say that about size_t.

The standard does say that sizeof(type) gives the size of an object of
that type, and that sizeof(expression) gives the size of an object of
the type of that expression. But it's trivial to specify a type for
which the size of that type cannot have a value within range of size_t:
sizeof(char[2][SIZE_MAX]). Such an expression cannot have the behavior
mandated by the C standard for such an expression, but it's less than
perfectly clear how it's wrong.

There's a great many functions in the C standard library that take a
size_t parameter, and which therefore cannot be used in any context
where the value to be passed to the function is larger than SIZE_MAX.

Keith Thompson

unread,

Nov 20, 2022, 1:38:28 AM11/20/22

to

You're right, I missed that -- but p3 says "The value of errno may be
set to nonzero by a library function call whether or not there is an
error, provided the use of errno is not documented in the description of
the function in this International Standard."

And I'm sure the intent is that the "additional macro definitions"
beginning with E are supposed to expand to positive constant expressions
of type int, but the standard doesn't say so. (It would IMHO be nice if
it did.)

Chris M. Thomasson

unread,

Nov 20, 2022, 1:39:44 AM11/20/22

to

Some low res experiments, some are from volumetrics. The rest are from
raw triangles in an obj file (wavefront).

https://sketchfab.com/ChrisThomasson

Works in VR... ;^) C++ is fun. Can crank out a obj file, no problem.

Love this one, a circle in 3d can either be a circle, an ellipse or a
line wrt an observing agent in the volume.

https://skfb.ly/6TYVW

a simple 3d von Koch curve:

https://skfb.ly/6TuEV

Using raw volumetric's requires big computers, so to speak... Think of
very high res DICOM.

James Kuyper

unread,

Nov 20, 2022, 1:46:14 AM11/20/22

to

On 11/17/22 08:09, A wrote:
> On Thursday, 17 November 2022 at 18:29:46 UTC+5:30, Richard Damon wrote:
>> On 11/17/22 3:28 AM, Amit wrote:
>>
>>
>> Note, before we got to 64 bit computers, it was sometimes reasonable to
>> ask for more than 1/2 of the max value of the max unsigned value, so
>> making size_t signed would have been an error.
>
> Is there an example of this?>
> As far as I know, malloc(size_t) was introduced in 1989 with C89 standard. At that time, i486 was also released. In 1985, i386 was released. They both had size of "long" as 32 bits but RAM size was only about 4 MB (22 bits) or 8 MB (23 bits).
>
> So, I don't think that your comment is correct unless you can give an example.

The C programming language is targeted at a much wider range of systems
than you seem to be willing to consider. The minimum permitted value for
SIZE_MAX is 65535, and that minimum was set precisely because, at the
time it was set, there were some systems for which a value that low was
appropriate, and such systems still exist. On systems with such small
amounts of memory, a need to allocate more than half of it at one time
might be rare, but not very rare.

Another issue to consider is that SIZE_MAX is simply the largest amount
of memory that you can request be allocated. It need not be greater than
the total amount of memory available on a system to be allocated. On
systems where SIZE_MAX is much smaller than the total amount of memory,
allocating more than SIZE_MAX/2 bytes would not be particularly uncommon.

Keith Thompson

unread,

Nov 20, 2022, 1:51:07 AM11/20/22

to

James Kuyper <james...@alumni.caltech.edu> writes:
> On 11/19/22 15:56, Keith Thompson wrote:
> ...
>> I suggest that the most important difference is that size_t is
>> guaranteed to be able to hold the size of any object, and long is not.
>> That difference is not directly implied by the standard's guarantees
>> about their ranges.
>
> The standard doesn't quite say that about size_t.

True (and I think I've actually said so here before).

> The standard does say that sizeof(type) gives the size of an object of
> that type, and that sizeof(expression) gives the size of an object of
> the type of that expression. But it's trivial to specify a type for
> which the size of that type cannot have a value within range of size_t:
> sizeof(char[2][SIZE_MAX]). Such an expression cannot have the behavior
> mandated by the C standard for such an expression, but it's less than
> perfectly clear how it's wrong.

My interpretation is that the sizeof operator can overflow, just like
most other operators can.

On the other hand, an implementation *should* IMHO make size_t big
enough to represent the size of any object it can create.

I vaguely recall a recent suggestion to require all objects to be no
more than SIZE_MAX bytes in size.

> There's a great many functions in the C standard library that take a
> size_t parameter, and which therefore cannot be used in any context
> where the value to be passed to the function is larger than SIZE_MAX.

James Kuyper

unread,

Nov 20, 2022, 1:52:02 AM11/20/22

to

On 11/20/22 01:38, Keith Thompson wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> On 11/19/22 15:47, Keith Thompson wrote:
>>> James Kuyper <james...@alumni.caltech.edu> writes:
>> ...
>>>> Standard library routines are required to set errno to positive values,
>>>> specifically to allow users to use negative values for their own
>>>> purposes. allocate_memory() is not supposed to be a standard library
>>>> function, and therefore can use a negative value.
>>>
>>> I don't think that's correct. The standard requires the values of EDOM,
>>> EILSEQ, and ERANGE to be positive, but makes no such guarantee about any
>>> implementation-defined E* macros, or about values that errno might be
>>> set to by library function calls.
>>
>> The description of errno in 7.5p2 says "... the value of which is set to
>> a positive error number by several library functions."
>
> You're right, I missed that -- but p3 says "The value of errno may be
> set to nonzero by a library function call whether or not there is an
> error, provided the use of errno is not documented in the description of
> the function in this International Standard."
>
> And I'm sure the intent is that the "additional macro definitions"
> beginning with E are supposed to expand to positive constant expressions
> of type int, but the standard doesn't say so. (It would IMHO be nice if
> it did.)

That this requirement was imposed with the intent of allowing user code
to assign negative values to errno is something I've only heard
third-hand, I've no official source for it.

Bonita Montero

unread,

Nov 20, 2022, 2:34:24 AM11/20/22

to

Am 20.11.2022 um 04:58 schrieb Chris M. Thomasson:

>> I always use int, it's the best choice for indexing on all machines.

> Until the index gets larger than INT_MAX?

Then I use "int x : 128" !!!

Tim Rentsch

unread,

Nov 20, 2022, 9:52:17 AM11/20/22

to

James Kuyper <james...@alumni.caltech.edu> writes:

> On 11/19/22 15:56, Keith Thompson wrote:
> ...
>
>> I suggest that the most important difference is that size_t is
>> guaranteed to be able to hold the size of any object, and long is not.
>> That difference is not directly implied by the standard's guarantees
>> about their ranges.
>
> The standard doesn't quite say that about size_t.
>
> The standard does say that sizeof(type) gives the size of an object of
> that type, and that sizeof(expression) gives the size of an object of
> the type of that expression. But it's trivial to specify a type for
> which the size of that type cannot have a value within range of size_t:
> sizeof(char[2][SIZE_MAX]). Such an expression cannot have the behavior
> mandated by the C standard for such an expression, but it's less than
> perfectly clear how it's wrong.

Surely the intention is that such an expression runs afoul of the
constraint in 6.6 p4, and also that the program may be rejected
by virtue of exceeding a minimum implementation limit. On a
64-bit implementation, try compiling this:

printf( "%zu\n", sizeof (char[0xfffffffffffffffe]) );

Both gcc and clang reject this program, complaining about the
size implied by the type, even though the size is less than
SIZE_MAX (which is 0xffffffffffffffff in that compilation
environment).

Kaz Kylheku

unread,

Nov 20, 2022, 10:29:16 AM11/20/22

to

On 2022-11-19, James Kuyper <james...@alumni.caltech.edu> wrote:
> On 11/17/22 10:55, Kaz Kylheku wrote:
>> On 2022-11-17, A <amit2342...@gmail.com> wrote:
>>> Let me ask a similar question.
>>>
>>> Please see the function below:
>>>
>>> void *allocate_memory(long size)
>>> {
>>> if (size < 0) {
>>> errno = -ENEGATIVESIZE;
>>
>> errno values aren't negated; might you have been studying Linux kernel
>> code recently?
>

> Standard library routines are required to set errno to positive values,
> specifically to allow users to use negative values for their own
> purposes. allocate_memory() is not supposed to be a standard library
> function, and therefore can use a negative value.

Even if every standard as well as implementation-added constant is
positive, it would be a bad idea to define your own negative-valued
constant via its additive inverse:

#define ENEGATIVESIZE 42

and then have to remember to negate it everywhere it is used.

The constants might /be/ negative, but code should not be /negating/
them to make them negative at the points of use.

That's just about as silly as

#define ENEGATIVESIZE (3 * 42)

And then remember to:

errno = ENEGATIVESIZE / 3;

You generally define any given symbolic constant to produce the intended
literal constant.

It really looks to me like the statement might be a coding mistake by
someone who formed a habit due to mainly coding in the Linux kernel.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Kaz Kylheku

unread,

Nov 20, 2022, 10:44:32 AM11/20/22

to

On 2022-11-20, Keith Thompson <Keith.S.T...@gmail.com> wrote:
> James Kuyper <james...@alumni.caltech.edu> writes:
>> On 11/19/22 15:47, Keith Thompson wrote:
>>> James Kuyper <james...@alumni.caltech.edu> writes:
>> ...
>>>> Standard library routines are required to set errno to positive values,
>>>> specifically to allow users to use negative values for their own
>>>> purposes. allocate_memory() is not supposed to be a standard library
>>>> function, and therefore can use a negative value.
>>>
>>> I don't think that's correct. The standard requires the values of EDOM,
>>> EILSEQ, and ERANGE to be positive, but makes no such guarantee about any
>>> implementation-defined E* macros, or about values that errno might be
>>> set to by library function calls.
>>
>> The description of errno in 7.5p2 says "... the value of which is set to
>> a positive error number by several library functions."
>
> You're right, I missed that -- but p3 says "The value of errno may be
> set to nonzero by a library function call whether or not there is an
> error, provided the use of errno is not documented in the description of
> the function in this International Standard."

There is also "strerror shall map any value of type int to a message".

So we know that if we stick a negative value into errno, and something
tries to interpret it, that is required to be robust.

That is important, because if it said something like that the behavior
is undefined if strerror is invoked on anything than the values of the
standard-defined constants, plus any implementation-defined ones, that
would put a damper on the use of application-defined errno values. A
program sticking nonstandard values into errno would have to ensure that
no function is invoked that might try to extract a message from errno.

Though that is required to be robust, I don't think it's a great idea to
stick custom values into errno because strerror will not come up with a
useful message. There is a risk the value stored in errno will be
accessed beyond its intended perimeter of use, and result in some users
having to deal with some cryptic "errno -123" diagnostic that will send
them on some wild goose chase.

Only as a hack between internal routines: say a lower-level routine
setting a custom value in errno, knowing that the intermediate level
intercepts it, handles it, and overwrites the value with zero or
a standard value before returning to the higher level.

Opus

unread,

Nov 20, 2022, 2:12:10 PM11/20/22

to

Hilarious!

Bonita Montero

unread,

Nov 20, 2022, 2:49:04 PM11/20/22

to

Sometimes I even use "int x : 1024" ! And not an unsigned
to address the negative portion of the address space.

David Brown

unread,

Nov 21, 2022, 4:31:05 AM11/21/22

to

On 20/11/2022 02:41, Keith Thompson wrote:
> sc...@slp53.sl.home (Scott Lurndal) writes:
>> Keith Thompson <Keith.S.T...@gmail.com> writes:
>>> James Kuyper <james...@alumni.caltech.edu> writes:
>>>> On 11/17/22 03:16, Amit wrote:
>>> (POSIX defines ssize_t, but it's not
>>> specified to be the signed type corresponding to size_t, so for example
>>> the "%du" format is not guaranteed to work with it.)
>>
>> %zu?
>
> No, I actually meant "%zd" (thanks for catching my error).
>
> "%zu" expects an argument of type size_t. "%zd" expects an argument of
> the signed type corresponding to size_t, but there's no good way to
> determine what that type is.
>

It's a pity you can't "return" a type from a _Generic. Otherwise you
could have a _Generic macro Signed(T) that generated a signed version of
its parameter. And even if your _Generic returns a number, it can't be
used in pre-processing (for conditional compilation).

sizeof(size_t) could be helpful, but I can't see any way to distinguish
between different types of the same size (usually "long" is the same
size as either "int" or "long long", but the type is different).

David Brown

unread,

Nov 21, 2022, 5:31:58 AM11/21/22

to

On 18/11/2022 13:56, A wrote:
> On Friday, 18 November 2022 at 17:32:09 UTC+5:30, David Brown wrote:
>> On 18/11/2022 10:46, A wrote:
>>>
>>> On Friday, 18 November 2022 at 14:49:36 UTC+5:30, David Brown wrote:
>>>> On 17/11/2022 11:43, A wrote:
>>>>>>
>>>>
>>>> And your history of C is bizarre. The ANSI C standard in 1989 was a
>>>> standardisation of /existing/ C common usage - it did not suddenly
>>>> invent C and its standard library out of thin air.
>>>>
>>>
>>> I was talking in context of glibc. glibc 0.1 came out in 1991. In 1988, glibc pre-release appeared.
>>>
>> So when you wrote "C standard library came with C89 in 1989", you meant
>> to write "glibc was released in 1991" ? It is /really/ difficult to
>> figure out what you are trying to say here.
>>
>
> My assumption was that glibc original authors would have been part of C89 standardization committee.
>

That is a very odd assumption. glibc was quite young at that time, and
neither glibc, nor gcc, nor their umbrella organisation the FSF, were
one of the "big players" at the time.

I don't know what time size_t and malloc became de facto standard in C
prior to the first ANSI standard, but I think it would have been long
before glibc was started. (I presume it was before the first edition of
The C Programming Language in 1978.)

So even if the glibc authors had been involved in the C89
standardisation committee, the use of "size_t" in malloc() would have
preceded that standardisation by a decade at least.

You do know that glibc is only one of perhaps (my estimates) several
thousand standard C library implementations over the decades, as well as
several thousand C compilers for various targets?

Keith Thompson

unread,

Nov 21, 2022, 11:52:12 AM11/21/22

to

David Brown <david...@hesbynett.no> writes:
[...]

> I don't know what time size_t and malloc became de facto standard in C
> prior to the first ANSI standard, but I think it would have been long
> before glibc was started. (I presume it was before the first edition
> of The C Programming Language in 1978.)

K&R1 doesn't mention size_t or malloc.

It does have a sample implementation of a memory allocator called
"alloc" (presented as an example, not as part of the standard library).
It returns an unsigned result; unsigned long didn't exist yet.

It does mention calloc() as part of the standard library.

James Kuyper

unread,

Nov 21, 2022, 3:12:04 PM11/21/22

to

On 11/21/22 05:31, David Brown wrote:
...

> I don't know what time size_t and malloc became de facto standard in C
> prior to the first ANSI standard, but I think it would have been long
> before glibc was started. (I presume it was before the first edition of
> The C Programming Language in 1978.)

There is no mention of malloc() in that edition. As an example of how to
use C, it describes two functions named alloc() and free(). alloc()
calls morecore() to allocate a new block of memory, which in turn calls
the UNIX system routine sbrk(). It also describes talloc(), a function
that calls alloc(). In one example a value of type int was passed to
alloc(), in another it passed the result of a sizeof expression. The
type returned by sizeof expressions is described only as an integer
type, and a sizeof expression is said to be "is semantically an integer
constant.".

It describes a standard library function named calloc(), but fails to
say whether it takes a signed or unsigned argument. Neither size_t nor
function prototypes were part of C yet at that time.

Kaz Kylheku

unread,

Nov 21, 2022, 11:27:49 PM11/21/22

to

On 2022-11-21, David Brown <david...@hesbynett.no> wrote:
> It's a pity you can't "return" a type from a _Generic.

Might it be that in GNU C you can, via typeof(_Generic(...))?

Chris M. Thomasson

unread,

Nov 22, 2022, 12:57:33 AM11/22/22

to

Have you ever had to iterate larger than INT_MAX times before?

David Brown

unread,

Nov 22, 2022, 2:55:20 AM11/22/22

to

On 22/11/2022 05:27, Kaz Kylheku wrote:
> On 2022-11-21, David Brown <david...@hesbynett.no> wrote:
>> It's a pity you can't "return" a type from a _Generic.
>
> Might it be that in GNU C you can, via typeof(_Generic(...))?
>

That's a possibility I had not considered - I was trying to stick to
standard C. If you allow common extensions, then "ssize_t" is surely
the easiest solution! But it's an interesting idea and could give more
general macros.

Malcolm McLean

unread,

Nov 22, 2022, 5:33:10 AM11/22/22

to

Mostly limits like that are not known at compile time, but are passed in by the user.

I'm currently working on a general purpose 2D graphics library, and most of the data
are Bezier paths. It's possible for the caller to construct a path of any size up to the
maximum a size_t will hold. Some of the algorithms then construct an N-squared matrix
of points. So of course it's possible to exceed their capabilities.
But you'd have other problems with such big paths. The display side of the software
might struggle to rasterise them. And they would be impossible for the user to edit.
Virtually all paths are under a thousand points and most well under a hundred, and
the algorithms aren't required to return in reasonable time for paths over about a thousand
points.

Philipp Klaus Krause

unread,

Nov 23, 2022, 3:24:15 AM11/23/22

to

Am 17.11.22 um 11:43 schrieb A:
>>
>> - at the time it was not unusual to have objects which needed the
>> full unsigned range for expressing their size: like declaring a 60
>> kilobyte array under the "small" memory model on an Intel 8088 PC.
>> The size would be a 16 bit type, which would have to be unsigned to
>> capture the 60 kilobyte size.
>>
>
> Intel 8088 came in 1979. But C standard library came with C89 in
> 1989. In 1989, i486 was released. Size of "int" on i486 was 32 bits
> and probably RAM size was 4 MB (22 bits). I don't think that in 1989
> C standard would have even thought about Intel 8088 (Intel 8088 would
> have been probably obsolete by 1989).
>
> In fact, i386 came in 1985 and size of "int" on it was also probably
> 32 bits.
>
> So, I don't think C89 would have thought about Intel 8088.

The future ISO C23 standard will change the minimum size of ptrdiff_t
from 17 to 16 bits to better support 16- and 8-bit systems.

Philipp

Philipp Klaus Krause

unread,

Nov 23, 2022, 3:28:39 AM11/23/22

to

Am 20.11.22 um 03:06 schrieb Bonita Montero:

> I always use int, it's the best choice for indexing on all machines.

I often use uint_fast8_t, since it tends to be the "best" choice for my
indexing for the machines I target.

Bonita Montero

unread,

Nov 23, 2022, 7:29:56 AM11/23/22

to

Am 23.11.2022 um 09:28 schrieb Philipp Klaus Krause:

> I often use uint_fast8_t, since it tends to be the "best" choice for my
> indexing for the machines I target.

I was joking ...

Philipp Klaus Krause

unread,

Nov 24, 2022, 5:26:35 AM11/24/22

to

Am 19.11.22 um 22:27 schrieb Keith Thompson:

>
> I have a copy of Schildt's very bad book "The Annotated ANSI C
> Standard", which has the text of the standard on every other page, but I
> think it actually uses the 1990 ISO C standard, so it would not be
> useful for this exercise -- and I can't find my copy anyway.
> Background: https://www.lysator.liu.se/c/schildt.html
>

In case someone is looking for that style of book, "The New C Standard"
by Derek M. Jones is a better alternative (it is about C99, though so
not that relevant to the OP here). It also has much higher annotation to
standard text ratio.

Philipp

Kaz Kylheku

unread,

Nov 24, 2022, 3:09:41 PM11/24/22

to

On 2022-11-23, Philipp Klaus Krause <p...@spth.de> wrote:
> The future ISO C23 standard will change the minimum size of ptrdiff_t
> from 17 to 16 bits to better support 16- and 8-bit systems.

That's unproductive you can't "support" anything by flipping a digit in
the new revision of a document. It has no meaning.

It makes no difference as to what kinds of programs actually work or
don't work on those systems.

If some implementation can't make a 17 bit ptrdiff_t, it just won't
conform in that regard, just the empty word semantics of what we
call "conforming" or not.

David Brown

unread,

Nov 25, 2022, 2:29:30 AM11/25/22

to

On 24/11/2022 21:09, Kaz Kylheku wrote:
> On 2022-11-23, Philipp Klaus Krause <p...@spth.de> wrote:
>> The future ISO C23 standard will change the minimum size of ptrdiff_t
>> from 17 to 16 bits to better support 16- and 8-bit systems.
>
> That's unproductive you can't "support" anything by flipping a digit in
> the new revision of a document. It has no meaning.
>
> It makes no difference as to what kinds of programs actually work or
> don't work on those systems.
>
> If some implementation can't make a 17 bit ptrdiff_t, it just won't
> conform in that regard, just the empty word semantics of what we
> call "conforming" or not.
>

The change is an acknowledgement that 16-bit and 8-bit systems are
important to the C world, and that it is a good thing for compilers for
such systems to aim for conformance rather than being content with lots
of non-conformancies. A compiler for a 16-bit device can usually get
quite close to conformance.