Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

usage of size_t

34 views
Skip to first unread message

Francis Moreau

unread,
Feb 21, 2010, 8:37:35 AM2/21/10
to
Hello,

I usually use 'unsigned int' type for variables which hold the length
of a buffer.

However, someone suggests me to use 'size_t'.

So I took a look to the C99 spec and see what it tells about size_t:
and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
its max value is 65535 (7.18.3 p2).

size_t doesn't seem to be the good type to use when the variable of
that type describes the number of elements of a buffer whose type is
not 'char' and if the buffer size is less than 65535 bytes.

Is that correct ?

Thanks

Alexander Bartolich

unread,
Feb 21, 2010, 9:09:16 AM2/21/10
to
Francis Moreau wrote:
> [...]

> So I took a look to the C99 spec and see what it tells about size_t:
> and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
> its max value is 65535 (7.18.3 p2).

*sigh*

Please, read again. The range of size_t must cover at least that of
a 16 bit unsigned integer. That is, the smallest legal definition of
size_t must be able to hold the values from 0 to 0xFFFF. However,
nothing in the standard forbids a size_t with a larger range.

You could also try this yourself:
- define an object larger than 64 KiB, apply sizeof on that object
- apply sizeof on size_t

--

Malcolm McLean

unread,
Feb 21, 2010, 9:44:04 AM2/21/10
to
On Feb 21, 3:37 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>
> size_t doesn't seem to be the good type to use when the variable of
> that type describes the number of elements of a buffer whose type is
> not 'char' and if the buffer size is less than 65535 bytes.
>
size_t is an int designed by committee.

The idea was that you would have a special type to hold amounts of
memory. Since, usually, the address space of a processor is the same
as the pointer width which is the same as an integer data register,
size_t was specified as unsigned.
The problem is that size_t ends up being the default index variable
type, which causes all sorts of problems. Mostly it's psychological -
people would much rather write int i; than size_t i when declaring a
counter. However there are also many situations where unsigned indices
are inconvenient, eg for(i=N-1;i>=0;i--).
The worst problem is that, because C is strictly typed, an int * is
not compatible with a size_t *. So you can end up writing little
adaptor functions to convert a vector of size_ts to a vector of ints,
and vice versa, even though the underlying bit patterns may be
identical.

Richard

unread,
Feb 21, 2010, 10:28:27 AM2/21/10
to
Alexander Bartolich <alexander...@gmx.at> writes:

> Francis Moreau wrote:
>> [...]
>> So I took a look to the C99 spec and see what it tells about size_t:
>> and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
>> its max value is 65535 (7.18.3 p2).
>
> *sigh*

Nice. You fit into c.l.c well. Never pass up an opportunity to belittle
others. Imagine how much nicer it would have been for you to merely
point out he had misread without the added exclamation to belittle and
embarrass him.

>
> Please, read again. The range of size_t must cover at least that of
> a 16 bit unsigned integer. That is, the smallest legal definition of
> size_t must be able to hold the values from 0 to 0xFFFF. However,
> nothing in the standard forbids a size_t with a larger range.
>
> You could also try this yourself:
> - define an object larger than 64 KiB, apply sizeof on that object
> - apply sizeof on size_t

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

santosh

unread,
Feb 21, 2010, 10:49:49 AM2/21/10
to
Francis Moreau <franci...@gmail.com> writes:

> Hello,
>
> I usually use 'unsigned int' type for variables which hold the
> length of a buffer.
>
> However, someone suggests me to use 'size_t'.
>
> So I took a look to the C99 spec and see what it tells about
> size_t: and it's the type of the retuned value by sizeof() (6.5.3.4
> p4) and its max value is 65535 (7.18.3 p2).

That's not correct, atleast not the last part. The type size_t must
be large enough to represent the size in bytes of the largest single
object that the implementation supports. This is not restricted to
65536 bytes.

Incidentally, standard C does require an implementation to support
atleast one object of 65536 bytes, but it can, and commonly does,
support more and bigger objects than that.

Under 32 bit systems, the usual theoretical upper limit for object
size is roughly 4 Gb, which a 32 bit size_t can just represent. Under
some 32 bit and 64 bit systems, this is a much higher limit, about 18
Tb, if i'm not wrong.

To be more concrete, for finding out the upper limit of size_t's
range under a particular implementation, look-up the value of the
SIZE_MAX macro in the limits.h header.

> size_t doesn't seem to be the good type to use when the variable of
> that type describes the number of elements of a buffer whose type
> is not 'char' and if the buffer size is less than 65535 bytes.
>
> Is that correct ?

Why do you say it's not a good type to represent the size of non-char
objects? What's your reasoning for this?

And for arrays less than 65536 bytes, you can safely store their
sizes in an unsigned int or unsigned long, but it doesn't make much
difference to store it in size_t too.

If you're tracking the sizes of a large number of relatively small
objects and size_t on your system is wastefully big, you could
conceivably use unsigned int, or even unsigned short or unsigned char
to store the sizes, but consider if you later modify your program and
the sizes of one or more of these objects grows and your
unsigned/short/char wraps around. Size_t is the only type guaranteed
by the standard to store sizes of objects allocated using the
implementation, but unsigned long should work under most situations,
but again, what's the big gain in using it instead of size_t?

> Thanks


Richard Heathfield

unread,
Feb 21, 2010, 10:50:19 AM2/21/10
to
Malcolm McLean wrote:
> On Feb 21, 3:37 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>> size_t doesn't seem to be the good type to use when the variable of
>> that type describes the number of elements of a buffer whose type is
>> not 'char' and if the buffer size is less than 65535 bytes.
>>
> size_t is an int designed by committee.

I don't think it's as bad as you paint it. In fact, it's a lot, lot,
lot, lot, lot, lot, lot better than you paint it.

> The problem is that size_t ends up being the default index variable
> type, which causes all sorts of problems. Mostly it's psychological -
> people would much rather write int i; than size_t i when declaring a
> counter. However there are also many situations where unsigned indices
> are inconvenient, eg for(i=N-1;i>=0;i--).

Idioms is there for those as wants to count down:

size_t i = N;
while(i--)

is simpler, shorter, and more correcterer.

> The worst problem is that, because C is strictly typed, an int * is
> not compatible with a size_t *. So you can end up writing little
> adaptor functions to convert a vector of size_ts to a vector of ints,
> and vice versa, even though the underlying bit patterns may be
> identical.

I don't recall ever having to do that, in 20+ years of using C.


--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Malcolm McLean

unread,
Feb 21, 2010, 11:09:22 AM2/21/10
to
On Feb 21, 5:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> > The worst problem is that, because C is strictly typed, an int * is
> > not compatible with a size_t *. So you can end up writing little
> > adaptor functions to convert a vector of size_ts to a vector of ints,
> > and vice versa, even though the underlying bit patterns may be
> > identical.
>
> I don't recall ever having to do that, in 20+ years of using C.
>
I suppose it depends how you work. For instance some people never use
complex numbers in their entire programming career.
I not infrequently find myself having to call a routine that takes an
int * or a size_t * as input. It's not always possible to make them
match with the data in the rest of the program. Most of the time, I'll
grant you, people expect lists of integers as int *s, which is also my
preference.

Ike Naar

unread,
Feb 21, 2010, 3:16:38 PM2/21/10
to
In article <hlrkn1$nbs$1...@news.eternal-september.org>,

santosh <santo...@gmail.com> wrote:
>Under 32 bit systems, the usual theoretical upper limit for object
>size is roughly 4 Gb, which a 32 bit size_t can just represent. Under
>some 32 bit and 64 bit systems, this is a much higher limit, about 18
>Tb, if i'm not wrong.

Small nit: 18 exabyte (EB). That's about 18 million TB.

Seebs

unread,
Feb 21, 2010, 3:44:02 PM2/21/10
to
On 2010-02-21, Francis Moreau <franci...@gmail.com> wrote:
> So I took a look to the C99 spec and see what it tells about size_t:
> and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
> its max value is 65535 (7.18.3 p2).

You are incorrect.

Its max value is *AT LEAST* 65535.

It may be much, much, much, larger.

> size_t doesn't seem to be the good type to use when the variable of
> that type describes the number of elements of a buffer whose type is
> not 'char' and if the buffer size is less than 65535 bytes.

> Is that correct ?

I don't understand what you are trying to do.

Use size_t for sizes. If you are recording the number of items in a thing,
and it's zero or more, use size_t, that's what size_t is for. It doesn't
matter what the type is or whether or not it's 65535 bytes or more or
less. If you can have a buffer of over 65535 bytes, then size_t will be
able to represent sizes over 65535.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

santosh

unread,
Feb 22, 2010, 2:50:33 AM2/22/10
to
Ike Naar <i...@localhost.claranet.nl> writes:

Oops you're right. Thanks for the correction.


gwowen

unread,
Feb 22, 2010, 2:50:47 AM2/22/10
to
On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:

> Idioms is there for those as wants to count down:
>
> size_t i = N;
> while(i--)
>
> is simpler, shorter, and more correcterer.

Sadly, its also less clear. It requires the reader to remember the
difference between --i and i--, and it requires them to be aware of
the implicit int-to-bool conversion. It's idiomatic precisely
because, until you've seen it many times, it requires more thought
than should be necessary.

size_t i=N-1; // implicitly assume N!=0
do {
foo(i); // or more likely foo(bar[i-1])
i = i - 1; // or --i or i--, as you prefer.
} while(i != 0);

seems, to me, the one thats most clearly expresses intent (though
obviously, I'm aware this is purely a personal preference).
Alternatively

size_t i=N; // No need for assumption N!=0 this time...
while(i != 0) {
foo(i-1); // ibid...
--i; // ibid...
};

and trust your compiler to do the right thing, optimisation wise...

Richard Heathfield

unread,
Feb 22, 2010, 2:58:17 AM2/22/10
to

It's a trivially small error to make, if you compare it to a similar
error once made by Isaac Asimov. He once managed to mislay a factor of
10^23, which rather knocks 10^6 into the shade.

(No, this is not an attack on Isaac Asimov. When the error was pointed
out to him, he cheerfully acknowledged it.)

Richard Heathfield

unread,
Feb 22, 2010, 3:01:43 AM2/22/10
to
gwowen wrote:
> On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
>> Idioms is there for those as wants to count down:
>>
>> size_t i = N;
>> while(i--)
>>
>> is simpler, shorter, and more correcterer.
>
> Sadly, its also less clear.

I disagree. I would argue that it's a well-known idiom. Still, I accept
that there are arguments on both sides.

> It requires the reader to remember the
> difference between --i and i--, and it requires them to be aware of
> the implicit int-to-bool conversion.

I would expect any serious C programmer to be aware of both of these
without having to think too strenuously about it, but the second at
least is easily dealt with:

while(i-- > 0)


> It's idiomatic precisely
> because, until you've seen it many times, it requires more thought
> than should be necessary.
>
> size_t i=N-1; // implicitly assume N!=0
> do {
> foo(i); // or more likely foo(bar[i-1])
> i = i - 1; // or --i or i--, as you prefer.
> } while(i != 0);

I find my version much easier to read. But then I would, wouldn't I? :-)

<snip>

Keith Thompson

unread,
Feb 22, 2010, 3:27:14 AM2/22/10
to
gwowen <gwo...@gmail.com> writes:
> On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
>> Idioms is there for those as wants to count down:
>>
>> size_t i = N;
>> while(i--)
>>
>> is simpler, shorter, and more correcterer.
>
> Sadly, its also less clear. It requires the reader to remember the
> difference between --i and i--, and it requires them to be aware of
> the implicit int-to-bool conversion.

Any C programmer needs to know the difference between --i and i--, and
there is no implicit int-to-bool conversion here. The condition in a
while statement is a scalar that's tested for inequality to 0.

[snip]

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

gwowen

unread,
Feb 22, 2010, 3:48:11 AM2/22/10
to
On Feb 22, 8:27 am, Keith Thompson <ks...@mib.org> wrote:

> Any C programmer needs to know the difference between --i and i--,

I know the difference. I've known the difference for years. But I
still have to think about it (if you see what I mean. I know which
one's Ant and which one's Dec, but I have to think about that too).
If I'm reading some code, I don't want my concentration unnecessarily
broken by having to recall some syntactical nicety, even a
relatively. The next guy to read my code may have to think harder
than me.

> and there is no implicit int-to-bool conversion here.  
> The condition in a while statement is a scalar that's tested for inequality to 0.

That test for inequality is implicit: an explicit one would look like
while(--i != 0). I defer to your knowledge on whether this counts as
a conversion to bool, but whatever such an implicit test is called, I
don't care for it with --i or i--. That's writing for the compiler,
not the human reader.

Personally, I almost never use --i as anything but an stand-alone
expression, don't use i-- unless I can absolutely help it. Is there a
compiler anywhere for which

z = i--;

produces different code than

z = i;
i = i-1;

And, if not, which one is clearer to a neophyte C coder who's been
given my code to maintain (poor bastard), or a Fortran programmer
trying to see how my C code works, or a mathematician checking my
implementation of his algorithm? Yes, its minor a stylistic point,
and they're automatically subjective, but that's my opinion. I don't
doubt yours is at least as valid, and probably more widely held.

Richard Heathfield

unread,
Feb 22, 2010, 4:47:21 AM2/22/10
to
Keith Thompson wrote:
> gwowen <gwo...@gmail.com> writes:
>> On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>>
>>> Idioms is there for those as wants to count down:
>>>
>>> size_t i = N;
>>> while(i--)
>>>
>>> is simpler, shorter, and more correcterer.
>> Sadly, its also less clear. It requires the reader to remember the
>> difference between --i and i--, and it requires them to be aware of
>> the implicit int-to-bool conversion.
>
> Any C programmer needs to know the difference between --i and i--, and
> there is no implicit int-to-bool conversion here. The condition in a
> while statement is a scalar that's tested for inequality to 0.

Yes to the first, but I think it's clear what he /meant/ by int-to-bool
in this case. (Strictly speaking, you're right; just trying to cut the
guy some slack here.)

Nick Keighley

unread,
Feb 22, 2010, 5:17:15 AM2/22/10
to
On 21 Feb, 15:28, Richard <rgrd...@gmail.com> wrote:
> Alexander Bartolich <alexander.bartol...@gmx.at> writes:
> > Francis Moreau wrote:

> >> So I took a look to the C99 spec and see what it tells about size_t:
> >> and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
> >> its max value is 65535 (7.18.3 p2).
>
> > *sigh*
>
> Nice. You fit into c.l.c well. Never pass up an opportunity to belittle
> others.

eeek! recursion!

<snip>

Nick Keighley

unread,
Feb 22, 2010, 5:46:56 AM2/22/10
to
On 22 Feb, 08:01, Richard Heathfield <r...@see.sig.invalid> wrote:
> gwowen wrote:
> > On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:


> >> Idioms is there for those as wants to count down:
>
> >> size_t i = N;
> >> while(i--)
>
> >> is simpler, shorter, and more correcterer.
>
> > Sadly, its also less clear.

which, arguably, makes it less simple


> I disagree. I would argue that it's a well-known idiom.

I don't remember seeing it before (which, of course, isn't a good
definition of "not well known").

> still, I accept that there are arguments on both sides.


>
> > It requires the reader to remember the
> > difference between --i and i--, and it requires them to be aware of
> > the implicit int-to-bool conversion.

yes, I'd classify it as "slightly obscure". I'd wonder why they didn't
use a for-loop. I'd probably comment it if I decided to use it.

> I would expect any serious C programmer to be aware of both of these
> without having to think too strenuously about it,

I wasn't aware of it, but I didn't have to think too strenuously.


> but the second at least is easily dealt with:
>
> while(i-- > 0)

yes, I prefer the test to be explicit. I think it is clearer. For
similar reasons I usually test for NULL.

> > It's idiomatic precisely
> > because, until you've seen it many times, it requires more thought
> > than should be necessary.
>
> > size_t i=N-1;  // implicitly assume N!=0
> > do {
> >   foo(i);      // or more likely foo(bar[i-1])
> >   i = i - 1;   // or --i or i--, as you prefer.
> > } while(i != 0);
>
> I find my version much easier to read. But then I would, wouldn't I? :-)

I'm never happy with do-while. Are we certain the loop body should
always be executed at least once? I use do-while but I think rather
carefully first. This was a bug in the original Fortran- it always did
a loop at least once.


--
I don't like [quantum mechanics],
and I'm sorry I ever had anything to do with it.
--Erwin Schrödinger

Phil Carmody

unread,
Feb 22, 2010, 5:50:09 AM2/22/10
to
Richard Heathfield <r...@see.sig.invalid> writes:

> Malcolm McLean wrote:
>> The worst problem is that, because C is strictly typed, an int * is
>
> I don't recall ever having to do that, in 20+ years of using C.

I don't remember C being a strictly typed language, on the
presumption that implies something similar to being strongly
typed.

Phil
--
Any true emperor never needs to wear clothes. -- Devany on r.a.s.f1

Malcolm McLean

unread,
Feb 22, 2010, 5:52:34 AM2/22/10
to
On Feb 22, 12:46 pm, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:

>
> I'm never happy with do-while. Are we certain the loop body should
> always be executed at least once? I use do-while but I think rather
> carefully first. This was a bug in the original Fortran- it always did
> a loop at least once.
>
I almost never use a do-while.

Usually you need a empty case where the loop never executes. Other
times it's easier to code the logic into a while loop.
In Fortran 77 the fact that a loop body always executes at least once
can be a real nuisance..

Nick Keighley

unread,
Feb 22, 2010, 5:54:00 AM2/22/10
to
On 22 Feb, 07:50, santosh <santosh....@gmail.com> wrote:
> Ike Naar <i...@localhost.claranet.nl> writes:
> > In article <hlrkn1$nb...@news.eternal-september.org>,
> > santosh  <santosh....@gmail.com> wrote:

> >>Under 32 bit systems, the usual theoretical upper limit for object
> >>size is roughly 4 Gb, which a 32 bit size_t can just represent.
> >>Under some 32 bit and 64 bit systems, this is a much higher limit,
> >>about 18 Tb, if i'm not wrong.
>
> > Small nit: 18 exabyte (EB). That's about 18 million TB.
>
> Oops you're right. Thanks for the correction.

as I used to say to my physics tutor. What's a few orders of magnitude
between friends?

Phil Carmody

unread,
Feb 22, 2010, 5:55:40 AM2/22/10
to

I see yours coping with N==0, and the others not coping with it,
to be the black and white distinguisher. Almost every time I
have a varying number of something, 0 is a valid count. To even
think of simply dismissing such a case out of hand seems sloppy.
Which is why, if the order I do things doesn't matter, I also
use your construct. (But if I'm accessing things by index, and
need to count forwards, I clearly won't)

gwowen

unread,
Feb 22, 2010, 6:11:12 AM2/22/10
to
On Feb 22, 10:55 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
wrote:

Au contraire, Blackadder. As I posted earlier...

size_t i=N;
while(i != 0) {
foo(i-1);
--i;
};

Having given it some thought, I now prefer this...

while(i != 0) { // Here i tells us how many loop iterations remain
--i; // i now indexes the i'th element of an array...
foo(bar[i]);
};

> Almost every time I have a varying number of something, 0 is a valid count.

Absolutely. But with zero-based indexing, unsigned types which wrap,
somewhere there's going to be some ugliness -- people are going to
want use the loop variable as an index, as well as a count of how many
unprocessed elements remain. The least ugly way to do this is a
matter of taste.

> To even think of simply dismissing such a case out of hand seems sloppy.

I didn't dismiss it out of hand, I noted my assumption, and then
provided an alternative which dealt with it. Dismissing solutions
having not read them seems sloppy ;)

Kelsey Bjarnason

unread,
Feb 22, 2010, 6:13:46 AM2/22/10
to
[snips]

On Mon, 22 Feb 2010 00:48:11 -0800, gwowen wrote:

> That test for inequality is implicit: an explicit one would look like
> while(--i != 0). I defer to your knowledge on whether this counts as a
> conversion to bool, but whatever such an implicit test is called, I
> don't care for it with --i or i--. That's writing for the compiler, not
> the human reader.

Well, in much C code, constructs such as while(x) are fairly common, with
or without increment or decrement, eg while ( *s++ ) n++;

This is hardly a novel usage.

> Personally, I almost never use --i as anything but an stand-alone
> expression, don't use i-- unless I can absolutely help it. Is there a
> compiler anywhere for which
>
> z = i--;
>
> produces different code than
>
> z = i;
> i = i-1;

Probably, somewhere, there is some pathological implementation which
does, but who cares?


> And, if not, which one is clearer to a neophyte C coder who's been given
> my code to maintain (poor bastard), or a Fortran programmer trying to
> see how my C code works, or a mathematician checking my implementation
> of his algorithm?


Pointless complexity or code density accomplishes nothing, to be sure,
but I know a lot of C coders - myself included - who would look at "i = i
- 1" and worry that whoever wrote it did not understand C, and thus the
code needs to undergo serious - and total - review. It's just not how C
programmers write C.

And while you're at it, are you writing code for C programmers, or for
mathematicians? They (presumably) wouldn't be competent to do anything
with the code anyhow, unless they were _also_ C programmers, in which
case they'd know the common usages and idioms.

Just a thunk.

Kelsey Bjarnason

unread,
Feb 22, 2010, 6:22:55 AM2/22/10
to
[snips]

On Mon, 22 Feb 2010 02:54:00 -0800, Nick Keighley wrote:

> as I used to say to my physics tutor. What's a few orders of magnitude
> between friends?

Send me your paycheck each pay period, and I'll send you back an order or
three less and we'll see. :)

Phil Carmody

unread,
Feb 22, 2010, 6:26:03 AM2/22/10
to
Kelsey Bjarnason <kbjar...@gmail.com> writes:
> [snips]
>
> On Mon, 22 Feb 2010 00:48:11 -0800, gwowen wrote:
>
>> That test for inequality is implicit: an explicit one would look like
>> while(--i != 0). I defer to your knowledge on whether this counts as a
>> conversion to bool, but whatever such an implicit test is called, I
>> don't care for it with --i or i--. That's writing for the compiler, not
>> the human reader.
>
> Well, in much C code, constructs such as while(x) are fairly common, with
> or without increment or decrement, eg while ( *s++ ) n++;
>
> This is hardly a novel usage.
>
>> Personally, I almost never use --i as anything but an stand-alone
>> expression, don't use i-- unless I can absolutely help it. Is there a
>> compiler anywhere for which
>>
>> z = i--;
>>
>> produces different code than
>>
>> z = i;
>> i = i-1;
>
> Probably, somewhere, there is some pathological implementation which
> does, but who cares?

In the current absense of a declaration for i, I'll suggest a volatile one.

>
>> And, if not, which one is clearer to a neophyte C coder who's been given
>> my code to maintain (poor bastard), or a Fortran programmer trying to
>> see how my C code works, or a mathematician checking my implementation
>> of his algorithm?
>
>
> Pointless complexity or code density accomplishes nothing, to be sure,
> but I know a lot of C coders - myself included - who would look at "i = i
> - 1" and worry that whoever wrote it did not understand C, and thus the
> code needs to undergo serious - and total - review. It's just not how C
> programmers write C.

Yes, but I also see macho code (MS - you know who you are, _please_
stop doing that!), and when I encounter just one mistake in that I
know I have to review thousands of lines of macho code, which is
often way worse than thousands of lines of naive code.

> And while you're at it, are you writing code for C programmers, or for
> mathematicians? They (presumably) wouldn't be competent to do anything
> with the code anyhow, unless they were _also_ C programmers, in which
> case they'd know the common usages and idioms.

Yup.

> Just a thunk.

No! If we have thunks, the lisp contingent will emerge, and we'll soon
have recursive function calls, and exploding stacks!

Richard

unread,
Feb 22, 2010, 6:31:42 AM2/22/10
to
gwowen <gwo...@gmail.com> writes:


Using clean C its easier to read in this form.

while(i--)
foo(i);

Note the decrement in the loop - easier to spot and cleaner should you
need more statements with that value of i.

The -- in the loop is a well known C usage and if its not clear to you
then your C is hazy to say the least. It is much more confusing and hard
to read to put the decrement on some line in the body.

gwowen

unread,
Feb 22, 2010, 7:13:12 AM2/22/10
to
On Feb 22, 11:13 am, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:

> And while you're at it, are you writing code for C programmers, or for
> mathematicians?  

Yes. If a reasonably computer-savvy mathematician, with some
programming experience, not necessarily in C, cannot understand my C
code, sufficiently well to detect whether I've correctly implemented
an algorithm with which they're familiar, then it probably needs
refactoring for clarity.

Yes, I know this is a fairly extreme position.
No, I don't expect anyone who is not working on my codebase to code in
this manner.

> It's just not how C programmers write C.

It's how some C programmers write C. I block out a lot of algorithms
in Matlab, then port the timing-critical bits to C (mainly to avoid
the copy-in-copy-out that plagues matrix operations in Matlab). ++i
does nothing in Matlab, and i++ is a syntax error, so you see a lot
of

i = i+1;

When porting, I'm not going to change that to ++i just for idiomatic
reasons.

gwowen

unread,
Feb 22, 2010, 7:28:44 AM2/22/10
to
On Feb 22, 11:31 am, Richard <rgrd...@gmail.com> wrote:

> The -- in the loop is a well known C usage and if its not clear to you
> then your C is hazy to say the least. It is much more confusing and hard
> to read to put the decrement on some line in the body.

Your code is clear for people with good C skills.
My code is clear for people without good C skills, and clear (but non-
idiomatic) for those with good skills, and clear-but-hideous for C
mavens. I'm OK with that.

All programmers any many non-programmers can read well written pseudo-
code. If I can make my code look like pseudo code, by omitting
unecessary idioms, why shouldn't I? If I'm writing for Usenet, where
many posters are not native English speakers, I'm not going to use
strongly idiomatic English, even though this is an English language
newsgroup.

So, if you want your code understood as widely as possible, don't be a
vicar of Bray; grasp the nettle, and do Yeoman's service and all
things being equal, Bob's your uncle and you'll come up smelling of
roses... Otherwise you'll do a Devon Loch, be hoist by your own
petard, be gone for a right royal Burton, or otherwise come a
cropper. I wouldn't touch idiomatic English with a bargepole. It's
just not cricket.

Richard

unread,
Feb 22, 2010, 7:52:07 AM2/22/10
to
gwowen <gwo...@gmail.com> writes:

> On Feb 22, 11:31 am, Richard <rgrd...@gmail.com> wrote:
>
>> The -- in the loop is a well known C usage and if its not clear to you
>> then your C is hazy to say the least. It is much more confusing and hard
>> to read to put the decrement on some line in the body.
>
> Your code is clear for people with good C skills.
> My code is clear for people without good C skills, and clear (but non-
> idiomatic) for those with good skills, and clear-but-hideous for C
> mavens. I'm OK with that.

But your code isn't any clearer at all.

Anyone modifying or reading that code is far likelier to understand the
common C idiom of the decrement in the loop than in some random body
line.

Anyone NOT understanding "while(i--)" has no business modifying the code in
the first place.

Francis Moreau

unread,
Feb 22, 2010, 8:21:14 AM2/22/10
to
On Feb 21, 4:49 pm, santosh <santosh....@gmail.com> wrote:
> Francis Moreau <francis.m...@gmail.com> writes:
> > Hello,
>
> > I usually use 'unsigned int' type for variables which hold the
> > length of a buffer.
>
> > However, someone suggests me to use 'size_t'.

>
> > So I took a look to the C99 spec and see what it tells about
> > size_t: and it's the type of the retuned value by sizeof() (6.5.3.4
> > p4) and its max value is 65535 (7.18.3 p2).
>
> That's not correct, atleast not the last part. The type size_t must
> be large enough to represent the size in bytes of the largest single
> object that the implementation supports. This is not restricted to
> 65536 bytes.
>
> Incidentally, standard C does require an implementation to support
> atleast one object of 65536 bytes, but it can, and commonly does,
> support more and bigger objects than that.

>
> Under 32 bit systems, the usual theoretical upper limit for object
> size is roughly 4 Gb, which a 32 bit size_t can just represent. Under
> some 32 bit and 64 bit systems, this is a much higher limit, about 18
> Tb, if i'm not wrong.
>
> To be more concrete, for finding out the upper limit of size_t's
> range under a particular implementation, look-up the value of the
> SIZE_MAX macro in the limits.h header.

ah yes sorry I mis read the spec.

>
> > size_t doesn't seem to be the good type to use when the variable of
> > that type describes the number of elements of a buffer whose type
> > is not 'char' and if the buffer size is less than 65535 bytes.
>
> > Is that correct ?
>

> Why do you say it's not a good type to represent the size of non-char
> objects? What's your reasoning for this?
>

Well, size_t is the type of the value returned by sizeof(). And
sizeof() returns the number of bytes (ie char) of its operand. So I
assumed that size_t was introduced to represent a number of char.

Francis Moreau

unread,
Feb 22, 2010, 8:31:42 AM2/22/10
to
On Feb 21, 9:44 pm, Seebs <usenet-nos...@seebs.net> wrote:

> On 2010-02-21, Francis Moreau <francis.m...@gmail.com> wrote:
>
> > So I took a look to the C99 spec and see what it tells about size_t:
> > and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
> > its max value is 65535 (7.18.3 p2).
>
> You are incorrect.
>
> Its max value is *AT LEAST* 65535.
>
> It may be much, much, much, larger.
>
> > size_t doesn't seem to be the good type to use when the variable of
> > that type describes the number of elements of a buffer whose type is
> > not 'char' and if the buffer size is less than 65535 bytes.
> > Is that correct ?
>
> I don't understand what you are trying to do.
>

I'm just trying to understand what (expert) people can deduce when
they're seeing an object whose type is size_t.

For example, if you see the following declaration:

int do_something_on_an_array(struct foo array[], size_t len);

Does 'len' parameter imply a size in bytes of 'array' (the one that
sizeof() operator would return assuming the length of the array is
known) or does it mean the number of object of type 'struct foo' in
'array'.

To sum up, I was wondering if there is some assumptions that could be
done with size_t.

Thanks

Francis Moreau

unread,
Feb 22, 2010, 8:36:03 AM2/22/10
to
On Feb 21, 3:44 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:

> On Feb 21, 3:37 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>
> > size_t doesn't seem to be the good type to use when the variable of
> > that type describes the number of elements of a buffer whose type is
> > not 'char' and if the buffer size is less than 65535 bytes.
>
> size_t is an int designed by committee.
>
> The idea was that you would have a special type to hold amounts of
> memory. Since, usually, the address space of a processor is the same
> as the pointer width which is the same as an integer data register,
> size_t was specified as unsigned.
> The problem is that size_t ends up being the default index variable
> type, which causes all sorts of problems. Mostly it's psychological -
> people would much rather write int i; than size_t i when declaring a
> counter.

That's true, I did read quite a lot of C code and I've never seen an
index variable whose type was 'size_t'.


gwowen

unread,
Feb 22, 2010, 8:42:44 AM2/22/10
to
On Feb 22, 12:52 pm, Richard <rgrd...@gmail.com> wrote:

> But your code isn't any clearer at all.

So you keep asserting. "Clear" is not an on-off concept, or even a
one-dimensional one. Programming idioms are like jargon. Speaking in
jargon and acronyms, I can be clear and concise to someone familiar
with my areas of expertise, but baffling to an outsider. By dropping
that jargon I'll probably be less concise, but understood by a less
exclusive group.

The question is, to whom am I trying to be clear -- to people with my
exact skillset or people in general? To any programmer who is not
familiar with idiomatic C, and is used to writing a language that does
not have the --i idiom[0] is not clear. To someone familiar with
idiom, yes of course it is, but to anyone else unclear.

> Anyone NOT understanding "while(i--)" has no business modifying the code in
> the first place.

Modifying is not the same as understanding. My code is comprehensible
to people who are not, by training or desire, C programmers, and they
can (and do) spot errors in it, because its deliberately written so
that that is possible.

[0] Say Fortran. Or Matlab. Or Lisp, Haskell or pretty much any
functional language. Or Basic, Pascal, Logo, or Python. Or anyone who
can read pseudo-code, or anyone who can follow a flow chart. But
other than that, almost no-one.

santosh

unread,
Feb 22, 2010, 8:50:46 AM2/22/10
to
Richard <rgr...@gmail.com> writes:
> gwowen <gwo...@gmail.com> writes:
>> On Feb 22, 11:31 am, Richard <rgrd...@gmail.com> wrote:
>>
>>> The -- in the loop is a well known C usage and if its not clear
>>> to you then your C is hazy to say the least. It is much more
>>> confusing and hard to read to put the decrement on some line in
>>> the body.
>>
>> Your code is clear for people with good C skills.
>> My code is clear for people without good C skills, and clear (but
>> non- idiomatic) for those with good skills, and clear-but-hideous
>> for C mavens. I'm OK with that.
>
> But your code isn't any clearer at all.
>
> Anyone modifying or reading that code is far likelier to understand
> the common C idiom of the decrement in the loop than in some random
> body line.

This can be solved to some extent by placing the x = x + 1 in a for
loop, instead of somewhere within a while.

> Anyone NOT understanding "while(i--)" has no business modifying the
> code in the first place.

From what I understand, it seems he is aiming to write code which is
easier to understand (and not necessarily modify. The two don't need
to go together), for those with some knowledge of algorithms and
pseudo-code, but little or no knowledge of C.


Nick Keighley

unread,
Feb 22, 2010, 8:52:16 AM2/22/10
to
On 22 Feb, 10:52, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:

> On Feb 22, 12:46 pm, Nick Keighley <nick_keighley_nos...@hotmail.com>
> wrote:
>
> > I'm never happy with do-while. Are we certain the loop body should
> > always be executed at least once? I use do-while but I think rather
> > carefully first. This was a bug in the original Fortran- it always did
> > a loop at least once.
>
> I almost never use a do-while.

ditto

> Usually you need a empty case where the loop never executes. Other
> times it's easier to code the logic into a while loop.

I sometimes end up with a do-while when it's "get an object and if its
no good try again". This seem sto code quite naturally as a do-while
though C's ability to do assignment in the while test means you have
to use it.

do
{
read_an_item (&item);
} while (!is_valid (item));

for some reason I like this with user input

of course

while (!is_valid (read_an_item (&item)))
;

and reserve for(;;) for break-out-of-the-middle cases

for (;;)
{
msg = get_msg();
if (msg == STOP)
break; /* <-- break here! */
process_msg(msg);
}

> In Fortran 77 the fact that a loop body always executes at least once
> can be a real nuisance..

I remember.

People could accidently create this problem with Algol-60 and its
descendents (Algol-60 didn't have a proper while loop)

FOR i := 0, i + 1 WHILE i <= last_used_entry DO
process (item [i]);

it always processes item[0] (assuming I have the syntax right!)


Malcolm McLean

unread,
Feb 22, 2010, 8:53:08 AM2/22/10
to
On Feb 22, 3:31 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>
> For example, if you see the following declaration:
>
>    int do_something_on_an_array(struct foo array[], size_t len);
>
> Does 'len' parameter imply a size in bytes of 'array' (the one that
> sizeof() operator would return assuming the length of the array is
> known) or does it mean the number of object of type 'struct foo' in
> 'array'.
>
In qsort,no. The function takes two size_ts, one giving element width
in bytes, which is where you'd expect a size_t, the other giving the
number of elements, which we would expect to be an int.
The justification is that int may not be big enough to index an entire
array. This could happen a) if int is the address size of the
processor, and the array is an array of chars taking up more than half
of memory, or b) if int is smaller than the address space of the
machine.
a) is so unlikely that we can ignore it. b) can happen if int is not
64 bits on a machine with a 64 bit address space.

gwowen

unread,
Feb 22, 2010, 8:53:26 AM2/22/10
to
On Feb 22, 1:50 pm, santosh <santosh....@gmail.com> wrote:

> From what I understand, it seems he is aiming to write code which is
> easier to understand (and not necessarily modify. The two don't need
> to go together), for those with some knowledge of algorithms and
> pseudo-code, but little or no knowledge of C.

Yes, thats exactly right.

Francis Moreau

unread,
Feb 22, 2010, 8:57:15 AM2/22/10
to
On Feb 21, 4:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:

> Malcolm McLean wrote:
>
> > The problem is that size_t ends up being the default index variable
> > type, which causes all sorts of problems. Mostly it's psychological -
> > people would much rather write int i; than size_t i when declaring a
> > counter. However there are also many situations where unsigned indices
> > are inconvenient, eg for(i=N-1;i>=0;i--).

>
> Idioms is there for those as wants to count down:
>
> size_t i = N;
> while(i--)
>
> is simpler, shorter, and more correcterer.
>

FWIW, I prefer just keep using 'int' for index, because

a) a variable whose name is 'i' has always 'int' type
for me;

b) I feel more confident to write this:

int i = N;
while (i-- >= 0) { .... };

because this is more robust and you can have in
the body of the while construct something like
this: "i-= X" where X > 1 without worring if 'i'
is greater that X.

santosh

unread,
Feb 22, 2010, 9:03:07 AM2/22/10
to

As far as I know, the purpose of size_t seems to be to serve as a
portable type to hold the sizes of objects. However it's also the
only type guaranteed to hold the number of elements of an object, in
a strict sense. However I agree it seems unnatural to use it for
indexing arrays. It's probably the ugly name. As I said, unsigned
long should work in nearly all cases, if you'd prefer that.

But for maximum portability, I guess size_t is the way to go, both
for holding sizes and index values for arrays.


Richard Tobin

unread,
Feb 22, 2010, 9:07:42 AM2/22/10
to
In article <M-SdnUFuaLJKpB_W...@bt.com>,
Richard Heathfield <r...@see.sig.invalid> wrote:

>> It requires the reader to remember the
>> difference between --i and i--, and it requires them to be aware of
>> the implicit int-to-bool conversion.

>I would expect any serious C programmer to be aware of both of these
>without having to think too strenuously about it

I also think it's unclear, but not because the reader is likely to
be unaware of the difference. The trouble is that it seems natural
for a test at the top of a loop to be testing the value that will
be used in the loop, but here it is testing a different value.

I suppose you could use

for(i=N-1; i != (size_t)-1; i--)

but it's not pretty.

-- Richard
--
Please remember to mention me / in tapes you leave behind.

Richard Tobin

unread,
Feb 22, 2010, 9:17:51 AM2/22/10
to
In article <43661e07-5551-4f93...@u9g2000yqb.googlegroups.com>,
gwowen <gwo...@gmail.com> wrote:

>size_t i=N-1; // implicitly assume N!=0
>do {
> foo(i); // or more likely foo(bar[i-1])
> i = i - 1; // or --i or i--, as you prefer.
>} while(i != 0);

This runs the loop with values N-1 ... 1. If you're going to use i-1
as the array index in the loop, you should have set it to N, not N-1,
at the start.

And if you're going to use i-1, you might as well write

for(i=N; i>0; i++)
... i-1 ...;

Richard Tobin

unread,
Feb 22, 2010, 9:21:55 AM2/22/10
to
In article <YZmdndsa4rGUpB_W...@bt.com>,
Richard Heathfield <r...@see.sig.invalid> wrote:

>It's a trivially small error to make, if you compare it to a similar
>error once made by Isaac Asimov. He once managed to mislay a factor of
>10^23, which rather knocks 10^6 into the shade.

I once saw someone on Usenet mistakenly use 2^70 as the number of
atoms in the universe, instead of 10^70, which is out by a factor of
about 10^49.

Richard

unread,
Feb 22, 2010, 9:25:14 AM2/22/10
to
santosh <santo...@gmail.com> writes:

That doesnt wash with me.

Putting the decrement in the body makes it less clear.

If a post decrement is too clever for the reader then so is using C.

--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Francis Moreau

unread,
Feb 22, 2010, 10:06:00 AM2/22/10
to
On Feb 22, 2:53 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
wrote:

But 'unsigned long' type could have been used, couldn't it ?


gwowen

unread,
Feb 22, 2010, 10:07:21 AM2/22/10
to
On Feb 22, 2:25 pm, Richard <rgrd...@gmail.com> wrote:

> Putting the decrement in the body makes it less clear.

So you keep asserting. If you say it again, does it become magically
true?

> If a post decrement is too clever for the reader then so is using C.

Using is not understanding. Understanding is not modifying.
Modifying is not bug-spotting. Which do you mean?

Francis Moreau

unread,
Feb 22, 2010, 10:08:56 AM2/22/10
to

and

c) size_t is just a very misleading name for
something that doesn't hold a size (ie index)

santosh

unread,
Feb 22, 2010, 10:23:26 AM2/22/10
to
Francis Moreau <franci...@gmail.com> writes:
> On Feb 22, 2:53 pm, Malcolm McLean <malcolm.mcle...@btinternet.com>
> wrote:
>> On Feb 22, 3:31 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>>
>> > For example, if you see the following declaration:
>>
>> > int do_something_on_an_array(struct foo array[], size_t len);
>>
>> > Does 'len' parameter imply a size in bytes of 'array' (the one
>> > that sizeof() operator would return assuming the length of the
>> > array is known) or does it mean the number of object of type
>> > 'struct foo' in 'array'.
>>
>> In qsort,no. The function takes two size_ts, one giving element
>> width in bytes, which is where you'd expect a size_t, the other
>> giving the number of elements, which we would expect to be an int.

They are both integer types. Why expect one instead of another? By
your reasoning then one would expect an int at any place where an
integer type is warranted, but C hasn't evolved that way. Instead we
have a multiplicity of integer types.

>> The justification is that int may not be big enough to index an
>> entire array. This could happen a) if int is the address size of
>> the processor, and the array is an array of chars taking up more
>> than half of memory, or b) if int is smaller than the address
>> space of the machine.
>> a) is so unlikely that we can ignore it. b) can happen if int is
>> not 64 bits on a machine with a 64 bit address space.
>
> But 'unsigned long' type could have been used, couldn't it ?

Right, but size_t is maximally portable (whatever that means) while
unsigned long is not. It's possible to have a 32 bit unsigned long on
a machine with a 64 bit address space, though I don't know of any
actual implementation that does that. size_t is guaranteed to "just
work" across all standard implementations for holding sizes, and
serving as indexes.

Consider an architecture with 64 bit or higher integers and a 16 bit
or lower address space. In this case using unsigned long to hold
sizes and indexes would waste storage, while size_t would presumably
be more economical, being a typedef for an unsigned short or an
unsigned int.


Malcolm McLean

unread,
Feb 22, 2010, 10:53:03 AM2/22/10
to
On Feb 22, 5:23 pm, santosh <santosh....@gmail.com> wrote:
>
> Right, but size_t is maximally portable (whatever that means) while
> unsigned long is not.
>
size_t is the only type that is guaranteed to be able to index any
array. So if the number of elements is arbitrary, it's the only
correct type to use.
The problem is that very few people actually do so. So we've got a
very undesireable situation.

santosh

unread,
Feb 22, 2010, 11:06:27 AM2/22/10
to

You make a good point. IMHO there is not much practical difference
between unsigned long and size_t, at-least on most architectures. So
you might as well use size_t where you'd otherwise use unsigned long.
But int or long should be perfectly fine for objects which you know
wont exceed their limits, by design and intent. Using size_t to index
into a 1 kb array, say holding a line from a config file, seems a bit
paranoid to me:-)


blm...@myrealbox.com

unread,
Feb 22, 2010, 11:15:33 AM2/22/10
to
In article <83e91186-de5c-4695...@k41g2000yqm.googlegroups.com>,

Malcolm McLean <malcolm...@btinternet.com> wrote:
> On Feb 22, 12:46 pm, Nick Keighley <nick_keighley_nos...@hotmail.com>
> wrote:
> >
> > I'm never happy with do-while. Are we certain the loop body should
> > always be executed at least once? I use do-while but I think rather
> > carefully first. This was a bug in the original Fortran- it always did
> > a loop at least once.
> >
> I almost never use a do-while.
>
> Usually you need a empty case where the loop never executes. Other
> times it's easier to code the logic into a while loop.
> In Fortran 77 the fact that a loop body always executes at least once
> can be a real nuisance..

Not that it matters in this group, really, but I was under the
impression that one of the things that made FORTRAN 77 different
from its predecessors was that loops could execute zero times
if the range of indices was empty. Possibly the old behavior
(at least one trip through the loop no matter what) was supposed
to be made available via some compiler option?

--
B. L. Massingill
ObDisclaimer: I don't speak for my employers; they return the favor.

Richard

unread,
Feb 22, 2010, 11:18:05 AM2/22/10
to
Malcolm McLean <malcolm...@btinternet.com> writes:

The whole use of size_t is a load of bollox and c.l.c pedantry.

Use an int when you know you're indexing an array of "only a few"
elements. I know I find it more readable and just more natural. The
counter and index is an integer. size_t is silly in all but the most
extreme circumstances.

Keith Thompson

unread,
Feb 22, 2010, 11:33:20 AM2/22/10
to
gwowen <gwo...@gmail.com> writes:
> On Feb 22, 8:27 am, Keith Thompson <ks...@mib.org> wrote:
>
>> Any C programmer needs to know the difference between --i and i--,
>
> I know the difference. I've known the difference for years. But I
> still have to think about it (if you see what I mean. I know which
> one's Ant and which one's Dec, but I have to think about that too).
> If I'm reading some code, I don't want my concentration unnecessarily
> broken by having to recall some syntactical nicety, even a
> relatively. The next guy to read my code may have to think harder
> than me.
>
>> and there is no implicit int-to-bool conversion here.   The
>> condition in a while statement is a scalar that's tested for
>> inequality to 0.
>
> That test for inequality is implicit: an explicit one would look like
> while(--i != 0). I defer to your knowledge on whether this counts as
> a conversion to bool, but whatever such an implicit test is called, I
> don't care for it with --i or i--. That's writing for the compiler,
> not the human reader.

The test for inequality is part of the definition of the while
statement; it also occurs in "while (x > 0)" (x > 0 yields 0 or 1;
the behavior of the while statement is controlled by whether the
result is unequal to 0).

There could hardly be a conversion to bool, since bool (or _Bool)
didn't exist in C prior to C99. C99 could have changed the rules for
conditions, but it didn't.

That's what the language says. Personally, though, I agree with you.
I dislike the use of expressions that aren't logically Boolean
as conditions. By "logically Boolean", I mean having two possible
meaningful values, where 0 denotes a false condition and anything
non-zero denotes a true condition, with no meaningful distinction
among non-zero values. This includes results of certain operators,
values of variables used in this way, and of course anything of
type _Bool. For other kinds of expressions, I prefer an explicit
"!= 0" or "!= '\0'" or "!= 0.0" or "!= NULL".

But plenty of C programmers don't feel that way, and we all need
to be able to read and understand their code.

> Personally, I almost never use --i as anything but an stand-alone
> expression, don't use i-- unless I can absolutely help it. Is there a
> compiler anywhere for which
>
> z = i--;
>
> produces different code than
>
> z = i;
> i = i-1;

Maybe, maybe not. They have the same effect; it doesn't make
sense to choose one or the other based on the generated code
(unless you're working around a serious compiler bug).

> And, if not, which one is clearer to a neophyte C coder who's been
> given my code to maintain (poor bastard), or a Fortran programmer
> trying to see how my C code works, or a mathematician checking my
> implementation of his algorithm? Yes, its minor a stylistic point,
> and they're automatically subjective, but that's my opinion. I don't
> doubt yours is at least as valid, and probably more widely held.

As a standalone statement, I find "i = i - 1;" *less* clear than
"i--;" or "--i;". It would make me ask myself whether the author
is unaware of the "--" operator, and therefore probably shouldn't
be writing C.

And I don't write code for neophyte programmers (except for
examples I post here), programmers who don't know the language,
or non-programmers. It needs to be clear to my peers, including
myself a year later. My style may be more straightforward than some
(I might write "z = i; i--;" rather than "z = i--;"), but I'm not
going to dumb it down to cater to people who probably won't see my
code anyway.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson

unread,
Feb 22, 2010, 11:45:47 AM2/22/10
to
Phil Carmody <thefatphi...@yahoo.co.uk> writes:
> Richard Heathfield <r...@see.sig.invalid> writes:
>> Malcolm McLean wrote:
>>> The worst problem is that, because C is strictly typed, an int * is
>>
>> I don't recall ever having to do that, in 20+ years of using C.
>
> I don't remember C being a strictly typed language, on the
> presumption that implies something similar to being strongly
> typed.

That depends on what you mean by "strongly typed".

Malcolm's point (which you partially snipped) is that int* and
size_t* are incompatible types, and he's correct; assigning an int*
to a size_t*, or vice versa, is a constraint violation.

In a sense, C pointer types (other than void*) are "strongly typed",
but C arithmetic types are not.

Keith Thompson

unread,
Feb 22, 2010, 11:54:11 AM2/22/10
to

And that strikes me as a specialized and fairly rare requirement.

If you have a specific need for your code to be legible by "those


with some knowledge of algorithms and pseudo-code, but little or

no knowledge of C", then of course that's what you should do.

Most of the rest of us have no such requirement, and catering to it
has a non-zero cost (making the code less clear to the experienced C
programmers who are actually likely to read it) that we're unwilling
to pay.

Yet another case is writing code for the purpose of teaching C.
Such code needs to be readable by inexperienced C programmers,
but it should introduce common C idioms. It sounds like you're
not in that position.

Code should be written with its audience in mind -- and the compiler
is not the only audience to be considered.

Keith Thompson

unread,
Feb 22, 2010, 12:00:24 PM2/22/10
to
Francis Moreau <franci...@gmail.com> writes:
> On Feb 21, 4:49 pm, santosh <santosh....@gmail.com> wrote:
[...]

>> Why do you say it's not a good type to represent the size of non-char
>> objects? What's your reasoning for this?
>
> Well, size_t is the type of the value returned by sizeof(). And
> sizeof() returns the number of bytes (ie char) of its operand. So I
> assumed that size_t was introduced to represent a number of char.

And a consequence of that is that, since array elements are at
least one byte, size_t can also safely be used to represent a
number of array elements, or an array index. No other predefined
type guarantees this (other than uintmax_t, which is overkill).
Maybe a smaller type than size_t suffices to count elements of an
array of double, but it's not worth the effort to figure that out.

There are only a limited number of predefined integer types.
We can't expect each one to have a name that precisely
reflects all the purposes for which it can be used. A name like
"size_or_count_or_index_t" might be more accurate than "size_t",
but I like "size_t" just fine.

Richard Heathfield

unread,
Feb 22, 2010, 12:56:51 PM2/22/10
to
Francis Moreau wrote:
> On Feb 21, 4:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>> Malcolm McLean wrote:
>>
>>> The problem is that size_t ends up being the default index variable
>>> type, which causes all sorts of problems. Mostly it's psychological -
>>> people would much rather write int i; than size_t i when declaring a
>>> counter. However there are also many situations where unsigned indices
>>> are inconvenient, eg for(i=N-1;i>=0;i--).
>> Idioms is there for those as wants to count down:
>>
>> size_t i = N;
>> while(i--)
>>
>> is simpler, shorter, and more correcterer.
>>
>
> FWIW, I prefer just keep using 'int' for index, because
>
> a) a variable whose name is 'i' has always 'int' type
> for me;
>
> b) I feel more confident to write this:
>
> int i = N;
> while (i-- >= 0) { .... };
>
> because this is more robust

It's also wrong. It should be > 0, not >= 0.

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Richard Heathfield

unread,
Feb 22, 2010, 12:57:53 PM2/22/10
to
Richard Tobin wrote:
> In article <43661e07-5551-4f93...@u9g2000yqb.googlegroups.com>,
> gwowen <gwo...@gmail.com> wrote:
>
>> size_t i=N-1; // implicitly assume N!=0
>> do {
>> foo(i); // or more likely foo(bar[i-1])
>> i = i - 1; // or --i or i--, as you prefer.
>> } while(i != 0);
>
> This runs the loop with values N-1 ... 1. If you're going to use i-1
> as the array index in the loop, you should have set it to N, not N-1,
> at the start.
>
> And if you're going to use i-1, you might as well write
>
> for(i=N; i>0; i++)

ITYM i--

Ersek, Laszlo

unread,
Feb 22, 2010, 12:56:32 PM2/22/10
to
In article <bnga57-...@news.eternal-september.org>, Richard <rgr...@gmail.com> writes:
> santosh <santo...@gmail.com> writes:
>
>> Richard <rgr...@gmail.com> writes:
>>> gwowen <gwo...@gmail.com> writes:
>>>> On Feb 22, 11:31 am, Richard <rgrd...@gmail.com> wrote:
>>>>
>>>>> The -- in the loop is a well known C usage and if its not clear
>>>>> to you then your C is hazy to say the least. It is much more
>>>>> confusing and hard to read to put the decrement on some line in
>>>>> the body.
>>>>
>>>> Your code is clear for people with good C skills.
>>>> My code is clear for people without good C skills, and clear (but
>>>> non- idiomatic) for those with good skills, and clear-but-hideous
>>>> for C mavens. I'm OK with that.
>>>
>>> But your code isn't any clearer at all.
>>>
>>> Anyone modifying or reading that code is far likelier to understand
>>> the common C idiom of the decrement in the loop than in some random
>>> body line.
>>
>> This can be solved to some extent by placing the x = x + 1 in a for
>> loop, instead of somewhere within a while.
>>
>>> Anyone NOT understanding "while(i--)" has no business modifying the
>>> code in the first place.
>>
>> From what I understand, it seems he is aiming to write code which is
>> easier to understand (and not necessarily modify. The two don't need
>> to go together), for those with some knowledge of algorithms and
>> pseudo-code, but little or no knowledge of C.
>>
>
> That doesnt wash with me.
>
> Putting the decrement in the body makes it less clear.
>
> If a post decrement is too clever for the reader then so is using C.

for (i = 0; i < N; ++i) {
arr[i];
}

is the same as (barring continue / break etc)

i = 0;
while (i < N) {
arr[i];
++i;
}


and to reverse the traversal, I like to write the following:

i = N;
while (0 < i)
{
--i;
arr[i];
}

Reasons:

- The syntax of arr[i] doesn't change.

- The set of arr subscripts is the same as before ([0 .. N-1]), only in
reverse order.

- "i" traverses the exact same value set as before ([0 .. N]), only in
reverse order.

The formula

i = N;
while (i--) {
arr[i];
}

contains one superfluous decrement and invalidates the last point: "i"
will finish with (type)-1, and the set of values visited by "i" will
become [0 .. N] U { (type)-1 }.

Cheers,
lacos

Richard Heathfield

unread,
Feb 22, 2010, 1:00:48 PM2/22/10
to
Richard Tobin wrote:
> In article <M-SdnUFuaLJKpB_W...@bt.com>,
> Richard Heathfield <r...@see.sig.invalid> wrote:
>
>>> It requires the reader to remember the
>>> difference between --i and i--, and it requires them to be aware of
>>> the implicit int-to-bool conversion.
>
>> I would expect any serious C programmer to be aware of both of these
>> without having to think too strenuously about it
>
> I also think it's unclear, but not because the reader is likely to
> be unaware of the difference. The trouble is that it seems natural
> for a test at the top of a loop to be testing the value that will
> be used in the loop,

You mean as in while(*s++=*t++); ? :-)

Seebs

unread,
Feb 22, 2010, 1:28:14 PM2/22/10
to
On 2010-02-22, Francis Moreau <franci...@gmail.com> wrote:
> I'm just trying to understand what (expert) people can deduce when
> they're seeing an object whose type is size_t.

That it's non-negative.

> For example, if you see the following declaration:

> int do_something_on_an_array(struct foo array[], size_t len);

> Does 'len' parameter imply a size in bytes of 'array' (the one that
> sizeof() operator would return assuming the length of the array is
> known) or does it mean the number of object of type 'struct foo' in
> 'array'.

I would expect the latter.

The times when I'll assume len to be in bytes will be when it's associated
with a char * or void *. Otherwise, I assume it to be in the relevant unit.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Kelsey Bjarnason

unread,
Feb 22, 2010, 1:37:54 PM2/22/10
to
On Mon, 22 Feb 2010 04:13:12 -0800, gwowen wrote:

> On Feb 22, 11:13 am, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:
>
>> And while you're at it, are you writing code for C programmers, or for
>> mathematicians?
>
> Yes. If a reasonably computer-savvy mathematician, with some
> programming experience, not necessarily in C, cannot understand my C
> code, sufficiently well to detect whether I've correctly implemented an
> algorithm with which they're familiar, then it probably needs
> refactoring for clarity.

Assuming he's conversant on the finicky details of C, such as which
operations can overflow, the effects of shifting, etc, etc, etc, etc,
etc. And if he knows this, a little pre-inc or post-inc isn't going to
bother him a bit, nor is an implicit comparison to zero.


> Yes, I know this is a fairly extreme position. No, I don't expect anyone
> who is not working on my codebase to code in this manner.
>
>> It's just not how C programmers write C.
>
> It's how some C programmers write C.

I've been using C, personally and professionally, for, oh, a couple
decades and change and I have yet to encounter C code from anyone who was
not either a rank newbie or attempting ofuscation which used constructs
such as you suggest.

They may be out there, professional C coders who, for some reason, avoid
standard C idioms, but I don't' know _where_ out there.

> I block out a lot of algorithms in
> Matlab, then port the timing-critical bits to C (mainly to avoid the
> copy-in-copy-out that plagues matrix operations in Matlab). ++i does
> nothing in Matlab, and i++ is a syntax error, so you see a lot of
>
> i = i+1;
>
> When porting, I'm not going to change that to ++i just for idiomatic
> reasons.

Which would be fine in a module or routine documented as such: "Code
generated [in|by] MatLab; the use of non-idiomatic expressions is
annoying but functional."

At least then you're letting the future maintainers know that the code
isn't written by a newbie, but appears that way for a specific reason.

John Bode

unread,
Feb 22, 2010, 1:46:54 PM2/22/10
to
On Feb 22, 10:18 am, Richard <rgrd...@gmail.com> wrote:

> Malcolm McLean <malcolm.mcle...@btinternet.com> writes:
> > On Feb 22, 5:23 pm, santosh <santosh....@gmail.com> wrote:
>
> >> Right, but size_t is maximally portable (whatever that means) while
> >> unsigned long is not.
>
> > size_t is the only type that is guaranteed to be able to index any
> > array. So if the number of elements is arbitrary, it's the only
> > correct type to use.
> > The problem is that very few people actually do so. So we've got a
> > very undesireable situation.
>
> The whole use of size_t is a load of bollox and c.l.c pedantry.
>
> Use an int  when you know you're indexing an array of "only a few"
> elements.

This attitude has bitten me in the ass more than once in a mixed-
platform environment (where I spent most of the '90s and the first
half of the '00s); I was working on a machine with 32-bit ints, and
what started out as "only a few" wound up growing into "a metric
assload", where "a metric assload" > 2^16. I didn't realize there was
a problem until the next build went into testing, where the 16-bit int
machine started blowing up.

As a result of the time I wasted on that and other projects, I now
uniformly declare array indices as size_t, regardless of the array
size. Requirements change, environments change, and it just makes
sense to pick the one type that will work in any conceivable scenario,
rather than saying "I'll use an int here, an unsigned long there, and
size_t that other place, and if the requirements or operating
environment change I can always waste a little time to go back and
tweak the small stuff." Use the thing that works everywhere all the
time and be done with it; you've got bigger issues to worry about.

That's not pedantry, that's bitter experience.

> I know I find it more readable and just more natural. The
> counter and index is an integer.

So is size_t (an integer, that is).

> size_t is silly in all but the most extreme circumstances.
>

It's one of the few things I know I won't have to go back and change
later. That makes up for any silliness.

ImpalerCore

unread,
Feb 22, 2010, 5:08:21 PM2/22/10
to
On Feb 21, 8:37 am, Francis Moreau <francis.m...@gmail.com> wrote:
> Hello,
>
> I usually use 'unsigned int' type for variables which hold the length
> of a buffer.
>
> However, someone suggests me to use 'size_t'.
>
> So I took a look to the C99 spec and see what it tells about size_t:
> and it's the type of the retuned value by sizeof() (6.5.3.4 p4) and
> its max value is 65535 (7.18.3 p2).

>
> size_t doesn't seem to be the good type to use when the variable of
> that type describes the number of elements of a buffer whose type is
> not 'char' and if the buffer size is less than 65535 bytes.
>
> Is that correct ?
>
> Thanks

I'll throw in my size_t issue here.

I have a few functions that use a 'ssize_t' parameter, where -1 is
used to indicate to use the whole string, or the end of the container,
i.e.

my_string_t* my_string_append_n( my_string_t* mstr, const char* str,
ssize_t n );

If n is -1, the entire string 'str' is appended to 'mstr'.

I use size_t on a regular basis, but what is the correct type for
ssize_t? I currently check something like HAVE_SSIZE_T, and if it
doesn't exist, define ssize_t as ptrdiff_t. Is this a viable way to
approach indexes that you want to allow negative values? If not, how
would you do it?

Thanks for the opinions.

Bill Cunningham

unread,
Feb 22, 2010, 5:29:33 PM2/22/10
to

"Richard" <rgr...@gmail.com> wrote in message
news:bnga57-...@news.eternal-september.org...

> That doesnt wash with me.
>
> Putting the decrement in the body makes it less clear.
>
> If a post decrement is too clever for the reader then so is using C.

Use a debugger.


bartc

unread,
Feb 22, 2010, 6:06:44 PM2/22/10
to

"Richard Tobin" <ric...@cogsci.ed.ac.uk> wrote in message
news:hlu3u3$15bj$5...@pc-news.cogsci.ed.ac.uk...

> In article <YZmdndsa4rGUpB_W...@bt.com>,
> Richard Heathfield <r...@see.sig.invalid> wrote:
>
>>It's a trivially small error to make, if you compare it to a similar
>>error once made by Isaac Asimov. He once managed to mislay a factor of
>>10^23, which rather knocks 10^6 into the shade.
>
> I once saw someone on Usenet mistakenly use 2^70 as the number of
> atoms in the universe, instead of 10^70, which is out by a factor of
> about 10^49.

And I'd heard it was 10^80. Which is enough material for ten thousand
million universes, if your figure is correct. And if mine is the correct
one, your universe would only run to about one galaxy.

--
Bart


Richard Heathfield

unread,
Feb 22, 2010, 6:17:18 PM2/22/10
to

Ten thousand million universes ought to be enough for anybody.

pete

unread,
Feb 22, 2010, 10:31:30 PM2/22/10
to
Richard Heathfield wrote:
> Malcolm McLean wrote:

>
>> On Feb 21, 3:37 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>>
>>> size_t doesn't seem to be the good type to use when the variable of
>>> that type describes the number of elements of a buffer whose type is
>>> not 'char' and if the buffer size is less than 65535 bytes.
>>>
>> size_t is an int designed by committee.
>
>
> I don't think it's as bad as you paint it. In fact, it's a lot, lot,
> lot, lot, lot, lot, lot better than you paint it.

>
>> The problem is that size_t ends up being the default index variable
>> type, which causes all sorts of problems. Mostly it's psychological -
>> people would much rather write int i; than size_t i when declaring a
>> counter. However there are also many situations where unsigned indices
>> are inconvenient, eg for(i=N-1;i>=0;i--).
>
>
> Idioms is there for those as wants to count down:
>
> size_t i = N;
> while(i--)
>
> is simpler, shorter, and more correcterer.

I would write that last part as

while(i-- != 0)

--
pete

Nick Keighley

unread,
Feb 23, 2010, 4:15:11 AM2/23/10
to
On 22 Feb, 22:29, "Bill Cunningham" <nos...@nspam.invalid> wrote:
> "Richard" <rgrd...@gmail.com> wrote in message

no, don't
a debugger is not the right tool to learn what C constructs do

Nick Keighley

unread,
Feb 23, 2010, 4:29:24 AM2/23/10
to
On 22 Feb, 18:37, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:
> On Mon, 22 Feb 2010 04:13:12 -0800, gwowen wrote:
> > On Feb 22, 11:13 am, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:

> >> And while you're at it, are you writing code for C programmers, or for
> >> mathematicians?
>
> > Yes.  If a reasonably computer-savvy mathematician, with some
> > programming experience, not necessarily in C, cannot understand my C
> > code, sufficiently well to detect whether I've correctly implemented an
> > algorithm with which they're familiar, then it probably needs
> > refactoring for clarity.
>
> Assuming he's conversant on the finicky details of C, such as which
> operations can overflow, the effects of shifting, etc, etc, etc, etc,
> etc.  And if he knows this, a little pre-inc or post-inc isn't going to
> bother him a bit, nor is an implicit comparison to zero.

yes, it's hard to imagine a mathematician being knowledgeable about
rounding and overflow but not being capable of understanding pre- and
post- increment operators. I'd accept I might have to explain them to
him, once.


> > Yes, I know this is a fairly extreme position. No, I don't expect anyone
> > who is not working on my codebase to code in this manner.
>
> >> It's just not how C programmers write C.

? this is a phrase that seem sto get tossed about a lot. Isn't this
the "true scotsman" fallacy. If they don't write C like me they are
*really* a C programmer.

The guy who said if saw
i = i + 1;

he'd suspect the whole code base! What planet is he from?!


> > It's how some C programmers write C.

yes


> I've been using C, personally and professionally, for, oh, a couple
> decades and change and I have yet to encounter C code from anyone who was
> not either a rank newbie or attempting ofuscation which used constructs
> such as you suggest.

in what sense is this code obfuscated! There are nearly always several
reasonable ways to skin a cat in C. This C-as-APL isn't the only way
to code.


> They may be out there, professional C coders who, for some reason, avoid
> standard C idioms, but I don't' know _where_ out there.
>
> > I block out a lot of algorithms in
> > Matlab, then port the timing-critical bits to C (mainly to avoid the
> > copy-in-copy-out that plagues matrix operations in Matlab). ++i does
> > nothing in Matlab, and i++ is a syntax error, so you see a lot of
>
> > i = i+1;
>
> > When porting, I'm not going to change that to ++i just for idiomatic
> > reasons.
>
> Which would be fine in a module or routine documented as such: "Code
> generated [in|by] MatLab; the use of non-idiomatic expressions is
> annoying but functional."

good grief. If I saw a comment like that I'd think "pretencious ****",
but that's just me.

Nick Keighley

unread,
Feb 23, 2010, 4:35:31 AM2/23/10
to
On 22 Feb, 11:22, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:

> On Mon, 22 Feb 2010 02:54:00 -0800, Nick Keighley wrote:
> > as I used to say to my physics tutor. What's a few orders of magnitude
> > between friends?
>
> Send me your paycheck each pay period, and I'll send you back an order or
> three less and we'll see.  :)

I once heard someone to refer to the earth's population as 6 million.
To which I responded "where'd everyone else go?!"

Kelsey Bjarnason

unread,
Feb 23, 2010, 4:50:32 AM2/23/10
to
[snips]

On Tue, 23 Feb 2010 01:29:24 -0800, Nick Keighley wrote:

>> >> It's just not how C programmers write C.
>
> ? this is a phrase that seem sto get tossed about a lot. Isn't this the
> "true scotsman" fallacy. If they don't write C like me they are *really*
> a C programmer.

No, it's not a "True Scotsman" fallacy. To be so it would have to assert
that one _cannot_ do this and be considered a C programmer. That is not
the case here. The case here is simply noting that, experientially and
observationally, people who program in C beyond the "entry" level - that
is to say "C programmers", rather than "people learning C" - tend to use
the common and compact form, to the point that use of the "extended" form
is sufficiently unusual as to send up flags about probable "newbie"
status.

Putting it in another context... almost all electricians I knew, when
working as a sparky, used "Klein" dikes. There's a reason for it; Kleins
are particularly well designed, making the job considerably easier and
more efficient (and less physically demanding) than most other tools.

Could a professional electrician use a $6 set of dikes from Bob's House
of Tools? Yes. Would doing so mean he's not a professional
electrician? No. Would seeing such a tool in his pouch make you wonder
whether he'd been doing this long enough to be regarded as a professional
electrician? Absolutely.

> The guy who said if saw
> i = i + 1;
>
> he'd suspect the whole code base! What planet is he from?!

That'd be me, and Earth. I'd suspect it because, barring some
documentation explaining *why* the code used such unconventional
constructs, the use of the construct is sufficient on its face to suggest
he has very little exposure to very commonplace C coding practices -
which one would only expect to be true of a neophyte, not a seasoned vet.

Would *you* trust your app to unreviewed code produced by a neophyte? I
wouldn't.


>> I've been using C, personally and professionally, for, oh, a couple
>> decades and change and I have yet to encounter C code from anyone who
>> was not either a rank newbie or attempting ofuscation which used
>> constructs such as you suggest.
>
> in what sense is this code obfuscated!

In the sense that is rejects the standard C usage patterns for something
else, which implies there is a _reason_ for rejecting the standard usage
patters, but the reasons for it, unless laid out explicitly, are not
obvious. The non-obvious reasoning for the use of atypical constructs
constitutes obfuscation.

> There are nearly always several
> reasonable ways to skin a cat in C. This C-as-APL isn't the only way to
> code.

No, it's not. Nor has anyone suggested it is. What has been said is
that there are some standard C idioms which are so nearly universal that
when they are rejected in favour of something else, ther must be a reason
for it. That reason (in the particular case being examined) is _usually_
(based on observation) the neophyte status of the coder. If the coder is
not a neophyte, and is using such an antypical construct, then presumably
he has a specific reason for doing so, in which case his reasoning should
be explicit (to avoid confusion over why he's doing it, or his skill
level) _or_ he is intentionally doing sometihng atypical for reasons of
obfuscation - such as relying on a non-obvious side-effect of the
construct he used which doesn't occur with the conventional construct.


>> Which would be fine in a module or routine documented as such: "Code
>> generated [in|by] MatLab; the use of non-idiomatic expressions is
>> annoying but functional."
>
> good grief. If I saw a comment like that I'd think "pretencious ****",
> but that's just me.

Why? A comment explaining that neophyte-like code looks that way for a
reason, not because the coder is, in fact a neophyte, would be very
helpful to maintainers.

Given a non-trivial piece of "working code" to use, knowing it came from
someone such as, oh, Pfaff or Heathfield, out of their own production
code base, would tend to suggest a certain level of review was
necessary. Given a piece of code which appears to be from a rank newbie
suggests a rather different level of code review would be necessary.

Seeing a piece of code which _appears_ to be from a rank newbie, but
which is documented as appearing that way for a particular and specific
reason, such commentary provided by someone such as Pfaff or Heathfield,
gives a lot more confidence that the code is in fact correct, than its
appearance as neophyte-level code does.

Richard Heathfield

unread,
Feb 23, 2010, 5:00:21 AM2/23/10
to

I'm *reasonably* sure he was joking - yanking Richard NoName
MyHammerIsADebuggerAndEveryProblemIsANail Riley's chain a little.

gwowen

unread,
Feb 23, 2010, 6:43:39 AM2/23/10
to
On Feb 23, 9:50 am, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:

> Why?  A comment explaining that neophyte-like code looks that way for a
> reason, not because the coder is, in fact a neophyte, would be very
> helpful to maintainers.

Or, alternatively, just drop the assumption that "code that doesn't
look like mine is neophyte code". How about judging the quality of
the coder by the robustness and correctness of the code, rather than
whether they use certain syntactic idioms.

True, this will require more thought, but occasionally using thought
rather than dogma is beneficial.

Kelsey Bjarnason

unread,
Feb 23, 2010, 7:43:49 AM2/23/10
to
On Tue, 23 Feb 2010 03:43:39 -0800, gwowen wrote:

> On Feb 23, 9:50 am, Kelsey Bjarnason <kbjarna...@gmail.com> wrote:
>
>> Why?  A comment explaining that neophyte-like code looks that way for a
>> reason, not because the coder is, in fact a neophyte, would be very
>> helpful to maintainers.
>
> Or, alternatively, just drop the assumption that "code that doesn't look
> like mine is neophyte code".

Again, not "like mine", but "like the vast majority I have seen which was
produced by skilled C programmers".

If someone is violating common conventions, they either have a reason for
doing so, or they're simply so unfamiliar with what the rest of the world
has been doing for the last couple of decades that they may as well be a
neophyte.

We had one particular coder around here for a while who persisted in
using such bizarre macros and the like that his code was virtually
unreadable. Was it good code? Was it safe, reliable, usable code?

Maybe it was, but his choice to make it _appear_ as something else meant
few cared to bother. If one chooses to make code appear bad or
amateurish or otherwise undesirable, why blame the reder for rejecting it?


Michael Foukarakis

unread,
Feb 23, 2010, 9:25:59 AM2/23/10
to
On Feb 22, 9:50 am, gwowen <gwo...@gmail.com> wrote:

> On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> > Idioms is there for those as wants to count down:
>
> > size_t i = N;
> > while(i--)
>
> > is simpler, shorter, and more correcterer.
>
> Sadly, its also less clear.  It requires the reader to remember the

> difference between --i and i--, and it requires them to be aware of
> the implicit int-to-bool conversion.

If the reader doesn't remember those, maybe he/she is worried about
concepts different than the clarity of idioms.

Michael Foukarakis

unread,
Feb 23, 2010, 9:51:08 AM2/23/10
to
On Feb 22, 5:08 pm, Francis Moreau <francis.m...@gmail.com> wrote:
> On Feb 22, 2:57 pm, Francis Moreau <francis.m...@gmail.com> wrote:
>
>
>
> > On Feb 21, 4:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:

>
> > > Malcolm McLean wrote:
>
> > > > The problem is that size_t ends up being the default index variable
> > > > type, which causes all sorts of problems. Mostly it's psychological -
> > > > people would much rather write int i; than size_t i when declaring a
> > > > counter. However there are also many situations where unsigned indices
> > > > are inconvenient, eg for(i=N-1;i>=0;i--).
>
> > > Idioms is there for those as wants to count down:
>
> > > size_t i = N;
> > > while(i--)
>
> > > is simpler, shorter, and more correcterer.
>
> > FWIW, I prefer just keep using 'int' for index, because
>
> >    a) a variable whose name is 'i' has always 'int' type
> >       for me;
>
> >    b) I feel more confident to write this:
>
> >         int i = N;
> >         while (i-- >= 0) { .... };
>
> >       because this is more robust and you can have in
> >       the body of the while construct something like
> >       this: "i-= X" where X > 1 without worring if 'i'
> >       is greater that X.
>
> and
>
>      c) size_t is just a very misleading name for
>         something that doesn't hold a size (ie index)

Since when is index a synonym for size?

Richard Heathfield

unread,
Feb 23, 2010, 10:00:32 AM2/23/10
to
Michael Foukarakis wrote:
> On Feb 22, 5:08 pm, Francis Moreau <francis.m...@gmail.com> wrote:
<snip>

>> c) size_t is just a very misleading name for
>> something that doesn't hold a size (ie index)
>
> Since when is index a synonym for size?

That isn't what he meant. He meant 'index' as an example of something
that is /not/ a size, but which is nevertheless often represented by
size_t. And there's a good reason for that - a size_t is guaranteed to
be able to hold *any* correct index value into any array, because size_t
must be able to hold the size of the array, which is necessarily greater
than or equal to the largest legal offset into that array.

Michael Foukarakis

unread,
Feb 23, 2010, 10:00:58 AM2/23/10
to
On Feb 22, 1:11 pm, gwowen <gwo...@gmail.com> wrote:
> On Feb 22, 10:55 am, Phil Carmody <thefatphil_demun...@yahoo.co.uk>
> wrote:
>
>
>
> > Richard Heathfield <r...@see.sig.invalid> writes:
> > > gwowen wrote:

> > >> On Feb 21, 3:50 pm, Richard Heathfield <r...@see.sig.invalid> wrote:
>
> > >>> Idioms is there for those as wants to count down:
>
> > >>> size_t i = N;
> > >>> while(i--)
>
> > >>> is simpler, shorter, and more correcterer.
>
> > >> Sadly, its also less clear.
>
> > > I disagree. I would argue that it's a well-known idiom. Still, I
> > > accept that there are arguments on both sides.

>
> > >> It requires the reader to remember the
> > >> difference between --i and i--, and it requires them to be aware of
> > >> the implicit int-to-bool conversion.
>
> > > I would expect any serious C programmer to be aware of both of these
> > > without having to think too strenuously about it, but the second at
> > > least is easily dealt with:
>
> > > while(i-- > 0)
>
> > >> It's idiomatic precisely
> > >> because, until you've seen it many times, it requires more thought
> > >> than should be necessary.

>
> > >> size_t i=N-1;  // implicitly assume N!=0
> > >> do {
> > >>   foo(i);      // or more likely foo(bar[i-1])
> > >>   i = i - 1;   // or --i or i--, as you prefer.
> > >> } while(i != 0);
>
> > > I find my version much easier to read. But then I would, wouldn't I? :-)
>
> > I see yours coping with N==0, and the others not coping with it,
> > to be the black and white distinguisher.
>
> Au contraire, Blackadder. As I posted earlier...
> Having given it some thought, I now prefer this...
>
> while(i != 0) { // Here i tells us how many loop iterations remain
>   --i;          // i now indexes the i'th element of an array...
>   foo(bar[i]);
> };

That loop will make anyone who reads your code cringe, for three
reasons:

1) A while(i != 0) loop with an array indexed by i needs special
handling, which adds to confusion. (not the case with do { } while(),
hint hint)
2) Pre-increment (--i) as special case handling raises questions. Why
--i; over i--;, or the more verbose, less idiomatic and generally
clearer i = i + 1; ?
3) It's neither common usage nor idiomatic per popular use of the term
(natural to a specific group of people, usually large enough to even
be recognized as one).

People without good C skills will be more concerned with (1) and (2)
when trying to understand if the loop does indeed what it's supposed
to.

Why avoid common idioms if you're indeed trying to write
comprehensible code? This is a trivial thing we're talking about; a
loop. This:

for(i = 0; i < N; i++) {
foo(bar[i];
}

is a million (give or take) times clearer than your solution, is
probably what any C newbie has been taught first in the subject of
loops, and conveys exactly the intended behaviour. If you are so
desperate to put it in a while loop, you can use:

while(i++ < N)
foo(bar[i]);

which does NOT raise the concerns you mention (implicit in-/equality
test, etc).

> To any programmer who is not
> familiar with idiomatic C, and is used to writing a language that does
> not have the --i idiom[0] is not clear. To someone familiar with
> idiom, yes of course it is, but to anyone else unclear.

Yet you advocate for it. Why? Are you always this inconsistent with
yourself?

Richard Heathfield

unread,
Feb 23, 2010, 10:06:09 AM2/23/10
to
Michael Foukarakis wrote:

<snip>

> for(i = 0; i < N; i++) {
> foo(bar[i];
> }
>
> is a million (give or take) times clearer than your solution, is
> probably what any C newbie has been taught first in the subject of
> loops, and conveys exactly the intended behaviour.

Not if the intent is to count /down/ (which is in fact what we're
talking about here).

> If you are so
> desperate to put it in a while loop, you can use:
>
> while(i++ < N)
> foo(bar[i]);
>
> which does NOT raise the concerns you mention (implicit in-/equality
> test, etc).

But (a) it's still counting in the wrong direction, and (b) it
introduces two bugs - the failure to deal with bar[0] and the attempt to
deal with the (presumably) non-existent element bar[N]. Well, you might
argue that it's actually only a single, but double-whammy, bug.

>> To any programmer who is not
>> familiar with idiomatic C, and is used to writing a language that does
>> not have the --i idiom[0] is not clear. To someone familiar with
>> idiom, yes of course it is, but to anyone else unclear.
>
> Yet you advocate for it. Why? Are you always this inconsistent with
> yourself?

"A foolish hobgoblin minds small consistencies." - Wolf Aldo Remerson.

Seebs

unread,
Feb 23, 2010, 12:11:22 PM2/23/10
to
On 2010-02-23, gwowen <gwo...@gmail.com> wrote:
> Or, alternatively, just drop the assumption that "code that doesn't
> look like mine is neophyte code". How about judging the quality of
> the coder by the robustness and correctness of the code, rather than
> whether they use certain syntactic idioms.

There isn't enough time in the world to give every piece of code the level
of review you'd give to something you knew was written by, say, Nilges,
or Bill Cunningham.

In practice, heuristics are an EXTREMELY effective way to allocate scarce
resources. The heuristic that certain kinds of quirky writing are a red
flag that the rest of the code will likely contain weirdness, errors, or
things that need careful re-reading to comprehend them, turns out to be
stunningly effective.

> True, this will require more thought, but occasionally using thought
> rather than dogma is beneficial.

It's nearly always beneficial. However, heuristics aren't dogma; they're just
a first pass to quickly spot cases where it's likely to be necessary to spend
extra time studying some code.

Richard Bos

unread,
Feb 23, 2010, 4:20:36 PM2/23/10
to
Malcolm McLean <malcolm...@btinternet.com> wrote:

> On Feb 22, 5:23=A0pm, santosh <santosh....@gmail.com> wrote:
> >
> > Right, but size_t is maximally portable (whatever that means) while
> > unsigned long is not.
> >
> size_t is the only type that is guaranteed to be able to index any
> array. So if the number of elements is arbitrary, it's the only
> correct type to use.
> The problem is that very few people actually do so. So we've got a
> very undesireable situation.

The problem, however, is not with size_t. It is with people who do not
want to use size_t.

Or do you also blame the large number of drunk drivers on the law
against drunk driving?

Richard

Richard Bos

unread,
Feb 23, 2010, 4:20:29 PM2/23/10
to
Malcolm McLean <malcolm...@btinternet.com> wrote:

> On Feb 21, 3:37=A0pm, Francis Moreau <francis.m...@gmail.com> wrote:
> >
> > size_t doesn't seem to be the good type to use when the variable of
> > that type describes the number of elements of a buffer whose type is
> > not 'char' and if the buffer size is less than 65535 bytes.
> >
> size_t is an int designed by committee.

Nonsense...

> The idea was that you would have a special type to hold amounts of
> memory. Since, usually, the address space of a processor is the same
> as the pointer width which is the same as an integer data register,

...because this is nonsense. There have been _many_ situations in which
sizeof(void *) != sizeof (size_t) != sizeof (int).

Richard

Richard Bos

unread,
Feb 23, 2010, 4:20:32 PM2/23/10
to
gwowen <gwo...@gmail.com> wrote:

> On Feb 22, 11:31=A0am, Richard <rgrd...@gmail.com> wrote:
>
> > The -- in the loop is a well known C usage and if its not clear to you
> > then your C is hazy to say the least. It is much more confusing and hard
> > to read to put the decrement on some line in the body.
>
> Your code is clear for people with good C skills.
> My code is clear for people without good C skills, and clear (but non-
> idiomatic) for those with good skills, and clear-but-hideous for C
> mavens. I'm OK with that.

People without good C skills should learn a bit more before trying to
hack on a program written in C. If you want COBOL, you know where to
find it.

> So, if you want your code understood as widely as possible, don't be a
> vicar of Bray; grasp the nettle, and do Yeoman's service and all
> things being equal, Bob's your uncle and you'll come up smelling of
> roses... Otherwise you'll do a Devon Loch, be hoist by your own
> petard, be gone for a right royal Burton, or otherwise come a
> cropper. I wouldn't touch idiomatic English with a bargepole. It's
> just not cricket.

And yet, I would prefer a novel written in English for literate readers
_not_ to eschew idioms. You should compare C to a Shaw play or a book by
Joyce. Do not write C as if you are Dr. Seuss - that's what BASIC is
for.

Richard

Keith Thompson

unread,
Feb 23, 2010, 4:37:58 PM2/23/10
to
ral...@xs4all.nl (Richard Bos) writes:
[...]

> ...because this is nonsense. There have been _many_ situations in which
> sizeof(void *) != sizeof (size_t) != sizeof (int).

I don't think I've ever used a system where sizeof(void*) != sizeof(size_t).

That's not to say that such systems don't exist, of course, but on
most modern systems with a linear monolithic address space, it makes
sense for void* and size_t to be the same size.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Seebs

unread,
Feb 23, 2010, 5:49:52 PM2/23/10
to
On 2010-02-23, Richard Bos <ral...@xs4all.nl> wrote:
> And yet, I would prefer a novel written in English for literate readers
> _not_ to eschew idioms. You should compare C to a Shaw play or a book by
> Joyce. Do not write C as if you are Dr. Seuss - that's what BASIC is
> for.

I have to take some exception to this, because Dr. Seuss was actually an
extremely skilled writer of English, even though many of his books don't
make this obvious to casual observation.

... But the point is still valid. Idiomatic writing is used because it is
clearer and more communicative, and yes, that does impose the cost of learning
the idioms on the reader. It's still worth it.

Peter Nilsson

unread,
Feb 23, 2010, 6:09:15 PM2/23/10
to
Keith Thompson <ks...@mib.org> wrote:

> ralt...@xs4all.nl (Richard Bos) writes:
> > ...because this is nonsense. There have been _many_
> > situations in which sizeof(void *) != sizeof (size_t)
> > != sizeof (int).
>
> I don't think I've ever used a system where sizeof(void*)
> != sizeof(size_t).

I think it's more common for size_t to match unsigned long,
rather than being dependant on the size of void *. N1256 has
a 'recommended practice'...

"The types used for size_t and ptrdiff_t should not
have an integer conversion rank greater than that of
signed long int unless the implementation supports
objects large enough to make this necessary."

There must be 64-bit systems where it's possible, but
not _necessary_, to make unsigned long and size_t larger
than 32-bit.

> That's not to say that such systems don't exist, of course,

Early 68k based Macs were capable of addressing 16M, but most
applications still had to fit into 32K, or 32K chunks stored
in the resource fork of the application. Many applications
were actually limited to using less than 32K in total. So it
wouldn't surprise me if there were some early mac C
implementations where pointers were 32-bit, but size_t and
int were only 16-bit due to the relative cost of 32-bit
operations and storage. [The 68k processor had separate
data and address registers. Even though they were all 32-bit,
16-bit operations on data registers were quicker than 32-bit
ones.]

> but on most modern systems with a linear monolithic address
> space, it makes sense for void* and size_t to be the same
> size.

Why? Serious question! It's a very common assumption, but
not one that's guaranteed by the standard. Size_t is only
required to be able to store the size of one object (more
precisely the result of sizeof.) It isn't required to be
large enough to store the combined size of all objects.

Recall the calloc kerfuffle and the possibility of creating
objects too big to fit in a size_t! Whilst I think that was
ruled out, the question remains as to whether C allows a
program to allocate more combined space than will fit in
a size_t.

--
Peter

Keith Thompson

unread,
Feb 23, 2010, 6:31:19 PM2/23/10
to
Peter Nilsson <ai...@acay.com.au> writes:
> Keith Thompson <ks...@mib.org> wrote:
>> ralt...@xs4all.nl (Richard Bos) writes:
>> > ...because this is nonsense. There have been _many_
>> > situations in which sizeof(void *) != sizeof (size_t)
>> > != sizeof (int).
>>
>> I don't think I've ever used a system where sizeof(void*)
>> != sizeof(size_t).
>
> I think it's more common for size_t to match unsigned long,
> rather than being dependant on the size of void *.

For what it's worth (which isn't a whole lot), every system I've
checked has size_t, unsigned long, and void* all the same size.

> N1256 has
> a 'recommended practice'...
>
> "The types used for size_t and ptrdiff_t should not
> have an integer conversion rank greater than that of
> signed long int unless the implementation supports
> objects large enough to make this necessary."

I've never seen a system that violates this recommendation.

> There must be 64-bit systems where it's possible, but
> not _necessary_, to make unsigned long and size_t larger
> than 32-bit.

Sure, but all the 64-bit systems I've seen have 64-bit unsigned long.
(I vaguely recall that 64-bit Windows has 32-bit unsigned long; I
don't know what it uses for size_t.)

>> That's not to say that such systems don't exist, of course,
>
> Early 68k based Macs were capable of addressing 16M, but most
> applications still had to fit into 32K, or 32K chunks stored
> in the resource fork of the application. Many applications
> were actually limited to using less than 32K in total. So it
> wouldn't surprise me if there were some early mac C
> implementations where pointers were 32-bit, but size_t and
> int were only 16-bit due to the relative cost of 32-bit
> operations and storage. [The 68k processor had separate
> data and address registers. Even though they were all 32-bit,
> 16-bit operations on data registers were quicker than 32-bit
> ones.]

Sure, if the maximum size of a single object is smaller than the
total addressing space (e.g., because an object must fit into a
single memory segment), it makes sense for size_t to be smaller
than void*.

>> but on most modern systems with a linear monolithic address
>> space, it makes sense for void* and size_t to be the same
>> size.
>
> Why? Serious question! It's a very common assumption, but
> not one that's guaranteed by the standard. Size_t is only
> required to be able to store the size of one object (more
> precisely the result of sizeof.) It isn't required to be
> large enough to store the combined size of all objects.

Absolutely. But most modern systems (at least the ones I've been
exposed to) have a monolithic linear address space, where the size of
a single object, at least in principle, has the same upper bound as
the size of all of memory. Smaller limits might be imposed by the
operating system, but those limits aren't typically enforced by the
compiler by making size_t smaller than void*.

> Recall the calloc kerfuffle and the possibility of creating
> objects too big to fit in a size_t! Whilst I think that was
> ruled out, the question remains as to whether C allows a
> program to allocate more combined space than will fit in
> a size_t.

I suspect that's not quite what you meant to say.

There's some question whether a single object can be bigger than
SIZE_MAX bytes, but I'm quite sure that the language doesn't require
the total size of all objects to be no bigger than INT_MAX.

For example, on system with 64-bit void* and 32-bit size_t (and
sufficient resources), a single object arguably couldn't be bigger
than 4 gigabytes, but you could have 1000 distinct 1-gigabyte objects.

Richard Heathfield

unread,
Feb 23, 2010, 8:04:23 PM2/23/10
to
Keith Thompson wrote:
<snip>

>
> For what it's worth (which isn't a whole lot), every system I've
> checked has size_t, unsigned long, and void* all the same size.

So you never used C under MS-DOS? Lucky fellow!

<snip>

Keith Thompson

unread,
Feb 23, 2010, 8:23:21 PM2/23/10
to
Richard Heathfield <r...@see.sig.invalid> writes:
> Keith Thompson wrote:
> <snip>
>>
>> For what it's worth (which isn't a whole lot), every system I've
>> checked has size_t, unsigned long, and void* all the same size.
>
> So you never used C under MS-DOS?

Nope.

> Lucky fellow!

Yup.

Nick Keighley

unread,
Feb 24, 2010, 4:24:47 AM2/24/10
to
On 23 Feb, 10:00, Richard Heathfield <r...@see.sig.invalid> wrote:
> Nick Keighley wrote:
> > On 22 Feb, 22:29, "Bill Cunningham" <nos...@nspam.invalid> wrote:
> >> "Richard" <rgrd...@gmail.com> wrote in message
> >>news:bnga57-...@news.eternal-september.org...

> >>> That doesnt wash with me.
> >>> Putting the decrement in the body makes it less clear.
> >>> If a post decrement is too clever for the reader then so is using C.
>
> >>     Use a debugger.
>
> > no, don't
> > a debugger is not the right tool to learn what C constructs do
>
> I'm *reasonably* sure he was joking - yanking Richard NoName
> MyHammerIsADebuggerAndEveryProblemIsANail Riley's chain a little.


I didn't think Bill was that witty.

I must confess I was expecting a response from /a/ Richard, just not
from you!

Nick Keighley

unread,
Feb 24, 2010, 4:33:01 AM2/24/10
to
On 23 Feb, 21:20, ralt...@xs4all.nl (Richard Bos) wrote:

> And yet, I would prefer a novel written in English for literate readers
> _not_ to eschew idioms. You should compare C to a Shaw play or a book by
> Joyce.

‘Sir Tristram, violer d’amores, fr’over the short sea, has passencore
rearrived from North Armorica on this side the scraggy isthmus of
Europe Minor to wielderfight his penisolate war; nor had topsawyer’s
rocks by the stream Oconee exaggerated themselse to Laurens County’s
giorgios while the went doubling their mumper all the time’


> Do not write C as if you are Dr. Seuss - that's what BASIC is for.

I would not, could not, in a box.
I could not, would not, with a fox.
I will not eat them with a mouse.
I will not eat them in a house.
I will not eat them here or there.
I will not eat them anywhere.
I do not eat green eggs and ham.
I do not like them, Sam-I-am.

no contest really...

Nick Keighley

unread,
Feb 24, 2010, 4:46:27 AM2/24/10
to

you are comparing oranges with orchards.

count = count + 1;

is not obscure code.

io_x style

#define B {
#define P printf

is obscure code

Richard

unread,
Feb 24, 2010, 4:46:28 AM2/24/10
to
Richard Heathfield <r...@see.sig.invalid> writes:

> Nick Keighley wrote:
>> On 22 Feb, 22:29, "Bill Cunningham" <nos...@nspam.invalid> wrote:
>>> "Richard" <rgrd...@gmail.com> wrote in message
>>> news:bnga57-...@news.eternal-september.org...
>>>
>>>> That doesnt wash with me.
>>>> Putting the decrement in the body makes it less clear.
>>>> If a post decrement is too clever for the reader then so is using C.
>>> Use a debugger.
>>
>> no, don't
>> a debugger is not the right tool to learn what C constructs do
>
> I'm *reasonably* sure he was joking - yanking Richard NoName
> MyHammerIsADebuggerAndEveryProblemIsANail Riley's chain a little.

My sirname begins with L. But you SHOULD know that from the email I sent
you.

As it happens, a debugger is PERFECT for seeing how it works. If you dont
realise this then you're a bigger idiot that you appear.

Run the debugger and step the loop keeping an eye on the locals. Set a
breakpoint maybe for the terminal conditions such as i==1 or 0 etc.

If you do not see that as being useful for someone learning C then thank
god you're not in a position to teach anything. While we all realise
you have a very high opinion of your self worth, please dont spread your
own ignorance of how to use such useful tools.


--
"Avoid hyperbole at all costs, its the most destructive argument on
the planet" - Mark McIntyre in comp.lang.c

Ed Vogel

unread,
Feb 24, 2010, 8:23:17 AM2/24/10
to

"Keith Thompson" <ks...@mib.org> wrote in message
news:lnhbp74...@nuthaus.mib.org...

> ral...@xs4all.nl (Richard Bos) writes:
> [...]
>> ...because this is nonsense. There have been _many_ situations in which
>> sizeof(void *) != sizeof (size_t) != sizeof (int).
>
> I don't think I've ever used a system where sizeof(void*) !=
> sizeof(size_t).
>
> That's not to say that such systems don't exist, of course, but on
> most modern systems with a linear monolithic address space, it makes
> sense for void* and size_t to be the same size.
>
I worked on a C compiler for OpenVMS. On that system size_t is
always 32-bits. By default., pointers were 32-bits, but one could
compile
(or use a #pragma) to make the size of pointers 64-bits. In that mode
sizeof(void *) != sizeof(size_t).

Ed Vogel


Richard Heathfield

unread,
Feb 24, 2010, 9:06:46 AM2/24/10
to

Neither did I. Hence the **s around "reasonably".

Kelsey Bjarnason

unread,
Feb 24, 2010, 9:16:56 AM2/24/10
to
[snips]

On Wed, 24 Feb 2010 01:46:27 -0800, Nick Keighley wrote:

> count = count + 1;
>
> is not obscure code.

When the construct used in virtually every piece of C code one runs
across reads "count++", where "count++" is such a common idiom that
avoiding its use suggests there is some reason (either neophyte status,
or something less obvious) for doing so, then yes, lacking comments
explaining precisely _why_ such a screwball construct is being used, the
result _is_ obscure code.

Without additional explanation (eg "Imported from MatLab, which uses this
sort of construct") there is no readily apparent reason for using such a
construct. If we assume the coder is not a neophyte, it then follows he
is using this screwball notation for a specific purpose, which implies
there is some behaviour involved which shows up in "count = count + 1"
but _does not_ show up in "count++".

Which means now we have do scratch our heads, go running for the standard
(and the compiler documentation), check "count" to see if there's some
special magic associated with it, and try to figure out _what_ the
different behaviour is that's being relied upon.

When the search fails (assuming it does, i.e. we find no special magic)
we're left not with confidence the construct works as we'd expect, but
rather the uneasy feeling it is relying on some bizarre behaviour, quite
possibly of an implementation-specific optimizer, or some equivalent,
which we'll never be able to fully understand, let alone rely upon. The
code, as a result, simply cannot be trusted.

There are languages in which "count = count + 1" are common idiom. To
people used to those languages, such a construct may be clear and
concise. C is not one of those languages.

Indeed, the very fact this has engendered a discussion as involved as
this should be sufficient to show that such constructs are _not_ trusted
by C coders, but _are_ treated as flags suggesting extreme review is
warranted.

Malcolm McLean

unread,
Feb 24, 2010, 9:22:51 AM2/24/10
to
On Feb 24, 3:23 pm, "Ed Vogel" <edward.vogel@hp_stopping_spam.com>
wrote:

>
>     I worked on a C compiler for OpenVMS.   On that system size_t is
>     always 32-bits.  By default., pointers were 32-bits, but one could
> compile
>     (or use a #pragma) to make the size of pointers 64-bits.  In that mode
>     sizeof(void *) != sizeof(size_t).
>
People aren't going to be using 64 bits for long before they start
asking for objects greater than 4GB.
Your compiler is an intermediate step along the way.

Ersek, Laszlo

unread,
Feb 24, 2010, 11:00:30 AM2/24/10
to
In article <hm397h$7hb$1...@usenet01.boi.hp.com>, "Ed Vogel" <edward.vogel@hp_stopping_spam.com> writes:

> I worked on a C compiler for OpenVMS. On that system size_t is
> always 32-bits. By default., pointers were 32-bits, but one could compile
> (or use a #pragma) to make the size of pointers 64-bits. In that mode
> sizeof(void *) != sizeof(size_t).


Aah, great!

ludens$ cc /version

HP C V7.1-015 on OpenVMS Alpha V8.3


ludens$ help cc /pointer_size

CC

/POINTER_SIZE

/POINTER_SIZE=option
/NOPOINTER_SIZE (D)

Controls whether or not pointer-size features are enabled, and
whether pointers are 32 bits or 64 bits long.

The default is /NOPOINTER_SIZE, which disables pointer-size
features, such as the ability to use #pragma pointer_size, and
directs the compiler to assume that all pointers are 32-bit
pointers. This default represents no change over previous versions
of the compiler.

You must specify one of the following options:

SHORT The compiler assumes 32-bit pointers.

32 Same as SHORT.

LONG The compiler assumes 64-bit pointers.

64 Same as LONG.

Specifying /POINTER_SIZE=32 directs the compiler to assume that all
pointers are 32-bit pointers. But unlike the default of
/NOPOINTER_SIZE, /POINTER_SIZE=32 enables use of the #pragma
pointer_size long and #pragma pointer_size short preprocessor
directives to control pointer size throughout your program.

Specifying /POINTER_SIZE=64 directs the compiler to assume that all
pointers are 64-bit pointers, and also enables use of the #pragma
pointer_size directives.


ludens$ type siz.c

#include <stdio.h>
#include <stdlib.h>

int
main(void)
{
/* sorry for the stupid indentation */
return 0 <= fprintf(stdout, "%u %u\n", (unsigned)sizeof(void *),
(unsigned)sizeof(size_t)) && 0 == fflush(stdout) ? EXIT_SUCCESS
: EXIT_FAILURE;
}


ludens$ cc /standard=ansi89 /pointer_size=32 siz.c
ludens$ link siz.obj
ludens$ run siz
4 4


ludens$ cc /standard=ansi89 /pointer_size=64 siz.c
ludens$ link siz.obj
ludens$ run siz
8 4


The results are the same with /standard=c99.

Cheers,
lacos

Ersek, Laszlo

unread,
Feb 24, 2010, 11:07:52 AM2/24/10
to
In article <Xt8Bokvz0a6x@ludens>, la...@ludens.elte.hu (Ersek, Laszlo) writes:
> In article <hm397h$7hb$1...@usenet01.boi.hp.com>,
> "Ed Vogel" <edward.vogel@hp_stopping_spam.com> writes:
^^ ^^^^

>> I worked on a C compiler for OpenVMS.

> HP C V7.1-015 on OpenVMS Alpha V8.3

So you worked on *the* C compiler for OpenVMS, then; I notice.

Thanks,
lacos

It is loading more messages.
0 new messages