A newbie's code

main()

unread,

Aug 23, 2006, 6:52:36 AM8/23/06

to

Hi all,

I'm learning this great language called C.
I thought i would try my hand at writing small some code.
I wrote a very small function that takes two char pointers a and b.
It is to remove all letters from a that occurs in b.

#include <stdio.h>

void fun(char *a,const char *b)
{
int i=0,apos=0,j;
if(!a || !b) /* basic check*/
return;
for(;a[i];i++)
{
for(j=0; b[j] && a[i] != b[j] ;j++); /*loop until end of b or until
a match is found*/
if(!b[j])
a[apos++] = a[i];
}
a[apos] = '\0';
}

int main(void)
{
char a[] = "hai how are you ?";
char b[] = "aeiou";
fun(a,b);
printf("%s\n",a);
return 0;
}

I have done and complied, it seems ok.
But i need your comment on my code.
I'm a newbie.I welcome any comment (like style, logic, any other neat
way to write the same function or proper usage etc).

Thanks for your time,
Yugi.

pete

unread,

Aug 23, 2006, 7:39:25 AM8/23/06

to

It's not bad.

/* BEGIN new.c */

#include <stdio.h>

#define STRING "\n\n\n\tThere's\n a\r beat in \r\tmy head.\n\n\n"
#define WHITE "\n\r\t"

void fun(char *a,const char *b)
{

size_t i = 0, apos = 0, j;
/*
** size_t is guaranteed to be big enough
** to represent the number of bytes in any object.
*/
if (a == NULL || b == NULL) {
return;
}
/*
** explicit comparisons against NULL for pointers
** are easier to read.
*/
for (; a[i]; i++) {
for (j=0; b[j] && a[i] != b[j]; j++) {
;
}
/*
** Always using the optional {braces}
** and putting the empty statement on it's own line,
** makes the above empty loop more obvious.
*/
if (b[j] == '\0') {

a[apos++] = a[i];
}

/*
** The same convention of using '\0' below
** in assignement,
** should also be used in the above if()
** in comparison,
** to compare a byte in a string with the null character.
*/
}
a[apos] = '\0';
}

int main(void)
{
char a[] = "hai how are you ?";
char b[] = "aeiou";

char c[] = STRING;

puts(a);
fun(a, b);
puts(a);
puts(c);
fun(c, WHITE);
puts(c);
return 0;
}

/* END new.c */

(fun) is K&R2 Exercise 2-4, Alternate squeeze functions.
I have three overly complicated versions of the same function.
http://groups.google.com/group/comp.lang.c/msg/d9c5bff63bc679f1

--
pete

Richard Bos

unread,

Aug 23, 2006, 7:40:47 AM8/23/06

to

"main()" <dnu...@gmail.com> wrote:

> #include <stdio.h>
>
> void fun(char *a,const char *b)

Wise use of const - you don't see that too often in newbie code.

> {
> int i=0,apos=0,j;
> if(!a || !b) /* basic check*/
> return;
> for(;a[i];i++)

I'd initialise both i and apos here, for clarity:

for(i=apos=0; a[i]; i++)

> {
> for(j=0; b[j] && a[i] != b[j] ;j++); /*loop until end of b or until
> a match is found*/
> if(!b[j])
> a[apos++] = a[i];
> }
> a[apos] = '\0';
> }
>
> int main(void)
> {
> char a[] = "hai how are you ?";

And not making the all too common error of passing a pointer to a string
literal to this function - another bonus point.

> char b[] = "aeiou";
> fun(a,b);
> printf("%s\n",a);
> return 0;
> }

> I'm a newbie.I welcome any comment (like style, logic, any other neat

> way to write the same function or proper usage etc).

I'd use spaces a little differently, but everybody has a different style
where spacing is concerned. Just be consistent, and don't skip them
altogether. I'd also use more meaningful function and parameter names.

You could write the function completely differently using strcspn() or
strpbrk(). I'm not sure whether that would buy you anything, though.

Apart from that - no remarks.

Richard

spi...@gmail.com

unread,

Aug 23, 2006, 8:14:11 AM8/23/06

to

main() wrote:

It's fine ; short and elegant. The only thing
that bothers me a bit is that the value of b[j]
is checked both inside the loop

for(j=0; b[j] && a[i] != b[j] ;j++);

and right after you exit the loop. There's a way
to avoid that at the cost of readability.

void fun_version2(char *a,const char *b)

{
int i=0,apos=0,j;
if(!a || !b) /* basic check*/
return;
for(;a[i];i++)
{

j=0 ;
loop:
if (b[j] == 0) {

a[apos++] = a[i] ;

continue ;
}
if (a[i] != b[j])
continue ;
j++ ;
goto loop ;
}
a[apos] = '\0';
}

I guess it's possible that a clever enough compiler
wouldn't actually check the value of b[j] twice.

It would be nice if C had a "continue n" statement
where n==1 would be like regular continue ,
"continue 2" would mean continue the loop in which
this one is contained etc. But since it doesn't,
one has to rely on goto or on checking the same
condition more than once.

Spiros Bousbouras

Flash Gordon

unread,

Aug 23, 2006, 10:37:39 AM8/23/06

to

pete wrote:
> main() wrote:
>> Hi all,
>>
>> I'm learning this great language called C.
>> I thought i would try my hand at writing small some code.
>> I wrote a very small function that takes two char pointers a and b.
>> It is to remove all letters from a that occurs in b.

You've described the problem and provided your code for implementing it.
This is good.

>> #include <stdio.h>
>>
>> void fun(char *a,const char *b)

You should choose a better name than fun. As your projects grow picking
sensible names makes it far easier for you, let alone anyone else, to
read your code.

<snip>

Others have commented on the rest of your code.

>> I have done and complied, it seems ok.
>> But i need your comment on my code.
>> I'm a newbie.
>> I welcome any comment (like style, logic, any other neat
>> way to write the same function or proper usage etc).

My other main comment is you are to be congratulated on coming here for
the right purpose and in the right way. It was a far better first post
than many we see.

> It's not bad.

<snip>

> (fun) is K&R2 Exercise 2-4, Alternate squeeze functions.
> I have three overly complicated versions of the same function.
> http://groups.google.com/group/comp.lang.c/msg/d9c5bff63bc679f1

The OP may also find it interesting to look at Richard Heathfield's
implementation which is available at
http://clc-wiki.net/wiki/K%26R2_solutions:Chapter_2:Exercise_4

I would recommend only looking at solutions to other exercises on this
site after you have attempted the exercise yourself.
--
Flash Gordon

Kenneth Brody

unread,

Aug 23, 2006, 11:08:30 AM8/23/06

to

"main()" wrote:
>
> Hi all,
>
> I'm learning this great language called C.

[...]

Welcome.

The first thing I noticed is that your pseudonym should probably be
"main(void)" rather than "main()".

:-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-) :-)

Ancient_Hacker

unread,

Aug 23, 2006, 11:31:52 AM8/23/06

to

Just a *few* suggestions:

> int i=0,apos=0,j;

You're executing TWO assignment statements unecessarily in the case
where one or both params are NULL. And you neednt set "j" at all
here, as it's always set in the for initializer part (whatever you call
the first for() thingamaggummy).

> if(!a || !b) /* basic check*/

You are to be complimented n checking your parameters, wish MicroSoft
did so with any regularity in their examples.

But would it kill you to write clearer code, ala:

if( a == NULL || b == NULL ) ....

Yes, I know !a is a well established C idiom. Even highly regarded
snippets of code are full of that kind of thing. Pls ponder the
relative clarity of each, especially when you get a call at 2:21AM to
fix your program.

> return;

"return" is a mighty hand thing, but it's in a sense a amorphous goto.
Consider having just one exit point, at the bottom of the function. It
makes adding debug lines, assertions, and return value settings sooo
much clearer.

> for(;a[i];i++)

This is fine, a really puristt would suggest initializing "i" here,
just in case someone ever adds code between the initialization and this
point and one loses track of wat's in "i".

Also a bubbly-headed coder might suggest "a[i] != EndOfString" or even
"a[i] IsNot EndOfString". Much clearer when the F133 payroll program
for the whole Skunk Works Division of Lockheed isnt running (a friend
of mine was called at 4 AM to fix that).
Yes, it takes a smidgen more typing, but disk space is not exactly
expensive these days.

Otherwise okay.

Frederick Gotham

unread,

Aug 23, 2006, 12:11:27 PM8/23/06

to

main() posted:

> #include <stdio.h>
>
> void fun(char *a,const char *b)
> {
> int i=0,apos=0,j;
> if(!a || !b) /* basic check*/
> return;

You should decide whether you want the programmer to be able to supply this
function with null pointers. If you forbid this, then use assertions:

assert(a);
assert(b);

I'll give you an example of how I would go about this:

#include <assert.h>

void RemoveCertainChars(char *alter, char const *const arg_certain)
{
char const *certain;

assert(alter);
assert(arg_certain);
assert(*alter);
assert(*arg_certain);

do
{
certain = arg_certain;

do if(*certain++ == *alter) *alter = 'X';
while(*certain);

}while(*++alter);
}

--

Frederick Gotham

unread,

Aug 23, 2006, 12:13:58 PM8/23/06

to

Flash Gordon posted:

> The OP may also find it interesting to look at Richard Heathfield's
> implementation which is available at
> http://clc-wiki.net/wiki/K%26R2_solutions:Chapter_2:Exercise_4

I would frown upon the use of array subscripting rather than pointers.

--

Frederick Gotham

unread,

Aug 23, 2006, 12:17:11 PM8/23/06

to

Ancient_Hacker posted:

>> if(!a || !b) /* basic check*/
>
> You are to be complimented n checking your parameters, wish MicroSoft
> did so with any regularity in their examples.
>
> But would it kill you to write clearer code, ala:
>
> if( a == NULL || b == NULL ) ....
>
> Yes, I know !a is a well established C idiom.

Idiom... ?! Something has to be at least mildy exotic to achieve the status
of "idiom".

Using the inversion operator on a pointer is perfectly clear, and a
fundamental feature of the language.

--

Frederick Gotham

unread,

Aug 23, 2006, 12:47:37 PM8/23/06

to

Frederick Gotham posted:

> I'll give you an example of how I would go about this:

Or, actually removing the characters rather than replacing them with X:

void ShiftStringBackwardsOnce(char *behind)
{
char const *ahead = behind + 1;

assert(behind); assert(*behind);

while(*behind++ = *ahead++);
}

void RemoveCertainChars(char *alter, char const *const arg_certain)
{
char const *certain;

assert(alter); assert(*alter);
assert(arg_certain); assert(*arg_certain);

do
{
certain = arg_certain;

do if(*certain++ == *alter) ShiftStringBackwardsOnce(alter--);

Flash Gordon

unread,

Aug 23, 2006, 1:36:28 PM8/23/06

to

You might, but I and many others would not. Array subscripting is
perfectly clear in my opinion.
--
Flash Gordon

Keith Thompson

unread,

Aug 23, 2006, 2:47:08 PM8/23/06

to

Why?

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson

unread,

Aug 23, 2006, 2:59:36 PM8/23/06

to

"Ancient_Hacker" <gr...@comcast.net> writes:
[...]

>> for(;a[i];i++)

The above was posted by "main()" <dnu...@gmail.com>. Please don't
snip attributions.

> This is fine, a really puristt would suggest initializing "i" here,
> just in case someone ever adds code between the initialization and this
> point and one loses track of wat's in "i".

Yes, I'd probably drop the initialization in the declaration of i, and
change the for loop to:

for (i = 0; a[i]; i++)

If I could depend on having a C99 compiler, I'd write:

for (int i = 0; a[i]; i++)

> Also a bubbly-headed coder might suggest "a[i] != EndOfString" or even
> "a[i] IsNot EndOfString". Much clearer when the F133 payroll program
> for the whole Skunk Works Division of Lockheed isnt running (a friend
> of mine was called at 4 AM to fix that).
> Yes, it takes a smidgen more typing, but disk space is not exactly
> expensive these days.

Defining the identifiers (macros?) "EndOfString" and "IsNot" would do
nothing but obfuscate the code. I would actually write:

for (i = 0; a[i] != '\0'; i ++)

Anyone capable of understanding C code will know what "!=" and '\0'
mean. Anyone who doesn't know that won't be helped by pseudo-English
aliases. See also question 10.2 in the comp.lang.c FAQ.

Keith Thompson

unread,

Aug 23, 2006, 3:01:04 PM8/23/06

to

Yes, it's a fundamental feature of the language, but in my opinion an
explicit comparison against NULL is clearer. For one thing, it makes
it obvious that the variable being tested is a pointer and not some
other kind of scalar. (And I'm well aware that plenty of very smart
people disagree.)

Keith Thompson

unread,

Aug 23, 2006, 3:07:50 PM8/23/06

to

Frederick Gotham <fgot...@SPAM.com> writes:
> main() posted:
>
>> #include <stdio.h>
>>
>> void fun(char *a,const char *b)
>> {
>> int i=0,apos=0,j;
>> if(!a || !b) /* basic check*/
>> return;
>
>
> You should decide whether you want the programmer to be able to supply this
> function with null pointers. If you forbid this, then use assertions:
>
> assert(a);
> assert(b);

This will work in C99, but C90 doesn't guarantee that the argument to
assert() can be anything other than an int (though I think it would be
ok in most implementations). I'd write:

assert(a != NULL);
assert(b != NULL);

IMHO, this is clearer both in the source code and in the error message
that's produced if the assertion fails.

Frederick Gotham

unread,

Aug 23, 2006, 3:23:34 PM8/23/06

to

Keith Thompson posted:

> Frederick Gotham <fgot...@SPAM.com> writes:
>> Flash Gordon posted:
>>> The OP may also find it interesting to look at Richard Heathfield's
>>> implementation which is available at
>>> http://clc-wiki.net/wiki/K%26R2_solutions:Chapter_2:Exercise_4
>>
>> I would frown upon the use of array subscripting rather than pointers.
>
> Why?

Efficiency and clarity. And efficiency again.

--

Frederick Gotham

unread,

Aug 23, 2006, 3:24:22 PM8/23/06

to

Keith Thompson posted:

> Yes, it's a fundamental feature of the language, but in my opinion an
> explicit comparison against NULL is clearer. For one thing, it makes
> it obvious that the variable being tested is a pointer and not some
> other kind of scalar. (And I'm well aware that plenty of very smart
> people disagree.)

If my own opinion is worth anything, I've never used NULL once in my code.

--

Frederick Gotham

Walter Roberson

unread,

Aug 23, 2006, 3:41:37 PM8/23/06

to

In article <WW1Hg.13008$j7.3...@news.indigo.ie>,

>> Frederick Gotham <fgot...@SPAM.com> writes:

>> Why?

Many compilers these days optimize array subscripting very well,
often better than they can optimize most pointers. pointers are
usually much harder to analyze to prove that aliasing cannot be
happening.

I have also seen implementations in which larger arrays (including
those in automatic variables) were automagically aligned to optimize
cache considerations. I have never -seen- an implementation in which
malloc() did that, especially as malloc() cannot be handed
access pattern information. Compilers can take access patterns
into account when positioning subscripted arrays.
--
If you lie to the compiler, it will get its revenge. -- Henry Spencer

Keith Thompson

unread,

Aug 23, 2006, 3:44:54 PM8/23/06

to

Why do you think code using pointers is more efficient than code using
array indexing? It probably would be given a sufficiently naive
compiler (or even a modern compiler invoked without optimization), but
any modern optimizer should be able to generate equally good code
either way.

Clarity, of course, is in the eye of the beholder.

Ian Collins

unread,

Aug 23, 2006, 4:13:18 PM8/23/06

to

Can you provide an example?

--
Ian Collins.

Bill Pursell

unread,

Aug 23, 2006, 4:37:23 PM8/23/06

to

How do you terminate your lists?

--
Bill Pursell

ena8...@yahoo.com

unread,

Aug 23, 2006, 11:13:47 PM8/23/06

to

Using arrays rather than pointers is often clearer
and also often produces faster code. If you measure
you may find yourself giving up some of your ideas
about which choice is better.

ena8...@yahoo.com

unread,

Aug 23, 2006, 11:22:20 PM8/23/06

to

So what you're saying is you've used NULL a lot? :)

Herbert Rosenau

unread,

Aug 24, 2006, 1:49:29 AM8/24/06

to

On Wed, 23 Aug 2006 15:31:52 UTC, "Ancient_Hacker" <gr...@comcast.net>
wrote:

> Just a *few* suggestions:
>
> > int i=0,apos=0,j;
>
> You're executing TWO assignment statements unecessarily in the case
> where one or both params are NULL. And you neednt set "j" at all
> here, as it's always set in the for initializer part (whatever you call
> the first for() thingamaggummy).
>
> > if(!a || !b) /* basic check*/
>
> You are to be complimented n checking your parameters, wish MicroSoft
> did so with any regularity in their examples.

Microsoft does so many things absolutely wrong that having M$ as
excample makes no sense except as bad one.

> But would it kill you to write clearer code, ala:
>
> if( a == NULL || b == NULL ) ....

Don't bother on style. It makes no sense here as one cand find more
sources using the short form as the long form you uses.

> Yes, I know !a is a well established C idiom. Even highly regarded
> snippets of code are full of that kind of thing. Pls ponder the
> relative clarity of each, especially when you get a call at 2:21AM to
> fix your program.

Boa, ey! Not anybody is a beginner like you who can't read and
understund most common used style. As sayed above: don't bother on
style.

>
> > return;
>
> "return" is a mighty hand thing, but it's in a sense a amorphous goto.
> Consider having just one exit point, at the bottom of the function. It
> makes adding debug lines, assertions, and return value settings sooo
> much clearer.

Anyone who had learned C in practise will follow the styles bignners
gets telled.

In practise there are 2 different forms of functions:

1. return on error quickly to forget about that for the rest of the
whole function but declaring the callee: something is faulty.
Multiple returns flagging error always. Not returned yet means that
anything done already is o.k. so far.
The last return is complete success of anything the function has to
do.

2. return on 'work done' already.
Single return at end is reached only when an error shows that the
whole work failed.
It makes only formalists happy tho have 50 cascaded ifs or checking
flags again and again and again endless only to avoid a single
retuern statement. It ends up in code hard to read and hard to
understund where a simple return cleans up well.

Which form is used depends on the question: what is easier to reach?
- complete success? return when success, left error open.
- faiture? return the failture found, left work open
as it can't be done on the error reached.

Hungin inside the function only because one has to reach the one only
return statement makes it hard to understund the whole function as
such.

>
> > for(;a[i];i++)
>
> This is fine, a really puristt would suggest initializing "i" here,
> just in case someone ever adds code between the initialization and this
> point and one loses track of wat's in "i".

Don't bother on style. Yes, it makes sense to initialise the control
variables of the loop in the initialiser list. But again it's mostenly
only a style question.

> Also a bubbly-headed coder might suggest "a[i] != EndOfString" or even
> "a[i] IsNot EndOfString". Much clearer when the F133 payroll program
> for the whole Skunk Works Division of Lockheed isnt running (a friend
> of mine was called at 4 AM to fix that).
> Yes, it takes a smidgen more typing, but disk space is not exactly
> expensive these days.

Again: don't bother on style.

> Otherwise okay.
>
To OP: Don't be so miserly on spaces!

e.g.: for ( ; a[i]; i++)

is more readable as yours.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Richard Bos

unread,

Aug 24, 2006, 3:38:52 AM8/24/06

to

"Ancient_Hacker" <gr...@comcast.net> wrote:

> Just a *few* suggestions:

[ Can you please not snip attributions? Thanks. ]

> > int i=0,apos=0,j;
>
> You're executing TWO assignment statements unecessarily in the case
> where one or both params are NULL.

No, he's not. He's initialising.

> And you neednt set "j" at all here,

And, surprise, he isn't.

> > if(!a || !b) /* basic check*/

> But would it kill you to write clearer code, ala:

>
> if( a == NULL || b == NULL ) ....
>
> Yes, I know !a is a well established C idiom. Even highly regarded
> snippets of code are full of that kind of thing. Pls ponder the
> relative clarity of each, especially when you get a call at 2:21AM to
> fix your program.

How amusing that you use "Pls" while telling the OP not to use !a.

If you can't understand perfectly normal C at two in the morning, don't
answer the phone at two in the morning. !a will be the least of your
worries.

> > return;
>
> "return" is a mighty hand thing, but it's in a sense a amorphous goto.
> Consider having just one exit point, at the bottom of the function.

IOW, consider complicating the layout of your function for the paltry
purpose of pleasing Niklaus Wirth. Bah.

> > for(;a[i];i++)

> Also a bubbly-headed coder might suggest "a[i] != EndOfString" or even
> "a[i] IsNot EndOfString".

That would be a truly phenomenally bubbly-headed coder. If you want to
hack on the Bourne shell, you know where to find it.

Richard

Richard Bos

unread,

Aug 24, 2006, 3:41:13 AM8/24/06

to

Frederick Gotham <fgot...@SPAM.com> wrote:

> main() posted:
>
> > #include <stdio.h>
> >
> > void fun(char *a,const char *b)
> > {
> > int i=0,apos=0,j;
> > if(!a || !b) /* basic check*/
> > return;
>
> You should decide whether you want the programmer to be able to supply this
> function with null pointers. If you forbid this, then use assertions:
>
> assert(a);
> assert(b);

While developing only! Do not, ever, let a user see the result of an
assertion failure. They get remarkably stroppy when all their work is
dumped in the bit bucket with nothing to show for it but a cryptic error
message, and that for no better reason than that the programmer was too
lazy to use proper error checking instead of assert(). NDEBUG is your
friend, here.

Richard

Frederick Gotham

unread,

Aug 24, 2006, 11:33:11 AM8/24/06

to

Ian Collins posted:

>> Efficiency and clarity. And efficiency again.
>>
> Can you provide an example?

Firstly, let's take two snippets which achieve the same objective:

Snippet (1):

size_t i;

for(i = 0; i != len; ++i)
{
arr[i] += 77;
}

Snippet (2):

int *p = arr;
int const *const pover = arr + len;

do *p++ += 77;
while(p != pover);

The body of the loop in Snippet (1) is equivalent to:

*(arr + i) += 77;

Upon each iteration, "i" is added to the address of the first element, and
then this address is dereferenced and assigned to.

Looking at the body of the loop in Snippet (2), "p" is dereferenced and
assigned to, and then "p" is incremented.

Snippet (2) is more efficent, and is cleaner in my opinion -- I much prefer
incrementing pointers than playing around with array indexes (but maybe
this is a matter of personal taste).

Of course, an optimising compiler *might* produce the same code for both,
but I prefer Snippet (2), as I like to write efficient code, and I also
think it's cleaner.

--

Frederick Gotham

unread,

Aug 24, 2006, 11:34:18 AM8/24/06

to

posted:

> Using arrays rather than pointers is often clearer
> and also often produces faster code. If you measure
> you may find yourself giving up some of your ideas
> about which choice is better.

I don't see how array subscripting could be faster. Elsethread, I have
explained how the pointer version is likely to be faster.

Can you give an example of where array subscripting would be faster for
iterating through an array?

--

Frederick Gotham

unread,

Aug 24, 2006, 11:35:08 AM8/24/06

to

Bill Pursell posted:

> How do you terminate your lists?

I don't understand the question, please be more specific.

--

Frederick Gotham

unread,

Aug 24, 2006, 11:38:31 AM8/24/06

to

Herbert Rosenau posted:

> Microsoft does so many things absolutely wrong that having M$ as
> excample makes no sense except as bad one.

I particularly loathe their use of "memset" for initialising arrays:

float array1[56];
int *array2[56];

memset(array1,0,sizeof array1);
memset(array2,0,sizeof array2);

Not only is it stupid, it's non-portable too.

> Boa, ey! Not anybody is a beginner like you who can't read and
> understund most common used style. As sayed above: don't bother on
> style.

I tend to get jumped on whenever I bring up that argument.

--

Frederick Gotham

unread,

Aug 24, 2006, 11:39:41 AM8/24/06

to

Ancient_Hacker posted:

> Pls ponder the relative clarity of each, especially when you get a call
> at 2:21AM to fix your program.

I don't answer the phone after about 10pm -- I find life to be more pleasant
that way.

--

Frederick Gotham

unread,

Aug 24, 2006, 11:42:31 AM8/24/06

to

Richard Bos posted:

>> assert(a);
>> assert(b);
>
> While developing only! Do not, ever, let a user see the result of an
> assertion failure. They get remarkably stroppy when all their work is
> dumped in the bit bucket with nothing to show for it but a cryptic error
> message, and that for no better reason than that the programmer was too
> lazy to use proper error checking instead of assert(). NDEBUG is your
> friend, here.

I was implying that it would be a compile-time programmer error if either
pointer were null, not a runtime error.

When you provide somebody with a completed program which you've written, all
programming errors should have been remedied, and the user should notice no
difference whether NDEBUG is defined or not (except of course for a minor
speed difference).

--

Frederick Gotham

Bill Pursell

unread,

Aug 24, 2006, 12:25:45 PM8/24/06

to

Frederick Gotham wrote:
> Bill Pursell posted:
>
> > How do you terminate your lists?
>
>
> I don't understand the question, please be more specific.

It's pretty standard to implement a list like this:

struct foo {
void *data;
struct foo * next;
};

And then traverse the list via:

for ( a_foo = head; a_foo /* != NULL*/; a_foo = a_foo->next)
;

That will only work if the first element that is inserted into
the list needs has had the assignment:
a_foo->next = NULL;
(eg, the tail of the list must point to NULL).
There are other ways to implement this, of course, but
the most natural is to use NULL as the terminator.

--
Bill

ena8...@yahoo.com

unread,

Aug 24, 2006, 12:55:01 PM8/24/06

to

size_t my_strlen_array( const char *s ){
unsigned r = 0;
while (s[r]) r++;
return r;
}

size_t my_strlen_pointer( const char *s ){
unsigned r = 0;
while (*s++) r++;
return r;
}

size_t my_strlen_pointer_2( const char *const s_0 ){
const char *s = s_0;
while (*s) s++;
return s-s_0;
}

My measurements show my_strlen_array is faster than
either of the pointer versions. YMMV, of course.

Flash Gordon

unread,

Aug 24, 2006, 12:26:03 PM8/24/06

to

Frederick Gotham wrote:
> Ian Collins posted:
>
>>> Efficiency and clarity. And efficiency again.
>>>
>> Can you provide an example?
>
>
> Firstly, let's take two snippets which achieve the same objective:
>
> Snippet (1):
>
> size_t i;
>
> for(i = 0; i != len; ++i)
> {
> arr[i] += 77;
> }
>
> Snippet (2):
>
> int *p = arr;
> int const *const pover = arr + len;
>
> do *p++ += 77;
> while(p != pover);
>
> The body of the loop in Snippet (1) is equivalent to:
>
> *(arr + i) += 77;
>
> Upon each iteration, "i" is added to the address of the first element, and
> then this address is dereferenced and assigned to.

You are assuming that the optimiser does not do anything. Optimisers
have certainly been dealing with things like this for over 10 years
(IIRC I saw this kind of strength reduction mentioned in a manual for a
C compiler in 1995).

> Looking at the body of the loop in Snippet (2), "p" is dereferenced and
> assigned to, and then "p" is incremented.
>
> Snippet (2) is more efficent,

You are assuming that the optimiser does not convert both to the same
machine code. You are also assuming that the processor does not have
index operations which would make the array form require only one
instruction or that said machine code instructions are slower.

> and is cleaner in my opinion -- I much prefer
> incrementing pointers than playing around with array indexes (but maybe
> this is a matter of personal taste).

That, indeed, is true. It is a matter of personal taste.

> Of course, an optimising compiler *might* produce the same code for both,
> but I prefer Snippet (2), as I like to write efficient code, and I also
> think it's cleaner.

It isn't necessarily more efficient even if converted directly in to
assembler with no optimisation. Personally I would bin any compiler so
bad it could not reduce array operations to pointer operations if the
pointer version was more efficient. As other have pointed out it can be
*harder* for the compiler to optimise pointer operations because it has
to be able to prove whether aliasing is occurring or not.

The best rule is to always write what you consider to be clearest, not
what you think might save a fraction off the execution time. Since you
consider the pointer version to be clearer *that* is a valid reason for
preferring the pointer versions, but when someone considers the array
index version to be clearer there is nothing intrinsically wrong with it.
--
Flash Gordon

Frederick Gotham

unread,

Aug 24, 2006, 2:34:16 PM8/24/06

to

Flash Gordon posted:

> You are assuming that the optimiser does not convert both to the same
> machine code. You are also assuming that the processor does not have
> index operations which would make the array form require only one
> instruction or that said machine code instructions are slower.

Optimisers don't give the programmer freedom to write code however he/she
pleases. Some code is always going to be more efficient than other code.

>> Of course, an optimising compiler *might* produce the same code for
>> both, but I prefer Snippet (2), as I like to write efficient code, and
>> I also think it's cleaner.
>
> It isn't necessarily more efficient even if converted directly in to
> assembler with no optimisation. Personally I would bin any compiler so
> bad it could not reduce array operations to pointer operations if the
> pointer version was more efficient.

You critisize the compiler... Why not critisize the programmer?! The
programmer clearly could have used pointers if they desired.

> As other have pointed out it can be *harder* for the compiler to
> optimise pointer operations because it has to be able to prove whether
> aliasing is occurring or not.

What is aliasing? I've never heard of it. (Not being sarcastic)

> The best rule is to always write what you consider to be clearest, not
> what you think might save a fraction off the execution time. Since you
> consider the pointer version to be clearer *that* is a valid reason for
> preferring the pointer versions, but when someone considers the array
> index version to be clearer there is nothing intrinsically wrong with
> it.

Indeed, both snippets achieve the same objective -- one is inherently more
efficient though.

--

Frederick Gotham

unread,

Aug 24, 2006, 2:43:50 PM8/24/06

to

> size_t my_strlen_array( const char *s ){
> unsigned r = 0;
> while (s[r]) r++;
> return r;
> }

> size_t my_strlen_pointer( const char *s ){
> unsigned r = 0;
> while (*s++) r++;
> return r;
> }
>
> size_t my_strlen_pointer_2( const char *const s_0 ){
> const char *s = s_0;
> while (*s) s++;
> return s-s_0;
> }
>
> My measurements show my_strlen_array is faster than
> either of the pointer versions. YMMV, of course.

How could that possibly be? I would have thought the fastest was:

#include <cassert>

size_t StrLength(char const *const start)
{
int dummy = (assert(start), 0);

char const *p = start;

while(*p++);

return p - start - 1;
}

--

Frederick Gotham

unread,

Aug 24, 2006, 2:45:04 PM8/24/06

to

Bill Pursell posted:

> There are other ways to implement this, of course, but
> the most natural is to use NULL as the terminator.

I test a pointer like as follows:

if(p)
{
/* Stuff */
}

--

Frederick Gotham

Ian Collins

unread,

Aug 24, 2006, 3:39:10 PM8/24/06

to

Frederick Gotham wrote:
> Ian Collins posted:
>
>
>>>Efficiency and clarity. And efficiency again.
>>>
>>
>>Can you provide an example?
>
> Firstly, let's take two snippets which achieve the same objective:
>
> Snippet (1):
>
> size_t i;
>
> for(i = 0; i != len; ++i)
> {
> arr[i] += 77;
> }
>
> Snippet (2):
>
> int *p = arr;
> int const *const pover = arr + len;
>
> do *p++ += 77;
> while(p != pover);
>
> The body of the loop in Snippet (1) is equivalent to:
>
> *(arr + i) += 77;
>
> Upon each iteration, "i" is added to the address of the first element, and
> then this address is dereferenced and assigned to.
>
> Looking at the body of the loop in Snippet (2), "p" is dereferenced and
> assigned to, and then "p" is incremented.
>
> Snippet (2) is more efficent, and is cleaner in my opinion -- I much prefer
> incrementing pointers than playing around with array indexes (but maybe
> this is a matter of personal taste).
>

You are assuming a very naive compiler. For example, where your first
snippet is f1 and the second f2:

void main(void) {
int arr[10];

f1( arr, 10 );
f2( arr, 10 );
}

Is see the call to f1 optimised to:

leal 8(%esp),%edx
movl 8(%esp),%eax
addl $77,%eax
movl %eax,8(%esp)
addl $77,12(%esp)
addl $77,16(%esp)
addl $77,20(%esp)
addl $77,24(%esp)
addl $77,28(%esp)
addl $77,32(%esp)
addl $77,36(%esp)
addl $77,40(%esp)
addl $77,44(%esp)

Which is about as efficient as one can get.

Another point to note in these days of multiple core CPUs, it is much
easier for a compiler that supports OpenMP to add parallelism to snippet 1.

--
Ian Collins.

Bill Pursell

unread,

Aug 24, 2006, 4:06:18 PM8/24/06

to

yes, but how do you **set** the pointer?

--
Bill Pursell

Clark S. Cox III

unread,

Aug 24, 2006, 4:10:31 PM8/24/06

to

Frederick Gotham wrote:

> Flash Gordon posted:

> > As other have pointed out it can be *harder* for the compiler to
> > optimise pointer operations because it has to be able to prove whether
> > aliasing is occurring or not.
>
> What is aliasing? I've never heard of it. (Not being sarcastic)

http://en.wikipedia.org/wiki/Aliasing_%28computing%29

> > The best rule is to always write what you consider to be clearest, not
> > what you think might save a fraction off the execution time. Since you
> > consider the pointer version to be clearer *that* is a valid reason for
> > preferring the pointer versions, but when someone considers the array
> > index version to be clearer there is nothing intrinsically wrong with
> > it.
>
>
> Indeed, both snippets achieve the same objective -- one is inherently more
> efficient though.

This is simply not true; given the following code:
//BEGIN CODE
#include <stdio.h>
#include <time.h>

void foo1(int arr[], size_t len)
{
for(size_t i = 0; i != len; ++i)
{
arr[i] += 77;
}
}

void foo2(int arr[], size_t len)
{

int *p = arr;
int const *const pover = arr + len;

do
{
*p++ += 77;
}
while(p != pover);
}

int main()
{
int arr[1000000];

clock_t start = clock();
for(int i=0; i!= 100; ++i)
{
foo1(arr, sizeof arr / sizeof *arr);
}

clock_t middle = clock();
for(int i=0; i!= 100; ++i)
{
foo2(arr, sizeof arr / sizeof *arr);
}

clock_t end = clock();

printf("array indexing: %f seconds\n", (middle - start) * 1.0 /
CLOCKS_PER_SEC);
printf("pointer arith : %f seconds\n", (end - middle) * 1.0 /
CLOCKS_PER_SEC);

return 0;
}
//END CODE

When I compile and run this on my Intel MacBook (with GCC 4.0.1), I get
the following output from 3 successive runs:

array indexing: 0.240000 seconds
pointer arith : 0.240000 seconds

array indexing: 0.240000 seconds
pointer arith : 0.230000 seconds

However, when I do the same on my PowerMac G5, I get:

array indexing: 0.340000 seconds
pointer arith : 0.360000 seconds

array indexing: 0.340000 seconds
pointer arith : 0.350000 secondss

array indexing: 0.340000 seconds
pointer arith : 0.350000 seconds

So, there you have it, on one platform, the difference seems to be just
noise, and on another the array indexing method is actially faster.

--
Clark S. Cox III
clar...@gmail.com

ena8...@yahoo.com

unread,

Aug 24, 2006, 4:58:40 PM8/24/06

to

In fact I had tried something like that along
with my other examples -

size_t my_strlen_pointer_3( const char *const s_0 ){

const char *s = s_0;

while (*s++);
return s-1-s_0;
}

It ran slower than all the other versions, so I
didn't put it in.

How can that be? It be!

Frederick Gotham

unread,

Aug 24, 2006, 5:10:53 PM8/24/06

to

Clark S. Cox III posted:

> So, there you have it, on one platform, the difference seems to be just
> noise, and on another the array indexing method is actially faster.

On my system (Intel Pentium 3, 500 MHz), pointers are 61% faster.

61% is far from negligible.

Try for yourself:

#include <stddef.h>
#include <stdio.h>
#include <time.h>

void Fun1(unsigned *const p, size_t const len)
{
size_t i;

for(i = 0; i != len; ++i)
{
p[i] += 77;
}
}

void Fun2(unsigned *p, size_t const len)
{
unsigned const *const pover = p + len;

do *p++ += 77;
while(pover != p);
}

int main(void)
{
clock_t time_start, time_elapsed1=0, time_elapsed2=0;
size_t i;
unsigned arr[99999] = {0};

puts("Please wait, this shouldn't take longer than 20 seconds...\n");

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun1(arr, sizeof arr / sizeof*arr);
}
time_elapsed1 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun2(arr, sizeof arr / sizeof*arr);
}
time_elapsed2 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun1(arr, sizeof arr / sizeof*arr);
}
time_elapsed1 += clock() - time_start;

for(i = 0,time_start = clock(); 1999 != i; ++i)
{
Fun2(arr, sizeof arr / sizeof*arr);
}
time_elapsed2 += clock() - time_start;

printf(" Array Subscripting: %f seconds.\n",
(double)time_elapsed1 / CLOCKS_PER_SEC);
printf("Pointer Incrementation: %f seconds.\n",
(double)time_elapsed2 / CLOCKS_PER_SEC);

if(time_elapsed1 > time_elapsed2)
{
printf("\nPointers are %f%% faster.\n\n",
(double)time_elapsed1/time_elapsed2*100-100);
}
else
{
printf("\nArrays are %f%% faster.\n\n",
(double)time_elapsed2/time_elapsed1*100-100);
}

return 0;
}

--

Frederick Gotham

unread,

Aug 24, 2006, 5:12:50 PM8/24/06

to

Bill Pursell posted:

Well to be honest, I don't write much C code (mostly C++). Up until very
recently, I always wrote:

p = 0;

, but lately I've taken a liking to "nullptr":

#define nullptr ((void*)0)

p = nullptr;

If I were writing C, I'd probably use NULL instead.

Anyhow, I always check for null pointers with:

if (!p) ...

--

Frederick Gotham

Ian Collins

unread,

Aug 24, 2006, 6:18:27 PM8/24/06

to

Frederick Gotham wrote:
> Clark S. Cox III posted:
>
>
>>So, there you have it, on one platform, the difference seems to be just
>>noise, and on another the array indexing method is actially faster.
>
>
>
> On my system (Intel Pentium 3, 500 MHz), pointers are 61% faster.
>
> 61% is far from negligible.
>
> Try for yourself:
>

I did, the difference varied between arrays 4% faster and pointers 11%
faster, depending on target and optimisations.

With a little help form OpenMP and an extra CPU, arrays were 37% faster.

--
Ian Collins.

Frederick Gotham

unread,

Aug 24, 2006, 7:28:52 PM8/24/06

to

Ian Collins posted:

>> Try for yourself:
>>
><snipped code>
>
> I did, the difference varied between arrays 4% faster and pointers 11%
> faster, depending on target and optimisations.
>
> With a little help form OpenMP and an extra CPU, arrays were 37% faster.

I don't know what OpenMP is, plus I've never worked with a computer which
had more than one CPU.

Whatever way you've achieved the 37% faster arrays, I wonder could that
power be channelled to speed up the pointers... ?

I'm still curious as to how the array subscripting could possibly be
faster. It seems to be that it'd be faster to:

(1) Dereference a pointer
(2) Increment the pointer

rather than:

(1) Add an integer to a pointer value
(2) Dereference the pointer
(3) Increment the integer

--

Frederick Gotham

Clark S. Cox III

unread,

Aug 24, 2006, 7:49:06 PM8/24/06

to

Some CPU's have single instructions for "load the memory at the location
described by a pointer and offset", on such platforms it's:

(1) Load the indexed memory
(2) Increment the index

My point being that you cannot say that either form is inherently more
efficient than the other. With that in mind, it is usually better to
write readable code first, and then to only optimize once you find a
bottleneck.

Compiler optimizers are much better than you seem to be believe. :)

Ian Collins

unread,

Aug 24, 2006, 7:50:32 PM8/24/06

to

Frederick Gotham wrote:
> Ian Collins posted:
>
>
>>>Try for yourself:
>>>
>>
>><snipped code>
>>
>>I did, the difference varied between arrays 4% faster and pointers 11%
>>faster, depending on target and optimisations.
>>
>>With a little help form OpenMP and an extra CPU, arrays were 37% faster.
>
> I don't know what OpenMP is, plus I've never worked with a computer which
> had more than one CPU.
>

http://www.openmp.org

It's the standard way of adding parallel processing to C, C++ and
Fortran code. The single core has had its day on the desktop.

> Whatever way you've achieved the 37% faster arrays, I wonder could that
> power be channelled to speed up the pointers... ?
>

I doubt it, OpenMP works with arrays (or at let that's all I used it for).

> I'm still curious as to how the array subscripting could possibly be
> faster. It seems to be that it'd be faster to:
>
> (1) Dereference a pointer
> (2) Increment the pointer
>
> rather than:
>
> (1) Add an integer to a pointer value
> (2) Dereference the pointer
> (3) Increment the integer
>

Mainly because the compiler finds it easier to do loop unrolling with
arrays. In the case of Sun cc, the array loop was optimised with 5
consecutive assignments.

Either way, it's not wise to make generalisations on performance,
there's inevitably an exception!

--
Ian Collins.

Frederick Gotham

unread,

Aug 24, 2006, 9:14:09 PM8/24/06

to

posted:

> It ran slower than all the other versions, so I
> didn't put it in.
>
> How can that be? It be!

Try this... it prints the following on my system:

Strlen1: 9.703000 seconds.
Strlen2: 6.870000 seconds.
Strlen3: 9.704000 seconds.
Strlen4: 9.584000 seconds.
strlen: 0.010000 seconds.

I would have thought Strlen3 would be the fastest (except for "strlen" of
course). I'm quite suprised however that strlen is so much faster...

#include <stddef.h>
#include <stdio.h>
#include <time.h>

#include <string.h>

char const str[] =
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz"
"abcdefghijklmnopqrstuvwxyz""abcdefghijklmnopqrstuvwxyz";

typedef struct Stopwatch {
clock_t start_time;
} Stopwatch;

void StopwatchReset(Stopwatch *const p)
{
p->start_time = clock();
}

clock_t StopwatchDuration(Stopwatch *const p)
{
/* Returns time elapsed since
stopwatch was last reset. */

clock_t const now = clock();

return now - p->start_time;
}

size_t Strlen1(char const *const p)
{
size_t i = 0;
while(p[i]) ++i;
return i;
}

size_t Strlen2(char const *p)
{
size_t i = 0;
while(*p++) ++i;
return i;
}

size_t Strlen3(char const *const parg)
{
char const *p = parg;
while (*p++);
return parg - p - 1;
}

size_t Strlen4(char const *const parg)
{
char const *p = parg;
while (*p) ++p;
return parg - p;
}

int main(void)
{
Stopwatch watch;
clock_t dur1,dur2,dur3,dur4,dur5;
size_t i,j=0;

puts("Please wait, this shouldn't take longer than 30 seconds...\n");

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen1(str);
dur1 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen2(str);
dur2 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen3(str);
dur3 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
Strlen4(str);
dur4 = StopwatchDuration(&watch);

for(StopwatchReset(&watch),i = 0;500000 != i;++i)
strlen(str);
dur5 = StopwatchDuration(&watch);

printf("Strlen1: %f seconds.\n",
(double)dur1 / CLOCKS_PER_SEC);
printf("Strlen2: %f seconds.\n",
(double)dur2 / CLOCKS_PER_SEC);
printf("Strlen3: %f seconds.\n",
(double)dur3 / CLOCKS_PER_SEC);
printf("Strlen4: %f seconds.\n",
(double)dur4 / CLOCKS_PER_SEC);
printf(" strlen: %f seconds.\n",
(double)dur5 / CLOCKS_PER_SEC);

Ian Collins

unread,

Aug 24, 2006, 9:39:23 PM8/24/06

to

Frederick Gotham wrote:
> posted:
>
>
>>It ran slower than all the other versions, so I
>>didn't put it in.
>>
>>How can that be? It be!
>
>
>
> Try this... it prints the following on my system:
>
> Strlen1: 9.703000 seconds.
> Strlen2: 6.870000 seconds.
> Strlen3: 9.704000 seconds.
> Strlen4: 9.584000 seconds.
> strlen: 0.010000 seconds.
>

Again, it depends:

32 bit:

Strlen1: 0.590000 seconds.
Strlen2: 0.900000 seconds.
Strlen3: 0.890000 seconds.
Strlen4: 0.890000 seconds.
strlen: 0.230000 seconds.

64 bit:

Strlen1: 0.590000 seconds.
Strlen2: 0.600000 seconds.
Strlen3: 0.600000 seconds.
Strlen4: 0.600000 seconds.
strlen: 0.120000 seconds.

Same CPU and compiler optimisations.

> I would have thought Strlen3 would be the fastest (except for "strlen" of
> course). I'm quite suprised however that strlen is so much faster...
>

Some CPUs are slower doing indirect access (*p).

strlen doesn't have to scan character by character.

--
Ian Collins.

Sjouke Burry

unread,

Aug 24, 2006, 9:41:50 PM8/24/06

to

Might the compiler have optimized the true system routine
out of the loop,because nothing is midified in the loop?
(And then throw out the empty for loop?)

Frederick Gotham

unread,

Aug 24, 2006, 9:54:12 PM8/24/06

to

Sjouke Burry posted:

> Might the compiler have optimized the true system routine
> out of the loop,because nothing is midified in the loop?
> (And then throw out the empty for loop?)

I had considered that. I also considered how it's nice to snip quotes.

--

Frederick Gotham

CBFalconer

unread,

Aug 25, 2006, 2:43:59 AM8/25/06

to

To find out all you have to do is examine the object code.

--
Chuck F (cbfal...@yahoo.com) (cbfal...@maineline.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE maineline address!

Richard Bos

unread,

Aug 25, 2006, 8:26:46 AM8/25/06

to

Frederick Gotham <fgot...@SPAM.com> wrote:

> Richard Bos posted:
>
> >> assert(a);
> >> assert(b);
> >
> > While developing only! Do not, ever, let a user see the result of an
> > assertion failure. They get remarkably stroppy when all their work is
> > dumped in the bit bucket with nothing to show for it but a cryptic error
> > message, and that for no better reason than that the programmer was too
> > lazy to use proper error checking instead of assert(). NDEBUG is your
> > friend, here.
>
> I was implying that it would be a compile-time programmer error if either
> pointer were null, not a runtime error.

That fails for at least two reasons. One, assert() doesn't work that
way. Two, most null pointer errors cannot be caught at compile time.

Richard

Frederick Gotham

unread,

Aug 25, 2006, 10:09:37 AM8/25/06

to

Richard Bos posted:

>> I was implying that it would be a compile-time programmer error if either
>> pointer were null, not a runtime error.
>
> That fails for at least two reasons. One, assert() doesn't work that
> way. Two, most null pointer errors cannot be caught at compile time.

I worded that poorly. I'll try again.

Let's take a simple function:

void Func(int *const p)
{
assert(p);

*p = 5;
}

In the finished program, it shouldn't make a difference whether NDEBUG is
defined or not, because Func should never be called with a null pointer.

If it IS called with a null pointer, it's not a runtime error, but rather an
error that the programmer made in writing the function which calls Func. (My
use of "compile-time" was misleading.)

--

Frederick Gotham

Chris Dollin

unread,

Aug 25, 2006, 10:29:18 AM8/25/06

to

Frederick Gotham wrote:

> Let's take a simple function:
>
> void Func(int *const p)
> {
> assert(p);
>
> *p = 5;
> }
>
> In the finished program, it shouldn't make a difference whether NDEBUG is
> defined or not, because Func should never be called with a null pointer.
>
> If it IS called with a null pointer, it's not a runtime error,

It's certainly a run-time error: it's an error, and it occurs at run-time.

> but rather an error that the programmer made in writing the function
> which calls Func.

The programmer made a mistake in writing the caller (or perhaps the caller's
caller, etc), and that mistake resulted in the error of passing null to
`Func`.

--
Chris "seeker" Dollin
"People are part of the design. It's dangerous to forget that." /Star Cops/

Frederick Gotham

unread,

Aug 25, 2006, 12:00:39 PM8/25/06

to

Chris Dollin posted:

>> but rather an error that the programmer made in writing the function
>> which calls Func.
>
> The programmer made a mistake in writing the caller (or perhaps the
> caller's caller, etc), and that mistake resulted in the error of passing
> null to `Func`.

Yes.

If this were acceptable, I'd have "Func" throw an exception.

However, if I want to downright condemn it, I place an assert in "Func".

--

Frederick Gotham

pete

unread,

Aug 25, 2006, 10:34:16 PM8/25/06

to

Flash Gordon wrote:
>
> Frederick Gotham wrote:
> > Flash Gordon posted:
> >

> >> The OP may also find it interesting to look at Richard Heathfield's
> >> implementation which is available at
> >> http://clc-wiki.net/wiki/K%26R2_solutions:Chapter_2:Exercise_4
> >
> > I would frown upon the use of
> > array subscripting rather than pointers.
>
> You might, but I and many others would not. Array subscripting is
> perfectly clear in my opinion.

It is, but I like pointers and
I like to rewrite other people's code.

char *str_squeeze_f(char *s1, const char *s2)
{
char *const p1 = s1;
const char *const p2 = s2;
char *p3 = s1;

while (*s1 != '\0') {
s2 = p2;
while (*s2 != '\0' && *s1 != *s2) {
++s2;
}
if (*s2 == '\0') {
*p3++ = *s1;
}
++s1;
}
*p3 = '\0';
return p1;
}

--
pete

ena8...@yahoo.com

unread,

Aug 26, 2006, 3:31:14 PM8/26/06

to

Frederick Gotham wrote:
> posted:
>
> > It ran slower than all the other versions, so I
> > didn't put it in.
> >
> > How can that be? It be!
>
>
> Try this... it prints the following on my system:

> [...]

I hope you realize that my courtesy in posting
the previous results doesn't carry either an
obligation or an interest to run random benchmarks.

Kelsey Bjarnason

unread,

Aug 27, 2006, 2:15:48 AM8/27/06

to

[snips]

On Thu, 24 Aug 2006 15:38:31 +0000, Frederick Gotham wrote:

> I particularly loathe their use of "memset" for initialising arrays:
>
> float array1[56];
> int *array2[56];
>
> memset(array1,0,sizeof array1);
> memset(array2,0,sizeof array2);
>
> Not only is it stupid, it's non-portable too.

Actually, it's perfectly portable, just needs to be commented properly:

/* Fills array with patent nonsense, quite possibly causing a trap and a
system crash if you forget to put proper values in */
memset(array1,0,sizeof array1);

See? Perfectly portable. :)

Frederick Gotham

unread,

Aug 27, 2006, 6:38:28 AM8/27/06

to

ena8t8si posted:

>> Try this... it prints the following on my system:
>> [...]
>
> I hope you realize that my courtesy in posting
> the previous results doesn't carry either an
> obligation or an interest to run random benchmarks.

Well Wup Dee Do Da Day! If you don't want to respond then don't respond -- I
didn't hold a gun to your head.

--

Frederick Gotham

ena8...@yahoo.com

unread,

Aug 27, 2006, 7:02:40 AM8/27/06

to

Frederick Gotham wrote:
> If you don't want to respond then don't respond [...]

Please rest assured, in the future I won't.

Herbert Rosenau

unread,

Aug 27, 2006, 2:39:58 PM8/27/06

to

On Sun, 27 Aug 2006 06:15:48 UTC, Kelsey Bjarnason
<kbjar...@ncoldns.com> wrote:

> [snips]
>
> On Thu, 24 Aug 2006 15:38:31 +0000, Frederick Gotham wrote:
>
> > I particularly loathe their use of "memset" for initialising arrays:
> >
> > float array1[56];
> > int *array2[56];
> >
> > memset(array1,0,sizeof array1);
> > memset(array2,0,sizeof array2);
> >
> > Not only is it stupid, it's non-portable too.
>
> Actually, it's perfectly portable, just needs to be commented properly:

It's NOT portable. The standard does not guarantee that all bits in
int or float or pointer equals to 0 are not an trap reresentation.
Filling a variable with one type does not mean that another type is
equal to. So filling an array of pointers with an array of bytes are
not guarateed that they are noways a guilty pointer. filling a float
with multple 0 bytes are not a guarantee that each float is regulary a
float 0.0.

> /* Fills array with patent nonsense, quite possibly causing a trap and a
> system crash if you forget to put proper values in */
> memset(array1,0,sizeof array1);
>
> See? Perfectly portable. :)

Solong you means that UB is perfectly portable: yes. But that is wat
you does with filling single bytes in multiple byte values. That it
works on your mashine says nothing as UB allows anything thinkable.

--
Tschau/Bye
Herbert

Visit http://www.ecomstation.de the home of german eComStation
eComStation 1.2 Deutsch ist da!

Michael Mair

unread,

Aug 27, 2006, 3:10:15 PM8/27/06

to

Frederick Gotham schrieb:
> ena8t8si posted:

*plonk*

-Michael
--
E-Mail: Mine is an /at/ gmx /dot/ de address.

Richard Bos

unread,

Aug 28, 2006, 7:03:41 AM8/28/06

to

Frederick Gotham <fgot...@SPAM.com> wrote:

If your user gets to see it, it's a runtime error.
If your user gets to see it, it had better be more than just a cryptic
assertion failure.

Richard

Kelsey Bjarnason

unread,

Aug 29, 2006, 12:23:51 AM8/29/06

to

[snips]

On Sun, 27 Aug 2006 18:39:58 +0000, Herbert Rosenau wrote:

>> Actually, it's perfectly portable, just needs to be commented properly:
>
> It's NOT portable.

Missed the comment, did we? :)

Keith Thompson

unread,

Aug 30, 2006, 2:25:00 AM8/30/06

to

Frederick Gotham <fgot...@SPAM.com> writes:
> Chris Dollin posted:
>>> but rather an error that the programmer made in writing the function
>>> which calls Func.
>>
>> The programmer made a mistake in writing the caller (or perhaps the
>> caller's caller, etc), and that mistake resulted in the error of passing
>> null to `Func`.
>
> Yes.
>
> If this were acceptable, I'd have "Func" throw an exception.

What do you mean by the words "throw" and "exception"? (Please check
the "Newsgroups:" header before answering that.)

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson

unread,

Aug 30, 2006, 3:19:31 AM8/30/06

to

Frederick Gotham <fgot...@SPAM.com> writes:
> Richard Bos posted:
>
>>> assert(a);
>>> assert(b);
>>
>> While developing only! Do not, ever, let a user see the result of an
>> assertion failure. They get remarkably stroppy when all their work is
>> dumped in the bit bucket with nothing to show for it but a cryptic error
>> message, and that for no better reason than that the programmer was too
>> lazy to use proper error checking instead of assert(). NDEBUG is your
>> friend, here.
>

> I was implying that it would be a compile-time programmer error if either
> pointer were null, not a runtime error.

(Unclear, but you explained that downthread.)

> When you provide somebody with a completed program which you've written, all
> programming errors should have been remedied, and the user should notice no
> difference whether NDEBUG is defined or not (except of course for a minor
> speed difference).

All programming errors should have been remedied? In what universe
can that *ever* be guaranteed? Yes, ideally all software should be
bug-free, but ...

I suggest that assert() calls may be left in production code only to
cover "this can't happen" programming errors. For example, if you're
*sure* that the expression in a switch statement can take on only one
of the values that you've explicitly handled, it might make sense to
have an assert() in a "default:" clause. If there's a logical error
in your program, it's (arguably) better for the assert() to trigger
than for the program to continue executing. The error message, which
will include the file name and line number, won't be particularly
useful to the user, but it should help the developer track down the
bug.

Richard Bos

unread,

Aug 30, 2006, 9:05:40 AM8/30/06

to

Keith Thompson <ks...@mib.org> wrote:

> I suggest that assert() calls may be left in production code only to
> cover "this can't happen" programming errors. For example, if you're
> *sure* that the expression in a switch statement can take on only one
> of the values that you've explicitly handled, it might make sense to
> have an assert() in a "default:" clause. If there's a logical error
> in your program, it's (arguably) better for the assert() to trigger
> than for the program to continue executing. The error message, which
> will include the file name and line number, won't be particularly
> useful to the user, but it should help the developer track down the
> bug.

You have a better class of user than I do. If mine noticed the assertion
message at all, they'd report it as "Assertion failure... erm...
[nonsensical-and-certainly-incorrect-name].c... oh, and there were some
numbers. No, sorry, I don't know what the numbers were."

Richard

Charlton Wilbur

unread,

Aug 30, 2006, 11:17:15 AM8/30/06

to

r...@hoekstra-uitgeverij.nl (Richard Bos) writes:

> Keith Thompson <ks...@mib.org> wrote:
>
> > The error message, which
> > will include the file name and line number, won't be particularly
> > useful to the user, but it should help the developer track down the
> > bug.
>
> You have a better class of user than I do. If mine noticed the assertion
> message at all, they'd report it as "Assertion failure... erm...
> [nonsensical-and-certainly-incorrect-name].c... oh, and there were some
> numbers. No, sorry, I don't know what the numbers were."

And that in itself is better than "it doesn't work! fix it!"

(I had a user helpfully take a screen shot so he could *show* it
wasn't working, and in the interests of making the email smaller he
cropped the screen shot just enough to eliminate all useful
information. I think they work at it, sometimes.)

Charlton

Kenneth Brody

unread,

Aug 30, 2006, 1:00:59 PM8/30/06

to

Sounds like you have the same clients I've had.

I think I've mentioned here (or maybe it was in a.t-s.r?) about the
person who wanted to know "what does this error mean?" I wish he
had done something "in the interests of making the email smaller",
because he ended up taking a full-screen screen capture (at 1024 by
768 by 24 bits), pasting it (twice!) into an MS-Word document,
saving it in RTF format, and attaching it to an e-mail with BASE64
encoding. The result was a 13MB e-mail, asking what the 100-ish
character message meant, and I still had 28.8K dialup at the time.

As for the "sorry, I don't know what the numbers were", we had
someone posting on a mailing list about the "granular errors" he
would keep getting with our program. Finally, after numerous posts
about "granular errors", without any further description, I finally
asked him flat out "what the heck are 'granular errors'?", and he
finally sent a (text mode -- a whole 2K worth) screen capture. Way
back then, we were using a program called DOS/4GW for running in
32-bit protected mode under MS-DOS, and its equivalent of a *nix
"SEGV" was a large dump of info, including the words "byte granular"
or "page granular" next to the segment registers. So, from half a
screen full of crash information, he picked that word, buried deep
in the middle of the dump, for his description of what happened.

Frederick Gotham

unread,

Aug 30, 2006, 1:46:02 PM8/30/06

to

Keith Thompson posted:

>> If this were acceptable, I'd have "Func" throw an exception.
>
> What do you mean by the words "throw" and "exception"? (Please check
> the "Newsgroups:" header before answering that.)

Sorry, I make that mistake every once in a while (especially when I have both
newsgroups open in different windows). You should see the abuse I get when I
post "printf" on the other newsgroup ha!

--

Frederick Gotham

CBFalconer

unread,

Aug 30, 2006, 4:29:02 PM8/30/06

to

Why? That is perfectly standard, as long as you #include <stdio>.

Clark S. Cox III

unread,

Aug 30, 2006, 6:02:47 PM8/30/06

to

CBFalconer wrote:
> Frederick Gotham wrote:
>> Keith Thompson posted:
>>
>>>> If this were acceptable, I'd have "Func" throw an exception.
>>> What do you mean by the words "throw" and "exception"? (Please
>>> check the "Newsgroups:" header before answering that.)
>> Sorry, I make that mistake every once in a while (especially when
>> I have both newsgroups open in different windows). You should see
>> the abuse I get when I post "printf" on the other newsgroup ha!
>
> Why? That is perfectly standard, as long as you #include <stdio>.

[OT]ITYM <cstdio> [/OT]

--
Clark S. Cox III
clar...@gmail.com

Frederick Gotham

unread,

Aug 30, 2006, 6:33:19 PM8/30/06

to

CBFalconer posted:

>> Sorry, I make that mistake every once in a while (especially when
>> I have both newsgroups open in different windows). You should see
>> the abuse I get when I post "printf" on the other newsgroup ha!
>
> Why? That is perfectly standard, as long as you #include <stdio>.

Most of the programmers over on comp.lang.c++ are incompetant. They claim
that pointers, null-terminated strings, and arrays are dangerous, and that
things like "vector" and "string" should be used instead. They also claim
that variadic functions are dangerous and should never be used.

It seems the more advanced features you add to a language, the larger the
group of people who will bitch and moan that the "less advanced" features are
dangerous. I just put it down to incompetence.

--

Frederick Gotham

Ian Collins

unread,

Aug 30, 2006, 7:20:09 PM8/30/06

to

<OT>while others put it down to experience</OT>

--
Ian Collins.

CBFalconer

unread,

Aug 30, 2006, 8:47:42 PM8/30/06

to

Frederick Gotham wrote:
> CBFalconer posted:
>
>>> Sorry, I make that mistake every once in a while (especially when
>>> I have both newsgroups open in different windows). You should see
>>> the abuse I get when I post "printf" on the other newsgroup ha!
>>
>> Why? That is perfectly standard, as long as you #include <stdio>.
>
> Most of the programmers over on comp.lang.c++ are incompetant. They
> claim that pointers, null-terminated strings, and arrays are
> dangerous, and that things like "vector" and "string" should be used
> instead. They also claim that variadic functions are dangerous and
> should never be used.

As far as variadic functions are concerned, they are right. There
are C alternatives, but you have to build them yourself. Pointers
and arrays have their own dangers, because of the wild
intermixing. Pascal handles them much better. Null terminated
strings are handleable.

>
> It seems the more advanced features you add to a language, the larger
> the group of people who will bitch and moan that the "less advanced"
> features are dangerous. I just put it down to incompetence.

That may well be true.

--
Some informative links:
news:news.announce.newusers
http://www.geocities.com/nnqweb/
http://www.catb.org/~esr/faqs/smart-questions.html
http://www.caliburn.nl/topposting.html
http://www.netmeister.org/news/learn2quote.html

Clark S. Cox III

unread,

Aug 31, 2006, 7:13:05 AM8/31/06

to

Frederick Gotham wrote:
> CBFalconer posted:
>
>>> Sorry, I make that mistake every once in a while (especially when
>>> I have both newsgroups open in different windows). You should see
>>> the abuse I get when I post "printf" on the other newsgroup ha!
>> Why? That is perfectly standard, as long as you #include <stdio>.
>
> Most of the programmers over on comp.lang.c++ are incompetant. They claim
> that pointers, null-terminated strings, and arrays are dangerous

NUL-terminated strings and malloc-allocated arrays *can* be dangerous
for beginners. This is just a matter of different languages providing
different mechanisms for doing similar things.

>, and that things like "vector" and "string" should be used instead.

In general, hiding complexity is a good thing; in either language.

> They also claim
> that variadic functions are dangerous and should never be used.

variadic functions *are* dangerous. Avoiding them is generally a good
idea, even in C.

> It seems the more advanced features you add to a language, the larger the
> group of people who will bitch and moan that the "less advanced" features are
> dangerous. I just put it down to incompetence.

--

Bill Pursell

unread,

Aug 31, 2006, 1:16:55 PM8/31/06

to

Clark S. Cox III wrote:
>
> variadic functions *are* dangerous. Avoiding them is generally a good
> idea, even in C.

That's absurd. The very first program anyone learns is:

#include <stdio.h>
int
main(void)
{
printf("Hello, world!\n");
return 0;
}

There are certainly people who will argue that it should
be written with puts instead of printf, but printf is generally
introduced right at the start.

If you mean that people shouldn't write variadic functions
until they know how, I would agree. But simply avoiding
them out of hand would be severely restrictive.

CBFalconer

unread,

Aug 31, 2006, 1:55:51 PM8/31/06

to

Variadic functions are dangerous because there is no way to check
that they are called correctly, especially when the format string
for printf is a variable. Better languages ensure that all
parameters are properly typed and checked. Pascal achieves the
same effect without the vulnerabilities, by specifying a standard
abbreviation for multiple simple calls. Unfortunately the
successors to Pascal, which include Modula and Ada, do not have
this simple and reliable mechanism.

Clark S. Cox III

unread,

Aug 31, 2006, 6:58:21 PM8/31/06

to

I should have been more specific. Yes, avoiding *writing* them is
generally a good idea. Also note that I said "generally" not "always".

Dave Thompson

unread,

Sep 7, 2006, 3:13:54 AM9/7/06

to

On Thu, 31 Aug 2006 13:55:51 -0400, CBFalconer <cbfal...@yahoo.com>
wrote:
<snip>

> Variadic functions are dangerous because there is no way to check
> that they are called correctly, especially when the format string
> for printf is a variable. Better languages ensure that all
> parameters are properly typed and checked. Pascal achieves the
> same effect without the vulnerabilities, by specifying a standard

> abbreviation for multiple simple calls. <snip>

For the builtin {read,write}{,ln} it does, but AFAIKaCT there is no
(standard) way to do this for user-written routines.

- David.Thompson1 at worldnet.att.net

CBFalconer

unread,

Sep 7, 2006, 8:44:15 AM9/7/06

to

Dave Thompson wrote:
> CBFalconer <cbfal...@yahoo.com> wrote:
> <snip>
>> Variadic functions are dangerous because there is no way to check
>> that they are called correctly, especially when the format string
>> for printf is a variable. Better languages ensure that all
>> parameters are properly typed and checked. Pascal achieves the
>> same effect without the vulnerabilities, by specifying a standard
>> abbreviation for multiple simple calls. <snip>
>
> For the builtin {read,write}{,ln} it does, but AFAIKaCT there is no
> (standard) way to do this for user-written routines.

Thus avoiding the insecurities of variadic functions. How often do
you really need a user-written variadic function?

--
Chuck F (cbfalconer at maineline dot net)