Reading a string of unknown size

Tonio Cartonio

unread,

Nov 25, 2006, 2:33:20 PM11/25/06

to

I have to read characters from stdin and save them in a string. The
problem is that I don't know how much characters will be read.

Francesco
--
-------------------------------------

http://www.riscossione.info/

Richard Heathfield

unread,

Nov 25, 2006, 2:42:28 PM11/25/06

to

Tonio Cartonio said:

> I have to read characters from stdin and save them in a string. The
> problem is that I don't know how much characters will be read.

http://www.cpax.org.uk/prg/writings/fgetdata.php
http://cbfalconer.home.att.net/download/ggets.zip
http://www.iedu.com/mrd/c/getsm.c
http://storm.freeshell.org/anysize.c

Take your pick.

(Coming soon - Eric Sosman's equivalent, if he can remember to attach it to
the email this time...)

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at the above domain, - www.

santosh

unread,

Nov 25, 2006, 2:42:32 PM11/25/06

to

Tonio Cartonio wrote:
> I have to read characters from stdin and save them in a string. The
> problem is that I don't know how much characters will be read.

You'll have to read the input, either character-at-a-time, (using
getc() or fgetc()), or line at a time, (using fgets()), and store them
into a block of memory which can be dynamically resized as you read in
more input.

It's fairly easy to do this yourself, but if you want to use an already
available one, try CBFalconer's ggets() function. Search the group's
archive. The URL is mentioned quite regularly.

Santosh

unread,

Nov 27, 2006, 4:31:41 AM11/27/06

to

> I have to read characters from stdin and save them in a string. The
> problem is that I don't know how much characters will be read.

int main()
{
char *str = NULL, ch ;
int i = 0 ;
str = (char*) malloc (2*sizeof(char)) ;
*str = '\0' ;

while( (ch=getchar()) != '\n' )
{
*(str+i) = ch ;
i++ ;
str = (char*) realloc(str, (2*sizeof(char)) + i ) ;
}
*(str+i) = '\0' ;

printf("\n\n %s ", str) ;

getch() ;
return 0;
}

:)

--
Regards
Santosh S Nayak
E-Mail - santos...@gmail.com
WebPage -- http://santoshsnayak.googlepages.com

Richard Heathfield

unread,

Nov 27, 2006, 4:55:28 AM11/27/06

to

Santosh said:

>> I have to read characters from stdin and save them in a string. The
>> problem is that I don't know how much characters will be read.
>
> int main()
> {
> char *str = NULL, ch ;
> int i = 0 ;
> str = (char*) malloc (2*sizeof(char)) ;

Here is your first bug.

> *str = '\0' ;

Here's the second.

>
> while( (ch=getchar()) != '\n' )

Here's the third.

> {
> *(str+i) = ch ;
> i++ ;
> str = (char*) realloc(str, (2*sizeof(char)) + i ) ;

Here's the fourth and fifth, at least.

> }
> *(str+i) = '\0' ;
>
> printf("\n\n %s ", str) ;

Here's your sixth.

>
> getch() ;

And your seventh.

> return 0;
> }
>
>
> :)

In fourteen lines of code (excluding spaces and braces), you managed at
least seven bugs. What are you smiling about?

Santosh

unread,

Nov 27, 2006, 5:43:43 AM11/27/06

to

> > :)In fourteen lines of code (excluding spaces and braces), you managed at

> least seven bugs. What are you smiling about?

The program works correctly.

No offence pal, but make sure you are at least 10% right before you
reply to any post.

Richard Heathfield

unread,

Nov 27, 2006, 6:00:41 AM11/27/06

to

Santosh said:

>> > :)In fourteen lines of code (excluding spaces and braces), you managed
>> > :at
>> least seven bugs. What are you smiling about?
>
> The program works correctly.

Really? Let's explore that, shall we?

foo.c:2: warning: function declaration isn't a prototype
foo.c: In function `main':
foo.c:3: `NULL' undeclared (first use in this function)
foo.c:3: (Each undeclared identifier is reported only once
foo.c:3: for each function it appears in.)
foo.c:5: warning: implicit declaration of function `malloc'
foo.c:5: warning: cast does not match function type
foo.c:8: warning: implicit declaration of function `getchar'
foo.c:12: warning: implicit declaration of function `realloc'
foo.c:12: warning: cast does not match function type
foo.c:16: warning: implicit declaration of function `printf'
foo.c:18: warning: implicit declaration of function `getch'
make: *** [foo.o] Error 1

Oh, look - it doesn't even compile.

> No offence pal, but make sure you are at least 10% right before you
> reply to any post.

I was 100% right that your program was a good 50% wrong (in terms of bugs
per line). And it doesn't compile on my system. If it compiles on yours
without generating at least one diagnostic message, then your compiler is
broken.

Furthermore, the OP did not indicate his platform (nor was there any need
for him to do that), let alone his implementation, so you can't just claim
"it works on *my* system", since there is no indication whatsoever that the
OP's system is the same as your system.

santosh

unread,

Nov 27, 2006, 6:08:46 AM11/27/06

to

Santosh wrote:
> > I have to read characters from stdin and save them in a string. The
> > problem is that I don't know how much characters will be read.
>

First include necessary headers: stdio.h, stdlib.h

> int main()

Better yet, replace above with int main(void)

> {
> char *str = NULL, ch ;
> int i = 0 ;
> str = (char*) malloc (2*sizeof(char)) ;

Don't cast return value of malloc() in C. It can hide the non-inclusion
of it's prototype, (by way of failure to include stdlib.h), and, on
some implementations, can result in nasty crashes during runtime.

Since sizeof(char) is by definition 1, you can omit that and instead do
'2 * sizeof *str'. This has the advantage of becoming automatically
updated when you later on happen to change the type of *str.

> *str = '\0' ;

And now you're possibly writing to a random area of memory, since you
failed to check the return value of malloc() above for failure.

> while( (ch=getchar()) != '\n' )

Check for EOF not newline. Moreover getchar() returns an int value
which you're storing in a char variable, probably getting a spurious
garbage return value when end-of-file is encountered.

> {
> *(str+i) = ch ;

You've overwritten your earlier nul character.

> i++ ;
> str = (char*) realloc(str, (2*sizeof(char)) + i ) ;

Again, _don't_ cast the return value of XXalloc() functions in C, and
check the call for failure before proceeding further. Also change
sizeof(char) to sizeof *str.

Anyway, your allocation strategy is very inefficient. Your calling
realloc() once every iteration of the loop. This could result in
fragmentation of the C library's memory pool. Why not allocate in terms
of fixed sized or dynamically growing blocks, say 128 bytes or so to
start with?

> }
> *(str+i) = '\0' ;
>
> printf("\n\n %s ", str) ;

Unless you terminate the output with a newline character, it's not
guaranteed to show up on screen, or wherever stdout happens to be
directed to.

> getch() ;

Non-standard, unportable and unnecessary function. Just get rid of it.

Santosh

unread,

Nov 27, 2006, 6:16:50 AM11/27/06

to

> per line). And it doesn't compile on my system. If it compiles on yours
> without generating at least one diagnostic message, then your compiler is
> broken.
>
> Furthermore, the OP did not indicate his platform (nor was there any need
> for him to do that), let alone his implementation, so you can't just claim
> "it works on *my* system", since there is no indication whatsoever that the
> OP's system is the same as your system.

Do not get too excited and offensive, you forgot to include the
following files:
#include <stdio.h>
#include <conio.h>

If you are using a Unix system:
#include <stdio.h>
#include <sys/types.h> OR #include <system.h>

Richard Heathfield

unread,

Nov 27, 2006, 6:22:53 AM11/27/06

to

Santosh said:

>> per line). And it doesn't compile on my system. If it compiles on yours
>> without generating at least one diagnostic message, then your compiler is
>> broken.
>>
>> Furthermore, the OP did not indicate his platform (nor was there any need
>> for him to do that), let alone his implementation, so you can't just
>> claim "it works on *my* system", since there is no indication whatsoever
>> that the OP's system is the same as your system.
>
> Do not get too excited and offensive,

I am neither excited nor offensive.

> you forgot to include the
> following files:
> #include <stdio.h>
> #include <conio.h>

No, I didn't forget: you did.

And C offers no header named <conio.h>

> If you are using a Unix system:
> #include <stdio.h>
> #include <sys/types.h> OR #include <system.h>

Wrong. That wouldn't help your program compile.

Furthermore, even with the inclusion of those headers, your program is still
broken in several important ways.

I suggest you stop defending and start thinking.

The Other Santosh has given a good analysis of some of the problems with
your program. If you don't understand that analysis, start asking
intelligent questions instead of acting all defensively.

santosh

unread,

Nov 27, 2006, 6:21:34 AM11/27/06

to

Huh!?

It failes to even compile on my Linux system.

After culling the proprietary getch() call it compiles with several
warnings and runs sucessfully by sheer luck, making several unwarrented
assumptions, which though may hold now, are prone to disastrous failure
anytime, especially if the target machine is anything other than a
standard 32 bit PC.

Looks like you should apply your last statement above to yourself,
before rushing to dismiss others, who may possibly be more knowledgeble
on C than you.

Santosh

unread,

Nov 27, 2006, 6:26:58 AM11/27/06

to

> Anyway, your allocation strategy is very inefficient. Your calling
> realloc() once every iteration of the loop. This could result in
> fragmentation of the C library's memory pool. Why not allocate in terms
> of fixed sized or dynamically growing blocks, say 128 bytes or so to
> start with?

Very True, I agree with you.
I just wanted to give "Tonio Cartonio" the idea of how to do it.

CBFalconer

unread,

Nov 27, 2006, 6:16:20 AM11/27/06

to

Santosh wrote: (and failed to maintain attributions)

>
>> I have to read characters from stdin and save them in a string. The
>> The problem is that I don't know how much characters will be read.

Piggy backing. OP's post not available.

Just download and use ggets. Available at:

<http://cbfalconer.home.att.net/download/>

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

Richard Heathfield

unread,

Nov 27, 2006, 6:48:21 AM11/27/06

to

CBFalconer said:

> Santosh wrote: (and failed to maintain attributions)
>>
>>> I have to read characters from stdin and save them in a string. The
>>> The problem is that I don't know how much characters will be read.
>
> Piggy backing. OP's post not available.
>
> Just download and use ggets. Available at:
>
> <http://cbfalconer.home.att.net/download/>

I already plugged it, in this very thread. (Along with a bunch of
alternatives.)

Richard Heathfield

unread,

Nov 27, 2006, 6:49:12 AM11/27/06

to

Santosh said:

What you actually gave him was a very good demonstration of how not to do
it.

Spiros Bousbouras

unread,

Nov 27, 2006, 3:53:30 PM11/27/06

to

Is Santosh a common name ? Or was Santosh
trying to impersonate santosh ? Personally
I didn't realize that Santosh was different than
santosh until santosh's first post and in fact I
was surprised at the way he was conducting himeslf.
(Before I realized they were different people that is.)

Richard Heathfield

unread,

Nov 27, 2006, 5:50:00 PM11/27/06

to

Spiros Bousbouras said:

> Is Santosh a common name ?

I couldn't say, but I once worked with someone of that name, a few years
ago. I doubt whether it's particularly rare.

<snip>

Keith Thompson

unread,

Nov 27, 2006, 8:38:51 PM11/27/06

to

Richard Heathfield <r...@see.sig.invalid> writes:
> Spiros Bousbouras said:
>> Is Santosh a common name ?
>
> I couldn't say, but I once worked with someone of that name, a few years
> ago. I doubt whether it's particularly rare.

It seems to be fairly common, judging by the results of a Google
search.

I suggest that one or both of the [Ss]antosh's currently posting here
consider using their full name to avoid confusion. (Neither of them,
of course, is obligated to follow my advice.)

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Santosh Nayak

unread,

Nov 27, 2006, 11:15:10 PM11/27/06

to

> I suggest that one or both of the [Ss]antosh's currently posting here
> consider using their full name to avoid confusion. (Neither of them,
> of course, is obligated to follow my advice.)

Sounds Logical !

--
Santosh Nayak

Sundar

unread,

Nov 27, 2006, 11:46:32 PM11/27/06

to

Can u please further explain the following statement....

"Don't cast return value of malloc() in C. It can hide the
non-inclusion
of it's prototype, (by way of failure to include stdlib.h), and, on
some implementations, can result in nasty crashes during runtime."

I am unable to understand the intricacies of the above statement.

james of tucson

unread,

Nov 28, 2006, 12:01:09 AM11/28/06

to

Sundar wrote:

> I am unable to understand the intricacies of the above statement.

Then understand a couple of simpler statements:

1. Do not cast the return value of malloc().

2. If you use malloc() and/or free(), be sure to #include stdlib.h

No intricacies needed :-)

Sundar

unread,

Nov 28, 2006, 1:33:39 AM11/28/06

to

I mean that the definition of malloc says that it returns a void
pointer to the allocated space. Why should u not typecast ur return
void pointer into ur required char or int pointer? And i think that you
dont understand the term "intricacies" else i would not have had to
post my query again.

Richard Heathfield

unread,

Nov 28, 2006, 1:41:35 AM11/28/06

to

Sundar said:

<snip>

> [...]the definition of malloc says that it returns a void

> pointer to the allocated space. Why should u not typecast ur return
> void pointer into ur required char or int pointer?

(a) Why would you need to cast? All code should either Do Something Good or
Stop Something Bad Happening. An automatic conversion is supplied, and
that's guaranteed by the Standard, so a cast adds neither value nor
protection.
(b) Without the cast, if you omit <stdlib.h> the compiler is required to
issue a diagnostic message for the mismatch between int and pointer, but
the cast removes this requirement (without fixing the bug that the
diagnostic message is reporting to you), thus giving you a silent bug
instead of a noisy bug.

For a fuller answer, see http://www.cpax.org.uk/prg/writings/casting.php

Keith Thompson

unread,

Nov 28, 2006, 2:04:27 AM11/28/06

to

"Santosh Nayak" <santos...@gmail.com> writes:
>> I suggest that one or both of the [Ss]antosh's currently posting here
>> consider using their full name to avoid confusion. (Neither of them,
>> of course, is obligated to follow my advice.)
>
> Sounds Logical !

Thanks. If I have this straight, you've been signing yourself as
"Santosh", and you started posting here recently; the other Santosh
has been signing himself as "santosh" and has been posting here for
some time. Is that right?

santosh

unread,

Nov 28, 2006, 2:48:38 AM11/28/06

to

Spiros Bousbouras wrote:
> Is Santosh a common name ?

Depends. It's more common in N.India than the south. But it isn't a
particularly common name.

For a split second before I noticed the capitalisation, I thought I was
looking at one of my previous posts. :)

Santosh Nayak

unread,

Nov 28, 2006, 5:13:43 AM11/28/06

to

> If I have this straight, you've been signing yourself as
> "Santosh", and you started posting here recently; the other Santosh
> has been signing himself as "santosh" and has been posting here for
> some time. Is that right?

Yes, that is right. Thanks for the suggestion.

--
Santosh Nayak
E-Mail - santoshsna...@gmail.com
WebPage -- http://santoshsnayak.googlepages.com

Santosh Nayak

unread,

Nov 28, 2006, 5:21:02 AM11/28/06

to

> > Is Santosh a common name ?
> Depends. It's more common in N.India than the south. But it isn't a
> particularly common name.

Santosh is also a common name in South Indian (where i belong).

santosh

unread,

Nov 28, 2006, 5:32:57 AM11/28/06

to

Santosh Nayak wrote:
> > > Is Santosh a common name ?
> > Depends. It's more common in N.India than the south. But it isn't a
> > particularly common name.
>
> Santosh is also a common name in South Indian (where i belong).

If you say so, though that's contrary to my experience. Anyway let's
stop this discussion, it's getting way off-topic.

webs...@gmail.com

unread,

Nov 28, 2006, 6:58:22 AM11/28/06

to

santosh wrote:
> Santosh wrote:
> > > I have to read characters from stdin and save them in a string. The
> > > problem is that I don't know how much characters will be read.
> First include necessary headers: stdio.h, stdlib.h
>
> > int main()
>
> Better yet, replace above with int main(void)
>
> > {
> > char *str = NULL, ch ;
> > int i = 0 ;
> > str = (char*) malloc (2*sizeof(char)) ;
>
> Don't cast return value of malloc() in C.

This is not a bug. Note that without some sort of cast, there is no
type checking, which is the biggest risk when dealing with void *
pointers.

> [...] It can hide the non-inclusion of it's prototype,

On *SOME* older generation compilers. No modern compiler fails to give
a warning about this regardless of the cast.

> [...] (by way of failure to include stdlib.h), and, on

> some implementations, can result in nasty crashes during runtime.
>
> Since sizeof(char) is by definition 1, you can omit that and instead do
> '2 * sizeof *str'. This has the advantage of becoming automatically
> updated when you later on happen to change the type of *str.

If you want automatic type safety you should do this:

#define safeMallocStr(p,n,type) do { (p) = (type *) malloc
((n)*sizeof (type)); } while (0);

and you get type checking, and correct semantics. So if you change the
type of the variable you are using, your compiler will issue warnings
if you mismatch here. Any variation you do in which you omit the cast
outside of malloc will fail to catch this "change the definition of the
pointer" scenario.

> > i++ ;
> > str = (char*) realloc(str, (2*sizeof(char)) + i ) ;
>

> [...]

> Anyway, your allocation strategy is very inefficient. Your calling
> realloc() once every iteration of the loop. This could result in
> fragmentation of the C library's memory pool. Why not allocate in terms
> of fixed sized

Because your memory pool will *still* fragment. It is also O(n^2)
rather than O(n) because of the implicit copying performed in
realloc().

> [...] or dynamically growing blocks, say 128 bytes or so to
> start with?

You have to do thing in exponential steps (doublings is the most
obvious was to do this) otherwise, weak malloc strategies will
inevitably be explosed. Its also going to be very slow for large
inputs (it really *IS* O(n^2)). So are you going to trade a buffer
overflow in just to be handed back a denial of service?

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

webs...@gmail.com

unread,

Nov 28, 2006, 7:26:45 AM11/28/06

to

Sundar wrote:
> Can u please further explain the following statement....
>
> "Don't cast return value of malloc() in C. It can hide the
> non-inclusion
> of it's prototype, (by way of failure to include stdlib.h), and, on
> some implementations, can result in nasty crashes during runtime."

On really old compilers, if you forget to explicitely declare the
prototype of a function before you use it, the compiler can assume that
it has a prototype like:

extern int malloc ( /* this is not void; its empty. */ );

But what it is supposed to be is:

void * malloc (size_t);

The compiler may emit different callsite code based on these. This is
usually an issue if void * and int are of different sizes, or target
different machine registers or whatever. The reasoning behind leaving
off the cast, is that even old compilers will crap out if you try to
assign an int to a pointer. I.e., they are using a completely
different error to assist you on what is really another kind of error
altogether.

This "automatic prototype" thing has been deprecated in the latest ANSI
C standards and there is basically no moden compiler in existence which
will not issue at least a warning as soon as it detects this scenario.
I personally always compile with "warning equals errors", and warnings
set to max or near max (of course the vendors own header files rarely
compile at the highest warning level -- *sigh*). So its not a real
issue for me, and it shouldn't be an issue for anyone who pays
attention to compiler warnings.

The cast allows for much more serious type checking. If you are using
the pattern:

<var> = (<type> *) malloc (<count> * sizeof (<type>));

and find a way of enforcing that the exact type is repeated precisely
(i.e., use a macro) then this usually works out better. If you change
<type> or the type of <var> the compiler will issue a warning unless
they are synchronized -- if you leave off this cast, you get no
assistance from the compiler, and you just have to get it correct.

So the whole idea of leaving off the cast is to help you because you
might make the mistake of omitting a header (a non-issue on modern
compilers which will give you this warning without this mechanism) but
it sacrifices more detailed type checking like making sure pointers
have the right base type which, for some reason, is not considered (by
some) as relevant a kind of mistake.

The whole "omit the cast" thing is a meme that exists amongst certain
people who have a very skewed idea of what it means to be C programmer.
If they are honest, they think that *development* of your code (not a
port-facto port) may, at any time, be forced onto some old C compiler
where their imagined scenario is an actual issue. More likely they
want to create an intention gratuitous incompatibility with C++ (where
the cast is mandatory.)

Richard Heathfield

unread,

Nov 28, 2006, 7:55:08 AM11/28/06

to

webs...@gmail.com said:

> santosh wrote:
>> Santosh wrote:
>> > > I have to read characters from stdin and save them in a string. The
>> > > problem is that I don't know how much characters will be read.
>> First include necessary headers: stdio.h, stdlib.h
>>
>> > int main()
>>
>> Better yet, replace above with int main(void)
>>
>> > {
>> > char *str = NULL, ch ;
>> > int i = 0 ;
>> > str = (char*) malloc (2*sizeof(char)) ;
>>
>> Don't cast return value of malloc() in C.
>
> This is not a bug.

It merely hides one.

> Note that without some sort of cast, there is no
> type checking, which is the biggest risk when dealing with void *
> pointers.

No, there is an even bigger risk - cargo cult programming, which is what
most people are doing when they cast malloc.

>> [...] It can hide the non-inclusion of it's prototype,
>
> On *SOME* older generation compilers. No modern compiler fails to give
> a warning about this regardless of the cast.

So if anyone comes up with a counter-example, you can simply claim that it's
not a "modern" compiler. ("True Scotsman" argument.) Furthermore, do not
forget that some organisations are remarkably conservative, and will not
change software that they know to work - especially if that software is
mission-critical, as compilers easily can be.

>> [...] (by way of failure to include stdlib.h), and, on
>> some implementations, can result in nasty crashes during runtime.
>>
>> Since sizeof(char) is by definition 1, you can omit that and instead do
>> '2 * sizeof *str'. This has the advantage of becoming automatically
>> updated when you later on happen to change the type of *str.
>
> If you want automatic type safety you should do this:
>
> #define safeMallocStr(p,n,type) do { (p) = (type *) malloc
> ((n)*sizeof (type)); } while (0);

That doesn't look very type-safe to me.

void *p;
safeMallocStr(p, n, void); /* requires a diagnostic */

void *q;
safeMallocStr(q, n, char);
int *r = q; /* so much for type safety */

> and you get type checking, and correct semantics. So if you change the
> type of the variable you are using, your compiler will issue warnings
> if you mismatch here.

Why not just remove the mismatch risk completely?

> Any variation you do in which you omit the cast
> outside of malloc will fail to catch this "change the definition of the
> pointer" scenario.

Wrong.

T *p;

p = malloc(n * sizeof *p);

Now change p's type to U *. The malloc is still correct, and does not need
an extra, potentially error-prone, edit to a spurious macro call.

CBFalconer

unread,

Nov 28, 2006, 7:41:21 AM11/28/06

to

Keith Thompson wrote:
> "Santosh Nayak" <santos...@gmail.com> writes:
>
>>> I suggest that one or both of the [Ss]antosh's currently posting
>>> here consider using their full name to avoid confusion. (Neither
>>> of them, of course, is obligated to follow my advice.)
>>
>> Sounds Logical !
>
> Thanks. If I have this straight, you've been signing yourself as
> "Santosh", and you started posting here recently; the other Santosh
> has been signing himself as "santosh" and has been posting here for
> some time. Is that right?

I think it's the other way around. The 'good' santosh has been
using lower case s.

CBFalconer

unread,

Nov 28, 2006, 7:43:31 AM11/28/06

to

Sundar wrote:
>
... snip ...

>
> Can u please further explain the following statement....
>
> "Don't cast return value of malloc() in C. It can hide the
> non-inclusion of it's prototype, (by way of failure to include
> stdlib.h), and, on some implementations, can result in nasty
> crashes during runtime."
>
> I am unable to understand the intricacies of the above statement.

It's in the FAQ. Don't use silly abbreviations such as 'u'.

CBFalconer

unread,

Nov 28, 2006, 10:55:28 AM11/28/06

to

webs...@gmail.com wrote:
>
... snip ...

>
> The cast allows for much more serious type checking. If you are
> using the pattern:
>
> <var> = (<type> *) malloc (<count> * sizeof (<type>));
>
> and find a way of enforcing that the exact type is repeated
> precisely (i.e., use a macro) then this usually works out better.
> If you change <type> or the type of <var> the compiler will issue
> a warning unless they are synchronized -- if you leave off this
> cast, you get no assistance from the compiler, and you just have
> to get it correct.

If you use the recommended:

<var> = malloc(<count> * sizeof *<var>);

you need no casts, and the exact type is enforced without any
concealment behind obfuscating macros or whatever.

santosh

unread,

Nov 28, 2006, 12:40:32 PM11/28/06

to

webs...@gmail.com wrote:
> santosh wrote:
> > Santosh wrote:
> > > > I have to read characters from stdin and save them in a string. The
> > > > problem is that I don't know how much characters will be read.

<snip>

> > > i++ ;
> > > str = (char*) realloc(str, (2*sizeof(char)) + i ) ;
> >
> > [...]
>
> > Anyway, your allocation strategy is very inefficient. Your calling
> > realloc() once every iteration of the loop. This could result in
> > fragmentation of the C library's memory pool. Why not allocate in terms
> > of fixed sized
>
> Because your memory pool will *still* fragment. It is also O(n^2)
> rather than O(n) because of the implicit copying performed in
> realloc().

The implicit copying will be performed in both cases, if realloc() runs
out of contiguous space. In addition calling realloc() for every byte
read adds significant avoidable overhead. Note the test below:

/* t1.c - realloc() once every byte */
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned char *buf = NULL, *stash = NULL;
size_t read = 0, bsize = 0;
int ch;

while((ch = fgetc(stdin)) != EOF) {
if(read == bsize) {
if((buf = realloc(buf, ++bsize)) == NULL) {
if(stash) free(stash);
return EXIT_FAILURE;
}
else stash = buf;
}
buf[read++] = ch;
}
free(stash);
return EXIT_SUCCESS;
}
/* end t1.c */

/* t2.c - buffer doubles upon each reallocation */
#include <stdio.h>
#include <stdlib.h>

int main(void) {
unsigned char *buf = NULL, *stash = NULL;
size_t read = 1, bsize = 1;
int ch;

while((ch = fgetc(stdin)) != EOF) {
if(read == bsize) {
if((buf = realloc(buf, bsize <<= 1)) == NULL) {
free(stash);
return EXIT_FAILURE;
}
else stash = buf;
}
buf[read - 1] = ch;
read++;
}
free(stash);
return EXIT_SUCCESS;
}
/* end t2.c */

$ gcc -Wall -Wextra -ansi -pedantic -o t1 t1.c
$ gcc -Wall -Wextra -ansi -pedantic -o t2 t2.c
$ du -b t1
7315 t1
$ du -b t2
7299 t2
$ du -sh /cdrom/boot/isolinux/linux
1.7M /cdrom/boot/isolinux/linux
$ time ./t1 < /cdrom/boot/isolinux/linux
real 0m1.483s
user 0m0.600s
sys 0m0.880s
$time ./t2 < /cdrom/boot/isolinux/linux
real 0m0.131s
user 0m0.104s
sys 0m0.028s
$

Well, apparently, the exponential allocation scheme is significantly
faster than the linear one.

> > [...] or dynamically growing blocks, say 128 bytes or so to
> > start with?
>
> You have to do thing in exponential steps (doublings is the most
> obvious was to do this) otherwise, weak malloc strategies will
> inevitably be explosed. Its also going to be very slow for large
> inputs (it really *IS* O(n^2)).

I'm sorry but I don't quite understand how it's going to any _slower_
than calling realloc() every one byte. I guess for large files, (or any
file for that matter), the disk overhead is going to swamp that of
realloc(), whether or not it's called every byte, since they won't be
buffered by the operating system.

> So are you going to trade a buffer
> overflow in just to be handed back a denial of service?

You can possibly recover from a denial of service situation but a
buffer overflow could mean a hard crash.

webs...@gmail.com

unread,

Nov 28, 2006, 2:42:44 PM11/28/06

to

santosh wrote:
> webs...@gmail.com wrote:
> > santosh wrote:
> > > Santosh wrote:
> > > > > I have to read characters from stdin and save them in a string. The
> > > > > problem is that I don't know how much characters will be read.

There is a reason for this. Let's do the math. The exponential
realloc strategy means you will call realloc() log_2(n) times, each
with a cost of 1, 2, 4, ..., O(N) in the event of a required memcpy().
Sum that up and you will see that its O(N). The constant factor is
basically 2 writes per byte.

If you do it linearly, k bytes at a time, then you will do O(N/k)
reallocs, with an average potential memcpy penalty of N/2. So that's a
cost of O(N*N/(2*k)) which is O(N^2).

Changing from "every byte" to some linear block of length k, is
basically trying to decrease the constant factor outside of the O(N^2).
Of course that will work for small N, but that's pointless and
unnecessary, and ineffective for large N.

This analysis is critically important, since it affects what you say
next:

> > > [...] or dynamically growing blocks, say 128 bytes or so to
> > > start with?
> >
> > You have to do thing in exponential steps (doublings is the most
> > obvious was to do this) otherwise, weak malloc strategies will
> > inevitably be explosed. Its also going to be very slow for large
> > inputs (it really *IS* O(n^2)).
>
> I'm sorry but I don't quite understand how it's going to any _slower_
> than calling realloc() every one byte.

Who said this?

> [...] I guess for large files, (or any

> file for that matter), the disk overhead is going to swamp that of
> realloc(), whether or not it's called every byte, since they won't be
> buffered by the operating system.

If you are swapping to disk, then the inherent memcpy() performance hit
is still there and is significantly worsened. In fact I would suggest
that any scenario that leads to disk swapping in the linear growth
method would necessarily turn into a kind of DOS (whether intentional
or not.) While a robust enough system using the exponential strategy
is likely to continue to function even if it does swap.

> > So are you going to trade a buffer
> > overflow in just to be handed back a denial of service?
>
> You can possibly recover from a denial of service situation but a
> buffer overflow could mean a hard crash.

So you are saying its ok to suffer DOSes when there is no good reason
to do so?

webs...@gmail.com

unread,

Nov 28, 2006, 2:54:42 PM11/28/06

to

CBFalconer wrote:
> webs...@gmail.com wrote:
> >
> ... snip ...
> >
> > The cast allows for much more serious type checking. If you are
> > using the pattern:
> >
> > <var> = (<type> *) malloc (<count> * sizeof (<type>));
> >
> > and find a way of enforcing that the exact type is repeated
> > precisely (i.e., use a macro) then this usually works out better.
> > If you change <type> or the type of <var> the compiler will issue
> > a warning unless they are synchronized -- if you leave off this
> > cast, you get no assistance from the compiler, and you just have
> > to get it correct.
>
> If you use the recommended:
>
> <var> = malloc(<count> * sizeof *<var>);
>
> you need no casts, and the exact type is enforced without any
> concealment behind obfuscating macros or whatever.

Well, this still has the potential for cut and paste errors unless you
macrofy the whole line. If you don't macrofy, then you risk error no
matter what.

So let us take a more serious approach compare macros which prevent any
mismatch errors:

#define scaredOfCPlusPlus(var,count) var = malloc(count*sizeof *var)
#define newThing(type,count) (type *) malloc (count * sizeof (type))

So you can say var = newThing(char *, 512), and if the type is wrong,
the compiler tells you. Furthermore you can pass newThing(,) as a
parameter to a function. The scaredOfCPlusPlus(,) macro works fine,
but doesn't look familliar, and can't be passed as a parameter to a
function.

And, of course, the real difference is that the first compiles straight
in C++, and the second is just an error. I have found that in general
the C++ optimizers and warnings are better for the C++ mode of my
compilers than the C mode.

Chris Torek

unread,

Nov 28, 2006, 3:05:59 PM11/28/06

to

In article <1164743682.7...@80g2000cwy.googlegroups.com>,

<webs...@gmail.com> wrote:
>And, of course, the real difference is that the first compiles straight
>in C++, and the second is just an error. I have found that in general
>the C++ optimizers and warnings are better for the C++ mode of my
>compilers than the C mode.

Fortran compilers are usually even better than that. Obviously
you should write your C code so that it compiles with Fortran compilers.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Eric Sosman

unread,

Nov 28, 2006, 3:36:37 PM11/28/06

to

webs...@gmail.com wrote On 11/28/06 14:54,:

> CBFalconer wrote:
>
>>webs...@gmail.com wrote:
>>
>>... snip ...
>>
>>>The cast allows for much more serious type checking. If you are
>>>using the pattern:
>>>
>>> <var> = (<type> *) malloc (<count> * sizeof (<type>));
>>>
>>>and find a way of enforcing that the exact type is repeated
>>>precisely (i.e., use a macro) then this usually works out better.
>>>If you change <type> or the type of <var> the compiler will issue
>>>a warning unless they are synchronized -- if you leave off this
>>>cast, you get no assistance from the compiler, and you just have
>>>to get it correct.
>>
>>If you use the recommended:
>>
>> <var> = malloc(<count> * sizeof *<var>);
>>
>>you need no casts, and the exact type is enforced without any
>>concealment behind obfuscating macros or whatever.
>
>
> Well, this still has the potential for cut and paste errors unless you
> macrofy the whole line. If you don't macrofy, then you risk error no
> matter what.

(Macros cure errors? News to me ...)

The principal advantage of the recommended form is that a
visual inspection of the line *in isolation* tells you whether
it's correct or incorrect. If the l.h.s. agrees with the sizeof
operand, the allocation is correct (unless <var> is a float or
an int or some other non-pointer thing, in which case the
compiler will squawk anyhow).

You do not need to go hunting for the declaration of <var>,
nor do you need to rummage around for the definition of some
macro and try to figure out its expansion. You can verify the
correctness of the code with a "local" inspection, without
digging through a pile of headers, hoping you've found the
right version and haven't been fooled by a morass of conditional
compilation.

> So let us take a more serious approach compare macros which prevent any
> mismatch errors:
>
> #define scaredOfCPlusPlus(var,count) var = malloc(count*sizeof *var)
> #define newThing(type,count) (type *) malloc (count * sizeof (type))
>
> So you can say var = newThing(char *, 512), and if the type is wrong,
> the compiler tells you. Furthermore you can pass newThing(,) as a
> parameter to a function. The scaredOfCPlusPlus(,) macro works fine,
> but doesn't look familliar, and can't be passed as a parameter to a
> function.

Why not? I don't see any advantage in writing such a macro,
but if you chose to do so the expression it generated would be
perfectly good as an argument to function.

> And, of course, the real difference is that the first compiles straight
> in C++, and the second is just an error. I have found that in general
> the C++ optimizers and warnings are better for the C++ mode of my
> compilers than the C mode.

(Have you interchanged "first" and "second" here?)

It doesn't seem to me that one's C style should be twisted
in deference to the practices of C++, or of Java, or of COBOL.
Nor vice versa, of course. Speak English or speak Spanish,
but don't lapse into Spanglish.

--
Eric....@sun.com

Keith Thompson

unread,

Nov 28, 2006, 4:20:55 PM11/28/06

to

"Santosh Nayak" <santos...@gmail.com> writes:
>> If I have this straight, you've been signing yourself as
>> "Santosh", and you started posting here recently; the other Santosh
>> has been signing himself as "santosh" and has been posting here for
>> some time. Is that right?
>
> Yes, that is right. Thanks for the suggestion.

You're welcome.

And please don't snip attribution lines. I wrote the text above
starting with "If I have this straight". There should have been a
line at the start of the quoted text identifying the author, something
like "Keith Thompson <ks...@mib.org> writes:". The Google interface
automatically provides such a line; please don't delete it.

Keith Thompson

unread,

Nov 28, 2006, 4:26:45 PM11/28/06

to

CBFalconer <cbfal...@yahoo.com> writes:
> Keith Thompson wrote:
>> "Santosh Nayak" <santos...@gmail.com> writes:
>>
>>>> I suggest that one or both of the [Ss]antosh's currently posting
>>>> here consider using their full name to avoid confusion. (Neither
>>>> of them, of course, is obligated to follow my advice.)
>>>
>>> Sounds Logical !
>>
>> Thanks. If I have this straight, you've been signing yourself as
>> "Santosh", and you started posting here recently; the other Santosh
>> has been signing himself as "santosh" and has been posting here for
>> some time. Is that right?
>
> I think it's the other way around. The 'good' santosh has been
> using lower case s.

That's what I said. Assuming that "the 'good' santosh" refers to
the one who's been posting here for a long time (I'm not going to
make any value judgements), he's the one who has been posting with a
lower case s; Santosh Nayak is the relative newcomer, and he's been
posting with an upper case S. (And Santosh Nayak confirmed this,
in a followup that you probably didn't see before posting yours.)

So, "santosh" (no last name, lower case s) has been posting here for a
long time, and "Santosh Nayak" is a relative newcomer, who briefly
posted as "Santosh".

Richard Tobin

unread,

Nov 28, 2006, 7:14:27 PM11/28/06

to

In article <eki4r...@news3.newsguy.com>,
Chris Torek <nos...@torek.net> wrote:

>>And, of course, the real difference is that the first compiles straight
>>in C++, and the second is just an error. I have found that in general
>>the C++ optimizers and warnings are better for the C++ mode of my
>>compilers than the C mode.

>Fortran compilers are usually even better than that. Obviously
>you should write your C code so that it compiles with Fortran compilers.

Or use IBM's PL/1 checkout compiler. It will give you lots of
warnings, but will probably translate your code into PL/1 as a bonus.

-- Richard
--
"Consideration shall be given to the need for as many as 32 characters
in some alphabets" - X3.4, 1963.

Barry Schwarz

unread,

Nov 28, 2006, 10:08:52 PM11/28/06

to

On 27 Nov 2006 20:46:32 -0800, "Sundar" <sunde...@gmail.com> wrote:

>
>santosh wrote:
>Can u please further explain the following statement....
>
>"Don't cast return value of malloc() in C. It can hide the
>non-inclusion
>of it's prototype, (by way of failure to include stdlib.h), and, on
>some implementations, can result in nasty crashes during runtime."
>
>I am unable to understand the intricacies of the above statement.

Under the C89 standard, an undeclared function is presumed to return
an int. It is entirely possible for an int to be returned by a
function using a different method than that used to return a pointer.

If there is no prototype in scope for malloc, but there is a cast,
then the usual warning about converting from an int to a pointer is
suppressed because the cast says to the compiler "I know what I'm
doing". The code that is generated will convert the supposed int to a
pointer. Since malloc really returned a pointer, the generated code
is processing a non-existent int.

Furthermore, malloc requires a size_t argument. There is an awful lot
of code where the argument expression has type int. If a prototype is
in scope, the argument will be converted automatically. If not, the
compiler will happily generate an int and pass it to malloc. If
size_t is not an (unsigned) int but a larger type, who knows what data
malloc will consider as the argument.

Both cases are examples of undefined behavior which, according to
Murphy's law, will appear to work until it is most damaging not to
(e.g., demonstration in front of very important customer).

On the other had, if you omit the cast, the diagnostic regarding the
conversion of an implied int to a pointer should provide enough
incentive to include stdlib.h which will automatically eliminate the
possibility of either problem occurring.

Since there is a well-defined implied conversion between void* and any
other object pointer type in either direction, the cast gains you
nothing but prevents the compiler from helping you in the case you
forgot the include the header. What some might call a lose/no-win
situation.

The oft recommended position in clc is cast only when you really know
why your are casting.

Remove del for email

webs...@gmail.com

unread,

Nov 28, 2006, 10:59:14 PM11/28/06

to

Eric Sosman wrote:
> webs...@gmail.com wrote On 11/28/06 14:54,:
> > CBFalconer wrote:
> >>webs...@gmail.com wrote:
> >>
> >>... snip ...
> >>
> >>>The cast allows for much more serious type checking. If you are
> >>>using the pattern:
> >>>
> >>> <var> = (<type> *) malloc (<count> * sizeof (<type>));
> >>>
> >>>and find a way of enforcing that the exact type is repeated
> >>>precisely (i.e., use a macro) then this usually works out better.
> >>>If you change <type> or the type of <var> the compiler will issue
> >>>a warning unless they are synchronized -- if you leave off this
> >>>cast, you get no assistance from the compiler, and you just have
> >>>to get it correct.
> >>
> >>If you use the recommended:
> >>
> >> <var> = malloc(<count> * sizeof *<var>);
> >>
> >>you need no casts, and the exact type is enforced without any
> >>concealment behind obfuscating macros or whatever.
> >
> >
> > Well, this still has the potential for cut and paste errors unless you
> > macrofy the whole line. If you don't macrofy, then you risk error no
> > matter what.
>
> (Macros cure errors? News to me ...)

I'm sure a lot of obvious things are news to you. What's not news to
me is that you would take a narrow statement I've made and
intentionally pretend I said something more general.

> The principal advantage of the recommended form is that a
> visual inspection of the line *in isolation* tells you whether
> it's correct or incorrect.

Which means what, in terms of coding safety? You are trading compiler
enforced type checking for manual based safety checking. You don't see
the inherent flaw in this? In your world people get blamed for
mistakes that in my world cannot even be made.

> [...] If the l.h.s. agrees with the sizeof

> operand, the allocation is correct (unless <var> is a float or
> an int or some other non-pointer thing, in which case the
> compiler will squawk anyhow).
>
> You do not need to go hunting for the declaration of <var>,

My compiler tells me the two types that are in conflict at the
diagnostic line (as does yours, probably) so I don't need to "hunt" for
anything with my solution either.

> nor do you need to rummage around for the definition of some
> macro and try to figure out its expansion. You can verify the
> correctness of the code with a "local" inspection, without
> digging through a pile of headers, hoping you've found the
> right version and haven't been fooled by a morass of conditional
> compilation.

The risk of error is simply not balanced by this. You can name your
macro genericArrayAlloc(,) and I don't think people will worry too much
about how the macro expands.

If your code is hundreds of thousands of lines, or if its been
substantially written by someone else, then manual inspection of all
your code is not a feasible option. Wherever possible, the tools and
compilers themselves should be enlisted to find as many "obvious once
you look at it" kinds of bugs automatically.

> > So let us take a more serious approach compare macros which prevent any
> > mismatch errors:
> >
> > #define scaredOfCPlusPlus(var,count) var = malloc(count*sizeof *var)
> > #define newThing(type,count) (type *) malloc (count * sizeof (type))
> >
> > So you can say var = newThing(char *, 512), and if the type is wrong,
> > the compiler tells you. Furthermore you can pass newThing(,) as a
> > parameter to a function. The scaredOfCPlusPlus(,) macro works fine,
> > but doesn't look familliar, and can't be passed as a parameter to a
> > function.
>
> Why not? I don't see any advantage in writing such a macro,
> but if you chose to do so the expression it generated would be
> perfectly good as an argument to function.

It requires an additional variable declaration that may be superfluous.
Hiding the "=" operator or anything as complex in macros is the kind
of thing that eventually leads to problems.

> > And, of course, the real difference is that the first compiles straight
> > in C++, and the second is just an error. I have found that in general
> > the C++ optimizers and warnings are better for the C++ mode of my
> > compilers than the C mode.
>
> (Have you interchanged "first" and "second" here?)

(Yes)

> It doesn't seem to me that one's C style should be twisted
> in deference to the practices of C++, or of Java, or of COBOL.

It isn't a matter of style. Its a matter of gaining access to better
tools. My C compilers simply don't emit a comparable number of
diagnostics, nor the quality of code that my C++ compilers do (even
when they are made by the very same vendor, packaged in the very same
tool.)

Making your C code compilable in C++ is a net gain in productivity, and
leads to a drammatic improvement in correctness, and a measurable
improvement in object code performance. Java and COBOL are
non-sequitor in this discussion.

webs...@gmail.com

unread,

Nov 28, 2006, 11:05:07 PM11/28/06

to

Chris Torek wrote:
> webs...@gmail.com wrote:
> >And, of course, the real difference is that the first compiles straight
> >in C++, and the second is just an error. I have found that in general
> >the C++ optimizers and warnings are better for the C++ mode of my
> >compilers than the C mode.
>
> Fortran compilers are usually even better than that.

Uhhh ... no they are not. Fortran compilers only compare favorably to
some C compilers in very narrow situations where the C compiler is
unable to prove that some pointers are non-aliasing ("restrict" was
added to C99 to eliminate this scenario completely). The real
challenge is to implement x << y in Fortran so that its no worse than 4
times slower than the comparable C output.

The point is, of course, that you almost certainly already know this
...

> [...] Obviously

> you should write your C code so that it compiles with Fortran compilers.

... which makes this at best a poor attempt at sarcasm, but more likely
an intentional deception.

Keith Thompson

unread,

Nov 28, 2006, 11:19:29 PM11/28/06

to

webs...@gmail.com writes:
> Chris Torek wrote:
[...]

>> [...] Obviously
>> you should write your C code so that it compiles with Fortran compilers.
>
> ... which makes this at best a poor attempt at sarcasm, but more likely
> an intentional deception.

Chris's comment was obviously sarcastic (I won't comment on whether it
was a "poor attempt"). If you seriously believe that Chris Torek is
intentionally trying to deceive people into believing that programmers
should write C code so it compiles with Fortran compilers, you have a
*severe* misunderstanding.

webs...@gmail.com

unread,

Nov 28, 2006, 11:26:42 PM11/28/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> > santosh wrote:
> >> Santosh wrote:
> >> > > I have to read characters from stdin and save them in a string. The
> >> > > problem is that I don't know how much characters will be read.
> >> First include necessary headers: stdio.h, stdlib.h
> >>
> >> > int main()
> >>
> >> Better yet, replace above with int main(void)
> >>
> >> > {
> >> > char *str = NULL, ch ;
> >> > int i = 0 ;
> >> > str = (char*) malloc (2*sizeof(char)) ;
> >>
> >> Don't cast return value of malloc() in C.
> >
> > This is not a bug.
>
> It merely hides one.
>
> > Note that without some sort of cast, there is no
> > type checking, which is the biggest risk when dealing with void *
> > pointers.
>
> No, there is an even bigger risk - cargo cult programming, which is what
> most people are doing when they cast malloc.

Uhh ... ok, but which has worse outcome? Superfluous structure that
your compiler is going to strip out of the object code anyways has no
negative impact on correctness or performance. Messing up a void *
pointer will cause truly arbitrary action. The two are not comparable
by outcome.

> >> [...] It can hide the non-inclusion of it's prototype,
> >
> > On *SOME* older generation compilers. No modern compiler fails to give
> > a warning about this regardless of the cast.
>
> So if anyone comes up with a counter-example, you can simply claim that it's
> not a "modern" compiler. ("True Scotsman" argument.)

For development? Are you going to use a digital watch to run your
compiler? You can demand minimum standards for your development
platform -- and numerous free compilers exist that behave as I suggest.

> [...] Furthermore, do not

> forget that some organisations are remarkably conservative, and will not
> change software that they know to work - especially if that software is
> mission-critical, as compilers easily can be.

Right -- but those same organizations are unlikely to be developing
lots of new code anyways. I don't look to such organizations to
leadership on how I should program. I only suffer their nonsense if
they are handing me a paycheck.

> >> [...] (by way of failure to include stdlib.h), and, on
> >> some implementations, can result in nasty crashes during runtime.
> >>
> >> Since sizeof(char) is by definition 1, you can omit that and instead do
> >> '2 * sizeof *str'. This has the advantage of becoming automatically
> >> updated when you later on happen to change the type of *str.
> >
> > If you want automatic type safety you should do this:
> >
> > #define safeMallocStr(p,n,type) do { (p) = (type *) malloc
> > ((n)*sizeof (type)); } while (0);
>
> That doesn't look very type-safe to me.
>
> void *p;
> safeMallocStr(p, n, void); /* requires a diagnostic */

My compiler barfs on sizeof(void). So the error is caught.

> void *q;
> safeMallocStr(q, n, char);
> int *r = q; /* so much for type safety */

That's ridiculous. Use of void * is never type safe. Using the
non-casting style of malloc usage doesn't change the above scenario in
any relevant way. Ironically, the correct solution is to use a C++
compiler which would spit errors at you for the last line.

> > and you get type checking, and correct semantics. So if you change the
> > type of the variable you are using, your compiler will issue warnings
> > if you mismatch here.
>
> Why not just remove the mismatch risk completely?

What risk are you talking about? Outside of gratuitous use of void *
pointers (which is tantamount to using gets() in "controlled
environments") there is no risk.

> > Any variation you do in which you omit the cast
> > outside of malloc will fail to catch this "change the definition of the
> > pointer" scenario.
>
> Wrong.
>
> T *p;
>
> p = malloc(n * sizeof *p);
>
> Now change p's type to U *. The malloc is still correct, and does not need
> an extra, potentially error-prone, edit to a spurious macro call.

The macro is potentially error-prone, but mismatching the variable and
the thing you are taking sizeof is not error-prone? Fiirst of all, you
have not explained how the macro is error prone, while the above is so
obviously susceptable to cut-and-paste errors.

Eric Sosman

unread,

Nov 28, 2006, 11:31:47 PM11/28/06

to

webs...@gmail.com wrote:

> Eric Sosman wrote:
>
>>webs...@gmail.com wrote On 11/28/06 14:54,:

>>>[...]

>>>Well, this still has the potential for cut and paste errors unless you
>>>macrofy the whole line. If you don't macrofy, then you risk error no
>>>matter what.
>>
>> (Macros cure errors? News to me ...)
>
> I'm sure a lot of obvious things are news to you.

I'm sure you're right, which means I must have missed
something obvious.

>> The principal advantage of the recommended form is that a
>>visual inspection of the line *in isolation* tells you whether
>>it's correct or incorrect.
>
> Which means what, in terms of coding safety? You are trading compiler
> enforced type checking for manual based safety checking. You don't see
> the inherent flaw in this? In your world people get blamed for
> mistakes that in my world cannot even be made.

No, I'm considering the poor sod who's trying to track down
a bug in a big hairy intertwined mess of code. If he's reading
along and he sees the Recommended Form, he can tell at once that
it's correct and not the cause of his problem (barring a mis-
computation of the number of items to be allocated, which he
needs to check in either formulation). He is not distracted by
the need to go haring off to other parts of the code base to find
out what the Dickens this macro is, or whether its customary
definition has been overridden by some "clever" use of #ifdef in
a well-concealed header file. (If you haven't encountered such
things, you haven't been around long enough.) He can read, verify,
and move along, all without scrolling the screen. That means his
attention is not diverted away from whatever the bug is; blind
alleys are eliminated without strolling down them.

> The risk of error is simply not balanced by this. You can name your
> macro genericArrayAlloc(,) and I don't think people will worry too much
> about how the macro expands.

If they don't, they should. Quickly, now: From the evidence
at hand, which argument is the type and which is the count? Or
is one of the arguments supposed to be the l.h.s. variable name
and not a type name at all? You can stare all day at the point
of invocation and not know what the macro expands to -- and you
can hunt all day through mazes of header files to find half a
dozen different conflicting definitions of the macro, and waste
time trying to figure out which is in force at the point of interest.

> If your code is hundreds of thousands of lines, or if its been
> substantially written by someone else, then manual inspection of all
> your code is not a feasible option. Wherever possible, the tools and
> compilers themselves should be enlisted to find as many "obvious once
> you look at it" kinds of bugs automatically.

The largest program I have worked on myself was only about three
million lines, roughly 2.5 million of C and 0.5 million of Lisp.
It was written and rewritten and re-rewritten over a period of about
fifteen years by a programming team that started as half-a-dozen
crazy zealots and grew (irregularly) to perhaps ninety or a hundred
people. I was one of them for eleven years, and have (I think) the
bare beginnings of an idea of what it must be like to work on a
big software project. (No, three million lines isn't "big" by any
standard. All I'm saying is that it's "big enough" to exceed a
human's ability for direct comprehension and to require the use of
conventions and suchlike formalisms as aids to understanding.)

And in light of what I've experienced, I stand by my opinion.

>>>So you can say var = newThing(char *, 512), and if the type is wrong,
>>>the compiler tells you. Furthermore you can pass newThing(,) as a
>>>parameter to a function. The scaredOfCPlusPlus(,) macro works fine,
>>>but doesn't look familliar, and can't be passed as a parameter to a
>>>function.
>>
>> Why not? I don't see any advantage in writing such a macro,
>>but if you chose to do so the expression it generated would be
>>perfectly good as an argument to function.
>
> It requires an additional variable declaration that may be superfluous.
> Hiding the "=" operator or anything as complex in macros is the kind
> of thing that eventually leads to problems.

Straw man: It was your decision, not mine, to hide an assignment
inside the macro. You are criticizing your own macro, not the form
it distorts.

> Making your C code compilable in C++ is a net gain in productivity,
> and leads to a drammatic improvement in correctness, and a measurable
> improvement in object code performance. Java and COBOL are
> non-sequitor in this discussion.

"Sequitur." It's Latin, like C an old-fashioned language. If
you prefer Italian and C++ by all means speak them, but don't try
to give advice to the classicists.

--
Eric Sosman
eso...@acm-dot-org.invalid

webs...@gmail.com

unread,

Nov 28, 2006, 11:49:52 PM11/28/06

to

Keith Thompson wrote:
> webs...@gmail.com writes:
> > Chris Torek wrote:
> [...]
> >> [...] Obviously
> >> you should write your C code so that it compiles with Fortran compilers.
> >
> > ... which makes this at best a poor attempt at sarcasm, but more likely
> > an intentional deception.
>
> Chris's comment was obviously sarcastic (I won't comment on whether it
> was a "poor attempt"). If you seriously believe that Chris Torek is
> intentionally trying to deceive people into believing that programmers
> should write C code so it compiles with Fortran compilers, you have a
> *severe* misunderstanding.

Typical shallow point of view on your part.

He's deceiving people by 1) intentionally reading that my argument is
that you should make your C code work in languages that might output
faster code (clearly I am picking C++ for the obvious reasons) and 2)
he's propogating the idea that Fortran is faster than C as a blanket
statement (this isn't true).

For sarcasm to be effective, you normally fairly represent what the
opposing idea (not done here) is and then show it at its extreme. He
starts by encoding two deceptions (which is a very typical propogandist
trick; if you unravel one deception, you may still fail to unwrap the
other) then applies his sarcasm in this case as a means of
misdirection. Then the sheep just ignore the deceptions and assess it
for its sarcastic value.

People who watch Fox News critically will recognize this as a very
standard trick they engage in all the time (host: "Do the democracts
support terrorism by suggesting we withdraw our troops?" fake-liberal:
"No! We are not cut and runners").

The only question here: Is Chris a liar or is he stupid? I don't think
he's stupid.

Richard Heathfield

unread,

Nov 29, 2006, 2:00:16 AM11/29/06

to

webs...@gmail.com said:

<snip>

> he's propogating the idea that Fortran is faster than C as a blanket
> statement (this isn't true).

He said no such thing.

> The only question here: Is Chris a liar or is he stupid? I don't think
> he's stupid.

Nor is he a liar. And to ask the only *other* question remaining, I don't
think you're a liar either. So that's settled, then.

Richard Heathfield

unread,

Nov 29, 2006, 2:19:02 AM11/29/06

to

webs...@gmail.com said:

> Richard Heathfield wrote:
>> webs...@gmail.com said:
>> > santosh wrote:
>> >> Santosh wrote:
>> >> > > I have to read characters from stdin and save them in a string.
>> >> > > The problem is that I don't know how much characters will be read.
>> >> First include necessary headers: stdio.h, stdlib.h
>> >>
>> >> > int main()
>> >>
>> >> Better yet, replace above with int main(void)
>> >>
>> >> > {
>> >> > char *str = NULL, ch ;
>> >> > int i = 0 ;
>> >> > str = (char*) malloc (2*sizeof(char)) ;
>> >>
>> >> Don't cast return value of malloc() in C.
>> >
>> > This is not a bug.
>>
>> It merely hides one.
>>
>> > Note that without some sort of cast, there is no
>> > type checking, which is the biggest risk when dealing with void *
>> > pointers.
>>
>> No, there is an even bigger risk - cargo cult programming, which is what
>> most people are doing when they cast malloc.
>
> Uhh ... ok, but which has worse outcome? Superfluous structure that
> your compiler is going to strip out of the object code anyways has no
> negative impact on correctness or performance.

If it's superfluous (your word, not mine, but I agree that it is appropriate
here), you might as well leave it out.

> Messing up a void *
> pointer will cause truly arbitrary action. The two are not comparable
> by outcome.

Have you ever tried *not* messing up a void * pointer? I have. It works just
fine.

>> >> [...] It can hide the non-inclusion of it's prototype,
>> >
>> > On *SOME* older generation compilers. No modern compiler fails to give
>> > a warning about this regardless of the cast.
>>
>> So if anyone comes up with a counter-example, you can simply claim that
>> it's not a "modern" compiler. ("True Scotsman" argument.)
>
> For development?

Sure. Just because an implementation doesn't give one particular diagnostic
message that Paul Hsieh thinks it should, that doesn't mean it's a Bad
Compiler.

> Are you going to use a digital watch to run your
> compiler?

Is it your contention, then, that only compilers that run on digital watches
do not issue such warnings?

> You can demand minimum standards for your development
> platform -- and numerous free compilers exist that behave as I suggest.

Paul Hsieh's suggestions on compiler behaviour are non-normative.

>> [...] Furthermore, do not
>> forget that some organisations are remarkably conservative, and will not
>> change software that they know to work - especially if that software is
>> mission-critical, as compilers easily can be.
>
> Right -- but those same organizations are unlikely to be developing
> lots of new code anyways.

That has not been true of several such organisations of which I have
personal experience.

> I don't look to such organizations to
> leadership on how I should program. I only suffer their nonsense if
> they are handing me a paycheck.

Bingo.

>> > If you want automatic type safety you should do this:
>> >
>> > #define safeMallocStr(p,n,type) do { (p) = (type *) malloc
>> > ((n)*sizeof (type)); } while (0);
>>
>> That doesn't look very type-safe to me.
>>
>> void *p;
>> safeMallocStr(p, n, void); /* requires a diagnostic */
>
> My compiler barfs on sizeof(void). So the error is caught.

Yes, but you now have your maintenance programmer wondering why the heck he
can't put void there - it worked all right for char, so why not void? He
has to dig out the macro to find out, which means pushing his context and
digging out the header. What a waste of time.

>> void *q;
>> safeMallocStr(q, n, char);
>> int *r = q; /* so much for type safety */
>
> That's ridiculous.

Yes, but then it uses a ridiculous macro.

> Use of void * is never type safe.

And it's not only legal but even idiomatic C. So trying to make C type safe
is a bit like trying to make Ook! object-oriented.

> Using the
> non-casting style of malloc usage doesn't change the above scenario in
> any relevant way.

I agree entirely, but my point was only that your macro doesn't magically
introduce type safety into a language that I prefer to think of as "type
aware". To some people, type safety is a straitjacket.

> Ironically, the correct solution is to use a C++
> compiler which would spit errors at you for the last line.

Ironically, by introducing C++ into this argument you just explained why
your suggestions about C should be treated with a pinch of salt.

>> > Any variation you do in which you omit the cast
>> > outside of malloc will fail to catch this "change the definition of the
>> > pointer" scenario.
>>
>> Wrong.
>>
>> T *p;
>>
>> p = malloc(n * sizeof *p);
>>
>> Now change p's type to U *. The malloc is still correct, and does not
>> need an extra, potentially error-prone, edit to a spurious macro call.
>
> The macro is potentially error-prone,

Yes. The macro needs to be told the type, and you can get the type *wrong*.

p = malloc(n * sizeof *p); does not need to be told the type, so you can't
get the type wrong.

> but mismatching the variable and
> the thing you are taking sizeof is not error-prone?

There is no such mismatch in the canonical form.

webs...@gmail.com

unread,

Nov 29, 2006, 2:27:43 AM11/29/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> <snip>
> > he's propogating the idea that Fortran is faster than C as a blanket
> > statement (this isn't true).
>
> He said no such thing.

Well this is what he said: "Fortran compilers are usually even better
than that." (responding my statement about compiler optimizers and
warnings). You see his *attempt* at sarcasm doesn't work unless he is
able to establish this premise. Otherwise, his statement doesn't make
any sense at all.

> > The only question here: Is Chris a liar or is he stupid? I don't think
> > he's stupid.
>
> Nor is he a liar.

Ok, he's made two gross mistakes in premise. But he clearly has a lot
of experience, and at least some skill as far as I can tell. Neither
mistake really makes sense in light of where you would expect his level
to be at. Well, ok maybe he was drunk or something -- you could argue
that's a form of temporary stupidity I guess.

But ignoring the more bizarre possibilities, what are we left with?

> [...] And to ask the only *other* question remaining, I don't

> think you're a liar either.

Ok, so upon what do you base this question?

> [...] So that's settled, then.

Right, because that's how usenet discussions are always settled.

Richard Heathfield

unread,

Nov 29, 2006, 2:57:38 AM11/29/06

to

webs...@gmail.com said:

<snip>
>
> Ok, [Chris Torek has] made two gross mistakes in premise. But he

> clearly has a lot of experience, and at least some skill as far as I can
> tell.

As far as I can tell, he has more of both than you do. And that's likely to
be the perception amongst others here too. That doesn't mean he's
infallible. But if you and he disagree over something, common sense and
experience will lead me to assume that he's right and you're wrong, unless
you can come up with some extraordinarily convincing counter-evidence. So
far, you have not done so.

> Neither
> mistake really makes sense in light of where you would expect his level
> to be at. Well, ok maybe he was drunk or something -- you could argue
> that's a form of temporary stupidity I guess.
>
> But ignoring the more bizarre possibilities, what are we left with?

The possibility that he's right and you're either wrong or misinterpreting
what he's said.

>> [...] And to ask the only *other* question remaining, I don't
>> think you're a liar either.
>
> Ok, so upon what do you base this question?

You said: "The only question here: Is Chris a liar or is he stupid? I don't
think he's stupid." In so doing, you called Chris's integrity into
question. And so either you were stupid enough to believe that Chris was
lying or you were lying because you knew he wasn't but were trying to
deceive people into believing he was. And I don't think you're a liar.

Attacking generous-hearted and much-loved old-timer experts like Chris Torek
is a risky strategy in comp.lang.c. If he's actually *wrong* about
something (and apparently there was a time back in 1974...), then sure,
let's put it right. But "either you're stupid or you're lying" is an
attack, plain and simple. You might turn those guns on a troll without
anyone batting an eyelid - but Chris Torek? Forget it.

>> [...] So that's settled, then.
>
> Right, because that's how usenet discussions are always settled.

No, sometimes logic prevails. It just doesn't happen very often.

webs...@gmail.com

unread,

Nov 29, 2006, 3:21:47 AM11/29/06

to

Eric Sosman wrote:
> webs...@gmail.com wrote:
> > Eric Sosman wrote:
> >>webs...@gmail.com wrote On 11/28/06 14:54,:
> >>>[...]
> >>>Well, this still has the potential for cut and paste errors unless you
> >>>macrofy the whole line. If you don't macrofy, then you risk error no
> >>>matter what.
> >>
> >> (Macros cure errors? News to me ...)
> >
> > I'm sure a lot of obvious things are news to you.
>
> I'm sure you're right, which means I must have missed
> something obvious.

Ok, boys and girls, can you spot two errors in logic in a row by Mr.
Sosman in these quotes above?

> >> The principal advantage of the recommended form is that a
> >>visual inspection of the line *in isolation* tells you whether
> >>it's correct or incorrect.
> >
> > Which means what, in terms of coding safety? You are trading compiler
> > enforced type checking for manual based safety checking. You don't see
> > the inherent flaw in this? In your world people get blamed for
> > mistakes that in my world cannot even be made.
>
> No, I'm considering the poor sod who's trying to track down
> a bug in a big hairy intertwined mess of code.

Ok, but you are just injecting your paranoia into the equation here, as
a poor cover for this groundless idea.

> [...] If he's reading

> along and he sees the Recommended Form,

Of course, capital letters -- I guess that must make it an athoritative
dictum that cannot be questioned.

> [...] he can tell at once that

> it's correct and not the cause of his problem (barring a mis-
> computation of the number of items to be allocated, which he
> needs to check in either formulation).

This is only relevant if you can assume that he can read every line of
code in the program. A silent mistmatch on types via void * will not
rear its head in obvious ways. You have to mechanically prevent it
otherwise it will just lead to problems at some realized constant rate.

> [...] He is not distracted by

> the need to go haring off to other parts of the code base to find
> out what the Dickens this macro is, or whether its customary
> definition has been overridden by some "clever" use of #ifdef in
> a well-concealed header file. (If you haven't encountered such
> things, you haven't been around long enough.)

Debugging is of arbitrary complexity. There's little you can do about
that. Blanket conceptions like "don't use macros" are basically no
help, especially when you can actually use macros to increase safety.

> [...] He can read, verify,

> and move along, all without scrolling the screen.

Yes but under your premise he has to repeat this a million times with
perfect precision otherwise this process is not really any good to
anyone.

> [...] That means his

> attention is not diverted away from whatever the bug is; blind
> alleys are eliminated without strolling down them.

You are mixing two things up here. You are trying to decrease the cost
of debugging (which in reality you really aren't) at the expense of
up-front safety. Its always easier and faster to not have to deal with
bugs than it is to debug them.

> > The risk of error is simply not balanced by this. You can name your
> > macro genericArrayAlloc(,) and I don't think people will worry too much
> > about how the macro expands.
>
> If they don't, they should. Quickly, now: From the evidence
> at hand, which argument is the type and which is the count?

The compiler compiled it. (I came up with this answer about in 0.2
seconds. Quick enough for you?) So the type and the count will
correspond to whatever correctly compiled.

Now just as quicky: if you redefined "sizeof" can the compiler think
the argument of malloc(n*sizeof*p) is a multiplication of 3 values?

> [...] Or

> is one of the arguments supposed to be the l.h.s. variable name
> and not a type name at all? You can stare all day at the point
> of invocation and not know what the macro expands to -- and you
> can hunt all day through mazes of header files to find half a
> dozen different conflicting definitions of the macro, and waste
> time trying to figure out which is in force at the point of interest.

You could, or you could step through it with a debugger, see that its
correct and move on.

> > If your code is hundreds of thousands of lines, or if its been
> > substantially written by someone else, then manual inspection of all
> > your code is not a feasible option. Wherever possible, the tools and
> > compilers themselves should be enlisted to find as many "obvious once
> > you look at it" kinds of bugs automatically.
>
> The largest program I have worked on myself was only about three
> million lines, roughly 2.5 million of C and 0.5 million of Lisp.
> It was written and rewritten and re-rewritten over a period of about
> fifteen years by a programming team that started as half-a-dozen
> crazy zealots and grew (irregularly) to perhaps ninety or a hundred
> people. I was one of them for eleven years, and have (I think) the
> bare beginnings of an idea of what it must be like to work on a
> big software project. (No, three million lines isn't "big" by any
> standard. All I'm saying is that it's "big enough" to exceed a
> human's ability for direct comprehension and to require the use of
> conventions and suchlike formalisms as aids to understanding.)

I don't see this as testimony in favor of your approach. With a light
macro, the compiler will keep you in check. Without it, you have only
your wits and a hope that your coding convention be followed. My
experience is that in large enough groups, coding conventions erode
over time.

> And in light of what I've experienced, I stand by my opinion.

Right. Most of the rest of the industry stands by completely other
opinions. Entire programming languages we created because of these
sorts of inanities in C.

> >>>So you can say var = newThing(char *, 512), and if the type is wrong,
> >>>the compiler tells you. Furthermore you can pass newThing(,) as a
> >>>parameter to a function. The scaredOfCPlusPlus(,) macro works fine,
> >>>but doesn't look familliar, and can't be passed as a parameter to a
> >>>function.
> >>
> >> Why not? I don't see any advantage in writing such a macro,
> >>but if you chose to do so the expression it generated would be
> >>perfectly good as an argument to function.
> >
> > It requires an additional variable declaration that may be superfluous.
> > Hiding the "=" operator or anything as complex in macros is the kind
> > of thing that eventually leads to problems.
>
> Straw man: It was your decision, not mine, to hide an assignment
> inside the macro. You are criticizing your own macro, not the form
> it distorts.

Well but you are proposing not using a macro at all, and I am not going
to address the safety of doing that other than to say that it becomes a
cut and paste silent error magnet. So I'm pointing out that the
no-cast method is going to end up sub-optimal no matter what you do.

Richard Heathfield

unread,

Nov 29, 2006, 3:32:38 AM11/29/06

to

webs...@gmail.com said:

<snip>

> Now just as quicky: if you redefined "sizeof" can the compiler think
> the argument of malloc(n*sizeof*p) is a multiplication of 3 values?

If you redefined sizeof you wouldn't be programming in C any more. When you
have a sensible argument, wake me up.

webs...@gmail.com

unread,

Nov 29, 2006, 3:54:33 AM11/29/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> <snip>
> > Ok, [Chris Torek has] made two gross mistakes in premise. But he
> > clearly has a lot of experience, and at least some skill as far as I can
> > tell.
>
> As far as I can tell, he has more of both than you do. And that's likely to
> be the perception amongst others here too. That doesn't mean he's
> infallible. But if you and he disagree over something, common sense and
> experience will lead me to assume that he's right and you're wrong, unless
> you can come up with some extraordinarily convincing counter-evidence. So
> far, you have not done so.

Given a previous discussion the two of us were in, I have to resist
making the ridiculously obvious retort at this point. Let me put it in
the most polite way as possible -- do you really think that your
personal deference to him is somehow supposed to mean something to me?
If it helps, let me tell you that I am a naturally anti-authoritarian
person. I strongly feel that pedistals are meant to be knocked over
(they serve no other real purpose).

> > Neither
> > mistake really makes sense in light of where you would expect his level
> > to be at. Well, ok maybe he was drunk or something -- you could argue
> > that's a form of temporary stupidity I guess.
> >
> > But ignoring the more bizarre possibilities, what are we left with?
>
> The possibility that he's right and you're either wrong or misinterpreting
> what he's said.

Uh ... he said Fortran was better than C (at optimization and/or
diagnostics). No matter how much you delete this quote, he still said
it. If he can make a strong case for the diagnostics, then I will
concede that he just wasn't being clear (but bizarrely intentionally
so). As for the optimizations, he's barking up the wrong tree if he
wants to try to present that case to me.

Ok, then he implied that I said you should consider using a C++
compiler to compile your C code, solely because it has better
optimizations and diagnostics. Obviously I only suggest this because
its just so easy to make your code both C and C++ compatible, so there
is a lot up side benefit (better optimized, better diagnostics) from
making your code C++ compatible with relatively little cost. Even if
Fortran were faster on average (again, it is not, and everyone knows
this) I am in no way implying that you should try to make a C/Fortran
polyglot or switch to Fortran or anything like that.

I'm trying to figure out what I'm missinterpreting or am getting wrong
in all of this. I just can't see it. Certainly your evidence-free
claims about this misinterpretation is of no help.

> >> [...] And to ask the only *other* question remaining, I don't
> >> think you're a liar either.
> >
> > Ok, so upon what do you base this question?
>
> You said: "The only question here: Is Chris a liar or is he stupid? I don't
> think he's stupid."

That is a conclusion after some build-up yes. I obviously didn't say
that in isolation.

> [...] In so doing, you called Chris's integrity into
> question.

Ok, well he's publically twisted my words and made a sarcastic remark
in order make the point that I am either advocating something ludicrast
(making your code into polygots) or have made some kind of error in
reasoning that ultimately leads to that. Chris ordinarily commands
some sort of respect in this group, so I wonder who is calling who's
integrity into question here?

> [...] And so either you were stupid enough to believe that Chris was

> lying or you were lying because you knew he wasn't but were trying to
> deceive people into believing he was. And I don't think you're a liar.

Excuse me? The *EFFECT* of what he wrote *DOES* deceive. This is not
credibly in dispute. He claims I am saying and implying things I
clearly am not, and has added a clearly mistaken claim to this. He has
put his name to these erroneous words. When this happens, the ordinary
options are 1) malice, 2) error, 3) incompetence. I've ruled out <2)
error> simply because his track records suggests he couldn't make two
simultaneous errors of that kind at once.

I call BS on the both of you.

webs...@gmail.com

unread,

Nov 29, 2006, 4:24:58 AM11/29/06

to

You mean like you should never comment your code because comments are
superfluous? (Certainly, their value is at best subjective.)
Sometimes redundancy is useful -- this is the lesson of structured
programming.

> > Messing up a void *
> > pointer will cause truly arbitrary action. The two are not comparable
> > by outcome.
>
> Have you ever tried *not* messing up a void * pointer? I have. It works just
> fine.

Sure. Self-modifying code works fine too. You know, I've written my
own fully functional co-routine library that only works in the
"register mode" of WATCOM C/C++ and a full multithreading library that
only works in DOS. I've written a compile-on-the-fly image scalar that
supports three different compilers and targets x86 CPUs that support
MMX. But taking a step back, it occurrs to me that "it works fine" is
a fairly low standard for writing code.

On the counter point, have you ever tried to debug a messed up type
coercion hidden by a void *? The real problem with it is that you
don't even realize that's what's gone wrong until you get deeply into
debugging in. And the debugger will typically be far less helpful than
you wished.

> >> >> [...] It can hide the non-inclusion of it's prototype,
> >> >
> >> > On *SOME* older generation compilers. No modern compiler fails to give
> >> > a warning about this regardless of the cast.
> >>
> >> So if anyone comes up with a counter-example, you can simply claim that
> >> it's not a "modern" compiler. ("True Scotsman" argument.)
> >
> > For development?
>
> Sure. Just because an implementation doesn't give one particular diagnostic
> message that Paul Hsieh thinks it should, that doesn't mean it's a Bad
> Compiler.

Right ... but we're several posts into this, and you couldn't even come
up with one? Does the Diep compiler do this? Some UNIX cc that I have
not encountered? Green-Hills compiler? I'm just listing the ones I
know about, but have never used to verify whether or not they warn you
about missing prototypes.

> > Are you going to use a digital watch to run your
> > compiler?
>
> Is it your contention, then, that only compilers that run on digital watches
> do not issue such warnings?

Well, old compilers don't issue the warning, because it used to be
considered valid C code without question. I've already implicitely
conceded this.

What? void is not a thing. If a maintenance programmer wants to sort
a 0-sized list, or return auto-declared arrays, he can do that too.
The difference is that stuffing void in there is completely
unmotivated, and the *compiler* tells you about the error. You have a
strange notion of what the true cost of development is.

> >> void *q;
> >> safeMallocStr(q, n, char);
> >> int *r = q; /* so much for type safety */
> >
> > That's ridiculous.
>
> Yes, but then it uses a ridiculous macro.
>
> > Use of void * is never type safe.
>
> And it's not only legal but even idiomatic C.

s/m/t/

> [...] So trying to make C type safe is a bit like

> trying to make Ook! object-oriented.

Interesting observation. If you intersect C and C++ what are you left
with? Its like its C without some of the marginal constructs and it
has the type safety of C++.

> > Using the
> > non-casting style of malloc usage doesn't change the above scenario in
> > any relevant way.
>
> I agree entirely, but my point was only that your macro doesn't magically
> introduce type safety into a language that I prefer to think of as "type
> aware". To some people, type safety is a straitjacket.

Well to some people type safety is free automated assistance.

> > Ironically, the correct solution is to use a C++
> > compiler which would spit errors at you for the last line.
>
> Ironically, by introducing C++ into this argument you just explained why
> your suggestions about C should be treated with a pinch of salt.

Do I smell a fundamentalist ideology?

> >> > Any variation you do in which you omit the cast
> >> > outside of malloc will fail to catch this "change the definition of the
> >> > pointer" scenario.
> >>
> >> Wrong.
> >>
> >> T *p;
> >>
> >> p = malloc(n * sizeof *p);
> >>
> >> Now change p's type to U *. The malloc is still correct, and does not
> >> need an extra, potentially error-prone, edit to a spurious macro call.
> >
> > The macro is potentially error-prone,
>
> Yes. The macro needs to be told the type, and you can get the type *wrong*.

But the compiler won't allow it to compile. Compile time errors are
basically zero cost. You may perceive the cost of development to be
"typing code in". I lean towards the idea that safety and debugging
costs more than typing code in, and debugging is far more costly than
up-front safety.

> p = malloc(n * sizeof *p); does not need to be told the type, so you can't
> get the type wrong.
>
> > but mismatching the variable and
> > the thing you are taking sizeof is not error-prone?
>
> There is no such mismatch in the canonical form.

I don't know what you are talking about. You cut and paste, you change
the target variable and miss the sizeof variable. Ok, you've just
introduced a silent error, that's many hours of debugging waiting to
happen. In my case, you change a variable declaration, and the
compiler then lists, all the places where you have created a type
mismatch. A few minutes maybe, even if you have to write an awk
script. As soon as you make it compile worthy, you are set.

Richard Heathfield

unread,

Nov 29, 2006, 5:32:33 AM11/29/06

to

webs...@gmail.com said:

> Richard Heathfield wrote:
>> webs...@gmail.com said:

<snip>

>> > Superfluous structure that
>> > your compiler is going to strip out of the object code anyways has no
>> > negative impact on correctness or performance.
>>
>> If it's superfluous (your word, not mine, but I agree that it is
>> appropriate here), you might as well leave it out.
>
> You mean like you should never comment your code because comments are
> superfluous?

I don't agree that comments are superfluous. *You* said the structure was
superfluous, and "superfluous" means "redundant, unnecessary" (look in a
dictionary if you don't believe me).

> (Certainly, their value is at best subjective.)
> Sometimes redundancy is useful -- this is the lesson of structured
> programming.

If it's useful, how can it be redundant?

>> > Messing up a void *
>> > pointer will cause truly arbitrary action. The two are not comparable
>> > by outcome.
>>
>> Have you ever tried *not* messing up a void * pointer? I have. It works
>> just fine.
>
> Sure. Self-modifying code works fine too.

In my experience, such code nails you pretty firmly to a particular platform
(or group of closely-related platforms).

<snip>

> But taking a step back, it occurrs to me that "it works fine" is
> a fairly low standard for writing code.

Taking a step forward again, "it works fine" is a fabulous *baseline* for
writing code. For example, fopen works fine, and I use it quite happily,
without worrying that my code is somehow of a low standard just because it
uses something that works fine. Using void pointers correctly does not
imply fragile code. Using void pointers incorrectly is a losing strategy.
But so is using casts incorrectly. So is using fopen incorrectly. So is
using *anything* incorrectly.

> On the counter point, have you ever tried to debug a messed up type
> coercion hidden by a void *?

Yes - in fact I've almost certainly done so right here in comp.lang.c. And
I've debugged screwed-up macro calls, too. So?

> The real problem with it is that you
> don't even realize that's what's gone wrong until you get deeply into
> debugging in. And the debugger will typically be far less helpful than
> you wished.

"Think first, compute later" is always a good plan. I generally don't bother
too much with debuggers nowadays. They are occasional ports in a storm,
that's all.

>> >> >> [...] It can hide the non-inclusion of it's prototype,
>> >> >
>> >> > On *SOME* older generation compilers. No modern compiler fails to
>> >> > give a warning about this regardless of the cast.
>> >>
>> >> So if anyone comes up with a counter-example, you can simply claim
>> >> that it's not a "modern" compiler. ("True Scotsman" argument.)
>> >
>> > For development?
>>
>> Sure. Just because an implementation doesn't give one particular
>> diagnostic message that Paul Hsieh thinks it should, that doesn't mean
>> it's a Bad Compiler.
>
> Right ... but we're several posts into this, and you couldn't even come
> up with one?

Why bother? The diagnostic message is not required by the Standard, so it
makes no sense to me to insist to compiler-writers that they provide it. In
general, I use the compiler I'm given when on client sites. If I were to
say to a client, "Paul Hsieh suggests we use a different compiler to the
one you've been using quite happily for this whole project and many others
before, because this one doesn't diagnose <foo>, which it isn't required to
by the Standard", he'd laugh in my face, and rightly so.

<snip>

>> > Use of void * is never type safe.
>>
>> And it's not only legal but even idiomatic C.
>
> s/m/t/

"idiotatic"? Okay, let's assume you mean "idiotic". It is your right to hold
that opinion, but your saying that use of void * is idiotic doesn't make it
so.

>> [...] So trying to make C type safe is a bit like
>> trying to make Ook! object-oriented.
>
> Interesting observation. If you intersect C and C++ what are you left
> with?

Either poor C, poor C++, or syntax errors.

>> > Using the
>> > non-casting style of malloc usage doesn't change the above scenario in
>> > any relevant way.
>>
>> I agree entirely, but my point was only that your macro doesn't magically
>> introduce type safety into a language that I prefer to think of as "type
>> aware". To some people, type safety is a straitjacket.
>
> Well to some people type safety is free automated assistance.

I have no problem with free automated assistance, but free automated
dictation is another matter. Whether a pointer of type <foo> is meaningful
when interpreted as if it were a pointer of type <bar> is something that
I'll judge for myself.

>> > Ironically, the correct solution is to use a C++
>> > compiler which would spit errors at you for the last line.
>>
>> Ironically, by introducing C++ into this argument you just explained why
>> your suggestions about C should be treated with a pinch of salt.
>
> Do I smell a fundamentalist ideology?

No, you smell comp.lang.c, which is about C, not C++. If you want to discuss
C++, there's a whole nother newsgroup for that. And if you want to discuss
programming in general, there's a newsgroup for that, too.

<snip>

>> p = malloc(n * sizeof *p); does not need to be told the type, so you
>> can't get the type wrong.
>>
>> > but mismatching the variable and
>> > the thing you are taking sizeof is not error-prone?
>>
>> There is no such mismatch in the canonical form.
>
> I don't know what you are talking about. You cut and paste, you change
> the target variable and miss the sizeof variable.

Oh, okay, I see what you mean. I thought you were talking about types, not
objects. The reason I didn't "get it" immediately is probably because I
find it quicker to type p = malloc(n * sizeof *p); than to invoke a copy
operation, a move operation, a paste operation, and two edits. Copy-paste
is expensive compared to typing when the amount to be copied is low, and
silly when the amount to be copied is high (because you're missing an
opportunity for re-factoring).

> Ok, you've just
> introduced a silent error, that's many hours of debugging waiting to
> happen.

Perhaps I need more practice. I generally don't manage to make debugging
last more than a few minutes.

Al Balmer

unread,

Nov 29, 2006, 10:42:23 AM11/29/06

to

On 28 Nov 2006 20:26:42 -0800, webs...@gmail.com wrote:

>Fiirst of all, you
>have not explained how the macro is error prone, while the above is so
>obviously susceptable to cut-and-paste errors.

I've found that all cut-and-paste errors can be eliminated by avoiding
cut-and-paste.

--
Al Balmer
Sun City, AZ

CBFalconer

unread,

Nov 29, 2006, 9:40:05 AM11/29/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
>
> <snip>
>
>> Now just as quicky: if you redefined "sizeof" can the compiler think
>> the argument of malloc(n*sizeof*p) is a multiplication of 3 values?
>
> If you redefined sizeof you wouldn't be programming in C any more.
> When you have a sensible argument, wake me up.

You might as well give it up and save the bandwidth. Websnarl is
never going to write portable conforming code anyhow. He also
thinks that all C systems have 32 bit integers, for example. Just
point out the errors for the benefit of others.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>

webs...@gmail.com

unread,

Nov 29, 2006, 2:32:54 PM11/29/06

to

Yes, and in fact all programming errors can be eliminated by avoiding
programming.

webs...@gmail.com

unread,

Nov 29, 2006, 2:58:42 PM11/29/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> > Richard Heathfield wrote:
> >> webs...@gmail.com said:
> <snip>
> >> > Superfluous structure that
> >> > your compiler is going to strip out of the object code anyways has no
> >> > negative impact on correctness or performance.
> >>
> >> If it's superfluous (your word, not mine, but I agree that it is
> >> appropriate here), you might as well leave it out.
> >
> > You mean like you should never comment your code because comments are
> > superfluous?
>
> I don't agree that comments are superfluous. *You* said the structure was
> superfluous, and "superfluous" means "redundant, unnecessary" (look in a
> dictionary if you don't believe me).

Huh? I don't have a problem with the definition. Comments describe
what the code is doing (redundant, since the source itself does that)
and are ignored by the compiler (unnecessary). So what is your
problem? (Note that this does not imply that Comments are a bad
thing.)

> > (Certainly, their value is at best subjective.)
> > Sometimes redundancy is useful -- this is the lesson of structured
> > programming.
>
> If it's useful, how can it be redundant?

Well let's just pause and think about this for a second.

Most CPU caches have parity bits, or ECC which are *REDUNDANT* to the
raw data its already carrying. So the question is, how can parity bits
or ECC be useful? Perhaps we need a research project to figure that
out. You'll find the same thing on hard disks and CD Roms.

TCP/IP uses a ones completement checksum on its packets. This checksum
is obviously derivable from the rest of the data its delivering.

There are well known algorithms, in fact called "Cyclic Redundancy
Check". Interesting that people would waste time developing these
algorithms if they weren't useful. Do you think these algorithms
belong in the same category as bogosort or the brainf---- computer
language?

In C or Pascal, you usually have a { or begin that matches the start of
a program block. Without any loss of correct grammar parsing, you
could obviously just drop those. So they should be considered
redundant.

When you teach a grade school student to spell, or some rules of
arithmetic or whatever, you commonly do so through the process of
repetition. This act is of course redundant, since you aren't doing
anything one time you didn't do an earlier time. So is it useful for
students to repeat exercises like this even though it is redundant?

> >> > Messing up a void *
> >> > pointer will cause truly arbitrary action. The two are not comparable
> >> > by outcome.
> >>
> >> Have you ever tried *not* messing up a void * pointer? I have. It works
> >> just fine.
> >
> > Sure. Self-modifying code works fine too.

<snip>

> > But taking a step back, it occurrs to me that "it works fine" is
> > a fairly low standard for writing code.
>
> Taking a step forward again, "it works fine" is a fabulous *baseline* for
> writing code.

Bare minimum requirements are fabulous?

> [...] For example, fopen works fine, and I use it quite happily,

> without worrying that my code is somehow of a low standard just because it
> uses something that works fine.

But obfuscated code works fine too.

> [...] Using void pointers correctly does not
> imply fragile code.

Well that's not exactly what I was saying. It becomes a place were an
error can *hide*. Fragile code is usually much better because it
breaks at the drop of a hat, and so you can isolate and debug it
easily, or with moderate testing.

With a void * pointer, you can allocate the wrong thing to it. If you
inadvertantly underallocate and you have good heap debugging facilities
then maybe you can get away with a short debug session. But if you
*over* allocate and this causes you to run out of memory -- now what
are you going to do? The bug may have happened many *days* earlier
during a long run of something.

> [...] Using void pointers incorrectly is a losing strategy.

> But so is using casts incorrectly. So is using fopen incorrectly. So is
> using *anything* incorrectly.

So you live in such a dichometric universe that you can't see anything
other than black and white? Either something is wrong or it isn't, and
you are not even concerned at all with the path you take in getting
from wrong to right?

> > On the counter point, have you ever tried to debug a messed up type
> > coercion hidden by a void *?
>
> Yes - in fact I've almost certainly done so right here in comp.lang.c. And
> I've debugged screwed-up macro calls, too. So?
>
> > The real problem with it is that you
> > don't even realize that's what's gone wrong until you get deeply into
> > debugging in. And the debugger will typically be far less helpful than
> > you wished.
>
> "Think first, compute later" is always a good plan.

No, that's not a plan at all. Its a mantra, and an unachievable ideal.
People do not think with perfection, so this will inevitably lead to
manifestions of thoughtless code. How about a more serious plan: "Use
your tools to assist you in ferreting out bugs before they happen to
the greatest degree possible".

> [...] I generally don't bother

> too much with debuggers nowadays. They are occasional ports in a storm,
> that's all.

So you don't deal with large amounts of other people's code?

> >> >> >> [...] It can hide the non-inclusion of it's prototype,
> >> >> >
> >> >> > On *SOME* older generation compilers. No modern compiler fails to
> >> >> > give a warning about this regardless of the cast.
> >> >>
> >> >> So if anyone comes up with a counter-example, you can simply claim
> >> >> that it's not a "modern" compiler. ("True Scotsman" argument.)
> >> >
> >> > For development?
> >>
> >> Sure. Just because an implementation doesn't give one particular
> >> diagnostic message that Paul Hsieh thinks it should, that doesn't mean
> >> it's a Bad Compiler.
> >
> > Right ... but we're several posts into this, and you couldn't even come
> > up with one?
>
> Why bother?

Indeed.

> [...] The diagnostic message is not required by the Standard, so it

> makes no sense to me to insist to compiler-writers that they provide it.

Right. Unfortunately, I never been able to successfully compile
anything using the standard. I usually use a compiler.

> >> [...] So trying to make C type safe is a bit like
> >> trying to make Ook! object-oriented.
> >
> > Interesting observation. If you intersect C and C++ what are you left
> > with?
>
> Either poor C, poor C++, or syntax errors.

There's some intellectual honesty for you. Incidentally, the answer is
a syntactical subset of C (but functionally equivalent to C itself).

> >> > Using the
> >> > non-casting style of malloc usage doesn't change the above scenario in
> >> > any relevant way.
> >>
> >> I agree entirely, but my point was only that your macro doesn't magically
> >> introduce type safety into a language that I prefer to think of as "type
> >> aware". To some people, type safety is a straitjacket.
> >
> > Well to some people type safety is free automated assistance.
>
> I have no problem with free automated assistance, but free automated
> dictation is another matter. Whether a pointer of type <foo> is meaningful
> when interpreted as if it were a pointer of type <bar> is something that
> I'll judge for myself.

So why have any type safety at all?

> >> > Ironically, the correct solution is to use a C++
> >> > compiler which would spit errors at you for the last line.
> >>
> >> Ironically, by introducing C++ into this argument you just explained why
> >> your suggestions about C should be treated with a pinch of salt.
> >
> > Do I smell a fundamentalist ideology?
>
> No, you smell comp.lang.c, which is about C, not C++. If you want to discuss
> C++, there's a whole nother newsgroup for that.

When did I say I wanted to discuss C++? When did I imply this? What
is leading you to this ridiculous statement? You can't read complete
sentences, that you didn't even snip out. That's a blindness very
common to fundamentalism.

> [...] And if you want to discuss

> programming in general, there's a newsgroup for that, too.

I'm not taking direction from you or anyone about where I post.

> >> p = malloc(n * sizeof *p); does not need to be told the type, so you
> >> can't get the type wrong.
> >>
> >> > but mismatching the variable and
> >> > the thing you are taking sizeof is not error-prone?
> >>
> >> There is no such mismatch in the canonical form.
> >
> > I don't know what you are talking about. You cut and paste, you change
> > the target variable and miss the sizeof variable.
>
> Oh, okay, I see what you mean.

It took this many posts?

> [...] I thought you were talking about types, not

> objects. The reason I didn't "get it" immediately is probably because I
> find it quicker to type p = malloc(n * sizeof *p); than to invoke a copy
> operation, a move operation, a paste operation, and two edits. Copy-paste
> is expensive compared to typing when the amount to be copied is low, and
> silly when the amount to be copied is high (because you're missing an
> opportunity for re-factoring).

Enter the bizzaro world of Richard Heathfield's editting mind. You
usually end up doing this when you copy entire routines that are
similar in nature, but need to rework the insides of it a bit to match
different signatures and types. C doesn't have templates you know.

> > Ok, you've just
> > introduced a silent error, that's many hours of debugging waiting to
> > happen.
>
> Perhaps I need more practice. I generally don't manage to make debugging
> last more than a few minutes.

Yeah, it only takes you days to understand what is meant by a
copy-paste error. You'll excuse me if I am skeptical of your claim.

Clever Monkey

unread,

Nov 29, 2006, 3:52:51 PM11/29/06

to

webs...@gmail.com wrote:
> Richard Heathfield wrote:
>> webs...@gmail.com said:
>> <snip>
>>> he's propogating the idea that Fortran is faster than C as a blanket
>>> statement (this isn't true).
>> He said no such thing.
>
> Well this is what he said: "Fortran compilers are usually even better
> than that." (responding my statement about compiler optimizers and
> warnings). You see his *attempt* at sarcasm doesn't work unless he is
> able to establish this premise. Otherwise, his statement doesn't make
> any sense at all.
>

Well, among high-performance computational folks, Fortran *is*
considered a better compiler, and not just because Fortran compilers can
optimize far better than many C compilers.

There are a lot of spilt pixels out there comparing benchmarks for
typical heavy computational work.

Clever Monkey

unread,

Nov 29, 2006, 4:14:14 PM11/29/06

to

Uh, again, Fortran is a better tool for some sorts of work. This is in
part because the compilers can optimize better (i.e., due to the way
explicit pointers are implemented) and the diagnostics are more robust.

I certainly am not getting into a holy war over this, but in general it
is well accepted that Fortran emits much faster code, especially for
some sorts of computational work. And it is also generally accepted
that it is easier to make fast, optimized code without resorting to
special compilers, optimized libraries or clever optimization techniques.

There are plenty of split pixels out there comparing benchmarks and
discussing this stuff.

The details can be argued /ad infinitum/, but simply asserting that
Fortran might be better than C in terms of automatic optimizations and
robust diagnostics is not some crazy unfounded assumption. It reflects
a fair amount of scholarly evidence and years of experience.

Richard Heathfield

unread,

Nov 29, 2006, 4:23:45 PM11/29/06

to

webs...@gmail.com said:

> Richard Heathfield wrote:
>> webs...@gmail.com said:
>> > Richard Heathfield wrote:
>> >> webs...@gmail.com said:
>> <snip>
>> >> > Superfluous structure that
>> >> > your compiler is going to strip out of the object code anyways has
>> >> > no negative impact on correctness or performance.
>> >>
>> >> If it's superfluous (your word, not mine, but I agree that it is
>> >> appropriate here), you might as well leave it out.
>> >
>> > You mean like you should never comment your code because comments are
>> > superfluous?
>>
>> I don't agree that comments are superfluous. *You* said the structure was
>> superfluous, and "superfluous" means "redundant, unnecessary" (look in a
>> dictionary if you don't believe me).
>
> Huh? I don't have a problem with the definition. Comments describe
> what the code is doing (redundant, since the source itself does that)

Good comments do more than merely describe what the code is doing - they
describe /why/ the code is doing it. They summarise, explain, and inform,
at a level that is not constrained by syntax rules. They also record other
useful information (e.g. algorithm sources, author info, and the like) that
cannot reasonably be shoehorned into the C code itself.

> and are ignored by the compiler (unnecessary).

"Ignored by the compiler" and "unnecessary" are two very different concepts.
The one does not imply the other.

> So what is your problem?

I'm not the one with the problem.

> (Note that this does not imply that Comments are a bad thing.)

Noted.

>> > (Certainly, their value is at best subjective.)
>> > Sometimes redundancy is useful -- this is the lesson of structured
>> > programming.
>>
>> If it's useful, how can it be redundant?
>
> Well let's just pause and think about this for a second.
>
> Most CPU caches have parity bits, or ECC which are *REDUNDANT* to the
> raw data its already carrying. So the question is, how can parity bits
> or ECC be useful? Perhaps we need a research project to figure that
> out. You'll find the same thing on hard disks and CD Roms.

Parity bits are not redundant. They act as a check on the integrity of the
data.

> TCP/IP uses a ones completement checksum on its packets. This checksum
> is obviously derivable from the rest of the data its delivering.

Same example, dressed in different clothes. Same answer.

> There are well known algorithms, in fact called "Cyclic Redundancy
> Check".

Sounds like a misnomer to me.

<snip>

> In C or Pascal, you usually have a { or begin that matches the start of
> a program block. Without any loss of correct grammar parsing, you
> could obviously just drop those. So they should be considered
> redundant.

From another perspective, however, the language requires them to be present
in a correct program, and so they are far from redundant.

>> [...] Using void pointers correctly does not
>> imply fragile code.
>
> Well that's not exactly what I was saying. It becomes a place were an
> error can *hide*. Fragile code is usually much better because it
> breaks at the drop of a hat, and so you can isolate and debug it
> easily, or with moderate testing.

If you prefer fragile code, that's up to you. I prefer a bit more
robustness.

>> [...] Using void pointers incorrectly is a losing strategy.
>> But so is using casts incorrectly. So is using fopen incorrectly. So is
>> using *anything* incorrectly.
>
> So you live in such a dichometric universe that you can't see anything
> other than black and white? Either something is wrong or it isn't, and
> you are not even concerned at all with the path you take in getting
> from wrong to right?

I can see things in many colours, but sometimes things /are/ black and
white. Now, the whole void * thing is not actually black and white, because
there are many people who aren't necessarily going to use them properly,
and perhaps such people - if they're not prepared to learn how to use them
properly - would be better off avoiding them. Personally, I think it's
better to learn how to use them properly.

>> > The real problem with it is that you
>> > don't even realize that's what's gone wrong until you get deeply into
>> > debugging in. And the debugger will typically be far less helpful than
>> > you wished.
>>
>> "Think first, compute later" is always a good plan.
>
> No, that's not a plan at all. Its a mantra, and an unachievable ideal.

Thinking before computing is unachievable? I cannot agree with that.

> People do not think with perfection, so this will inevitably lead to
> manifestions of thoughtless code.

Now who's thinking in black and white? No, the imperfection of people's
thought will not inevitably lead to manifestations of thoughtless code, but
rather to manifestations of code written by a less than perfect thinker.

> How about a more serious plan: "Use
> your tools to assist you in ferreting out bugs before they happen to
> the greatest degree possible".

Provided they don't get in my way, sure. But that means dropping the "to the
greatest degree possible" bit. The greatest degree possible is "don't write
the program", which is a good indication of where an extreme will take you.

>
>> [...] I generally don't bother
>> too much with debuggers nowadays. They are occasional ports in a storm,
>> that's all.
>
> So you don't deal with large amounts of other people's code?

ROTFL! Yes, I deal with large amounts of other people's code. No, I don't
often use a debugger when doing so. Sometimes, yes, but usually, no.

<snip>

>> >> [...] To some people, type safety is a straitjacket.

>> >
>> > Well to some people type safety is free automated assistance.
>>
>> I have no problem with free automated assistance, but free automated
>> dictation is another matter. Whether a pointer of type <foo> is
>> meaningful when interpreted as if it were a pointer of type <bar> is
>> something that I'll judge for myself.
>
> So why have any type safety at all?

I view type safety as a guide, rather than a dictator. Guides can be useful.

>> >> > Ironically, the correct solution is to use a C++
>> >> > compiler which would spit errors at you for the last line.
>> >>
>> >> Ironically, by introducing C++ into this argument you just explained
>> >> why your suggestions about C should be treated with a pinch of salt.
>> >
>> > Do I smell a fundamentalist ideology?
>>
>> No, you smell comp.lang.c, which is about C, not C++. If you want to
>> discuss C++, there's a whole nother newsgroup for that.
>
> When did I say I wanted to discuss C++? When did I imply this? What
> is leading you to this ridiculous statement?

Your words:

>> >> > Ironically, the correct solution is to use a C++
>> >> > compiler which would spit errors at you for the last line.

> You can't read complete

> sentences, that you didn't even snip out.

See above.

> That's a blindness very common to fundamentalism.

Who is the fundamentalist here?

>> [...] And if you want to discuss
>> programming in general, there's a newsgroup for that, too.
>
> I'm not taking direction from you or anyone about where I post.

Evidently.

>> >> p = malloc(n * sizeof *p); does not need to be told the type, so you
>> >> can't get the type wrong.
>> >>
>> >> > but mismatching the variable and
>> >> > the thing you are taking sizeof is not error-prone?
>> >>
>> >> There is no such mismatch in the canonical form.
>> >
>> > I don't know what you are talking about. You cut and paste, you change
>> > the target variable and miss the sizeof variable.
>>
>> Oh, okay, I see what you mean.
>
> It took this many posts?

It'll take a great many more, it seems, before *you* see what *I* mean, so I
guess I'm ahead of the game.

Malcolm

unread,

Nov 29, 2006, 4:24:40 PM11/29/06

to

<webs...@gmail.com> wrote in message

> Enter the bizzaro world of Richard Heathfield's editting mind. You
> usually end up doing this when you copy entire routines that are
> similar in nature, but need to rework the insides of it a bit to match
> different signatures and types. C doesn't have templates you know.
>

Join my campaign for 64-bit ints.
Then there will be no need for templates, since all numbers (well, integers)
will be represented in the same way.
--
www.personal.leeds.ac.uk/~bgy1mm
freeware games to download.

Richard Heathfield

unread,

Nov 29, 2006, 4:34:00 PM11/29/06

to

Malcolm said:

> Join my campaign for 64-bit ints.
> Then there will be no need for templates, since all numbers (well,
> integers) will be represented in the same way.

Even 18446744073709551617 ?

webs...@gmail.com

unread,

Nov 29, 2006, 4:53:13 PM11/29/06

to

Clever Monkey wrote:
> webs...@gmail.com wrote:
> > Richard Heathfield wrote:
> >> The possibility that he's right and you're either wrong or misinterpreting
> >> what he's said.
> >
> > Uh ... he said Fortran was better than C (at optimization and/or
> > diagnostics). No matter how much you delete this quote, he still said
> > it. If he can make a strong case for the diagnostics, then I will
> > concede that he just wasn't being clear (but bizarrely intentionally
> > so). As for the optimizations, he's barking up the wrong tree if he
> > wants to try to present that case to me.
>
> Uh, again, Fortran is a better tool for some sorts of work. This is in
> part because the compilers can optimize better (i.e., due to the way
> explicit pointers are implemented) and the diagnostics are more robust.

If you can make a strong case for the diagnostics, then fine. Fortran
doesn't have type specific problems, so what precisely is Fortran
bringing to the table in terms of diagnostics. Does it detect and
notify the programmer about numerically unstable code, or what? (I am
familliar what Fortran compilers only in the code they generate; I'm
not much of a practicioner of the language itself.)

> I certainly am not getting into a holy war over this, but in general it
> is well accepted that Fortran emits much faster code, especially for
> some sorts of computational work. And it is also generally accepted
> that it is easier to make fast, optimized code without resorting to
> special compilers, optimized libraries or clever optimization techniques.

Both Intel and Microsoft now support auto-vectorizors in their C
compilers. Both compilers support a "no aliasing" flag and Intel
supports restrict. These compilers are not special (they are both
mainstream) and there are no special techniques. The code should
appear comparable to the equivalent Fortran code.

> There are plenty of split pixels out there comparing benchmarks and
> discussing this stuff.
>
> The details can be argued /ad infinitum/, but simply asserting that
> Fortran might be better than C in terms of automatic optimizations and
> robust diagnostics is not some crazy unfounded assumption. It reflects
> a fair amount of scholarly evidence and years of experience.

This represents obsolete experience. The most clear example of this is
the x86 platform. The fastest Fortran compilers come from Intel (or
near fastest, I don't know exactly what the status of the latest Lahey
or PathScale compilers are like). However, this compiler uses the same
back end for compiling both C and Fortran. It is crucial to observe
that the Intel compiler uses the identical intermediate => vector
translators for both languages. From a language point of view, the
only lacking feature from C is the aliasing problem. However, Intel
includes both a "no alias" flag as well as the "restrict" keyword from
C99. But once again both languages eventually translate the "no alias"
feature equivalently down to the intermediate language. Thus this
leaves no room for Fortran to outperform C; you can always write your C
code to be able to leverage any optimization technique available to the
Fortran front end.

Now on the flip side, things are not equal. Fortran has a very
different interface to its integer semantics. For example, its integer
shift is highly generalized to include negative and positive shifting
(and I believe it has to correctly saturate as well). The Fortran
compiler has no opportunity to be able to assume the shift count
variable is either positive or negative. This what is rendered as a
single and very parallelizable instruction in C, ends up taking about 8
instructions in Fortran. Fortran also does not have a concept of
pointers or unions. Thus for certain data structures where those are
optimal ways of implementing them, In Fortran you are forced to
implement "work-a-likes" (i.e., pretend an array is a heap, and just
seperate variables by storage even if they never have overlapping
lifetimes) that the compiler is unlikely to be able to simplify down to
the C equivalent.

WIth any objective analysis, the details don't look too good for
Fortran. The places where it had an advantage in the past, were of a
historical and engineering effort nature. Vectorizors have been ported
to C compilers nowadays, so the major ace up Fortran's sleeve is gone.

Realistically one cannot support a case that suggests that Fortran
continues to be a language that is faster than C. It used to be true
for pure array/floating point, and it was never true for integer or
data structure code. Now its no longer true for array/floating point
code.

Old Wolf

unread,

Nov 29, 2006, 5:48:32 PM11/29/06

to

webs...@gmail.com wrote:
> > If it's useful, how can it be redundant?
>
> Well let's just pause and think about this for a second.

The problem is that Paul Hsieh thinks "redundant" means "useless".

In fact, the English word "redundant" means exactly that, in the
minds of many people.

But in computer science, "redundant" means that it performs a
function which is already being performed by something else.
This may or may not be useless.

For example, companies pay for expensive servers which are
entirely redundant. This is so that if the main server dies then
the redundant one can become the main one.

(I'm sure you know all this, but it seems to have sprung into
several long messages to try and resolve it).

Keith Thompson

unread,

Nov 29, 2006, 8:17:28 PM11/29/06

to

"Malcolm" <regn...@btinternet.com> writes:
[...]

> Join my campaign for 64-bit ints.
> Then there will be no need for templates, since all numbers (well, integers)
> will be represented in the same way.

No.

Beliavsky

unread,

Nov 29, 2006, 9:07:51 PM11/29/06

to

webs...@gmail.com wrote:

<snip>

> Now on the flip side, things are not equal. Fortran has a very
> different interface to its integer semantics. For example, its integer
> shift is highly generalized to include negative and positive shifting
> (and I believe it has to correctly saturate as well). The Fortran
> compiler has no opportunity to be able to assume the shift count
> variable is either positive or negative. This what is rendered as a
> single and very parallelizable instruction in C, ends up taking about 8
> instructions in Fortran. Fortran also does not have a concept of
> pointers or unions.

Fortran 90 and later versions do have pointers, but they differ from
those of C.

<snip>

> WIth any objective analysis, the details don't look too good for
> Fortran. The places where it had an advantage in the past, were of a
> historical and engineering effort nature. Vectorizors have been ported
> to C compilers nowadays, so the major ace up Fortran's sleeve is gone.

Fortran 2003 is a higher-level language than C (this does not
necessarily mean better), especially in its handling of
multidimensional arrays, and I think its real competition is C++ among
compiled programming languages and Matlab/Octave/Scilab and
Python+Numpy among interpreted languages.

webs...@gmail.com

unread,

Nov 29, 2006, 9:13:11 PM11/29/06

to

Old Wolf wrote:
> webs...@gmail.com wrote:
> > > If it's useful, how can it be redundant?
> >
> > Well let's just pause and think about this for a second.
>
> The problem is that Paul Hsieh thinks "redundant" means "useless".

Please read the thread attributions more carefully. You are targetting
the wrong person.

> In fact, the English word "redundant" means exactly that, in the
> minds of many people.
>
> But in computer science, "redundant" means that it performs a
> function which is already being performed by something else.
> This may or may not be useless.

This is precisely the point I am making. Exactly. You have just read
the attributions of the threads incorrectly. Redundancy is, in fact, a
feature. People often pay an extremely high premium for it. Go tell
that to the other guy.

webs...@gmail.com

unread,

Nov 29, 2006, 9:51:05 PM11/29/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> > Richard Heathfield wrote:
> >> webs...@gmail.com said:
> >> > Richard Heathfield wrote:
> >> >> webs...@gmail.com said:
> >> <snip>
> >> >> > Superfluous structure that
> >> >> > your compiler is going to strip out of the object code anyways has
> >> >> > no negative impact on correctness or performance.
> >> >>
> >> >> If it's superfluous (your word, not mine, but I agree that it is
> >> >> appropriate here), you might as well leave it out.
> >> >
> >> > You mean like you should never comment your code because comments are
> >> > superfluous?
> >>
> >> I don't agree that comments are superfluous. *You* said the structure was
> >> superfluous, and "superfluous" means "redundant, unnecessary" (look in a
> >> dictionary if you don't believe me).
> >
> > Huh? I don't have a problem with the definition. Comments describe
> > what the code is doing (redundant, since the source itself does that)
>
> Good comments do more than merely describe what the code is doing - they
> describe /why/ the code is doing it. They summarise, explain, and inform,
> at a level that is not constrained by syntax rules. They also record other
> useful information (e.g. algorithm sources, author info, and the like) that
> cannot reasonably be shoehorned into the C code itself.

In other words they are a redundant and unnecessary reexpression of the
algorithm (i.e., technically superfluous) which happen to also serve
another purpose. Just because you happen to have another purpose for
them doesn't relieve them of their redundancy status.

> >> > (Certainly, their value is at best subjective.)
> >> > Sometimes redundancy is useful -- this is the lesson of structured
> >> > programming.
> >>
> >> If it's useful, how can it be redundant?
> >
> > Well let's just pause and think about this for a second.
> >
> > Most CPU caches have parity bits, or ECC which are *REDUNDANT* to the
> > raw data its already carrying. So the question is, how can parity bits
> > or ECC be useful? Perhaps we need a research project to figure that
> > out. You'll find the same thing on hard disks and CD Roms.
>
> Parity bits are not redundant. They act as a check on the integrity of the
> data.

Straight from the school of Frank Luntz and his ilk. They check the
integrity of the data *AND* they are redundant. That's the whole
fricking point! The redundancy itself is the feature. If they weren't
redundant, then they wouldn't be serving their intended purpose.

> > TCP/IP uses a ones completement checksum on its packets. This checksum
> > is obviously derivable from the rest of the data its delivering.
>
> Same example, dressed in different clothes. Same answer.
>
> > There are well known algorithms, in fact called "Cyclic Redundancy
> > Check".
>
> Sounds like a misnomer to me.

You think its misnamed? Are you crazy? Its a very precise decription
for what it is.

> <snip>
>
> > In C or Pascal, you usually have a { or begin that matches the start of
> > a program block. Without any loss of correct grammar parsing, you
> > could obviously just drop those. So they should be considered
> > redundant.
>
> From another perspective, however, the language requires them to be present
> in a correct program, and so they are far from redundant.
>
> <Lots of silly stuff snipped - so silly that no comment other than this is
> necessary>

You of course snipped the different examples, instead of the ones that
you said were repeats.

> >> [...] Using void pointers correctly does not
> >> imply fragile code.
> >
> > Well that's not exactly what I was saying. It becomes a place were an
> > error can *hide*. Fragile code is usually much better because it
> > breaks at the drop of a hat, and so you can isolate and debug it
> > easily, or with moderate testing.
>
> If you prefer fragile code, that's up to you. I prefer a bit more
> robustness.

I prefer code with perfect robustness. If code is fragile, I will get
to that goal faster. If code has hidden errors, even though you could
technically call them more robust than fragile code, then that's worse.
Any serious engineer will always prefer the nice hard crash at the
least provocation to heisenbugs or errors encoded in pseudo-correct
code that is just dying on some inadvertant semantic.

> >> > The real problem with it is that you
> >> > don't even realize that's what's gone wrong until you get deeply into
> >> > debugging in. And the debugger will typically be far less helpful than
> >> > you wished.
> >>
> >> "Think first, compute later" is always a good plan.
> >
> > No, that's not a plan at all. Its a mantra, and an unachievable ideal.
>
> Thinking before computing is unachievable? I cannot agree with that.

Its not practically achievable to have every line of code thoughtfully
considered. Unless you are ok with 3 lines of code produced a day or
something like that. And this sort of thing does not mask the greater
issue of normal error rates.

> > People do not think with perfection, so this will inevitably lead to
> > manifestions of thoughtless code.
>
> Now who's thinking in black and white? No, the imperfection of people's
> thought will not inevitably lead to manifestations of thoughtless code, but
> rather to manifestations of code written by a less than perfect thinker.

>From a code production point of view, this is not a distinction with
any relevance. How a bug gets into your code is far less important
than *if* the bug gets in there.

> > How about a more serious plan: "Use
> > your tools to assist you in ferreting out bugs before they happen to
> > the greatest degree possible".
>
> Provided they don't get in my way, sure.

Yeah, and your mindless obstinance is real conducive to this.

> [...] But that means dropping the "to the

> greatest degree possible" bit. The greatest degree possible is "don't write
> the program", which is a good indication of where an extreme will take you.

Anything to twist words to mean things I clearly cannot possibly mean.
Casting malloc doesn't inhibit your ability to program.

> >> >> > Ironically, the correct solution is to use a C++
> >> >> > compiler which would spit errors at you for the last line.
> >> >>
> >> >> Ironically, by introducing C++ into this argument you just explained
> >> >> why your suggestions about C should be treated with a pinch of salt.
> >> >
> >> > Do I smell a fundamentalist ideology?
> >>
> >> No, you smell comp.lang.c, which is about C, not C++. If you want to
> >> discuss C++, there's a whole nother newsgroup for that.
> >
> > When did I say I wanted to discuss C++? When did I imply this? What
> > is leading you to this ridiculous statement?
>
> Your words:
>
> >> >> > Ironically, the correct solution is to use a C++
> >> >> > compiler which would spit errors at you for the last line.

Ok, I don't see the part where I say I want to discuss C++. Nor is it
implied.

> > You can't read complete
> > sentences, that you didn't even snip out.
>
> See above.

I see it -- I wrote it, and I remember what I wrote. I have not
suggested the discussion of C++ here.

> > That's a blindness very common to fundamentalism.
>
> Who is the fundamentalist here?

The one that puts forth ideas that don't match ordinary parsing of
facts. You suggested redundancy means something other than repetition
(you suggested it meant non-uselessness), and you don't see a
distinction between C++ compilers and the C++ language.

pete

unread,

Nov 30, 2006, 1:31:40 AM11/30/06

to

Tonio Cartonio wrote:
>
> I have to read characters from stdin and save them in a string. The
> problem is that I don't know how much characters will be read.

/* BEGIN line_to_string.c */

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <string.h>

struct list_node {
struct list_node *next;
void *data;
};

int line_to_string(FILE *fp, char **line, size_t *size);
int list_fputs(FILE *stream, struct list_node *node);
void list_free(struct list_node *node, void (*free_data)(void *));
struct list_node *string_node(struct list_node **head,
struct list_node *tail,
char *data);

int main(void)
{
struct list_node *head, *tail;
int rc;
char *buff_ptr;
size_t buff_size;
long unsigned line_count;

#if 1
buff_size = 0;
buff_ptr = NULL;
#else
buff_size = 100;
buff_ptr = malloc(buff_size);
if (buff_ptr == NULL) {
puts("malloc trouble!");
exit(EXIT_FAILURE);
}
#endif

tail = head = NULL;
line_count = 0;
puts(
"\nThis program makes and prints a list of all the lines\n"
"of text entered from standard input.\n"
"Just hit the Enter key to end,\n"
"or enter any line of characters to continue."
);
while ((rc = line_to_string(stdin, &buff_ptr, &buff_size)) > 1) {
++line_count;
tail = string_node(&head, tail, buff_ptr);
if (tail == NULL) {
break;
}
puts(
"\nJust hit the Enter key to end,\n"
"or enter any other line of characters to continue."
);
}
switch (rc) {
case EOF:
if (buff_ptr != NULL && strlen(buff_ptr) > 0) {
puts("rc equals EOF\nThe string in buff_ptr is:");
puts(buff_ptr);
++line_count;
tail = string_node(&head, tail, buff_ptr);
}
break;
case 0:
puts("realloc returned a null pointer value");
if (buff_size > 1) {
puts("rc equals 0\nThe string in buff_ptr is:");
puts(buff_ptr);
++line_count;
tail = string_node(&head, tail, buff_ptr);
}
break;
default:
break;
}
if (line_count != 0 && tail == NULL) {
puts("Node allocation failed.");
puts("The last line entered didn't make it onto the list:");
puts(buff_ptr);
}
free(buff_ptr);
puts("\nThe line buffer has been freed.\n");
printf("%lu lines of text were entered.\n", line_count);
puts("They are:\n");
list_fputs(stdout, head);
list_free(head, free);
puts("\nThe list has been freed.\n");
return 0;
}

int line_to_string(FILE *fp, char **line, size_t *size)
{
int rc;
void *p;
size_t count;

count = 0;
while ((rc = getc(fp)) != EOF) {
++count;
if (count + 2 > *size) {
p = realloc(*line, count + 2);
if (p == NULL) {
if (*size > count) {
(*line)[count] = '\0';
(*line)[count - 1] = (char)rc;
} else {
ungetc(rc, fp);
}
count = 0;
break;
}
*line = p;
*size = count + 2;
}
if (rc == '\n') {
(*line)[count - 1] = '\0';
break;
}
(*line)[count - 1] = (char)rc;
}
if (rc != EOF) {
rc = count > INT_MAX ? INT_MAX : count;
} else {
if (*size > count) {
(*line)[count] = '\0';
}
}
return rc;
}

void list_free(struct list_node *node, void (*free_data)(void *))
{
struct list_node *next_node;

while (node != NULL) {
next_node = node -> next;
free_data(node -> data);
free(node);
node = next_node;
}
}

int list_fputs(FILE *stream, struct list_node *node)
{
while (node != NULL) {
if (fputs(node -> data, stream) == EOF
|| putc('\n', stream) == EOF)
{
break;
}
node = node -> next;
}
return node == NULL ? '\n' : EOF;
}

struct list_node *string_node(struct list_node **head,
struct list_node *tail,
char *data)
{
struct list_node *node;

node = malloc(sizeof *node);
if (node != NULL) {
node -> next = NULL;
node -> data = malloc(strlen(data) + 1);
if (node -> data != NULL) {
if (*head == NULL) {
*head = node;
} else {
tail -> next = node;
}
strcpy(node -> data, data);
} else {
free(node);
node = NULL;
}
}
return node;
}

/* END line_to_string.c */

--
pete

Richard Heathfield

unread,

Nov 30, 2006, 1:52:14 AM11/30/06

to

webs...@gmail.com said:
> Richard Heathfield wrote:
>> webs...@gmail.com said:
>> >
>> > Huh? I don't have a problem with the definition. Comments describe
>> > what the code is doing (redundant, since the source itself does that)
>>
>> Good comments do more than merely describe what the code is doing - they
>> describe /why/ the code is doing it. They summarise, explain, and inform,
>> at a level that is not constrained by syntax rules. They also record
>> other useful information (e.g. algorithm sources, author info, and the
>> like) that cannot reasonably be shoehorned into the C code itself.
>
> In other words they are a redundant and unnecessary reexpression of the
> algorithm

Learn to read. Good day, sir.

Richard Bos

unread,

Nov 30, 2006, 2:07:24 AM11/30/06

to

"Old Wolf" <old...@inspire.net.nz> wrote:

> webs...@gmail.com wrote:
> > > If it's useful, how can it be redundant?
> >
> > Well let's just pause and think about this for a second.
>
> The problem is that Paul Hsieh thinks "redundant" means "useless".
>
> In fact, the English word "redundant" means exactly that, in the
> minds of many people.

Those many people are just as wrong as Paul, then.

Richard

CBFalconer

unread,

Nov 30, 2006, 2:58:33 AM11/30/06

to

pete wrote:
> Tonio Cartonio wrote:
>
>> I have to read characters from stdin and save them in a string. The
>> problem is that I don't know how much characters will be read.
>
> /* BEGIN line_to_string.c */
>

... snip 180 lines of code ...

I think the heart code of ggets is somewhat simpler. See:

<http://cbfalconer.home.att.net/download/>

#include <stdio.h>
#include <stdlib.h>
#include "ggets.h"

#define INITSIZE 112 /* power of 2 minus 16, helps malloc */
#define DELTASIZE (INITSIZE + 16)

enum {OK = 0, NOMEM};

int fggets(char* *ln, FILE *f)
{
int cursize, ch, ix;
char *buffer, *temp;

*ln = NULL; /* default */
if (NULL == (buffer = malloc(INITSIZE))) return NOMEM;
cursize = INITSIZE;

ix = 0;
while ((EOF != (ch = getc(f))) && ('\n' != ch)) {
if (ix >= (cursize - 1)) { /* extend buffer */
cursize += DELTASIZE;
if (NULL == (temp = realloc(buffer, (size_t)cursize))) {
/* ran out of memory, return partial line */
buffer[ix] = '\0';
*ln = buffer;
return NOMEM;
}
buffer = temp;
}
buffer[ix++] = ch;
}
if ((EOF == ch) && (0 == ix)) {
free(buffer);
return EOF;
}

buffer[ix] = '\0';
if (NULL == (temp = realloc(buffer, (size_t)ix + 1))) {
*ln = buffer; /* without reducing it */
}
else *ln = temp;
return OK;
} /* fggets */

Richard Bos

unread,

Nov 30, 2006, 7:25:01 AM11/30/06

to

webs...@gmail.com wrote:

> CBFalconer wrote:
> > If you use the recommended:
> >
> > <var> = malloc(<count> * sizeof *<var>);
> >
> > you need no casts, and the exact type is enforced without any
> > concealment behind obfuscating macros or whatever.

>
> Well, this still has the potential for cut and paste errors unless you
> macrofy the whole line.

Unless, of course, you're clever enough not to cut and paste.

> If you don't macrofy, then you risk error no matter what.

Heh. The typical complaint of a sorry typist.

> So let us take a more serious approach compare macros which prevent any
> mismatch errors:
>
> #define scaredOfCPlusPlus(var,count) var = malloc(count*sizeof *var)

Gosh, what a sane name for a macro. You must be a popular cow-orker.

> #define newThing(type,count) (type *) malloc (count * sizeof (type))

>
> So you can say var = newThing(char *, 512), and if the type is wrong,
> the compiler tells you.

Right. But now change the type of your pointer. Say, from a LinkedList *
tp a Binary_Tree *. Happens, you know. Programs evolve. So do data sets.
Some programmers, apparently, never, alas. But the clever ones program
for maintainability, not for not-having-to-think-up-frontness.

> The scaredOfCPlusPlus(,) macro works fine, but doesn't look familliar,

And this would be why, again? Oh, right, because good programmers don't
abuse the preprocessor like that.

> And, of course, the real difference is that the first compiles straight
> in C++, and the second is just an error.

That's a pretty good argument in comp.lang.c++. Guess where we're not?

> I have found that in general the C++ optimizers and warnings are better
> for the C++ mode of my compilers than the C mode.

Gosh, you think? C++ better at C++ than other languages - film at 11:00.

All your arguments make perfect sense _if_, and only if, you start by
accepting your premises that C++ is a better language than C, that all
programmers are stupid, and that getting a program to compile is more
important than getting it to work. Wise programmers accept none of those
premises.

Richard

Eric Sosman

unread,

Nov 30, 2006, 9:25:42 AM11/30/06

to

Richard Bos wrote:

<off-topic distance="extreme">

No, they're just four centuries behind the times. "Redundant"
did not always carry the implication of "unnecessary," but only of
"repeated," or more prosaically "iterated." John Milton described
the serpent of Eden as a dazzlingly beautiful creature (how else
could it have tempted Eve?), with its coils "floating redundant" in
glistening display. He did not mean by this that the serpent had
too many coils, or more coils than it needed, but only that it had
lots of coils.

(This factoid came to me by way of the story "The Djinn in the
Nightingale's Eye" by A.S. Byatt, a tale as delightfully beautiful
as the serpent it mentions. Much more pleasant to read than the
Standard, I promise.)

</off-topic>

--
Eric Sosman
eso...@acm-dot-org.invalid

pete

unread,

Nov 30, 2006, 10:29:40 AM11/30/06

to

CBFalconer wrote:
>
> pete wrote:
> > Tonio Cartonio wrote:
> >
> >> I have to read characters from stdin and save them in a string. The
> >> problem is that I don't know how much characters will be read.
> >
> > /* BEGIN line_to_string.c */
> >
> ... snip 180 lines of code ...
>
> I think the heart code of ggets is somewhat simpler. See:
>
> <http://cbfalconer.home.att.net/download/>

Maybe, but they work differently.
ggets allocates a new buffer every time that it's called.

while (0 == fggets(&line, infile)) {
fprintf(stderr, "%4d %4d\n", ++cnt, (int)strlen(line));
(void)puts(line);
free(line);
}

line_to_string was designed to be called from a loop.
The number of allocation calls made within line_to_string
while reading a text file,
is a function of the length of of the longest line of the text file
and completely independant of how many lines are in the file.

while ((rc = line_to_string(stdin, &buff_ptr, &buff_size)) > 1) {
++line_count;
tail = string_node(&head, tail, buff_ptr);
if (tail == NULL) {
break;
}
puts(
"\nJust hit the Enter key to end,\n"
"or enter any other line of characters to continue."
);
}

If there's a zillion lines in a text file
and the longest line is only 100 bytes,
then string_to_line will only call realloc 100 times,
if the initial values of the buff_ptr and buff_size
are NULL and 0.

I've rewritten main().
If INITIAL_BUFFER_SIZE were to be defined as 100,
then to read the same zillion line text file mentioned above,
malloc would be called only once,
and realloc would not be called at all.

#define INITIAL_BUFFER_SIZE 0 /* Can be any number */

int main(void)
{
struct list_node *head, *tail;
int rc;
char *buff_ptr;
size_t buff_size;
long unsigned line_count;

buff_size = INITIAL_BUFFER_SIZE;
buff_ptr = malloc(buff_size);
if (buff_ptr == NULL && buff_size != 0) {

puts("malloc trouble!");
exit(EXIT_FAILURE);
}

tail = head = NULL;
line_count = 0;
puts(
"\nThis program makes and prints a list of all the lines\n"
"of text entered from standard input.\n"
"Just hit the Enter key to end,\n"
"or enter any line of characters to continue."
);
while ((rc = line_to_string(stdin, &buff_ptr, &buff_size)) > 1) {

--
pete

pete

unread,

Nov 30, 2006, 10:55:17 AM11/30/06

to

Eric Sosman wrote:

> "Redundant" did not always carry the implication of
> "unnecessary," but only of
> "repeated," or more prosaically "iterated."

http://www.google.com/search?hl=en&ie=ISO-8859-1&q=%22redundant+backup+systems%22

Results 1 - 10 of about 959 for "redundant backup systems". (0.19
seconds)

--
pete

Clever Monkey

unread,

Nov 30, 2006, 11:34:53 AM11/30/06

to

"... without resorting to special compilers, optimized libraries or
clever optimzation techniques."

They are special in the sense that they are specific implementations
intended for a specific audience. The point of Fortran was that
ordinary code written in a portable fashion should perform reasonably
well under most implementations.

I'm not getting involved in this holy-war. Use the best tool for the
job. If you need the kind of grunt required to run non-trivial math
over the course of days or weeks, do your own benchmarks.

My only point was that it is not crazy to make the statement that
Fortran may emit code that performs better than the equivalent algorithm
implemented in C. In general, this has been true. Whether or not you
can find the right implementation, library or technique to find a case
where this general trend is reversed is not all that important.

Specific comparisons between specific implementations are important
considerations, and there are some modern benchmarks posted (I can't
find the link, sorry, but Google should have it) comparing Intel's C
compiler, a recent gcc implementation and Fortran-90. Given a variety
of hard problems, Fortran consistently came up much faster with default
code and no explicit optimizations.

Your other comments were addressed else-thread, I think.

Chris Torek

unread,

Nov 30, 2006, 4:20:01 PM11/30/06

to

Before I add to this, let me say that my earlier posting in the
thread was indeed sarcastic/flip. I probably should not have
posted it.

(There was a real point to it, mostly being: "If you use a C++
compiler, you are compiling C++ code. It may also happen to be C
code, and it may even have the same semantics in both languages,
but it is still C++ code." Note that "having the same semantics"
is not as common as "compiles in both languages".

Personally, I think if one intends to compile with C++ compilers,
one might as well make use of various C++ constructs. For instance,
templates are actually quite valuable, in spite of their horrific
syntax. :-) )

In article <NKDbh.46450$43.1...@nnrp.ca.mci.com!nnrp1.uunet.ca>

Clever Monkey <clvrmnky...@hotmail.com.invalid> wrote:
>My only point was that it is not crazy to make the statement that
>Fortran may emit code that performs better than the equivalent algorithm
>implemented in C. In general, this has been true.

Indeed. It may -- and some may hope that it does -- become less
true, especially now that C99 has "restrict". But traditionally
it seems to have beem the case. (One possible reason I offer here
is that, on many machines, particularly the mini- and micro-computers
commonly used in the 1980s and early 1990s, it is easy to compile
C code to "relatively OK" machine code without bothering with much
if any optimization, and at the same time, C's aliasing rules often
make it hard to do a great deal of optimization. The same does
not hold for the Fortran of the time -- F77 -- so compiler-writers
*had* to put in *some* optimization, and then had no barriers to
putting in more optimization.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Eric Sosman

unread,

Nov 30, 2006, 10:11:15 PM11/30/06

to

Not 100% sure what point you're making, but in case it's "lots
of programmers use `redundant' without meaning `unnecessary'," let
me point out that lots of programmers use "kilo" as if it meant 1024.

--
Eric Sosman
eso...@acm-dot-org.invalid

pete

unread,

Dec 1, 2006, 1:25:22 AM12/1/06

to

It's not just programmers.
"Redundant backup systems" is an engineering term.

http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=%22redundant+backup+systems%22+aviation
Results 1 - 10 of about 94 for "redundant backup systems" aviation.

http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=%22redundant+backup+systems%22+steam
Results 1 - 10 of about 54 for "redundant backup systems" steam.

http://www.google.com/search?hl=en&lr=&ie=ISO-8859-1&q=%22redundant+backup+systems%22+nuclear
Results 1 - 10 of about 122 for "redundant backup systems" nuclear.

--
pete

Richard Bos

unread,

Dec 1, 2006, 5:39:56 AM12/1/06

to

Eric Sosman <eso...@acm-dot-org.invalid> wrote:

> Richard Bos wrote:
>
> > "Old Wolf" <old...@inspire.net.nz> wrote:
> >
> >>webs...@gmail.com wrote:
> >>
> >>>>If it's useful, how can it be redundant?
> >>>
> >>>Well let's just pause and think about this for a second.
> >>
> >>The problem is that Paul Hsieh thinks "redundant" means "useless".
> >>
> >>In fact, the English word "redundant" means exactly that, in the
> >>minds of many people.
> >
> > Those many people are just as wrong as Paul, then.
>
> <off-topic distance="extreme">
>
> No, they're just four centuries behind the times.

YM ahead.

> "Redundant" did not always carry the implication of "unnecessary,"
> but only of "repeated," or more prosaically "iterated."

And even then it never did, and it still does not, mean "useless".
Something that is repeated, even something that is repeated
unnecessarily, may well be repeated usefully. People who use "redundant"
to mean "useless" are wrong now, just as they would have been back then.

Richard

webs...@gmail.com

unread,

Dec 1, 2006, 7:12:38 AM12/1/06

to

Exactly what *specific* audience do you think Microsoft's C compiler is
for? Its the default compiler for anyone developing applications on or
for a Windows machine. In terms of developer audience, there couldn't
possibly be even a close second with the exception of gcc (which may in
fact have a wider audience; I really don't know how they compare in
that sense). And Intel C/C++ started as a specialist (for video games,
and specific applications where Intel wanted to look good on a
benchmark) compiler, but certainly by today, its a totally mainstream
compiler whose target audience is basically just anyone who is looking
for a high quality and high performance C compiler. There's nothing
special at all with their audiences, except that MS is tied to Windows
(most of the Unix cc's are in the same boat). (Intel's compiler runs
on Linux, Windows and the recent Mac OS Xs.)

> [...] The point of Fortran was that

> ordinary code written in a portable fashion should perform reasonably
> well under most implementations.

That may be your point (and is only true if by ordinary code you mean
algorithms that use only arrays of floating point numbers, and
targetting compilers from half a decade ago). But that's not the
statement Chris made.

> I'm not getting involved in this holy-war. Use the best tool for the
> job. If you need the kind of grunt required to run non-trivial math
> over the course of days or weeks, do your own benchmarks.

Been there, done that. By modern standards Fortran no longer offers
anything that C doesn't (except being a simpler language.)

> My only point was that it is not crazy to make the statement that
> Fortran may emit code that performs better than the equivalent algorithm
> implemented in C.

That's fine if that was the point originally made. In fact its hard to
contend with this except for the special example of the Intel compiler,
because of its common back-end for the two languages coupled with its
benchmark leadership in both languages.

But that's *NOT* the point that was made. Chris just say that "Fortran
was even faster". And that's just utter nonsense (as a blanket
statement, that's basically never been true, and by today's standard
you cannot put together a fair case.)

> [...] In general, this has been true.

With an emphasis on *HAS BEEN*. Its basically no longer true.

> [...] Whether or not you

> can find the right implementation, library or technique to find a case
> where this general trend is reversed is not all that important.

Like picking a modern compiler and turning on a switch? (The "no
aliasing" switches have been sitting in C compilers since the early
90s.)

> Specific comparisons between specific implementations are important
> considerations, and there are some modern benchmarks posted (I can't
> find the link, sorry, but Google should have it) comparing Intel's C
> compiler, a recent gcc implementation and Fortran-90. Given a variety
> of hard problems, Fortran consistently came up much faster with default
> code and no explicit optimizations.

This is what google returned to me:

http://shootout.alioth.debian.org/gp4/fortran.php (Fortran is way
slower)

Obviously using "g95" is highly suboptimal, but more googling didn't
reveal anything else of relevance to me. My understanding comes from
directly analysis of the compiler output and mating the language's
capabilities to them.

webs...@gmail.com

unread,

Dec 1, 2006, 7:35:17 AM12/1/06

to

Chris Torek wrote:
> Before I add to this, let me say that my earlier posting in the
> thread was indeed sarcastic/flip. I probably should not have
> posted it.

That isn't the point of contention. Its that you used this sarcastic
guise to cover up two blatant deceptions.

> (There was a real point to it, mostly being: "If you use a C++
> compiler, you are compiling C++ code. It may also happen to be C
> code, and it may even have the same semantics in both languages,
> but it is still C++ code." Note that "having the same semantics"
> is not as common as "compiles in both languages".

That couldn't be your point, because you didn't say anything remotely
similar to that.

> Personally, I think if one intends to compile with C++ compilers,
> one might as well make use of various C++ constructs.

Perhaps you would like to discuss this with Richard Heathfield. He
apparently has a very strong opinion about discussion of C++ in this
newsgroup. You and he are the only people in this thread who have
brought up the discussion of the C++ language here. (Personally, I
just notice that there are other C++ newsgroups, and that C++ experts
don't tend to hang around in this newsgroup, so why would I try to
discuss C++ here?)

There is a very big difference between using a C++ compiler, and using
the C++ language. This is a very special case because of the very
large intersection of C and C++. I made the very clear point that the
better C compilers are, in fact, C++ compilers (both from an object
code output and a diagnostics point of view). This is the most obvious
thing in the world, and clearly was I was talking about. Any imagined
discussions about the C++ language here are from people *OTHER* than
myself.

> [...] For instance,

> templates are actually quite valuable, in spite of their horrific
> syntax. :-) )

You are being off topic for this newsgroup.

Richard Heathfield

unread,

Dec 1, 2006, 7:53:25 AM12/1/06

to

webs...@gmail.com said:

> Chris Torek wrote:

>
>> Personally, I think if one intends to compile with C++ compilers,
>> one might as well make use of various C++ constructs.
>
> Perhaps you would like to discuss this with Richard Heathfield. He
> apparently has a very strong opinion about discussion of C++ in this
> newsgroup. You and he are the only people in this thread who have
> brought up the discussion of the C++ language here.

The first message in this thread that I can find that talks about C++ is
Message-ID: <1164743682.7...@80g2000cwy.googlegroups.com>

"And, of course, the real difference is that the first compiles straight

in C++, and the second is just an error. I have found that in general

the C++ optimizers and warnings are better for the C++ mode of my
compilers than the C mode."

And you posted it.

> There is a very big difference between using a C++ compiler, and using
> the C++ language. This is a very special case because of the very
> large intersection of C and C++. I made the very clear point that the
> better C compilers are, in fact, C++ compilers (both from an object
> code output and a diagnostics point of view).

If you invoke a C++ compiler, it will interpret your source according to the
rules of C++, not C. If that's what you want to do, that's fine, but
discussions of C++ compilations belong elseNet, not in comp.lang.c.

> This is the most obvious
> thing in the world, and clearly was I was talking about.

It is also obviously wrong. Trivial examples (e.g. int new;) easily disprove
your point, so there is no particular need to find complicated examples.

> Any imagined
> discussions about the C++ language here are from people *OTHER* than
> myself.

See above quote.

>> [...] For instance,
>> templates are actually quite valuable, in spite of their horrific
>> syntax. :-) )
>
> You are being off topic for this newsgroup.

Indeed he is. And so were you.

Ian Collins

unread,

Dec 1, 2006, 4:39:18 PM12/1/06

to

Chris Torek wrote:
> Before I add to this, let me say that my earlier posting in the
> thread was indeed sarcastic/flip. I probably should not have
> posted it.
>
> (There was a real point to it, mostly being: "If you use a C++
> compiler, you are compiling C++ code. It may also happen to be C
> code, and it may even have the same semantics in both languages,
> but it is still C++ code." Note that "having the same semantics"
> is not as common as "compiles in both languages".
>
> Personally, I think if one intends to compile with C++ compilers,
> one might as well make use of various C++ constructs.
>

Another situation is where one uses C++ constructs to instrument or add
testing functionality that can not be accomplished in C without
intrusive test code to an application under test. Provided one is happy
to live with the constraints of the common subset of the two languages,
this can be a powerful development tool.

> For instance,
> templates are actually quite valuable, in spite of their horrific
> syntax. :-) )

Thank goodness for typedefs!

--
Ian Collins.

webs...@gmail.com

unread,

Dec 2, 2006, 3:38:17 PM12/2/06

to

Richard Heathfield wrote:
> webs...@gmail.com said:
> > Chris Torek wrote:
> <nonsense from websnarf snipped>
> >
> >> Personally, I think if one intends to compile with C++ compilers,
> >> one might as well make use of various C++ constructs.
> >
> > Perhaps you would like to discuss this with Richard Heathfield. He
> > apparently has a very strong opinion about discussion of C++ in this
> > newsgroup. You and he are the only people in this thread who have
> > brought up the discussion of the C++ language here.
>
> The first message in this thread that I can find that talks about C++ is
> Message-ID: <1164743682.7...@80g2000cwy.googlegroups.com>
>
> "And, of course, the real difference is that the first compiles straight
> in C++, and the second is just an error. I have found that in general
> the C++ optimizers and warnings are better for the C++ mode of my
> compilers than the C mode."
>
> And you posted it.

Just as a person cannot be described by the color of their toenail, I
don't see this as a discussion of the C++ language. My bringing this
up is obviously narrowly focussed on the usage of a C++ compiler as a
tool to compile C code. I mean C++ is a language with really a lot of
features; using the compilers for generating better output for C code
is not something anyone thinks of as a language feature.

> > There is a very big difference between using a C++ compiler, and using
> > the C++ language. This is a very special case because of the very
> > large intersection of C and C++. I made the very clear point that the
> > better C compilers are, in fact, C++ compilers (both from an object
> > code output and a diagnostics point of view).
>
> If you invoke a C++ compiler, it will interpret your source according to the
> rules of C++, not C.

Yes, but this is just a natural characteristic of the tool. It
doesn't, by itself, make your code into C++ code.

> [...] If that's what you want to do, that's fine, but

> discussions of C++ compilations belong elseNet, not in comp.lang.c.

But, its compiling C, just using a different tool. Are you suggesting
then, that discussion of the usage of LINT is off topic for this
newsgroup as well?

> > This is the most obvious
> > thing in the world, and clearly was I was talking about.
>
> It is also obviously wrong. Trivial examples (e.g. int new;) easily disprove
> your point, so there is no particular need to find complicated examples.

What the hell are you talking about? That example (or more complicated
ones which invoke those sorts of anomolies) would not be in the
intersection of C and C++. And clearly I am not advocating the
creation of polyglots with different semantics from different
languages.

> > Any imagined
> > discussions about the C++ language here are from people *OTHER* than
> > myself.
>
> See above quote.

The quote makes no mention or implication about any C++ language
content.

> >> [...] For instance,
> >> templates are actually quite valuable, in spite of their horrific
> >> syntax. :-) )
> >
> > You are being off topic for this newsgroup.
>
> Indeed he is. And so were you.

You have provided no evidence of this claim.

Old Wolf

unread,

Dec 3, 2006, 4:11:22 AM12/3/06

to

webs...@gmail.com wrote:
> Old Wolf wrote:
> > webs...@gmail.com wrote:
> > > > If it's useful, how can it be redundant?
> > >
> > > Well let's just pause and think about this for a second.
> >
> > The problem is that Paul Hsieh thinks "redundant" means "useless".
>
> Please read the thread attributions more carefully. You are targetting
> the wrong person.

Sorry, you're right -- it was in fact Richard Heathfield who
made that comment. I transfer my pox to him.

Old Wolf

unread,

Dec 3, 2006, 5:45:33 AM12/3/06

to

webs...@gmail.com wrote:
> I made the very clear point that the better C compilers are, in fact,

> C++ compilers. This is the most obvious thing in the world

You're on a different world to the rest of us. Since there exist
C programs with identical source to C++ programs, but
different semantics, it follows that a C++ compiler cannot
simultaneously be a C compiler, as you are claiming.

Are you trying to make the point that the developers of the
better C compilers, also develop C++ compilers? If so, then
that isn't even relevant to the discussion.

Keith Thompson

unread,

Dec 3, 2006, 6:04:25 AM12/3/06

to

It's possible that a compiler could act as either a C compiler or a
C++ compiler depending on how it's invoked. gcc does this, but I
don't know how much code is shared between the C and C++ modes.

Richard Heathfield

unread,

Dec 3, 2006, 9:08:41 AM12/3/06

to

Old Wolf said:

Keep it, Old Wolf. You might need it some day. If you look more closely at
the original discussion, you'll see that it initially centred around the
word "superfluous", which websnarf introduced to describe his own code.

Chris Torek

unread,

Dec 3, 2006, 1:34:37 PM12/3/06

to

In article <lnhcwdz...@nuthaus.mib.org>

Keith Thompson <ks...@mib.org> wrote:
>It's possible that a compiler could act as either a C compiler or a
>C++ compiler depending on how it's invoked. gcc does this, but I
>don't know how much code is shared between the C and C++ modes.

The preprocessing and code-generation/optimization phases are
shared. The code for building the parse trees, i.e., assigning
semantics based on syntax, is (unsurprisingly) not shared.

(Saying that the code generation is shared may be a little bit of
an overstatement. Without getting into details, it is difficult
to describe the process and the shared vs separate parts. There
is only one "machine description" per target, however, and it
includes all the code-matching/generation patterns, even if some
are never actually used from the C compiler -- e.g., there is no
need to emit "exception handler frames" for C code.)

Old Wolf

unread,

Dec 3, 2006, 3:28:04 PM12/3/06

to

Richard Heathfield wrote:
>>>>>> If it's useful, how can it be redundant?

> If you look more closely at the original discussion, you'll see that it
> initially centred around the word "superfluous", which websnarf
> introduced to describe his own code.

You wrote earlier:

> I don't agree that comments are superfluous. *You* said the structure was
> superfluous, and "superfluous" means "redundant, unnecessary" (look in a
> dictionary if you don't believe me).

which is certainly true, if you take one of the many meanings of
"redundant". Note that you introduced this usage of "redundant".
You then wrote, in response to Paul Hsieh:

>Paul> Sometimes redundancy is useful

> If it's useful, how can it be redundant?

Clearly he is referring to one of the other meanings of "redundant",
in particular, one in which redundant things can be useful.

Old Wolf

unread,

Dec 3, 2006, 3:39:27 PM12/3/06

to

Richard Bos wrote:
> "Old Wolf" <old...@inspire.net.nz> wrote:

>> The problem is that Paul Hsieh thinks "redundant" means "useless".
>>
>> In fact, the English word "redundant" means exactly that, in the
>> minds of many people.
>
> Those many people are just as wrong as Paul, then.

[Note - that was in fact a misattribution; that statement wasn't
made by Paul]

If many people think a word means something, then they are
correct by definition. The meaning of words isn't set by some
authority. Instead, dictionaries try to reflect actual usage.

There are thousands of words (probably more) in current usage
today that had different meanings decades ago. It's called
language evolution.

FWIW, from dictionary.com:
re·dun·dant /rɪˈdʌndənt/
–adjective
1. characterized by verbosity or unnecessary repetition [....]

Keith Thompson

unread,

Dec 3, 2006, 5:08:41 PM12/3/06

to

I'm not going to take sides on that issue, but I'll mention that it's
controversial; detailed discussions about the meanings of English
words are welcome in some other newsgroup.

But I will point out that, even as a new meaning for a word becomes
common, it is a fact that some people will continue to use it with its
old meaning. There are plenty of uses of the word "redundant" that do
not imply that something is unnnecessary; see "redundant backup
systems". See also "byte".

Mark McIntyre

unread,

Dec 4, 2006, 5:53:11 AM12/4/06

to

On 3 Dec 2006 12:39:27 -0800, in comp.lang.c , "Old Wolf"
<old...@inspire.net.nz> wrote:

>If many people think a word means something, then they are
>correct by definition.

This is a false conclusion. Just becase thousands of ignoramuses think
"enormity" is a symonym for "huge", don't make it so.

>The meaning of words isn't set by some
>authority. Instead, dictionaries try to reflect actual usage.

to an extent, but only to an extent, and always in broad terms not
colloquial ones.

>There are thousands of words (probably more) in current usage
>today that had different meanings decades ago. It's called
>language evolution.

true

>FWIW, from dictionary.com:
> re·dun·dant /r??d?nd?nt/

> –adjective
> 1. characterized by verbosity or unnecessary repetition [....]

Note the word "or". Not "and". So far as I'm aware these are not (yet)
synonyms.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan