Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Why doesn't strrstr() exist?

1,671 views

Skip to first unread message

Christopher Benson-Manica

unread,

Aug 25, 2005, 10:07:01 AM8/25/05

(Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)

strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
isn't part of the standard. Why not?

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.

SM Ryan

unread,

Aug 25, 2005, 1:47:50 PM8/25/05

Christopher Benson-Manica <at...@nospam.cyberspace.org> wrote:
# (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
#
# strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
# isn't part of the standard. Why not?

char *strrstr(char *x,char *y) {
int m = strlen(x);
int n = strlen(y);
char *X = malloc(m+1);
char *Y = malloc(n+1);
int i;
for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
char *Z = strstr(X,Y);
if (Z) {
int ro = Z-X;
int lo = ro+n-1;
int ol = m-1-lo;
Z = x+ol;
}
free(X); free(Y);
return Z;
}

--
SM Ryan http://www.rawbw.com/~wyrmwif/
If your job was as meaningless as theirs, wouldn't you go crazy too?

Walter Roberson

unread,

Aug 25, 2005, 2:09:13 PM8/25/05

In article <11gs126...@corp.supernews.com>,
SM Ryan <wyr...@tango-sierra-oscar-foxtrot-tango.fake.org> wrote:

>char *strrstr(char *x,char *y) {
> int m = strlen(x);
> int n = strlen(y);
> char *X = malloc(m+1);
> char *Y = malloc(n+1);

Small changes: strlen has a result type of size_t, not int, and
malloc() takes a parameter of type size_t, not int. A small change to
the declaratons of m and n fixes both issues.

> int i;
> for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
> for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;

As per the above, m and n are size_t not int, so i needs to be size_t
as well.

Also, you don't check to see whether the malloc() returned NULL.

> char *Z = strstr(X,Y);
> if (Z) {
> int ro = Z-X;
> int lo = ro+n-1;
> int ol = m-1-lo;
> Z = x+ol;

This starts to get into murky waters. Z-X is a subtraction
of pointers, the result of which is ptrdiff_t, which is a signed
integral type. Logically, though, Z-X could be of size_t, which
is unsigned. This difference has probably been discussed in the past,
but I have not happened to see the discussion of what happens with
pointer subtraction if the object size would fit in the unsigned
type but not in the signed type. Anyhow, ro, lo, ol should not be int.

> }
> free(X); free(Y);
> return Z;
>}
--

Look out, there are llamas!

Eric Sosman

unread,

Aug 25, 2005, 2:33:00 PM8/25/05

SM Ryan wrote:
> Christopher Benson-Manica <at...@nospam.cyberspace.org> wrote:
> # (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)
> #
> # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> # isn't part of the standard. Why not?
>
> char *strrstr(char *x,char *y) {
> int m = strlen(x);
> int n = strlen(y);

ITYM size_t, here and throughout.

> char *X = malloc(m+1);
> char *Y = malloc(n+1);

if (X == NULL || Y == NULL) ...?

> int i;
> for (i=0; i<m; i++) X[m-1-i] = x[i]; X[m] = 0;
> for (i=0; i<n; i++) Y[n-1-i] = y[i]; Y[n] = 0;
> char *Z = strstr(X,Y);
> if (Z) {
> int ro = Z-X;
> int lo = ro+n-1;
> int ol = m-1-lo;
> Z = x+ol;
> }
> free(X); free(Y);
> return Z;
> }

Untested:

#include <string.h>
/* @NOPEDANTRY: ignore use of reserved identifier */
char *strrstr(const char *x, const char *y) {
char *prev = NULL;
char *next;
if (*y == '\0')
return strchr(x, '\0');
while ((next = strstr(x, y)) != NULL) {
prev = next;
x = next + 1;
}
return prev;
}

The behavior when y is empty is a matter of taste
and/or debate. The code above takes the view that the
rightmost occurrence in x of the empty string is the
one that appears (if that's the right word) just prior
to x's terminating zero; other conventions are surely
possible and might turn out to be better.

Note that simply omitting the test on y would be
an error: an empty y would then cause the while loop
to run off the end of x.

--
Eric....@sun.com

Douglas A. Gwyn

unread,

Aug 25, 2005, 2:18:16 PM8/25/05

SM Ryan wrote:
> Christopher Benson-Manica <at...@nospam.cyberspace.org> wrote:
> # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> # isn't part of the standard. Why not?
> char *strrstr(char *x,char *y) {
> int m = strlen(x);
> int n = strlen(y);
> char *X = malloc(m+1);
> char *Y = malloc(n+1);

> ...

If one really wanted to use the function, that implementation
would be problematic.

I think the real answer is that there were lots of uses for
strstr() and few if any requests for strrstr() functionality.
Why specify/require it if it won't be used?

Also note that if you want to implement such a function you
might benefit from reading my chapter on string searching in
"Software Solutions in C" (ed. Dale Schumacher).

Douglas A. Gwyn

unread,

Aug 25, 2005, 4:15:55 PM8/25/05

Walter Roberson wrote:
> This starts to get into murky waters. Z-X is a subtraction
> of pointers, the result of which is ptrdiff_t, which is a signed
> integral type. Logically, though, Z-X could be of size_t, which
> is unsigned. This difference has probably been discussed in the past,
> but I have not happened to see the discussion of what happens with
> pointer subtraction if the object size would fit in the unsigned
> type but not in the signed type. Anyhow, ro, lo, ol should not be int.

ptrdiff_t is supposed to be defined as a type wide enough to
accommodate *any* possible result of a valid subtraction of
pointers to objects. If an implementation doesn't *have* a
suitable integer type, that is a deficiency..

Anyway, when you know which pointer is less than the other,
you can always subtract the lesser from the greater and the
result will then always be appropriately represented using
size_t. If you really had to worry about these limits in
some situation, you could first test which is lesser, then
use two branches in the code with size_t in each one.

Walter Roberson

unread,

Aug 25, 2005, 5:21:20 PM8/25/05

In article <430E26FB...@null.net>,

Douglas A. Gwyn <DAG...@null.net> wrote:
>Anyway, when you know which pointer is less than the other,
>you can always subtract the lesser from the greater and the
>result will then always be appropriately represented using
>size_t. If you really had to worry about these limits in
>some situation, you could first test which is lesser, then
>use two branches in the code with size_t in each one.

It seems to me that you are implying that the maximum
object size that a C implementation may support, is only
half of the memory addressible in that address mode --
e.g., maximum 2 Gb object on a 32 bit (4 Gb span)
pointer machine. This limitation being necessary so that
the maximum object size would fit in a signed storage
location, just in case you wanted to do something like

(object + sizeof object) - object

"logically" the result would be sizeof object, an
unsigned type, but the pointer subtraction is defined
as returning a signed value, so the maximum
magnitude of the signed value would have to be at least
as great as the maximum magnitude of the unsigned value...

number_of_usable_bits(size_t) < number_of_usable_bits(ptrdiff_t)

[provided, that is, that one is not using a seperate-sign-bit
machine.]

The machines I use most often -happen- to have that property
anyhow, because the high-bit on a pointer is reserved for
indicating kernel memory space, but I wonder about the extent
to which this is true on other machines?
--
Ceci, ce n'est pas une idée.

pete

unread,

Aug 25, 2005, 6:25:12 PM8/25/05

Douglas A. Gwyn wrote:

> ptrdiff_t is supposed to be defined as a type wide enough to
> accommodate *any* possible result of a valid subtraction of
> pointers to objects.

What are you talking about?

Is your point that ptrdiff_t is actually defined
opposite of the way that it's supposed to be?

"If the result is not representable in an object of that type,
the behavior is undefined.
In other words, if the expressions P and Q point to,
respectively, the i-th and j-th elements of an array object,
the expression (P)-(Q) has the value i-j
provided the value fits in an object of type ptrdiff_t."

--
pete

Old Wolf

unread,

Aug 25, 2005, 6:52:19 PM8/25/05

SM Ryan wrote:
> #
> # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> # isn't part of the standard. Why not?
>
> char *strrstr(char *x,char *y) {
> int m = strlen(x);
> int n = strlen(y);
> char *X = malloc(m+1);
> char *Y = malloc(n+1);

Using dynamic allocation for this function? You have got
to be kidding

I don't know which is more obfuscated -- your code, or your
quote marker

> }
> free(X); free(Y);
> return Z;
> }

Keith Thompson

unread,

Aug 25, 2005, 7:20:20 PM8/25/05

Christopher Benson-Manica <at...@nospam.cyberspace.org> writes:
> strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> isn't part of the standard. Why not?

I don't think anyone has posted the real reason: it's arbitrary. The
C standard library isn't a coherently designed entity. It's a
collection of functionality from historical implementations,
consisting largely of whatever seemed like a good idea at the time,
filtered through the standards committee. Just look at the continuing
existence of gets(), or the design of <time.h>.

It's remarkable (and a tribute to the original authors and to the
committee) that the whole thing works as well as it does.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

SM Ryan

unread,

Aug 25, 2005, 8:06:19 PM8/25/05

Eric Sosman <eric....@sun.com> wrote:
#
#
# SM Ryan wrote:
# > Christopher Benson-Manica <at...@nospam.cyberspace.org> wrote:
# > # (Followups set to comp.std.c. Apologies if the crosspost is unwelcome.)

# > #
# > # strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()

# > # isn't part of the standard. Why not?
# >
# > char *strrstr(char *x,char *y) {
# > int m = strlen(x);
# > int n = strlen(y);

Time complexity can be O(m+n), since strstr can be O(m+n)
and O(2m+2n) = O(m+n).

# char *strrstr(const char *x, const char *y) {
# char *prev = NULL;
# char *next;
# if (*y == '\0')
# return strchr(x, '\0');
# while ((next = strstr(x, y)) != NULL) {
# prev = next;
# x = next + 1;
# }
# return prev;
# }

Potentially O(m*n), depending on how often characters repeat in y.

--
SM Ryan http://www.rawbw.com/~wyrmwif/

No pleasure, no rapture, no exquisite sin greater than central air.

Antoine Leca

unread,

Aug 26, 2005, 7:02:33 AM8/26/05

En <news:430E0B68...@null.net>, Douglas A. Gwyn va escriure:

> Also note that if you want to implement such a function you
> might benefit from reading my chapter on string searching in
> "Software Solutions in C" (ed. Dale Schumacher).

The straightforward idea (using strstr() in a loop and returning the last
not-NULL answer, as strrchr() usually does) won't be a good one?
At least it would take profit from the optimized form of strstr() often
found (several people reported here that the shipped strstr()'s regularly
outperform crafted algorithms like Boyer-Moore.)

Not that I see any use for strrstr(), except perhaps to do the same as
strrchr() when c happens to be a multibyte character in a stateless
encoding.

Antoine

Douglas A. Gwyn

unread,

Aug 26, 2005, 4:48:07 PM8/26/05

Walter Roberson wrote:
> It seems to me that you are implying that the maximum
> object size that a C implementation may support, is only
> half of the memory addressible in that address mode --

No, I was saying that *if* a C implementation doesn't
support some integer type with more bits than are needed
to represent an address, *and if* the compiler supports
objects larger than half the available address space,
*then* then the definition of ptrdiff_t becomes
problematic. Note all the conditions..

> The machines I use most often -happen- to have that property
> anyhow, because the high-bit on a pointer is reserved for
> indicating kernel memory space, but I wonder about the extent
> to which this is true on other machines?

Now that 64-bit integer support is required for C
conformance, there should be a suitable ptrdiff_t type
available except on systems that support processes with
data sizes greater than 2^63 bytes. I don't know of
many systems like that..

Douglas A. Gwyn

unread,

Aug 26, 2005, 4:58:30 PM8/26/05

Antoine Leca wrote:
> The straightforward idea (using strstr() in a loop and returning the last
> not-NULL answer, as strrchr() usually does) won't be a good one?

Well, it won't be optimal, since it searches the entire string
even when a match could have been found immediately if the
scan progressed from the end of the string. Finding the end
of the string initially has relatively high overhead, alas,
due to the representation of C strings. It isn't immediately
obvious just what the trade-off is between starting at the end
and scanning backward vs. the algoritm you suggested. Probably,
unless strrstr() is a bottleneck in the app, what you suggested
will be good enough.

> At least it would take profit from the optimized form of strstr()

Yes, that is useful.

What I was actually concerned about was that people might
implement the naive "brute-force" method of attempting matches
at each incremental (decremental?) position, which is okay for
occasional use but certainly not nearly the fastest method.

> (several people reported here that the shipped strstr()'s
> regularly outperform crafted algorithms like Boyer-Moore.)

I compared various algorithms in the book to which I referred.

> Not that I see any use for strrstr(), except perhaps to do the same as
> strrchr() when c happens to be a multibyte character in a stateless
> encoding.

Even then it's problematic, because the search would not respect
alignment with boundaries between character encodings.

Douglas A. Gwyn

unread,

Aug 26, 2005, 5:04:59 PM8/26/05

Keith Thompson wrote:
> I don't think anyone has posted the real reason: it's arbitrary. The
> C standard library isn't a coherently designed entity. It's a
> collection of functionality from historical implementations,
> consisting largely of whatever seemed like a good idea at the time,

> filtered through the standards committee. ...

That is far from arbitrary. The evolution of C library
functions was substantially influenced by the demands of
practical programming, and many of the interfaces went
through several iterations in the early years of C, as
deficiencies in earlier versions were identified. The C
standards committee quite reasonably chose to standardize
existing interfaces rather than try to design totally new
ones. Many of the standard interfaces are not at all
what we would come up with in a new design.

webs...@gmail.com

unread,

Aug 27, 2005, 6:30:03 PM8/27/05

Keith Thompson wrote:
> Christopher Benson-Manica <at...@nospam.cyberspace.org> writes:
> > strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> > isn't part of the standard. Why not?
>
> I don't think anyone has posted the real reason: it's arbitrary. The
> C standard library isn't a coherently designed entity. It's a
> collection of functionality from historical implementations,
> consisting largely of whatever seemed like a good idea at the time,
> filtered through the standards committee. Just look at the continuing
> existence of gets(), or the design of <time.h>.
>
> It's remarkable (and a tribute to the original authors and to the
> committee) that the whole thing works as well as it does.

When you look at the world through rose color glasses ...

Remember that almost every virus, buffer overflow exploit, core
dump/GPF/etc is basically due to some undefined situation in the ANSI C
standard. I consider the ANSI C standard committee basically coauthors
of every one of these problems.

---
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

Randy Howard

unread,

Aug 27, 2005, 7:21:08 PM8/27/05

webs...@gmail.com wrote
(in article
<1125181803....@g49g2000cwa.googlegroups.com>):

> Keith Thompson wrote:
>> Christopher Benson-Manica <at...@nospam.cyberspace.org> writes:
>>> strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
>>> isn't part of the standard. Why not?
>>
>> I don't think anyone has posted the real reason: it's arbitrary. The
>> C standard library isn't a coherently designed entity. It's a
>> collection of functionality from historical implementations,
>> consisting largely of whatever seemed like a good idea at the time,
>> filtered through the standards committee. Just look at the continuing
>> existence of gets(), or the design of <time.h>.
>>
>> It's remarkable (and a tribute to the original authors and to the
>> committee) that the whole thing works as well as it does.
>
> When you look at the world through rose color glasses ...

Well, at least some seem to have their eyes fully open.

> Remember that almost every virus, buffer overflow exploit, core
> dump/GPF/etc is basically due to some undefined situation in the ANSI C
> standard.

Not really. Those that defined early C, and later standard C
are not responsible for bad programming. If a programmer has
access to the standard (which they do), and they decide to do
something which 'invokes undefined behavior', then it is their
fault. The standard says do not do that, and they did it
anyway.

> I consider the ANSI C standard committee basically coauthors
> of every one of these problems.

I couldn't disagree more. If programmers themselves were held
responsible for their mistakes, instead of trying to blame it on
loopholes or missing words in a huge document, we would be much
better off. If you could be fined or perhaps even jailed for
gross neglicence in software development the way doctors can be
today, I suspect the problem would be all but nonexistent.

--
Randy Howard (2reply remove FOOBAR)

webs...@gmail.com

unread,

Aug 27, 2005, 8:49:45 PM8/27/05

Randy Howard wrote:

> webs...@gmail.com wrote:
> > Keith Thompson wrote:
> >> Christopher Benson-Manica <at...@nospam.cyberspace.org> writes:
> >>> strchr() is to strrchr() as strstr() is to strrstr(), but strrstr()
> >>> isn't part of the standard. Why not?
> >>
> >> I don't think anyone has posted the real reason: it's arbitrary. The
> >> C standard library isn't a coherently designed entity. It's a
> >> collection of functionality from historical implementations,
> >> consisting largely of whatever seemed like a good idea at the time,
> >> filtered through the standards committee. Just look at the continuing
> >> existence of gets(), or the design of <time.h>.
> >>
> >> It's remarkable (and a tribute to the original authors and to the
> >> committee) that the whole thing works as well as it does.
> >
> > When you look at the world through rose color glasses ...
>
> Well, at least some seem to have their eyes fully open.
>
> > Remember that almost every virus, buffer overflow exploit, core
> > dump/GPF/etc is basically due to some undefined situation in the ANSI C
> > standard.
>
> Not really. Those that defined early C, and later standard C
> are not responsible for bad programming.

Bad programming + good programming language does not allow for buffer
overflow exploits. You still need a bad programming language to
facilitate the manifestation of these worst case scenarios.

> [...] If a programmer has access to the standard (which they

> do), and they decide to do something which 'invokes undefined
> behavior', then it is their fault. The standard says do not
> do that, and they did it anyway.

Ok, this is what I was talking about when I mentioned rose colored
glasses. If programmers are perfect, then what you are saying is fine,
because you can expect perfection. But real people are not. And I
think expectations of perfection in programming is really nonsensical.

Remember NASA put a priority inversion (a truly nasty bug to deal with)
in the mars pathfinder. The Arianne rocket blew up because of an
overflow triggering an interrupt handler that was faulty. You think
the programmers for these projects were not trying their best to do a
good job? Perfect programmers/programming is a pipedream. There is a
reason we paint lines on the roads, wear seatbelts, put guardrails on
stairs and bridges.

The problem of programmer safety can be attacked quite successfully at
the level of the programming language itself. There isn't actually a
downside to removing gets() and deprecating strtok and strnc??. (Hint:
Legacy code uses legacy compilers.)

> > I consider the ANSI C standard committee basically coauthors
> > of every one of these problems.
>
> I couldn't disagree more. If programmers themselves were held
> responsible for their mistakes, instead of trying to blame it on
> loopholes or missing words in a huge document, we would be much
> better off.

And what if its not the programmer's fault? What if the programmer is
being worked to death? What if he's in a dispute with someone else
about how something should be done and lost the argument and was forced
to do things badly?

> [...] If you could be fined or perhaps even jailed for

> gross neglicence in software development the way doctors can be
> today, I suspect the problem would be all but nonexistent.

Ok, that's just vindictive nonsense. Programmers are generally not
aware of the liability of their mistakes. And mistakes are not
completely removable -- and there's a real question as to whether the
rate can even be reduced.

But if you were to truly enforce such an idea, I believe both C and C++
as programming languages would instantly disappear. Nobody in their
right mind, other than the most irresponsible daredevils would program
in these langauges if they were held liable for their mistakes.

Randy Howard

unread,

Aug 27, 2005, 10:32:08 PM8/27/05

webs...@gmail.com wrote
(in article
<1125190185....@o13g2000cwo.googlegroups.com>):

>>> Remember that almost every virus, buffer overflow exploit, core
>>> dump/GPF/etc is basically due to some undefined situation in the ANSI C
>>> standard.
>>
>> Not really. Those that defined early C, and later standard C
>> are not responsible for bad programming.
>
> Bad programming + good programming language does not allow for buffer
> overflow exploits.

For suitably high-level languages that might be true (and
provable). Let us not forget that C is *not* a high-level
language. It's not an accident that it is called high-level
assembler.

I'd love for you to explain to us, by way of example, how you
could guarantee that assembly programmers can not be allowed to
code in a way that allows buffer overflows.

> You still need a bad programming language to
> facilitate the manifestation of these worst case scenarios.

If you wish to argue that low-level languages are 'bad', I will
have to disagree. If you want to argue that too many people
write code in C when their skill level is more appropriate to a
language with more seatbelts, I won't disagree. The trick is
deciding who gets to make the rules.

>> [...] If a programmer has access to the standard (which they
>> do), and they decide to do something which 'invokes undefined
>> behavior', then it is their fault. The standard says do not
>> do that, and they did it anyway.
>
> Ok, this is what I was talking about when I mentioned rose colored
> glasses. If programmers are perfect, then what you are saying is fine,
> because you can expect perfection. But real people are not. And I
> think expectations of perfection in programming is really nonsensical.

/Exactly/ Expecting zero buffer overruns is nonsensical.

> Remember NASA put a priority inversion (a truly nasty bug to deal with)
> in the mars pathfinder. The Arianne rocket blew up because of an
> overflow triggering an interrupt handler that was faulty. You think
> the programmers for these projects were not trying their best to do a
> good job?

No, I do not. I expect things to go wrong, because humans are
not infallible. Especially in something as inherently difficult
as space travel. It's not like you can test it (for real)
before you try it for all the marbles. You can't just hire an
army of monkey to sit in a lab beating on the keyboarrd all day
like an application company.

Anyway, a language so restrictive as to guarantee that nothing
can go wrong will probably never be used for any real-world
project.

> Perfect programmers/programming is a pipedream.

So is the idea of a 'perfect language'.

> There is a
> reason we paint lines on the roads, wear seatbelts, put guardrails on
> stairs and bridges.

Yes. And we require licenses for dangerous activities
elsewhere, but anyone can pick up a compiler and start playing
around.

> The problem of programmer safety can be attacked quite successfully at
> the level of the programming language itself.

It's quite easy to simply make the use of gets() and friends
illegal for your code development. Most of us have already done
so, without a standard body telling us to do it.

> There isn't actually a downside to removing gets() and deprecating
> strtok and strnc??. (Hint: Legacy code uses legacy compilers.)

Hint: Legacy code doesn't have to stay on the original platform.
Even so, anyone dusting off an old program that doesn't go
sifting through looking for the usual suspects is a fool.

I don't have a problem with taking gets() out of modern
compilers, but as you already pointed out, this doesn't
guarantee anything. People can still fire up an old compiler
and use it. I don't see a realistic way for the C standard to
enforce such things.

>>> I consider the ANSI C standard committee basically coauthors
>>> of every one of these problems.
>>
>> I couldn't disagree more. If programmers themselves were held
>> responsible for their mistakes, instead of trying to blame it on
>> loopholes or missing words in a huge document, we would be much
>> better off.
>
> And what if its not the programmer's fault?

It is the fault of the development team, comprised of whoever
that involves for a given project. If the programmer feels like
his boss screwed him over, let him refuse to continue, swear out
an affidavit and have it notarized the bad software was
knowingly shipped, and that you refuse to endorse it.

> What if the programmer is being worked to death?

That would be interesting, because although I have worked way
more than my fair share of 120 hour weeks, I never died, and
never heard of anyone dying. I have heard of a few losing it
and checking themselves into psycho wards, but still. If you
are being overworked, you can either keep doing it, or you can
quit, or you can convince your boss to lighten up. ESPECIALLY
in this case, the C standard folks are not to blame.

> What if he's in a dispute with someone else
> about how something should be done and lost the argument and
> was forced to do things badly?

Try and force me to write something in a way that I know is
wrong. Go ahead, it'll be a short argument, because I will
resign first.

Try and force a brain surgeon to operate on your head with a
chainsaw. good luck.

>> [...] If you could be fined or perhaps even jailed for
>> gross neglicence in software development the way doctors can be
>> today, I suspect the problem would be all but nonexistent.
>
> Ok, that's just vindictive nonsense.

Why? We expect architects, doctors, lawyers, pretty much all
other real 'professions' to meet and typically exceed a higher
standard, and those that do not are punished, fined, or stripped
of their license to practice in the field. Why should
programmers get a pass? Is it because you do not feel it is a
professional position?

We don't let anyone that wants to prescribe medicine, why should
we let anyone that wants to put software up for download which
could compromise system security?

> Programmers are generally not aware of the liability of
> their mistakes.

Then those you refer to must be generally incompetent. Those
that are good certainly are aware, especially when the software
is of a critical nature.

> And mistakes are not completely removable --

Correct. It's also not possible to completely remove medical
malpractice, but it gets punished anyway. It's called a
deterrent.

> and there's a real question as to whether the rate can even be reduced.

As long as there is no risk of failure, it almost certainly will
not be reduced by magic or wishing.

> But if you were to truly enforce such an idea, I believe both C and C++
> as programming languages would instantly disappear.

I highly doubt that. Low-level language programmers would be
the cream of the crop, not 'the lowest bidder' as is the case
today. You would not be hired to work based upon price, but on
skill. Much as I would go look for the most expensive attorney
I could find if I was on trial, I would look for the most highly
skilled programmers I could find to work on a nuclear reactor.

Taking bids and outsourcing to some sweatshop in a jungle
somewhere would not be on the list of options.

> Nobody in their right mind, other than the most irresponsible
> daredevils would program in these langauges if they were held
> liable for their mistakes.

I guess all the professionals in other fields where they are
held up to scrutiny must be irresponsible daredevils too. For
example, there are operations that have very low success rates,
yet there are doctors that specialize in them anyway, despite
the low odds.

If you don't want to take the risk, then go write in visual
whatever#.net and leave it to those that are.

Chris McDonald

unread,

Aug 27, 2005, 10:44:26 PM8/27/05

Randy Howard <randy...@FOOverizonBAR.net> writes:

<getting-way-OT>

>I'd love for you to explain to us, by way of example, how you
>could guarantee that assembly programmers can not be allowed to
>code in a way that allows buffer overflows.
>

> ......

>
>/Exactly/ Expecting zero buffer overruns is nonsensical.
>

> ......

>
>Anyway, a language so restrictive as to guarantee that nothing
>can go wrong will probably never be used for any real-world
>project.

I struggle to parse your first sentence, but what if assembly language
programmers were "required" to program in an assembly language whose
program structure could be strongly verified at runtime (aka JVM bytecodes)?

Or would that be against the spirit of an assembly language, and the
discussion?

</getting-way-OT>

--
Chris.

webs...@gmail.com

unread,

Aug 28, 2005, 2:14:14 AM8/28/05

Randy Howard wrote:
> webs...@gmail.com wrote:

> >>> Remember that almost every virus, buffer overflow exploit, core
> >>> dump/GPF/etc is basically due to some undefined situation in the ANSI C
> >>> standard.
> >>
> >> Not really. Those that defined early C, and later standard C
> >> are not responsible for bad programming.
> >
> > Bad programming + good programming language does not allow for buffer
> > overflow exploits.
>
> For suitably high-level languages that might be true (and
> provable). Let us not forget that C is *not* a high-level
> language. It's not an accident that it is called high-level
> assembler.

Right. If you're not with us, you are with the terrorists.

Why does being a low language mean you have to present a programming
interface surrounded by landmines? Exposing a sufficiently low level
interface may require that you expose some danergous semantics, but why
expose them up front right in the most natural paths of usage?

> I'd love for you to explain to us, by way of example, how you
> could guarantee that assembly programmers can not be allowed to
> code in a way that allows buffer overflows.

Ok, the halting problem means basically nobody guarantees anything
about computer programming.

But its interesting that you bring up the questions of assembly
language. If you persuse the x86 assembly USENET newsgroups, you will
see that many people are very interested in expanding the power and
syntax for assembly language (examples include HLA, RosAsm, and
others). A recent post talked about writing a good string library for
assembly, and there was a strong endorsement for the length prefixed
style of strings, including one direct reference to Bstrlib as a design
worth following (not posted by me!).

So, while assembly clearly isn't an inherently safe language, it seems
quite possible that some assembly efforts will have a much safer (and
much faster) string interface than C does.

> > You still need a bad programming language to facilitate the
> > manifestation of these worst case scenarios.
>
> If you wish to argue that low-level languages are 'bad', I will
> have to disagree.

So why put those words in my mouth?

> [...] If you want to argue that too many people

> write code in C when their skill level is more appropriate to a
> language with more seatbelts, I won't disagree. The trick is
> deciding who gets to make the rules.

But I'm not arguing that either. I am saying C is to a large degree
just capriciously and unnecessarily unsafe (and slow, and powerless,
and unportable etc., etc).

> >> [...] If a programmer has access to the standard (which they
> >> do), and they decide to do something which 'invokes undefined
> >> behavior', then it is their fault. The standard says do not
> >> do that, and they did it anyway.
> >
> > Ok, this is what I was talking about when I mentioned rose colored
> > glasses. If programmers are perfect, then what you are saying is fine,
> > because you can expect perfection. But real people are not. And I
> > think expectations of perfection in programming is really nonsensical.
>
> /Exactly/ Expecting zero buffer overruns is nonsensical.

Well, not exactly. If you're not using C or C++, then buffer overflows
usually at worse lead to a runtime exception; in C or C++, exploits are
typically designed to gain shell access in the context of the erroneous
program. Its like honey for bees -- people attack C/C++ programs
because they have this weakness. In other safer programming languages,
even if you had a buffer overflow, allowing a control flow
zombification of the program is typically not going to be possible.

> > Remember NASA put a priority inversion (a truly nasty bug to deal with)
> > in the mars pathfinder. The Arianne rocket blew up because of an
> > overflow triggering an interrupt handler that was faulty. You think
> > the programmers for these projects were not trying their best to do a
> > good job?
>
> No, I do not. I expect things to go wrong, because humans are
> not infallible. Especially in something as inherently difficult
> as space travel.

Space travel itself was not the issue, and it wasn't any more
complicated than any kind of industrial device manager (as you might
find in an automated assembly line.) The real problem is the priority
inversions are *nasty*. Each component can be unit tested and
validated to work properly in isolation -- the problem is that when you
put them together and they encounter a specific scenario. Its just a
very sophisticated deadlock.

> [...] It's not like you can test it (for real)

> before you try it for all the marbles. You can't just hire an
> army of monkey to sit in a lab beating on the keyboarrd all day
> like an application company.

Hmm ... I don't think that's quite it. The problem is that the
scenario, which I don't recall all the details of, was something that
was simply unaccounted for in their testing. This is a problem in
testing in general. Line by line coverage, unit testing, and other
forms of typical testing really only find the most obvious bugs.

They were able to save the pathfinder, because VxWorks allows you to
reboot into a shell or debug mode, and they were able to patch the code
remotely. The point of this being that in the end they were lucky to
have very sophisticated 3rd party support that is well beyond anything
that the C standard delivers.

> Anyway, a language so restrictive as to guarantee that nothing
> can go wrong will probably never be used for any real-world
> project.

How about simpler language that is more powerful, demonstrably faster,
more portable (dictionary definition), obviously safer and still just
as low level? Just take the C standard, deprecate the garbage, replace
a few things, genericize some of the APIs, well define some of the
scenarios which are currently described as undefined, make some of the
ambiguous syntaxes that lead to undefined behavior illegal, and you're
immediately there. If these steps seem too radical, just draw a line
from where you are and where you need to go, and pick an acceptable
point in between.

Your problem is that you assume making C safer (or faster, or more
portable, or whatever) will take something useful away from C that it
currently has. Think about that for a minute. How is possible that
your mind can be in that state?

> > Perfect programmers/programming is a pipedream.
>
> So is the idea of a 'perfect language'.

But I was not advocating that. You want punishment -- so you
implicitely are *demanding* programmer perfection.

> > There is a
> > reason we paint lines on the roads, wear seatbelts, put guardrails on
> > stairs and bridges.
>
> Yes. And we require licenses for dangerous activities
> elsewhere, but anyone can pick up a compiler and start playing
> around.
>
> > The problem of programmer safety can be attacked quite successfully at
> > the level of the programming language itself.
>
> It's quite easy to simply make the use of gets() and friends
> illegal for your code development. Most of us have already done
> so, without a standard body telling us to do it.

So, estimate the time taken to absorb this information per programmer,
multiply it by the average wage of that programmer, multiply that by
the number of programmers that follow that and there you get the cost
of doing it correctly. Add to that the cost of downtime for those that
get it wrong. (These are costs per year, of course -- since its an on
going problem, the total cost would really be infinite.)

The standards body, just needs to remove it and those costs go away.
Vendors and legacy defenders and pure idiot programmers might get their
panties in a bunch, but no matter how you slice it, the cost of doing
this is clearly finite.

> > There isn't actually a downside to removing gets() and deprecating
> > strtok and strnc??. (Hint: Legacy code uses legacy compilers.)
>
> Hint: Legacy code doesn't have to stay on the original platform.

Hint: moving code *ALWAYS* incurrs costs. As I said above, its a
*finite* cost. You don't think people who move code around with calls
to gets() in it should remove them?

> Even so, anyone dusting off an old program that doesn't go
> sifting through looking for the usual suspects is a fool.

And an old million line program? I think this process should be
automated. In fact, I think it should be automated in your compiler.
In fact I think your compiler should just reject these nonsensical
functions out of hand and issue errors complaining about them. Hey! I
have an idea! Why not remove them from the standard?

> I don't have a problem with taking gets() out of modern
> compilers, but as you already pointed out, this doesn't
> guarantee anything. People can still fire up an old compiler
> and use it. I don't see a realistic way for the C standard to
> enforce such things.

Interesting -- because I do. You make gets a reserved word, not
redefinable by the preprocessor, and have it always lead to a syntax
error. This forces legacy code owners to either remove it, or stay
away from new compilers.

This has value because, developers can claim to be "C 2010 compliant"
or whatever, and this can tell you that you know it doesn't have gets()
or any other wart that you decided to get rid of. This would in turn
put pressure of the legacy code owners to remove the offending calls,
in an effort that's certainly no worse than the Y2K issue (without the
looming deadline hanging over their heads).

> >>> I consider the ANSI C standard committee basically coauthors
> >>> of every one of these problems.
> >>
> >> I couldn't disagree more. If programmers themselves were held
> >> responsible for their mistakes, instead of trying to blame it on
> >> loopholes or missing words in a huge document, we would be much
> >> better off.
> >
> > And what if its not the programmer's fault?
>
> It is the fault of the development team, comprised of whoever
> that involves for a given project. If the programmer feels like
> his boss screwed him over, let him refuse to continue, swear out
> an affidavit and have it notarized the bad software was
> knowingly shipped, and that you refuse to endorse it.

Oh I see. So, which socialist totally unionized company do you work as
a programmer for? I'd like to apply!

> > What if the programmer is being worked to death?
>
> That would be interesting, because although I have worked way
> more than my fair share of 120 hour weeks, I never died, and
> never heard of anyone dying. I have heard of a few losing it
> and checking themselves into psycho wards, but still.

Well ... they usually put in buffer overflows, backdoors, or otherwise
sloppy code before they check into these places.

> [...] If you

> are being overworked, you can either keep doing it, or you can
> quit, or you can convince your boss to lighten up.

Hmmm ... so you live in India? I'm trying to guess where it is in this
day and age that you can just quit your job solely because you don't
like the pressures coming from management.

> [...] ESPECIALLY in this case, the C standard folks are not to blame.

But if the same issue happens and you are using a safer language, the
same kinds of issues don't come up. Your code might be wrong, but it
won't allow buffer overflow exploits.

> > What if he's in a dispute with someone else
> > about how something should be done and lost the argument and
> > was forced to do things badly?
>
> Try and force me to write something in a way that I know is
> wrong. Go ahead, it'll be a short argument, because I will
> resign first.

That's a nice bubble you live in. Or is it just in your mind?

> Try and force a brain surgeon to operate on your head with a
> chainsaw. good luck.
>
> >> [...] If you could be fined or perhaps even jailed for
> >> gross neglicence in software development the way doctors can be
> >> today, I suspect the problem would be all but nonexistent.
> >
> > Ok, that's just vindictive nonsense.
>
> Why? We expect architects, doctors, lawyers, pretty much all
> other real 'professions' to meet and typically exceed a higher
> standard, and those that do not are punished, fined, or stripped
> of their license to practice in the field. Why should
> programmers get a pass? Is it because you do not feel it is a
> professional position?

Because its not as structured, and that's simply not practical.
Doctors have training, internships, etc. Lawyers have to pass a bar
exam, etc. There's no such analogue for computer programmers. Because
the most successful programmers are always ones that are able to think
outside the box, but the bar for average programmers is pretty low --
but both can make a contribution, and neither can guarantee perfect
code.

> We don't let anyone that wants to prescribe medicine, why should
> we let anyone that wants to put software up for download which
> could compromise system security?
>
> > Programmers are generally not aware of the liability of
> > their mistakes.
>
> Then those you refer to must be generally incompetent.

Dennis Ritchie had no idea that NASA would put a priority inversion in
their pathfinder code. Linus Torvalds had no idea that the NSA would
take his code and use it for a security based platform. My point is
that programmers don't know what the liability of their code is,
because they are not always in control of when or where or for what it
might be used.

The recent JPEG parsing buffer overflow exploit, for example, came from
failed sample code from the JPEG website itself. You think we should
hunt down Tom Lane and linch him?

> [...] Those that are good certainly are aware, especially when

> the software is of a critical nature.
>
> > And mistakes are not completely removable --
>
> Correct. It's also not possible to completely remove medical
> malpractice, but it gets punished anyway. It's called a
> deterrent.

You don't think medical practioners use the latest and safest
technology available to practice their medicine?

> > and there's a real question as to whether the rate can even be reduced.
>
> As long as there is no risk of failure, it almost certainly will
> not be reduced by magic or wishing.

This is utter nonsense. The reason for the success of languages like
Java and Python is not because of their speed you know.

> > But if you were to truly enforce such an idea, I believe both C and C++
> > as programming languages would instantly disappear.
>
> I highly doubt that. Low-level language programmers would be
> the cream of the crop, not 'the lowest bidder' as is the case
> today.

You still don't get it. You, I or anyone you know, will produce errors
if pushed. There's no such thing as a 0 error rate for programming.
Just measuring first time compile error rates, myself, I score roughly
one syntax error per 300 lines of code. I take this as an indicator
for the likely number of hidden bugs I just don't know about in my
code. Unless my first-compile error rate was 0, I just can't have any
confidence that I don't also have a 0 hidden bug rate. I know that
since using my own Bstrlib library, and other similar mechanisms my
rate is probably far less now than its ever been. But its still not 0.

Go measure your own first-compile error rate and tell me you are
confident in your own ability to avoid hidden bugs. If you still think
you can achieve a 0 or near 0 hidden bug rate, go look up "priority
inversion". No syntax checker and no run time debugger can tell you
about this sort of error. Your only chance of avoiding these sorts of
errors is having a very thoroughly vetted high level design.

> [...] You would not be hired to work based upon price, but on

> skill. Much as I would go look for the most expensive attorney
> I could find if I was on trial, I would look for the most highly
> skilled programmers I could find to work on a nuclear reactor.
>
> Taking bids and outsourcing to some sweatshop in a jungle
> somewhere would not be on the list of options.

For a nuclear reactor, I would also include the requirement that they
use a safer programming language like Ada. Personally I would be
shocked to know that *ANY* nuclear reactor control mechanism was
written in C. Maybe a low level I/O driver library, that was
thoroughly vetted (because you probably can't do that in Ada), but
that's it.

> > Nobody in their right mind, other than the most irresponsible
> > daredevils would program in these langauges if they were held
> > liable for their mistakes.
>
> I guess all the professionals in other fields where they are
> held up to scrutiny must be irresponsible daredevils too.

No -- they have great assistance and controlled environments that allow
them to perform under such conditions. Something akin to using a
better programming language.

> [...] For

> example, there are operations that have very low success rates,
> yet there are doctors that specialize in them anyway, despite
> the low odds.

Well, your analogy only makes some sense if you are talking about
surgeons in developing countries who simply don't have access to the
necessary anesthetic, support staff or even the proper education to do
the operation correctly. In those cases, there is little choice, so
you make do with what you have. But obviously its a situation you just
want to move away from -- they way you solve it, is you give them
access to the safer, and better ways to practice medicine.

> If you don't want to take the risk, then go write in visual
> whatever#.net and leave it to those that are.

So you want some people to stay away from C because the language is too
dangerous. While I want the language be fixed so that most people
don't trigger the landmines in the language so easily. If you think
about it, my solution actually *costs* less.

Magnus Wibeck

unread,

Aug 28, 2005, 5:44:11 AM8/28/05

webs...@gmail.com wrote:
> The point of this being that in the end they were lucky to
> have very sophisticated 3rd party support that is well beyond anything
> that the C standard delivers.

You surely cannot be comparing "3rd party support" from a commercial
company to a language standard? They have totally different purposes.
That's like comparing a specification of a car to a taxi company,
and complaining that if you sit on the specification it doesn't get you
anywhere, but if you call the taxi company they get you where you tell them to.

> For a nuclear reactor, I would also include the requirement that they
> use a safer programming language like Ada.

The Ariane software module that caused the problem was written in Ada.
http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
Had it been written in C, the actual cause (integer overflow) probably would not
have caused an exception. I'm not saying that it would have been better in
C, but you *cannot* blame the C standard for what happened there.

Also, this "priority inversion" you speak of - doesn't that imply processes
or threads? C does not have that AFAIK. So you cannot blame the C standard
for allowing priority inversion bugs to occurr. It neither allows or disallows
them, because C has no notion of priorities.

/Magnus

Richard Kettlewell

unread,

Aug 28, 2005, 6:36:12 AM8/28/05

webs...@gmail.com writes:
> Randy Howard wrote:
>> webs...@gmail.com wrote:

>>> Remember that almost every virus, buffer overflow exploit, core
>>> dump/GPF/etc is basically due to some undefined situation in the
>>> ANSI C standard.
>>
>> Not really. Those that defined early C, and later standard C are
>> not responsible for bad programming.
>
> Bad programming + good programming language does not allow for
> buffer overflow exploits. You still need a bad programming language
> to facilitate the manifestation of these worst case scenarios.

Exploits that rely on C undefined behaviour are not the only kind of
problem in reality. Programs not written in C sometimes have serious
security problems too.

For example lots of software has had various kinds of quoting and
validation bugs - SQL injection, cross-site scripting, inadequate
shell quoting - for many years, and this is a consequence purely of
the program, and cannot be pinned on the language it is written in.

You won't spot these bugs with tools such as Valgrind or Purify,
either.

--
http://www.greenend.org.uk/rjk/

webs...@gmail.com

unread,

Aug 28, 2005, 6:45:23 AM8/28/05

Magnus Wibeck wrote:
> webs...@gmail.com wrote:
> > The point of this being that in the end they were lucky to
> > have very sophisticated 3rd party support that is well beyond anything
> > that the C standard delivers.
>
> You surely cannot be comparing "3rd party support" from a commercial
> company to a language standard?

Originally I was making a point about the mistake rate of programmers.
But more generally, the C language probably has more "problem support
tools" than any language in existence, and this will probably continue
to be true for the future regardless of language mindshare.

> [...] They have totally different purposes.

> That's like comparing a specification of a car to a taxi company,
> and complaining that if you sit on the specification it doesn't get you
> anywhere, but if you call the taxi company they get you where you tell them
> to.

Hmmm ... I'm not sure its the same thing. For example let's say C
added a function: numallocs(), that counted the number of memory
allocations that are outstanding (or the maximum number that could be
legally freed or whatever.) Similarly, if the Boehm garbage collector
were adopted as part of the C standard (not that I'm advocating that.)
If the C library were to basically abandon its string functions and use
something like Bstrlib, for example, then David Wagner's (and many
other) buffer overflow security analysis tools would be obsolete.

> > For a nuclear reactor, I would also include the requirement that they
> > use a safer programming language like Ada.
>
> The Ariane software module that caused the problem was written in Ada.
> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
> Had it been written in C, the actual cause (integer overflow) probably would
> not have caused an exception. I'm not saying that it would have been better
> in C, but you *cannot* blame the C standard for what happened there.

You are right, I cannot blame C for bugs that happen in other
languages. This is the most famous one from Ada. If you would like a
short list of infamous bugs for C just go through the CERT advisories
-- they are basically almost entirely C related.

See, the thing is, with Ada bugs, you can clearly blame the programmer
for most kinds of failures. With C you can go either way. But nearly
every software design house that writes lots of software in C just gets
bit by bugs from all sorts of edges of the language.

> Also, this "priority inversion" you speak of - doesn't that imply processes
> or threads? C does not have that AFAIK. So you cannot blame the C standard
> for allowing priority inversion bugs to occurr. It neither allows or
> disallows them, because C has no notion of priorities.

The programmer used priority based threading because that's what he had
available to him. Suppose, however, that C had implemented co-routines
(they require only barely more support than setjmp()/longjmp()). It
turns out that using coroutines alone, you can implement a lot of
multitasking problems. Maybe the Pathfinder code would have more
coroutines, and fewer threads, and may have avoided the problem
altogether (I am not privy to their source, so I really don't know).
This isn't just some weird snake oil style solution -- by their very
nature, coroutines do not have priorities, do not in of themselves make
race conditions possible, and generally consume less in resources than
threads.

Coroutines are one of those "perfect compromises", because you can
easily specify a portable interface, that is very likely to be widely
supportable, they are actually tremendously faster than threading in
many cases, and all without adding *any* undefined behavior or
implementation defined behavior scenarios (other than a potential
inability to allocate new stacks.) Full blown multithreading, such as
in POSIX is notoriously platform specific, and it should not surprise
anyone that only few non-UNIX platforms support full blowns POSIX
threads. This fact has been noticed and adopted by those languages
where serious development is happening (Lua, Perl, Python). I don't
know if the C standards committee would be open to this -- I highly
doubt it.

Chris Hills

unread,

Aug 28, 2005, 7:03:46 AM8/28/05

In article <1125225923.8...@g47g2000cwa.googlegroups.com>,
webs...@gmail.com writes

>Magnus Wibeck wrote:
>> webs...@gmail.com wrote:
>> > The point of this being that in the end they were lucky to
>> > have very sophisticated 3rd party support that is well beyond anything
>> > that the C standard delivers.
>>
>> You surely cannot be comparing "3rd party support" from a commercial
>> company to a language standard?

why not?

>
>> > For a nuclear reactor, I would also include the requirement that they
>> > use a safer programming language like Ada.
>>
>> The Ariane software module that caused the problem was written in Ada.
>> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>> Had it been written in C, the actual cause (integer overflow) probably would
>> not have caused an exception. I'm not saying that it would have been better
>> in C, but you *cannot* blame the C standard for what happened there.
>
>You are right, I cannot blame C for bugs that happen in other
>languages. This is the most famous one from Ada. If you would like a
>short list of infamous bugs for C just go through the CERT advisories
>-- they are basically almost entirely C related.

Possibly because C is more widely and less rigorously used? I would
expect that most Ada projects are high integrity and developed as such.
C is often not used ( and certainly not taught) in a high integrity
environmnet

>See, the thing is, with Ada bugs, you can clearly blame the programmer
>for most kinds of failures.

AFAIK the Arriane problem was one of project management

> With C you can go either way. But nearly
>every software design house that writes lots of software in C just gets
>bit by bugs from all sorts of edges of the language.

So use a subset? Many industries do.

--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
/\/\/ ch...@phaedsys.org www.phaedsys.org \/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

kuy...@wizard.net

unread,

Aug 28, 2005, 9:26:51 AM8/28/05

webs...@gmail.com wrote:
> Randy Howard wrote:
> > webs...@gmail.com wrote:

...

> > > You still need a bad programming language to facilitate the
> > > manifestation of these worst case scenarios.
> >
> > If you wish to argue that low-level languages are 'bad', I will
> > have to disagree.
>
> So why put those words in my mouth?

He didn't - he's just pointing out that the characteristics you deplore
in C are inherent in C being a low-level language. Therefore, any
criticism of C for possessing those characteristics implies a criticism
of all low-level languages. You didn't actually make such a criticism,
but it was implied by the criticism you did make.

...

> Your problem is that you assume making C safer (or faster, or more
> portable, or whatever) will take something useful away from C that it
> currently has. Think about that for a minute. How is possible that
> your mind can be in that state?

Possibly, possession of a certain minimal state of awareness of
reality? No one wants C to be unsafe, slow, or unportable. As a general
rule, the cost-free ways of making it safer, faster, and more portable
have already been fully exploited. Therefore, the remaining ways are
disproportionately likely to carry a significant cost.

This is simple economics: cost-free or negative-cost ways of improving
anything are usually implemented quickly. With any reasonably mature
system, the ways of improving the system that haven't been implemented
yet are disproportionately likely to carry a significant cost.

...

> > So is the idea of a 'perfect language'.
>
> But I was not advocating that. You want punishment -- so you
> implicitely are *demanding* programmer perfection.

By that logic, requiring punishment for theft implicitly demands human
perfection?

...

> get it wrong. (These are costs per year, of course -- since its an on
> going problem, the total cost would really be infinite.)

You're failing to take into consideration the cost of capital. Costs
that take place in the future are less expensive in present-day dollars
than costs that take place in the present. The net present value of a
steady annual cost is finite, so long as the cost of capital is
positive.

...

> The standards body, just needs to remove it and those costs go away.
> Vendors and legacy defenders and pure idiot programmers might get their
> panties in a bunch, but no matter how you slice it, the cost of doing
> this is clearly finite.

You're assuming that those programmers are idiots, instead of being
intelligent people who are actually aware of what the ongoing (i.e. by
your way of calculating things, infinite) costs of such a change will
be.

> > I don't have a problem with taking gets() out of modern
> > compilers, but as you already pointed out, this doesn't
> > guarantee anything. People can still fire up an old compiler

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > and use it. ...
^^^^^^^^^^^

> > ... I don't see a realistic way for the C standard to

> > enforce such things.
>
> Interesting -- because I do. You make gets a reserved word, not
> redefinable by the preprocessor, and have it always lead to a syntax
> error. This forces legacy code owners to either remove it, or stay
> away from new compilers.

How in the world does changing new compilers have any effect on people
who "fire up an old compiler and use it"?

...

> > It is the fault of the development team, comprised of whoever
> > that involves for a given project. If the programmer feels like
> > his boss screwed him over, let him refuse to continue, swear out
> > an affidavit and have it notarized the bad software was
> > knowingly shipped, and that you refuse to endorse it.
>
> Oh I see. So, which socialist totally unionized company do you work as
> a programmer for? I'd like to apply!

What does socialism and unionism have to do with workers accepting full
responsibility for the quality of their product?

> > That would be interesting, because although I have worked way
> > more than my fair share of 120 hour weeks, I never died, and
> > never heard of anyone dying. I have heard of a few losing it
> > and checking themselves into psycho wards, but still.
>
> Well ... they usually put in buffer overflows, backdoors, or otherwise
> sloppy code before they check into these places.

Backdoors are, by definition, installed deliberately. I suppose you
might have intended to imply that overworked programmers would install
backdoors as a way of getting revenge for being overworked, but if so,
you didn't express that idea properly.

> > [...] If you
> > are being overworked, you can either keep doing it, or you can
> > quit, or you can convince your boss to lighten up.
>
> Hmmm ... so you live in India? I'm trying to guess where it is in this
> day and age that you can just quit your job solely because you don't
> like the pressures coming from management.

I'm curious - what part of the world do you live in where you are
prohibited from quitting your job? I don't understand your reference to
India - are you suggesting that it is the only place in the world where
workers aren't slaves?

...

> > Try and force me to write something in a way that I know is
> > wrong. Go ahead, it'll be a short argument, because I will
> > resign first.
>
> That's a nice bubble you live in. Or is it just in your mind?

I live in that same bubble. I'm free to quit my job for any reasons I
want to, at any time I want to. I would stop being paid, I'd have to
start searching for a new job at a better employer, and I'd have to pay
full price if I decided to use the CORBA option to continue recieving
the insurance benefits that my employer currently subsizes, but those
are just consequences of my decision, not things that would prevent me
from making it. If I decide to obey orders to produce defective code, I
have to accept the consequences of being responsible for bad code. If I
prefer the consequences of having to look for a new job at a better
employer, that's precisely what I'll do. Wouldn't you?

...

> Dennis Ritchie had no idea that NASA would put a priority inversion in
> their pathfinder code. Linus Torvalds had no idea that the NSA would
> take his code and use it for a security based platform. My point is
> that programmers don't know what the liability of their code is,
> because they are not always in control of when or where or for what it
> might be used.

When you take someone else's code and use in in a context that it
wasn't designed for, the responsibility for adapting it to be suitable
for use in the new context is yours, not the original author's.

> > > But if you were to truly enforce such an idea, I believe both C and C++

Which would you prefer: a life expectancy of three months, or a 30%
chance of increasing your life expectancy to 20 years, inextricably
linked with a 70% chance of dying in the operating room tomorrow? There
are real-life situations where the best doctors in the world, with the
best equipment in the world, can't offer you a choice that's any more
attractive than that one.

> ... In those cases, there is little choice, so

> you make do with what you have. But obviously its a situation you just
> want to move away from -- they way you solve it, is you give them
> access to the safer, and better ways to practice medicine.

I suspect that no matter how advanced our medicine gets, there will
always be conditions that it's just barely able to deal with. The
longer we live, the harder it is to keep us living; that's pretty much
unavoidable.

Hallvard B Furuseth

unread,

Aug 28, 2005, 10:38:02 AM8/28/05

Paul Hsieh writes:

> Remember that almost every virus, buffer overflow exploit, core
> dump/GPF/etc is basically due to some undefined situation in the
> ANSI C standard. I consider the ANSI C standard committee
> basically coauthors of every one of these problems.

So it's partly their fault? What should they have done -
refrained from standardizing the already existing C language?
That would not have helped: K&R C was already widely used, and
people were cooperating anyway to get some sort of portability out
of it.

Or should they have removed every undefined situation from the
language? Bye bye free() and realloc() - require a garbage
collector instead. To catch all bad pointer usage, insert
type/range information in both pointers and data. Those two
changes alone in the standard would change the C runtime
implementation so much that it's practically another language.

An _implementation_ which catches such things can be nice when you
already have a C program which you want to run safely. But if the
language standard itself made such requirements, a number of the
reasons that exist to choose C for a project would not be there.

If one is going to use another language than C, it's better to use
a language which takes advantage of not being C, instead of a
language which pretends to be C but isn't.

--
Hallvard

Chris Torek

unread,

Aug 28, 2005, 1:58:59 PM8/28/05

(Again, quite off-topic, but ...)

[Ariane rocket example]

In article <1125225923.8...@g47g2000cwa.googlegroups.com>

<webs...@gmail.com> wrote:
>You are right, I cannot blame C for bugs that happen in other

>languages. This is the most famous one from Ada. ...

>See, the thing is, with Ada bugs, you can clearly blame the programmer
>for most kinds of failures.

I am reminded of a line from a novel and movie:

"*We* fix the blame. *They* fix the problem. Their way's better."

[Pathfinder example]

>The programmer used priority based threading because that's what he had
>available to him.

Actually, the Pathfinder used vxWorks, a system with which I am
now somewhat familiar. (Not that I know much about versions
predating 6.0, but this particular item has been this way "forever",
or long enough anyway.)

The vxWorks system offers "mutex semaphores" as one of its several
flavors of data-protection between threads. The mutex creation
call, semMCreate(), takes several flag parameters. One of these
flags controls "task" (thread, process, whatever moniker you prefer)
priority behavior when the task blocks on the mutex.

The programmer *chose* this behavior, because vxWorks does offer
priority inheritance. (Admittedly, vxWorks priority inheritance
has a flaw, but that is a different problem.)

Thus, your premise -- that the programmer used priority based
scheduling (without inheritance) that led to the priority inversion
problem "because that's what he had available" is incorrect: he
could have chosen to make all the threads the same priority, and/or
used priority inheritance, all with simple parameters to the various
calls (taskSpawn(), semMCreate(), and so on).

>Coroutines are one of those "perfect compromises" ...

Coroutines are hardly perfect. However, if you like them, I suggest
you investigate the Icon programming language, for instance.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Keith Thompson

unread,

Aug 28, 2005, 2:53:09 PM8/28/05

Magnus Wibeck <magnus.wib...@telia.com> writes:
[...]

> The Ariane software module that caused the problem was written in
> Ada. http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
> Had it been written in C, the actual cause (integer overflow)
> probably would not have caused an exception. I'm not saying that it
> would have been better in C, but you *cannot* blame the C standard
> for what happened there.

Nor can it be blamed on Ada. (Not that you did so, I just wanted to
clarify that point.)

The details are off-topic but easily Googlable.

Magnus Wibeck

unread,

Aug 28, 2005, 6:08:40 PM8/28/05

Chris Hills wrote:
[..]

>>Magnus Wibeck wrote:
>>>You surely cannot be comparing "3rd party support" from a commercial
>>>company to a language standard?
>
> why not?

I find such a comparison void of meaning. Like comparing apples and anxiety.
A language standard is a passive item that describes one specific thing.
You cannot pay it money to do what you want.
A support department in a commercial company (should) bend over backwards
to help its customers getting issues with the company's products sorted out.

I missed the point websnarf was making, which, if I understand it correctly,
is that the fact that there are (lots of) support for products that use C
somehow infers that C is an unsafe language.

If that is the point websnarf was trying to make, the comparison should
be the frequency of 3rd party support contacts made regarding C casued
problems compared to other languages. Obviously numbers that are darn near
impossible to gather.

I'm not getting into the "C is unsafe" discussion, I just saw a few,
as I see it, flawed, deductions about C and the "unsafeness" of it,
and tried to address them.

/Magnus

webs...@gmail.com

unread,

Aug 28, 2005, 8:24:19 PM8/28/05

Hallvard B Furuseth wrote:
> Paul Hsieh writes:
>
> > Remember that almost every virus, buffer overflow exploit, core
> > dump/GPF/etc is basically due to some undefined situation in the
> > ANSI C standard. I consider the ANSI C standard committee
> > basically coauthors of every one of these problems.
>
> So it's partly their fault? What should they have done -
> refrained from standardizing the already existing C language?

The ANSI C standard was good enough for 1989, when the computing
industry was still in its growing stages. It served the basic purpose
of standardizing everyone behind a common standard.

Its the standards that came *AFTER* that, where the problem is. The
problem of "buffer overflows" and similar problems was well documented
and even then were making the news. And look at the near unilateral
ambivalence to the C99 standard by compiler vendors. The point is that
the "coming together" has already been achieved -- the vendors have
already gotten the value out of unified standard from the 1989
standard. The C99 standard doesn't solve and crutcial problems of
similar nature.

But suppose the C99 standard (or C94, or some future standard) included
numerous changes for the purposes of security, that broke backwards
compatibility. If there are vendors who are concerned about backward
compatibility they would just stick with the older standard (which is
what they are doing right now anyways) and if they felt security was
more important then they would move towards the new standard.

The point being that, the real reason there has been so little C99
adoption is because there is little *value* in it. The foremost thing
it delivers is backwards compatibility -- but its something the
compiler vendors *ALREADY HAVE* by sticking with the previous
standards.

Because C99 has so little value add over C89, there is no demand for
it. And it fundamentally means that language really only solves the
same problems that it did in 1989. Even for me, restrict, was really
the only language feature I was remotely interested in, and <stdint.h>
the only other thing in the standard that has any real value in it.
But since the vendors I use are not interested in implementing C99, I
have lived with "assume no aliasing" compiler switches and I have
fashioned my very own stdint.h. It turns out, that in practice, this
completely covers my C99 needs -- and I'm sure it solves most people's
C99 needs. And this is just using 1989 C compiler technology.

If the C99 standard had solved *important* problems that are plaguing
programmers today, then I think there would be more demand. You might
cause a fracture in the C community, but at least the would be some
degree of keeping up with the needs of the C community. And the reason
*WHY* such things should be solved in the C standard and not just other
languages, is because this is where the largest problems are, and where
the effect can be leveraged to the degree.

> That would not have helped: K&R C was already widely used, and
> people were cooperating anyway to get some sort of portability out
> of it.

Right. It would not have helped in 1989. But C99 doesn't help anyone
today. That's the key point.

> Or should they have removed every undefined situation from the
> language?

No, just the worst offenders.

> [...] Bye bye free() and realloc() - require a garbage

> collector instead. To catch all bad pointer usage, insert
> type/range information in both pointers and data. Those two
> changes alone in the standard would change the C runtime
> implementation so much that it's practically another language.

Right. That's not what I am advocating.

Tell me, if you removed gets, strtok, and strn??, would you also have
practically another language?

Randy Howard

unread,

Aug 29, 2005, 12:02:54 AM8/29/05

webs...@gmail.com wrote
(in article
<1125209654.0...@o13g2000cwo.googlegroups.com>):

> Randy Howard wrote:

>>> Bad programming + good programming language does not allow for buffer
>>> overflow exploits.
>>
>> For suitably high-level languages that might be true (and
>> provable). Let us not forget that C is *not* a high-level
>> language. It's not an accident that it is called high-level
>> assembler.
>
> Right. If you're not with us, you are with the terrorists.

Excuse me?

> Why does being a low language mean you have to present a programming
> interface surrounded by landmines?

If you have access to any sequence of opcodes available on the
target processor, how can it not be?

> Exposing a sufficiently low level
> interface may require that you expose some danergous semantics, but why
> expose them up front right in the most natural paths of usage?

Do you feel that 'gets()' is part of the most natural path in C?

>> I'd love for you to explain to us, by way of example, how you
>> could guarantee that assembly programmers can not be allowed to
>> code in a way that allows buffer overflows.
>
> Ok, the halting problem means basically nobody guarantees anything
> about computer programming.

Fair enough, but you're just dodging the underlying question.

> But its interesting that you bring up the questions of assembly
> language. If you persuse the x86 assembly USENET newsgroups, you will
> see that many people are very interested in expanding the power and
> syntax for assembly language (examples include HLA, RosAsm, and
> others).

For a suitably generous definition of 'many', perhaps.

> A recent post talked about writing a good string library for
> assembly, and there was a strong endorsement for the length prefixed
> style of strings, including one direct reference to Bstrlib as a design
> worth following (not posted by me!).

I would have been shocked if you had not figured out a way to
bring your package up. :-)

> So, while assembly clearly isn't an inherently safe language, it seems
> quite possible that some assembly efforts will have a much safer (and
> much faster) string interface than C does.

Which does absolutely nothing to prevent the possibility of
developing insecure software in assembler. It may offer some
advantages for string handling, but that closes at best only one
of a thousand doors.

>> [...] If you want to argue that too many people
>> write code in C when their skill level is more appropriate to a
>> language with more seatbelts, I won't disagree. The trick is
>> deciding who gets to make the rules.
>
> But I'm not arguing that either. I am saying C is to a large degree
> just capriciously and unnecessarily unsafe (and slow, and powerless,
> and unportable etc., etc).

Slow? Yes, I keep forgetting how much better performance one
achieves when using Ruby or Python. Yeah, right.

Powerless? How so? It seems to be the only language other than
assembler which has been used successfully for operating system
development.

Unportable? You have got to be kidding. I must be
hallucinating when I see my C source compiled and executing on
Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.

>>> Ok, this is what I was talking about when I mentioned rose colored
>>> glasses. If programmers are perfect, then what you are saying is fine,
>>> because you can expect perfection. But real people are not. And I
>>> think expectations of perfection in programming is really nonsensical.
>>
>> /Exactly/ Expecting zero buffer overruns is nonsensical.
>
> Well, not exactly. If you're not using C or C++, then buffer overflows
> usually at worse lead to a runtime exception; in C or C++, exploits are
> typically designed to gain shell access in the context of the erroneous
> program. Its like honey for bees -- people attack C/C++ programs
> because they have this weakness. In other safer programming languages,
> even if you had a buffer overflow, allowing a control flow
> zombification of the program is typically not going to be possible.

That is all true, and it does nothing to address the point that
C is still going to be used for a lot of development work. The
cost of the runtime error handling is nonzero. Sure, there are
a lot of applications today where they do not need the raw speed
and can afford to use something else. That is not always the
case. People are still writing a lot of inline assembly even
when approaching 4GHz clock speeds.

>> Anyway, a language so restrictive as to guarantee that nothing
>> can go wrong will probably never be used for any real-world
>> project.
>
> How about simpler language that is more powerful, demonstrably faster,
> more portable (dictionary definition), obviously safer and still just
> as low level?

That would be nice.

> Just take the C standard, deprecate the garbage, replace
> a few things, genericize some of the APIs, well define some of the
> scenarios which are currently described as undefined, make some of the
> ambiguous syntaxes that lead to undefined behavior illegal, and you're
> immediately there.

I don't immediately see how this will be demonstrably faster,
but you are free to invent such a language tomorrow afternoon.
Do it, back up your claims, and no doubt the world will beat a
path to your website. Right? "D" is already taken, what will
you call it?

It isn't possible. What is possible is for you to make gross
assumptions about what 'my problem' is based up the post you are
replying to here. I do not assume that C can not be made safer.
What I said, since you seem to have missed it, is that the
authors of the C standard are not responsible for programmer
bugs.

>> So is the idea of a 'perfect language'.
>
> But I was not advocating that. You want punishment -- so you
> implicitely are *demanding* programmer perfection.

No, I am not. I do not demand that doctors are perfect, but I
expect them to be highly motivated to attempt to be perfect.

>> It's quite easy to simply make the use of gets() and friends
>> illegal for your code development. Most of us have already done
>> so, without a standard body telling us to do it.
>
> So, estimate the time taken to absorb this information per programmer,
> multiply it by the average wage of that programmer, multiply that by
> the number of programmers that follow that and there you get the cost
> of doing it correctly.

What cost? Some 'world-wide rolled-up cost'? For me, it cost
me almost nothing at all. I first discovered gets() was
problematic at least a decade ago, probably even earlier, but I
don't keep notes on such things. It hasn't cost me anything
since. If I hire a programmer, this has all been settled to my
satisfaction before they get an offer letter. It hasn't been a
problem and I do not expect it to be one in the future.

> The standards body, just needs to remove it and those costs go away.

They do not. As we have already seen, it takes years, if not
decades for a compiler supporting a standard to land in
programmer hands. With the stunningly poor adoption of C99, we
could not possibly hope to own or obtain an open source C0x
compiler prior to 2020-something, if ever. In the mean time,
those that are serious solved the problem years ago.

> You don't think people who move code around with calls
> to gets() in it should remove them?

Of course I do. In fact, I say so, which you conveniently
quoted just below...

>> Even so, anyone dusting off an old program that doesn't go
>> sifting through looking for the usual suspects is a fool.
>
> And an old million line program?

Didn't /you/ just say that they should be removed?

> I think this process should be
> automated. In fact, I think it should be automated in your compiler.
> In fact I think your compiler should just reject these nonsensical
> functions out of hand and issue errors complaining about them.

Make up your mind. Fixing them in the the compiler, as I would
expect an 'automated' solution to do, and rejecting the
offending lines are completely different approaches.

> Hey! I have an idea! Why not remove them from the standard?

Great idea. 15 years from now that will have some value.

A better idea. Patch gcc to bitch about them TODAY, regardless
of the standard.

>> I don't have a problem with taking gets() out of modern
>> compilers, but as you already pointed out, this doesn't
>> guarantee anything. People can still fire up an old compiler
>> and use it. I don't see a realistic way for the C standard to
>> enforce such things.
>
> Interesting -- because I do. You make gets a reserved word, not
> redefinable by the preprocessor, and have it always lead to a syntax
> error.

What part of 'people can still fire up and old compiler' did you
fail to read and/or understand?

> This has value because, developers can claim to be "C 2010 compliant"
> or whatever, and this can tell you that you know it doesn't have gets()
> or any other wart that you decided to get rid of.

They could also simply claim "we are smarter than the average
bear, and we know better to use any of the following offensive
legacy functions, such as gets(), ..."

To clarify, since it didn't soak in the first time, I am not
opposed to them being removed. I simply don't this as a magic
bullet, and certainly not in the sense that it takes far too
long for the compilers to catch up with it. I would much rather
see compilers modified to deny gets() and its ilk by default,
and require a special command line option to bypass it, /if at
all/. However, the warning message should be far more useful
than
gets.c: 325: error: gets() has been deprecated.

That's just oh so useful, especially to newbies. I wouldn't
care if it dumped a page and a half of explanation, along with a
detailed example of how to replace such calls with something
safer. After all, good code doesn't have it in them anyway, and
it won't annoy anyone that is competent.

> This would in turn
> put pressure of the legacy code owners to remove the offending calls,
> in an effort that's certainly no worse than the Y2K issue (without the
> looming deadline hanging over their heads).

If, and only if, they use a compiler with such changes. We
still see posts on a regular basic with people using old 16-bit
Borland compilers to write new software.

>>> And what if its not the programmer's fault?
>>
>> It is the fault of the development team, comprised of whoever
>> that involves for a given project. If the programmer feels like
>> his boss screwed him over, let him refuse to continue, swear out
>> an affidavit and have it notarized the bad software was
>> knowingly shipped, and that you refuse to endorse it.
>
> Oh I see. So, which socialist totally unionized company do you work as
> a programmer for? I'd like to apply!

I don't think you understood me. I know of no company that has
a policy for this. However, if I was working on something and
felt that something was being done that could be inherently
dangerous, and it was going to ship anyway, I would take some
form of legal action, if for no other reason than to be able to
disassociate myself from the impending lawsuits.

I would much rather go look for work than participate in
something that might wind up with people dying over the actions
of some meddling manager.

>> [...] If you
>> are being overworked, you can either keep doing it, or you can
>> quit, or you can convince your boss to lighten up.
>
> Hmmm ... so you live in India?

Why would you think so?

> I'm trying to guess where it is in this
> day and age that you can just quit your job solely because you don't
> like the pressures coming from management.

Where do you live? Because I am trying to guess where on the
planet you would /not/ have the right to quit your job.
Indentured servitude is not widely practiced anymore, AFAIK.

>> [...] ESPECIALLY in this case, the C standard folks are not to blame.
>
> But if the same issue happens and you are using a safer language, the
> same kinds of issues don't come up. Your code might be wrong, but it
> won't allow buffer overflow exploits.

You can have 10 dozen other forms of security failure, that have
nothing to do with buffer overflows. It isn't a panacea. When
one form of attack is removed, another one shows up.

For example, the last straw the sent Microsoft windows off my
network for eternity happened recently. A computer system
running XP, SP2, all the patches, automatic Windows updates
daily, virus software with automatic updates and real-time
protection, email-virus scanning software, two different brands
of spyware protection, also with automatic updates enabled, and
both a hardware firewall and software firewall installed, got
covered up in viruses after 2 hours of letting my kids use it to
go play some stupid online kids game on disney.com or
nickelodeon.com (not sure which, since they went to both, and I
didn't want to replicate it). Suddenly, when I come back to
look at it, it has 3 or 4 new taskbar icons showing downloads in
progress of I know not what task manager shows a bunch of extra
processes that shouldn't be there, the registry run keys are
stuffed fool of malware, and it's pushing stuff out the network
of I know not what. I pull the cable, start trying to delete
files, which Windows wants to tell me I don't have permission to
do, scanning, the browser cache directories are filled with .exe
and .dll files, it's out of control.

A few expletives later, and I was installing a new Linux distro
that I had been meaning to try out for a while.

I had done just about everything I could imagine to lock the
system down, and it still got out of control in 2 hours letting
a 12-yr-old browse a website and play some games.

Of course, if enough people do the same thing, the bad guys will
figure out how to do this on Linux boxes as well. But for now,
the OS X and Linux systems have been causing me (and the kids)
zero pain and I'm loving it.

>> Try and force me to write something in a way that I know is
>> wrong. Go ahead, it'll be a short argument, because I will
>> resign first.
>
> That's a nice bubble you live in. Or is it just in your mind?

No, I'm just not a spineless jellyfish. It's rather
disappointing that it surprises you, it doesn't say much for
your own backbone that you would just roll over when faced with
this sort of thing.

>> We expect architects, doctors, lawyers, pretty much all
>> other real 'professions' to meet and typically exceed a higher
>> standard, and those that do not are punished, fined, or stripped
>> of their license to practice in the field. Why should
>> programmers get a pass? Is it because you do not feel it is a
>> professional position?
>
> Because its not as structured, and that's simply not practical.
> Doctors have training, internships, etc. Lawyers have to pass a bar
> exam, etc. There's no such analogue for computer programmers.

Thank you. You get it now. That is exactly what is missing.

> Because the most successful programmers are always ones that are
> able to think outside the box,

Then they should have zero problems passing a rigorous training
program and examinations.

> but the bar for average programmers is pretty low --

Fine. If you don't have your cert, you can be a 'nurse', you
can write scripts, or use uber-safe languages certified for
those not willing to prove themselves worthy through formal
certification.

> but both can make a contribution, and neither can guarantee
> perfect code.

And no doctor can guarantee that you won't die on the operating
table. But, they have to prove that they are competent anyway,
despite the lack of a guarantee of perfection. Would you like
it if they didn't have to do so?

>>> Programmers are generally not aware of the liability of
>>> their mistakes.
>>
>> Then those you refer to must be generally incompetent.
>
> Dennis Ritchie had no idea that NASA would put a priority inversion in
> their pathfinder code.

Are you implying that Dennis Ritchie is responsible for some bad
code in the pathfinder project?

> Linus Torvalds had no idea that the NSA would
> take his code and use it for a security based platform.

Is there any evidence that the NSA chose his code because it was
not worth fooling with? What is your point? Oh, you're going
to tell us...

> My point is
> that programmers don't know what the liability of their code is,
> because they are not always in control of when or where or for what it
> might be used.

Wow, that is tortured at best. Presumably Ritchie is in your
list because of C or UNIX? How could he be 'liable' for an
application or driver written by somebody else 30 years later?

Are the contributors to gcc responsible for every bad piece of
software compiled with it?

If someone writes a denial-of-service attack program that sits
on a Linux host, is that Torvald's fault? I've heard of people
trying to shift blame before, but not that far. Maybe you might
want to blame Linus' parents too, since if they hadn't conceived
him, Linux wouldn't be around for evil programmers to write code
upon. Furrfu.

> The recent JPEG parsing buffer overflow exploit, for example, came from
> failed sample code from the JPEG website itself. You think we should
> hunt down Tom Lane and linch him?

Nope. If you take sample code and don't investigate it fully
before putting it into production use, that's /your/ problem.
You think a doctor would take a sample of medicine he found
laying on a shelf in 7-11 and administer it to a patient in the
hopes that it would work? Downloading source off the web and
using it without reading and understanding it is similarly
irresponsible, although with perhaps less chance (although no
guarantee) of it killing someone.

>> I highly doubt that. Low-level language programmers would be
>> the cream of the crop, not 'the lowest bidder' as is the case
>> today.
>
> You still don't get it. You, I or anyone you know, will produce errors
> if pushed. There's no such thing as a 0 error rate for programming.

Then I do get it, because I agree with you. Let me know when I
can write a device driver in Python.

> Just measuring first time compile error rates, myself, I score roughly
> one syntax error per 300 lines of code. I take this as an indicator
> for the likely number of hidden bugs I just don't know about in my
> code. Unless my first-compile error rate was 0, I just can't have any
> confidence that I don't also have a 0 hidden bug rate.

Strange logic, or lack thereof. Having no first-compile errors
doesn't provide ANY confidence that you don't have hidden bugs.

> Go measure your own first-compile error rate and tell me you are
> confident in your own ability to avoid hidden bugs.

That would be pointless, since measuring first-compile error
rate proves zilch about overall bug rates. If you want to avoid
hidden bugs, you have to actively look for them, test for them,
and code explicitly to avoid them, regardless of how often your
compiler detects a problem.

> If you still think you can achieve a 0 or near 0 hidden bug rate,

[snip, no sense following a false premise]

> For a nuclear reactor, I would also include the requirement that they
> use a safer programming language like Ada. Personally I would be
> shocked to know that *ANY* nuclear reactor control mechanism was
> written in C. Maybe a low level I/O driver library, that was
> thoroughly vetted (because you probably can't do that in Ada), but
> that's it.

Well gee, there you have it. It seems that there are some
places were C is almost unavoidable. What a shock. Who's
wearing those rose-colored glasses now?

>> [...] For
>> example, there are operations that have very low success rates,
>> yet there are doctors that specialize in them anyway, despite
>> the low odds.
>
> Well, your analogy only makes some sense if you are talking about
> surgeons in developing countries who simply don't have access to the
> necessary anesthetic, support staff or even the proper education to do
> the operation correctly. In those cases, there is little choice, so
> you make do with what you have. But obviously its a situation you just
> want to move away from -- they way you solve it, is you give them
> access to the safer, and better ways to practice medicine.

You seem to ignore the /fact/ that even in the finest medical
facilities on the planet (argue where they are elsewhere) there
are medical operations that have very low success rates, yet
they are still attempted, usually because the alternative is
certain death. A 20% chance is better than zero.

>> If you don't want to take the risk, then go write in visual
>> whatever#.net and leave it to those that are.
>
> So you want some people to stay away from C because the language is too
> dangerous.

So are chainsaws, but I don't want chainsaws to be illegal, they
come in handy. So are steak knifes, and despite them be illegal
on airplanes, being stuck with plastic 'sporks' instead, I still
like them when cutting into a t-bone. You can not eliminate all
risk.

Do you really think you can do anything to a language that
allows you to touch hardware that will prevent people from
misusing it? Not all development work is for use inside a VM or
other sandbox.

> While I want the language be fixed so that most people
> don't trigger the landmines in the language so easily.

I am not opposed to the language removing provably faulty
interfaces, but I do not want its capabilities removed in other
ways. Even so, there is no likelihood of any short-term
benefits, due to the propagation delay of standard changes into
compilers, and no proof that it will even be beneficial
longer-term.

It would probably be a better idea for you to finish your
completely new "better C compiler" (keeping to your string
library naming) and make it so popular that C withers on the
vine. It's been so successful for you already, replacing all
those evil null-terminated strings all over the globe, I quiver
in anticipation of your next earth-shattering achievement.

Randy Howard

unread,

Aug 29, 2005, 12:15:54 AM8/29/05

webs...@gmail.com wrote
(in article
<1125225923.8...@g47g2000cwa.googlegroups.com>):

>>> For a nuclear reactor, I would also include the requirement that they
>>> use a safer programming language like Ada.
>>
>> The Ariane software module that caused the problem was written in Ada.
>> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
>> Had it been written in C, the actual cause (integer overflow) probably would
>> not have caused an exception. I'm not saying that it would have been better
>> in C, but you *cannot* blame the C standard for what happened there.
>
> You are right, I cannot blame C for bugs that happen in other
> languages. This is the most famous one from Ada.

You just got done telling me that Ada would avoid problems.

> See, the thing is, with Ada bugs, you can clearly blame the programmer
> for most kinds of failures.

Oh my, SURELY the Ada standard should not allow such things to
happen. Those thoughtless bastards, how could this be? ;-)

>> Also, this "priority inversion" you speak of - doesn't that imply processes
>> or threads? C does not have that AFAIK. So you cannot blame the C standard
>> for allowing priority inversion bugs to occurr. It neither allows or
>> disallows them, because C has no notion of priorities.
>
> The programmer used priority based threading because that's what he had
> available to him.

He used something that does not even exist in standard C, and
got bit in the ass. Gee, and to think that you want to hold the
standard committee (and judging by another post of yours Ritchie
himself) responsible when people do things like this. Wow.
Let's read on and see what sort of hole you choose to dig...

> Suppose, however, that C had implemented co-routines

Suppose that you hadn't blamed standard C for something not
written in standard C. That one had a much higher chance of
being true until just recently.

> Maybe the Pathfinder code would have more
> coroutines, and fewer threads, and may have avoided the problem
> altogether (I am not privy to their source, so I really don't know).

That didn't stop you from blaming it on standard C, why stop
now?

> Coroutines are one of those "perfect compromises", because you can
> easily specify a portable interface, that is very likely to be widely
> supportable, they are actually tremendously faster than threading in
> many cases, and all without adding *any* undefined behavior or
> implementation defined behavior scenarios (other than a potential
> inability to allocate new stacks.)

How strange that they are so wildly popular, whereas threads are
never used. *cough*

> Full blown multithreading, such as
> in POSIX is notoriously platform specific, and it should not surprise
> anyone that only few non-UNIX platforms support full blowns POSIX
> threads.

That's interesting, because I have used the pthreads interfaces
for code on Windows (pthreads-win32), Linux, OS X, solaris, and
even Novell NetWare (libc, since they started supporting them
several years ago). I didn't realize they didn't work, because
for some strange reason, they do work for me. Maybe I'm just
lucky, or maybe you're too fond of spouting off about things you
have 'heard' but don't actually know to be true.

Have there been bugs in pthread libraries? Yes. Have their
been bugs in almost every library ever used in software
development? Yes. Where they impossible to fix? No.

> This fact has been noticed and adopted by those languages
> where serious development is happening (Lua, Perl, Python). I don't
> know if the C standards committee would be open to this -- I highly
> doubt it.

Feel free to propose a complete coroutine implementation.

Antoine Leca

unread,

Aug 29, 2005, 5:13:00 AM8/29/05

En <news:430F8276...@null.net>, Douglas A. Gwyn va escriure:
> Antoine Leca wrote:

>> Not that I see any use for strrstr(), except perhaps to do the same
>> as strrchr() when c happens to be a multibyte character in a
>> stateless encoding.
>
> Even then it's problematic, because the search would not respect
> alignment with boundaries between character encodings.

Good point, you are quite right, and this is often overseen problem.
It will only work with self-synchronizing encodings (UTF-8 comes to mind,
but the only others I know of are using SS2/SS3, the single shifts,
_without_ using LSx/SI/SO, the locking shifts, and they are NOT very common
;-)).
Quite narrow application for a general library function.

Antoine

Antoine Leca

unread,

Aug 29, 2005, 6:03:55 AM8/29/05

En <news:1125181803....@g49g2000cwa.googlegroups.com>,
Paul Hsieh va escriure:

> Remember that almost every virus, buffer overflow exploit, core
> dump/GPF/etc is basically due to some undefined situation in the ANSI
> C standard.

<OT>
The worst exploit I've seen so far was because a library dealing with
Unicode was not checking about malformed, overlong, UTF-8 sequences, and
allowed to walk though the filesystem, including in places where webusers
are not supposed to go. AFAIK, the library is written in C++ (it could
equally been written in C, that won't change the point.)
And the exploit was successful because some key directories had bad default
permissions as factory setup.

Another one quite successful was based on an brocken API for address books;
the API can be accessed from (not strictly conforming) C code, but that is
not how it is used usually. And the way the API is accessed though C
purposely avoid possible buffer overflows.

The most sticky virus I had to deal with was a bootsector virus. PC
bootsectors are not known to be written in C, rather in assembly language.

Granted, all these behaviours are _not_ defined by the ANSI C standard.
</OT>

Just because C is very much used will mean that statically it will show up
more often in exploit or core dumps or GPF cases. This only shows it is a
successful language; there might be reasons for that; in fact, the ANSI C
Standard is a big reason for its prolongated life as a successful (= widely
used) language: I mean, had it not happen, C would probably be superceeded
nowadays (same for FIV then F77; or x86/PC.)

Antoine

Douglas A. Gwyn

unread,

Aug 29, 2005, 5:45:01 PM8/29/05

webs...@gmail.com wrote:
> Remember that almost every virus, buffer overflow exploit, core
> dump/GPF/etc is basically due to some undefined situation in the ANSI C
> standard.

That's misplaced blame. I use the same standard and don't have
such problems.

webs...@gmail.com

unread,

Aug 29, 2005, 6:53:16 PM8/29/05

Antoine Leca wrote:
> Paul Hsieh va escriure:
> > Remember that almost every virus, buffer overflow exploit, core
> > dump/GPF/etc is basically due to some undefined situation in the ANSI
> > C standard.
>
> <OT>
> The worst exploit I've seen so far was because a library dealing with
> Unicode was not checking about malformed, overlong, UTF-8 sequences, and
> allowed to walk though the filesystem, including in places where webusers
> are not supposed to go. AFAIK, the library is written in C++ (it could
> equally been written in C, that won't change the point.)
> And the exploit was successful because some key directories had bad default
> permissions as factory setup.

This is the worst? Are you sure silent zombification of your machine
isn't worse?

In any event, compare this to Java, where Unicode is actually the
standard encoding for string data. Its not really possible to have
"unicode parsing problems" in Java, since all this stuff has been
specified in the core of the language. Compare this to ANSI C, which
uses wchar, which literally doesn't *specify* anything useful. So
technically the only reason one is writing Unicode parsers in C is
because the standard doesn't give you one.

> Another one quite successful was based on an brocken API for address books;
> the API can be accessed from (not strictly conforming) C code, but that is
> not how it is used usually. And the way the API is accessed though C
> purposely avoid possible buffer overflows.
>
> The most sticky virus I had to deal with was a bootsector virus. PC
> bootsectors are not known to be written in C, rather in assembly language.

So you've been in a time machine and just recently joined us in the
next millenium? Bootsector viruses are so 80s. Boot to a dos disk and
type "fdisk /fixmbr" and usually you are set.

> Granted, all these behaviours are _not_ defined by the ANSI C standard.
> </OT>
>
> Just because C is very much used will mean that statically it will show up
> more often in exploit or core dumps or GPF cases.

Ok, so normalize the measures based on usages of the language. Do you
think C and C++ still won't be the worst by a country mile?

> [...] This only shows it is a

> successful language; there might be reasons for that; in fact, the ANSI C
> Standard is a big reason for its prolongated life as a successful (= widely
> used) language: I mean, had it not happen, C would probably be superceeded
> nowadays (same for FIV then F77; or x86/PC.)

The ANSI *C89* standard is the reason for its long life, success. But
that ducks the point that it also is the fundamental source for these
problems, exploits, etc.

webs...@gmail.com

unread,

Aug 29, 2005, 10:29:39 PM8/29/05

Randy Howard wrote:
> webs...@gmail.com wrote:

> >>> For a nuclear reactor, I would also include the requirement that they
> >>> use a safer programming language like Ada.
> >>
> >> The Ariane software module that caused the problem was written in Ada.
> >> http://sunnyday.mit.edu/accidents/Ariane5accidentreport.html
> >> Had it been written in C, the actual cause (integer overflow) probably
> >> would not have caused an exception. I'm not saying that it would have
> >> been better in C, but you *cannot* blame the C standard for what happened
> >> there.
> >
> > You are right, I cannot blame C for bugs that happen in other
> > languages. This is the most famous one from Ada.
>
> You just got done telling me that Ada would avoid problems.

It avoids buffer overflows, and other sorts of problems. Its not my
intention to defend Ada. Once again, I am not saying that either you
are with us or you are with the terrorists.

This was a point to demonstrate the programmers are not perfect, not
matter what you do. So this idea that you should just blame
programmers is just pointless.

> > See, the thing is, with Ada bugs, you can clearly blame the programmer
> > for most kinds of failures.
>
> Oh my, SURELY the Ada standard should not allow such things to
> happen. Those thoughtless bastards, how could this be? ;-)
>
> >> Also, this "priority inversion" you speak of - doesn't that imply
> >> processes or threads? C does not have that AFAIK. So you cannot blame the
> >> C standard for allowing priority inversion bugs to occurr. It neither
> >> allows or disallows them, because C has no notion of priorities.
> >
> > The programmer used priority based threading because that's what he had
> > available to him.
>
> He used something that does not even exist in standard C, and
> got bit in the ass. Gee, and to think that you want to hold the
> standard committee (and judging by another post of yours Ritchie
> himself) responsible when people do things like this.

Well, Ritchie, AFAIK, did not push for the standardization, or
recommend that everyone actually use C as a real application
development language. So I blame him for the very narrow problem of
making a language with lots of silly unnecessary problems, but not for
the fact that everyone decided to use it. The actual ANSI C committee
is different -- they knew exactly what role C was taking. They have
the ability to fix the warts in the language.

> [...] Wow. Let's read on and see what sort of hole you choose to dig...

Or what words you will put in my mouth, or what false dichotomies you
will draw ...

> > Suppose, however, that C had implemented co-routines
>
> Suppose that you hadn't blamed standard C for something not
> written in standard C. That one had a much higher chance of
> being true until just recently.
>
> > Maybe the Pathfinder code would have more
> > coroutines, and fewer threads, and may have avoided the problem
> > altogether (I am not privy to their source, so I really don't know).
>
> That didn't stop you from blaming it on standard C, why stop
> now?

First of all, the standard doesn't *have* coroutines while other
languages do. And I never *did* blame the pathfinder bug on the C
standard. I see you have the CBFalconer disease of reading whatever
the hell you want from text that simply doesn't contain the content you
think it does.

> > Coroutines are one of those "perfect compromises", because you can
> > easily specify a portable interface, that is very likely to be widely
> > supportable, they are actually tremendously faster than threading in
> > many cases, and all without adding *any* undefined behavior or
> > implementation defined behavior scenarios (other than a potential
> > inability to allocate new stacks.)
>
> How strange that they are so wildly popular, whereas threads are
> never used. *cough*

Coroutines are not very widely *deployed*. So popularity is how you
judge the power and utility of a programming mechanism? Why don't you
try to add something substantive here rather leading with ignorance?
Can you give a serious pro-con argument for full threads versus
coroutines? Because I can.

> > Full blown multithreading, such as
> > in POSIX is notoriously platform specific, and it should not surprise

> > anyone that only few non-UNIX platforms support full blown POSIX

> > threads.
>
> That's interesting, because I have used the pthreads interfaces
> for code on Windows (pthreads-win32), Linux, OS X, solaris, and
> even Novell NetWare (libc, since they started supporting them
> several years ago).

You undertstand that those are all mostly UNIX right? Even the windows
thing is really an implementation or emulation of pthreads on top of
Windows multithreading. Show me pthreads in an RTOS.

> [...] I didn't realize they didn't work, because

> for some strange reason, they do work for me. Maybe I'm just
> lucky, or maybe you're too fond of spouting off about things you
> have 'heard' but don't actually know to be true.
>
> Have there been bugs in pthread libraries? Yes. Have their
> been bugs in almost every library ever used in software
> development? Yes. Where they impossible to fix? No.

Right. And have they fixed the generic problem of race conditions?
Race conditions are just the multitasking equivalent of
buffer-overflows. Except, as you know, they are *much* harder to
debug, and you cannot use tools, compiler warnings or other simple
mechanisms to help you avoid them. This is the real benefit of
coroutines over full threading. You can't have race conditions using
coroutines.

> > This fact has been noticed and adopted by those languages
> > where serious development is happening (Lua, Perl, Python). I don't
> > know if the C standards committee would be open to this -- I highly
> > doubt it.
>
> Feel free to propose a complete coroutine implementation.

I would if I thought there was an audience for it. These things take
effort, and a brief perusal of comp.std.c leads me to believe that the
ANSI committee is extremely capricious.

Think about it. You want me to propose something actually useful,
powerful and which would improve the language to a committee that
continues to rubber stamp gets().

Is your real point that I am supposed to do this to waste my time and
energy, obviously get rejected because the ANSI C committee has no
interested in improving the language, and this will be proof that I am
wrong?

Tell me, when is the last time the C language committee considered a
change in the language that made it truly more powerful that wasn't
already implemented in many compilers as extensions? Can you give me
at least a plausibility argument that I wouldn't be wasting my time by
doing such a thing?

webs...@gmail.com

unread,

Aug 30, 2005, 2:32:03 AM8/30/05

Chris Torek wrote:
> (Again, quite off-topic, but ...)
>
> [Ariane rocket example]
>
> In article <1125225923.8...@g47g2000cwa.googlegroups.com>
> <webs...@gmail.com> wrote:
> >You are right, I cannot blame C for bugs that happen in other
> >languages. This is the most famous one from Ada. ...
> >See, the thing is, with Ada bugs, you can clearly blame the programmer
> >for most kinds of failures.
>
> I am reminded of a line from a novel and movie:
>
> "*We* fix the blame. *They* fix the problem. Their way's better."

So in this case, how do we "fix" the problem of buffer overflows in C
programs? Shall we teach every bad programmer how not to do it, and
zap them if they get it wrong (a la Randy Howard style)? Or do you
perform some C library modifications so that the problem is
substantially mitigated?

> [Pathfinder example]
> >The programmer used priority based threading because that's what he had
> >available to him.
>
> Actually, the Pathfinder used vxWorks, a system with which I am
> now somewhat familiar. (Not that I know much about versions
> predating 6.0, but this particular item has been this way "forever",
> or long enough anyway.)
>
> The vxWorks system offers "mutex semaphores" as one of its several
> flavors of data-protection between threads. The mutex creation
> call, semMCreate(), takes several flag parameters. One of these
> flags controls "task" (thread, process, whatever moniker you prefer)
> priority behavior when the task blocks on the mutex.
>
> The programmer *chose* this behavior, because vxWorks does offer
> priority inheritance. (Admittedly, vxWorks priority inheritance
> has a flaw, but that is a different problem.)
>
> Thus, your premise -- that the programmer used priority based
> scheduling (without inheritance) that led to the priority inversion
> problem "because that's what he had available" is incorrect: he
> could have chosen to make all the threads the same priority, and/or
> used priority inheritance, all with simple parameters to the various
> calls (taskSpawn(), semMCreate(), and so on).

I'm confused as to how you know that programmer who made the mistake
was in charge of what priority his task ran at. And in any event I
don't claim its the programmer (or designer, or whoever's fault it
really was's) *only* choice. Its just an example of a land mine being
there, and it being stepped on.

The reality is that VxWork "solves" this problem and others by having a
very sophisticated built-in debugging environment. They just had to
look at the task list and see that the high priority task was blocked.

> >Coroutines are one of those "perfect compromises" ...
>
> Coroutines are hardly perfect.

You, of course, missed the word that I typed immediately following the
word perfect.

> [...] However, if you like them, I suggest you investigate the Icon
> programming language, for instance.

I looked briefly at it. The impression I had is that it supported too
many modes of coroutines which made them unnecessarily complicated to
use (I may be thinking of another language.)

In any event, I have *studied* coroutines in university, and can
revisit them in far more mainstream languages like Python and Lua. My
understanding of them is not the point (I think I do -- or at least the
"one shot continuation" kind) my point is they should probably be part
of the C standard.

Chris Hills

unread,

Aug 30, 2005, 2:43:41 AM8/30/05

In article <1125368979.7...@g47g2000cwa.googlegroups.com>,
webs...@gmail.com writes

>Randy Howard wrote:
>> webs...@gmail.com wrote:
>> He used something that does not even exist in standard C, and
>> got bit in the ass. Gee, and to think that you want to hold the
>> standard committee (and judging by another post of yours Ritchie
>> himself) responsible when people do things like this.
>
>Well, Ritchie, AFAIK, did not push for the standardization, or
>recommend that everyone actually use C as a real application
>development language. So I blame him for the very narrow problem of
>making a language with lots of silly unnecessary problems, but not for
>the fact that everyone decided to use it. The actual ANSI C committee
>is different -- they knew exactly what role C was taking. They have
>the ability to fix the warts in the language.
>

>> Feel free to propose a complete coroutine implementation.
>
>I would if I thought there was an audience for it. These things take
>effort, and a brief perusal of comp.std.c leads me to believe that the
>ANSI committee is extremely capricious.
>

It's NOT down to the ANSI committee..... it is down to WG14 an ISO
committee of which ANSI is but one part. Since 1990 C has been handles
by ISO as an international standard. There are committees from many
countries involved. ANSI gets one vote like all the rest so don't blam
ANSI for all of it.

webs...@gmail.com

unread,

Aug 30, 2005, 5:56:17 AM8/30/05

Randy Howard wrote:
> webs...@gmail.com wrote:

> > Randy Howard wrote:
> >>> Bad programming + good programming language does not allow for buffer
> >>> overflow exploits.
> >>
> >> For suitably high-level languages that might be true (and
> >> provable). Let us not forget that C is *not* a high-level
> >> language. It's not an accident that it is called high-level
> >> assembler.
> >
> > Right. If you're not with us, you are with the terrorists.
>
> Excuse me?

"False dichotomy". Look it up. I never mentioned high or low level
language, and don't consider it relevant to the discussion. Its a
false dichotomoy because you immediately dismiss the possibility of a
safe low-level language.

> > Why does being a low language mean you have to present a programming
> > interface surrounded by landmines?
>
> If you have access to any sequence of opcodes available on the
> target processor, how can it not be?

C gives you access to a sequence of opcodes in ways that other
languages do not? What exactly are you saying here? I don't
understand.

> > Exposing a sufficiently low level
> > interface may require that you expose some danergous semantics, but why
> > expose them up front right in the most natural paths of usage?
>
> Do you feel that 'gets()' is part of the most natural path in C?

Yes of course! When people learn a new language they learn what it
*CAN* do before they learn what it should not do. It means anyone that
learns C first learns to use gets() before they learn not to use
gets().

> >> I'd love for you to explain to us, by way of example, how you
> >> could guarantee that assembly programmers can not be allowed to
> >> code in a way that allows buffer overflows.
> >
> > Ok, the halting problem means basically nobody guarantees anything
> > about computer programming.
>
> Fair enough, but you're just dodging the underlying question.

I am dodging the false dichotomy. Yes. You are suggesting that making
C safer is equivalent to removing buffer overflows from assembly. The
two have nothing to do with each other.

> > But its interesting that you bring up the questions of assembly
> > language. If you persuse the x86 assembly USENET newsgroups, you will
> > see that many people are very interested in expanding the power and
> > syntax for assembly language (examples include HLA, RosAsm, and
> > others).
>
> For a suitably generous definition of 'many', perhaps.

Terse, HLA, Rosasm, LuxAsm -- this is all for *one* assembly language.

> > A recent post talked about writing a good string library for
> > assembly, and there was a strong endorsement for the length prefixed
> > style of strings, including one direct reference to Bstrlib as a design
> > worth following (not posted by me!).
>
> I would have been shocked if you had not figured out a way to
> bring your package up. :-)

Oh by the way there is a new version! It incoroporates a new secure
non data-leaking input function! Soon to reach 5000 downloads and
80000 webpage hits! Come join the string library revolution and visit:
http://bstring.sf.net/ to see all the tastey goodness!

> > So, while assembly clearly isn't an inherently safe language, it seems
> > quite possible that some assembly efforts will have a much safer (and
> > much faster) string interface than C does.
>
> Which does absolutely nothing to prevent the possibility of
> developing insecure software in assembler. It may offer some
> advantages for string handling, but that closes at best only one
> of a thousand doors.

You mean it closes the most obvious and well trodden thousand doors out
of a million doors.

Assembly is not a real application development language no matter how
you slice it. So I'm would be loath to make any point about whether or
not you should expect application to become safer because they are
writing them in assembly language using Bstrlib-like philosophies. But
maybe those guys would beg to differ -- who knows.

As I recall this was just a point about low level languages adopting
safer interfaces. Tough in this case, the performance improvements
probably drives their interest in it.

> >> [...] If you want to argue that too many people
> >> write code in C when their skill level is more appropriate to a
> >> language with more seatbelts, I won't disagree. The trick is
> >> deciding who gets to make the rules.
> >
> > But I'm not arguing that either. I am saying C is to a large degree
> > just capriciously and unnecessarily unsafe (and slow, and powerless,
> > and unportable etc., etc).
>
> Slow? Yes, I keep forgetting how much better performance one
> achieves when using Ruby or Python. Yeah, right.

I never put those languages up as alternatives for speed. The false
dichotomy yet again.

> Powerless? How so?

No introspection capabilities. I cannot write truly general
autogenerated code from the preprocessor, so I don't get even the most
basic "fake introspection" that's should otherwise be so trivial to do.
No coroutines (Lua and Python have them) -- which truly closes doors
for certain kinds of programming (think parsers, simple incremental
chess program legal move generators, and so on). Multiple heaps which
a freeall(), so that you can write "garbage-collection style" programs,
without incurring the cost of garbage collection -- again there are
real applications where this kind of thing is *really* useful.

> [...] It seems to be the only language other than

> assembler which has been used successfully for operating system
> development.

The power I am talking about is power to program. Not the power to
access the OS.

> Unportable? You have got to be kidding. I must be
> hallucinating when I see my C source compiled and executing on
> Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
> other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.

Right. Because you write every piece of C code that's ever been
written right?

> >>> Ok, this is what I was talking about when I mentioned rose colored
> >>> glasses. If programmers are perfect, then what you are saying is fine,
> >>> because you can expect perfection. But real people are not. And I
> >>> think expectations of perfection in programming is really nonsensical.
> >>
> >> /Exactly/ Expecting zero buffer overruns is nonsensical.
> >
> > Well, not exactly. If you're not using C or C++, then buffer overflows
> > usually at worse lead to a runtime exception; in C or C++, exploits are
> > typically designed to gain shell access in the context of the erroneous
> > program. Its like honey for bees -- people attack C/C++ programs
> > because they have this weakness. In other safer programming languages,
> > even if you had a buffer overflow, allowing a control flow
> > zombification of the program is typically not going to be possible.
>
> That is all true, and it does nothing to address the point that
> C is still going to be used for a lot of development work. The
> cost of the runtime error handling is nonzero. Sure, there are
> a lot of applications today where they do not need the raw speed
> and can afford to use something else. That is not always the
> case. People are still writing a lot of inline assembly even
> when approaching 4GHz clock speeds.

Ok, first of all runtime error handling is not the only path. In fact,
I don't recommend that as your sole approach. You always want error
detection to happen as early in the development process as possible,
and that means bringing errors to compile time. In this case, the most
obvious solution is to have better and safer APIs.

Second of all, remember, I *BEAT* the performance of C's strings across
the board on multiple platforms with a combination of run time and API
design in Bstrlib. This is a false idea that error checking always
costs performance. Performance is about design, not what you do about
safety.

> >> Anyway, a language so restrictive as to guarantee that nothing
> >> can go wrong will probably never be used for any real-world
> >> project.
> >
> > How about simpler language that is more powerful, demonstrably faster,
> > more portable (dictionary definition), obviously safer and still just
> > as low level?
>
> That would be nice.
>
> > Just take the C standard, deprecate the garbage, replace
> > a few things, genericize some of the APIs, well define some of the
> > scenarios which are currently described as undefined, make some of the
> > ambiguous syntaxes that lead to undefined behavior illegal, and you're
> > immediately there.
>
> I don't immediately see how this will be demonstrably faster,
> but you are free to invent such a language tomorrow afternoon.

Well just a demonstration candidate, we could take the C standard, add
in Bstrlib, remove the C string functions listed in the bsafe.c module,
remove gets and you are done (actually you could just remove the C
string functions listed as redundant in the documentation). Of course
there are many other simple changes, like abstracted heaps that include
a freeall() function that I have demonstrated which can also lead to
enormous performance improvements. This would immedately make the
language technically safer and faster.

> Do it, back up your claims, and no doubt the world will beat a
> path to your website. Right?

Uhh ... actually no. People like my Bstrlib because its *safe* and
*powerful*. They tend not to notice or realize they are getting a
major performance boost for free as well (they *would* notice if it was
slower, of course). But my optimization and low level web pages
actually do have quite a bit of traffic -- a lot more than my pages
critical of apple or microsoft, for example.

Its not hard to beat compiler performance, even based fundamentally on
weakness in the standard (I have a web page practically dedicated to
doing just that; it also gets a lot of traffic). But by itself, that's
insufficient to gain enough interest in building a language for
everyday use that people would be interested in.

> [...] "D" is already taken, what will you call it?

How about "C"?

> > Your problem is that you assume making C safer (or faster, or more
> > portable, or whatever) will take something useful away from C that it
> > currently has. Think about that for a minute. How is possible that
> > your mind can be in that state?
>
> It isn't possible. What is possible is for you to make gross
> assumptions about what 'my problem' is based up the post you are
> replying to here. I do not assume that C can not be made safer.
> What I said, since you seem to have missed it, is that the
> authors of the C standard are not responsible for programmer
> bugs.

Ok, well then we have an honest point of disagreement then. I firmly
believe that the current scourge of bugs that lead to CERT advisories
will not ever be solved unless people abandon the current C and C++
languages. I think there is great concensus on this. The reason why I
blame the ANSI C committee is because, although they are active, they
are completely blind to this problem, and haven't given one iota of
consideration to it. Even though they clearly are in the *best*
position to do something about it. And its them any only them -- the
only alternative is to abandon C (and C++) which is a very painful and
expensive solution; but you can se that people are doing exactly that.
Not a lot of Java in those CERT advisories.

> >> So is the idea of a 'perfect language'.
> >
> > But I was not advocating that. You want punishment -- so you
> > implicitely are *demanding* programmer perfection.
>
> No, I am not. I do not demand that doctors are perfect, but I
> expect them to be highly motivated to attempt to be perfect.

Ok, you demand that they *try* to be perfect. I'm not advocating that
language be perfect or *try* to be perfect. I only want it not to be
thoroughly incompetent.

> >> It's quite easy to simply make the use of gets() and friends
> >> illegal for your code development. Most of us have already done
> >> so, without a standard body telling us to do it.
> >
> > So, estimate the time taken to absorb this information per programmer,
> > multiply it by the average wage of that programmer, multiply that by
> > the number of programmers that follow that and there you get the cost
> > of doing it correctly.
>
> What cost? Some 'world-wide rolled-up cost'? For me, it cost
> me almost nothing at all. I first discovered gets() was
> problematic at least a decade ago, probably even earlier, but I
> don't keep notes on such things. It hasn't cost me anything
> since.

And so are you saying it didn't cost you anything when you first
learned it? And that it won't cost the next generation of programmers,
or anyone else who learns C for the first time?

> [...] If I hire a programmer, this has all been settled to my

> satisfaction before they get an offer letter. It hasn't been a
> problem and I do not expect it to be one in the future.

But the cost is there. So the cost is ongoing.

> > The standards body, just needs to remove it and those costs go away.
>
> They do not. As we have already seen, it takes years, if not
> decades for a compiler supporting a standard to land in
> programmer hands. With the stunningly poor adoption of C99, we
> could not possibly hope to own or obtain an open source C0x
> compiler prior to 2020-something, if ever. In the mean time,
> those that are serious solved the problem years ago.

C99 is not being adopted because there is no *demand* from the users or
development houses for it. If the standard had been less drammatic,
and solved more real world problems, like safety, for example, I am
sure that this would not be the case. You also ignore the fact that
the C++ folks typically pick up the changes in the C standard for their
own. So the effect of the standard actually *is* eventually
propogated.

The fact that it would take a long time for a gets() removal in the
standard to be propogated to compiler, I do not find to be a credible
argument.

Also note thast C89, had very fast adoption. It took a long time for
near perfect and pervasive adoption, but you had most vendors more than
90% of the way there within a very few years.

> > You don't think people who move code around with calls
> > to gets() in it should remove them?
>
> Of course I do. In fact, I say so, which you conveniently
> quoted just below...

A compiler error telling the user that its wrong (for new platform
compilers) is the best and simplest way to do this.

> >> Even so, anyone dusting off an old program that doesn't go
> >> sifting through looking for the usual suspects is a fool.
> >
> > And an old million line program?
>
> Didn't /you/ just say that they should be removed?

I am saying the process of manual removal, hoping that your programmers
are disciplined enough to do it, is not necessarily going to happen in
practice.

> > I think this process should be
> > automated. In fact, I think it should be automated in your compiler.
> > In fact I think your compiler should just reject these nonsensical
> > functions out of hand and issue errors complaining about them.
>
> Make up your mind. Fixing them in the the compiler, as I would
> expect an 'automated' solution to do, and rejecting the
> offending lines are completely different approaches.
>
> > Hey! I have an idea! Why not remove them from the standard?
>
> Great idea. 15 years from now that will have some value.

Uh ... but you see that its still better than nothing right? You think
programming will suddenly stop in 15 years? Do you think there will be
less programmer *after* this 15 year mark than there has been before
it?

Or, like me, do you think C will just become COBOL in 15 years?

> A better idea. Patch gcc to bitch about them TODAY, regardless
> of the standard.

The linker for the GNU linker already does this. But its perceived as
a warning. People do not always listen to warnings.

> >> I don't have a problem with taking gets() out of modern
> >> compilers, but as you already pointed out, this doesn't
> >> guarantee anything. People can still fire up an old compiler
> >> and use it. I don't see a realistic way for the C standard to
> >> enforce such things.
> >
> > Interesting -- because I do. You make gets a reserved word, not
> > redefinable by the preprocessor, and have it always lead to a syntax
> > error.
>
> What part of 'people can still fire up and old compiler' did you
> fail to read and/or understand?

Use of old compilers is not the problem. The piles of CERT advisories
and news stories about exploits are generally directed at systems that
are constantly being updated with well supported compilers.

> > This has value because, developers can claim to be "C 2010 compliant"
> > or whatever, and this can tell you that you know it doesn't have gets()
> > or any other wart that you decided to get rid of.
>
> They could also simply claim "we are smarter than the average
> bear, and we know better to use any of the following offensive
> legacy functions, such as gets(), ..."

But nobody would believe your claim. My claim could be audited, and a
company would actually worry about being sued for making a false claim
of the sort I am advocating unless it were true.

> To clarify, since it didn't soak in the first time, I am not
> opposed to them being removed. I simply don't this as a magic
> bullet, and certainly not in the sense that it takes far too
> long for the compilers to catch up with it. I would much rather
> see compilers modified to deny gets() and its ilk by default,
> and require a special command line option to bypass it, /if at
> all/. However, the warning message should be far more useful
> than
> gets.c: 325: error: gets() has been deprecated.

Did I misspeak and ask for deprecation? Or are you misrepresenting my
position as usual? I'm pretty sure I explicitely said "non-redefinable
in the preprocessor and always leads to an error" to specifically
prevent people from working around its removal.

> That's just oh so useful, especially to newbies. I wouldn't
> care if it dumped a page and a half of explanation, along with a
> detailed example of how to replace such calls with something
> safer. After all, good code doesn't have it in them anyway, and
> it won't annoy anyone that is competent.

Well, then our positions are not so different. Since my solution would
cause the developer to go to the manuals which would hopefully explain
the situation in the way you would expect it to.

> > This would in turn
> > put pressure of the legacy code owners to remove the offending calls,
> > in an effort that's certainly no worse than the Y2K issue (without the
> > looming deadline hanging over their heads).
>
> If, and only if, they use a compiler with such changes. We

> still see posts on a regular basis with people using old 16-bit

> Borland compilers to write new software.

... And you think there will be lots of CERT advisories on such
products? Perhaps you could point my to a few examples of such
advisories which are new, but which use old compilers such as Borland
C.

We can't do anything about legacy compilers -- and we don't *NEED TO*.
That's not the point. The "software crisis" is directed at development
that usually uses fairly well maintained compilers.

> >>> And what if its not the programmer's fault?
> >>
> >> It is the fault of the development team, comprised of whoever
> >> that involves for a given project. If the programmer feels like
> >> his boss screwed him over, let him refuse to continue, swear out
> >> an affidavit and have it notarized the bad software was
> >> knowingly shipped, and that you refuse to endorse it.
> >
> > Oh I see. So, which socialist totally unionized company do you work as
> > a programmer for? I'd like to apply!
>
> I don't think you understood me. I know of no company that has
> a policy for this. However, if I was working on something and
> felt that something was being done that could be inherently
> dangerous, and it was going to ship anyway, I would take some
> form of legal action, if for no other reason than to be able to
> disassociate myself from the impending lawsuits.

Ok ... that's interesting, but this is ridiculuous. As I said above,
you do not write every piece of software in the world. And we are well
aware of about 10,000 programmers living in the pacific northwest who
we know do *NOT* share your attitude.

And you defence of the situation is that you assume every gainfully
employed programmer should be willing to quit the moment they see that
their process of programming is not likely to yield the highest
possible quality in software engineering.

> I would much rather go look for work than participate in
> something that might wind up with people dying over the actions
> of some meddling manager.

That's nice for you. That's not going to be a choice for lots of other
people.

> >> [...] If you
> >> are being overworked, you can either keep doing it, or you can
> >> quit, or you can convince your boss to lighten up.
> >
> > Hmmm ... so you live in India?
>
> Why would you think so?

Wild guess.

> > I'm trying to guess where it is in this
> > day and age that you can just quit your job solely because you don't
> > like the pressures coming from management.
>
> Where do you live? Because I am trying to guess where on the
> planet you would /not/ have the right to quit your job.
> Indentured servitude is not widely practiced anymore, AFAIK.

That isn't what I am saying. People's ability to quit or work at will
are often not related to things like programming philosophy or idealism
about their job. And software is and always will be created by
developers who have considerations other than the process of creating
perfect software.

> >> [...] ESPECIALLY in this case, the C standard folks are not to blame.
> >
> > But if the same issue happens and you are using a safer language, the
> > same kinds of issues don't come up. Your code might be wrong, but it
> > won't allow buffer overflow exploits.
>
> You can have 10 dozen other forms of security failure, that have
> nothing to do with buffer overflows.

I implore you -- read the CERT advisories. Buffer Overflows are #1 by
a LARGE margin. Its gotten to the point where its so embarassing to
Microsoft that they now try to disguise the fact that they have buffer
overlows through convoluted language. (You can still figure it out
though, when they say things like "hostile input leading to running of
arbitrary code ...")

> [...] It isn't a panacea. When one form of attack is removed, another
> one shows up.

If you remove buffer overflows, it doesn't mean that other kinds of
bugs will suddenly increase in absolute occurrence. Unless you've got
your head in the sand, you've got to know that *SPECIFICALLY* buffer
overflows are *BY THEMSELVES* the biggest and most solvable, and
therefore most important safety problem in programming.

I'm not sure how this is an argument that Buffer Overflows aren't the
worst safety problem in programming by a large margin.

None of those problems actually have anything to do with programmer
abilities, or language capabilities. They have to do with corporate
direction, mismanagement, and incompetent program architecture. That's
a completely seperate issue.

> [... analogy taken too far, as usual, snipped ...]

> >>> Programmers are generally not aware of the liability of
> >>> their mistakes.
> >>
> >> Then those you refer to must be generally incompetent.
> >
> > Dennis Ritchie had no idea that NASA would put a priority inversion in
> > their pathfinder code.
>
> Are you implying that Dennis Ritchie is responsible for some bad
> code in the pathfinder project?

Uh ... no *you* are. My point was that he *COULDN'T* be.

Just like I can't be responsible if some bank used an old version of
Bstrlib to input passwords not realizing that longer passwords might be
leaked back to the heap and some other flaw in their program which
exposed the heap caused some passwords to become visible.

Sometimes you are *NOT AWARE* of your liability, and you don't *KNOW*
the situations where your software might be used.

> > Linus Torvalds had no idea that the NSA would
> > take his code and use it for a security based platform.
>
> Is there any evidence that the NSA chose his code because it was
> not worth fooling with?

What? They *DID* fool with his code -- they created something called
"Security Enhanced Linux" and suggested people look at it. As I
recall, the changes were a little too drastic, but there are
alternatives that people have been working on that provide similar
functionality that the main brain *is* adopting (and you cannot deny
this was motivated by the NSA's little project, which by itself has
some usage).

So the question is, does Linus himself become liable for the potential
security flaws or failures that such "security enhancements" might not
deliver? (Keeping in mind that Linus still does personally accept or
reject the changes proposed to the Linux kernel.)

> [...] What is your point? Oh, you're going to tell us...

>
> > My point is
> > that programmers don't know what the liability of their code is,
> > because they are not always in control of when or where or for what it
> > might be used.
>
> Wow, that is tortured at best. Presumably Ritchie is in your
> list because of C or UNIX? How could he be 'liable' for an
> application or driver written by somebody else 30 years later?

That was *my* point. Remember you are claiming that you want to pin
responsibility and liability for code to people so that you can dish
out punishment to them. I see a direct line of responsibility from
weakness in the C library back to him (or maybe it was Thompson or
Kernigham). And remember you want to punish people.

> Are the contributors to gcc responsible for every bad piece of
> software compiled with it?

Well no, but you can argue that they are responsible for the bugs they
introduce into their compilers. I've certainly stepped on a few of
them myself, for example. So if a bug in my software came down to a
bug in their compiler, do you punish me for not being aware of the bug,
or them for putting the bug in there in the first place?

> If someone writes a denial-of-service attack program that sits
> on a Linux host, is that Torvald's fault? I've heard of people
> trying to shift blame before, but not that far. Maybe you might
> want to blame Linus' parents too, since if they hadn't conceived
> him, Linux wouldn't be around for evil programmers to write code
> upon. Furrfu.

Steve Gibson famously railed on Microsoft for enabling "raw sockets" in
Windows XP. This allows for easy DDOS attacks, once the machines have
been zombified. Microsoft marketing, just like you, of course
dismissed any possibility that they should accept any blame whatsoever.

With the latest service pack, the engineers took control (and
responsibility) and turned off raw sockets by default in Windows XP.
There *IS* a liability chain, and yes it *DOES* reach back that far,
even if marketing people try to convince you otherwise.

> > The recent JPEG parsing buffer overflow exploit, for example, came from
> > failed sample code from the JPEG website itself. You think we should
> > hunt down Tom Lane and linch him?
>
> Nope. If you take sample code and don't investigate it fully
> before putting it into production use, that's /your/ problem.

Oh I see. So you just want to punish, IBM, Microsoft, Unisys, JASC
software, Adobe, Apple, ... etc. NOBODY caught the bug for about *10
years* dude. Everyone was using that sample code including *myself*.
And its quite likely its just traces back to Tom Lane, or someone that
was working with him.

> [... more appealing to analogies removed ...]

> >> I highly doubt that. Low-level language programmers would be
> >> the cream of the crop, not 'the lowest bidder' as is the case
> >> today.
> >
> > You still don't get it. You, I or anyone you know, will produce errors
> > if pushed. There's no such thing as a 0 error rate for programming.
>
> Then I do get it, because I agree with you. Let me know when I
> can write a device driver in Python.

False dichotomy ...

> > Just measuring first time compile error rates, myself, I score roughly
> > one syntax error per 300 lines of code. I take this as an indicator
> > for the likely number of hidden bugs I just don't know about in my
> > code. Unless my first-compile error rate was 0, I just can't have any
> > confidence that I don't also have a 0 hidden bug rate.
>
> Strange logic, or lack thereof. Having no first-compile errors
> doesn't provide ANY confidence that you don't have hidden bugs.

Speaking of lack of logic ... its the *REVERSE* that I am talking
about. Its because I *don't* have a 0 first-compile error rate that I
feel that my hidden error rate can't possibly be 0.

> > Go measure your own first-compile error rate and tell me you are
> > confident in your own ability to avoid hidden bugs.
>
> That would be pointless, since measuring first-compile error
> rate proves zilch about overall bug rates. If you want to avoid
> hidden bugs, you have to actively look for them, test for them,
> and code explicitly to avoid them, regardless of how often your
> compiler detects a problem.

You miss my argument. First-compile error rates are not a big deal --
the compiler catches them, you fix them. But they are indicative of
nature blind spots. This same thing must be true to some degree or
another to bugs which don't lead to compiler errors.

Testing, structured walk throughs/inspections, are just imperfect
processes for trying to find hidden bugs. Sure they reduce them, but
you can't believe that they would get all of them -- they dont! So in
the end you are still left with *some* bug rate. So write enough code
and you will produce an arbitrary number of hidden bugs.

> > For a nuclear reactor, I would also include the requirement that they
> > use a safer programming language like Ada. Personally I would be
> > shocked to know that *ANY* nuclear reactor control mechanism was
> > written in C. Maybe a low level I/O driver library, that was
> > thoroughly vetted (because you probably can't do that in Ada), but
> > that's it.
>
> Well gee, there you have it. It seems that there are some
> places were C is almost unavoidable. What a shock. Who's
> wearing those rose-colored glasses now?

None of those sentences have any connection to each other.

First of all, you missed the "maybe" in there. Assembly would be an
equally good choice, or enhanced versions of HLL compilers.

> [... more analogy snipped ...]

> Do you really think you can do anything to a language that
> allows you to touch hardware that will prevent people from
> misusing it?

When did I suggest or imply this?

> [...] Not all development work is for use inside a VM or
> other sandbox.

Again putting words in my mouth.

> > While I want the language be fixed so that most people
> > don't trigger the landmines in the language so easily.
>
> I am not opposed to the language removing provably faulty
> interfaces, but I do not want its capabilities removed in other
> ways. Even so, there is no likelihood of any short-term
> benefits, due to the propagation delay of standard changes into
> compilers, and no proof that it will even be beneficial
> longer-term.
>
> It would probably be a better idea for you to finish your
> completely new "better C compiler" (keeping to your string
> library naming) and make it so popular that C withers on the
> vine.

When did I suggest that I was doing such a thing? Can you find the
relevant quote?

> [...] It's been so successful for you already, replacing all

> those evil null-terminated strings all over the globe, I quiver
> in anticipation of your next earth-shattering achievement.

Actually, my strings are also NUL terminated. That's why people who
use it like it -- its truly a no-lose scenario. You really have to try
using it to understand it. If my library isn't *more* popular, its
probably just because I don't know how to advertise it. Or maybe its
just one of those things that hard to get people excited about. I
haven't received any negative feedback from anyone who's actually used
it -- just suggestions for improvements.

kuy...@wizard.net

unread,

Aug 30, 2005, 10:09:48 AM8/30/05

webs...@gmail.com wrote:
> Randy Howard wrote:
> > webs...@gmail.com wrote:

...

> > > Coroutines are one of those "perfect compromises", because you can
> > > easily specify a portable interface, that is very likely to be widely
> > > supportable, they are actually tremendously faster than threading in
> > > many cases, and all without adding *any* undefined behavior or
> > > implementation defined behavior scenarios (other than a potential
> > > inability to allocate new stacks.)
> >
> > How strange that they are so wildly popular, whereas threads are
> > never used. *cough*
>
> Coroutines are not very widely *deployed*. So popularity is how you
> judge the power and utility of a programming mechanism?

Powerful and useful programming mechanisms tend to become popular. If
they've been around for a reasonable length of time without becoming
popular, if they're not yet widely deployed, then it's reasonable to
guess that this might be because they're not as powerful or as useful
as you think they are, or that they might have disadvantages that are
more than sufficient to compensate for those advantages. The
implication of his comment is that he's not merely guessing; he knows
reasons that he considers sufficient to justify their lack of
popularity.

...

> > That's interesting, because I have used the pthreads interfaces
> > for code on Windows (pthreads-win32), Linux, OS X, solaris, and
> > even Novell NetWare (libc, since they started supporting them
> > several years ago).
>
> You undertstand that those are all mostly UNIX right?

No, Windows conforms to POSIX (though from what I've heard, it doesn't
conform very well; I'm happy to say I know nothing about the details),
but that doesn't make it UNIX.

> Tell me, when is the last time the C language committee considered a
> change in the language that made it truly more powerful that wasn't
> already implemented in many compilers as extensions? Can you give me
> at least a plausibility argument that I wouldn't be wasting my time by
> doing such a thing?

You've answered your own question: the single best way to get something
standardized is to convince someone to implement it as an extension,
develop experience actually using it, show people that it's useful, and
have it become so popular that the committee has no choice but to
standardize it. Why should the committee waste it's very limited amount
of time standardizing something that isn't a sufficiently good idea to
have already become a widely popular extension? If something good is so
new that it hasn't become popular yet, it's too early to standardize
it. Standardizing something freezes it; the need to maintain backwards
compatibility means that it's almost impossible to make substantive
changes to something once it's been standardized. Nothing should be
standardized until people have gained enough experience with it that it
seems unlikely that it will ever again need significant modification.

And, before you bring it up: yes, the committee has occasionally
ignored this advice; however, the results of doing so have often been
regrettable.

kuy...@wizard.net

unread,

Aug 30, 2005, 10:54:44 AM8/30/05

webs...@gmail.com wrote:
> Randy Howard wrote:
> > webs...@gmail.com wrote:
> > > Randy Howard wrote:

...

> "False dichotomy". Look it up. I never mentioned high or low level
> language, and don't consider it relevant to the discussion. Its a
> false dichotomoy because you immediately dismiss the possibility of a
> safe low-level language.

No, it's not an immediate dimissal. It's also not a dichotomy:
low-level languages are inherently unsafe, but high-level languages are
not inherently safe. If it's low-level, by definition it gives you
access to unprotected access to dangerous features of the machine
you're writing for. If it protected your access to those features, that
protection (regardless of what form it takes) would make it a
high-level language.

...

> C gives you access to a sequence of opcodes in ways that other
> languages do not? What exactly are you saying here? I don't
> understand.

Yes, you can access things more directly in C than in other higher
level languages. That's what makes them higher-level languages. One of
the most dangerous features of C is that it has pointers, which is a
concept only one layer of abstraction removed from the concept of
machine addresses. Most of the "safer" high level languages provide
little or no access to machine addresses; that's part of what makes
them safer.

> I am dodging the false dichotomy. Yes. You are suggesting that making
> C safer is equivalent to removing buffer overflows from assembly. The
> two have nothing to do with each other.

You can't remove buffer overflows from C without moving it at least a
little bit farther away from assembly, for precisely the same reason
why you can't remove buffer overflows from assembly without making it
less of an assembly language.

> As I recall this was just a point about low level languages adopting
> safer interfaces. Tough in this case, the performance improvements
> probably drives their interest in it.
>
> > >> [...] If you want to argue that too many people
> > >> write code in C when their skill level is more appropriate to a
> > >> language with more seatbelts, I won't disagree. The trick is
> > >> deciding who gets to make the rules.
> > >
> > > But I'm not arguing that either. I am saying C is to a large degree
> > > just capriciously and unnecessarily unsafe (and slow, and powerless,
> > > and unportable etc., etc).
> >
> > Slow? Yes, I keep forgetting how much better performance one
> > achieves when using Ruby or Python. Yeah, right.
>
> I never put those languages up as alternatives for speed. The false
> dichotomy yet again.

A more useful response would have been to identify these
safer-and-specdier-than-C languages that you're referring to.

> > Unportable? You have got to be kidding. I must be
> > hallucinating when I see my C source compiled and executing on
> > Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
> > other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.
>
> Right. Because you write every piece of C code that's ever been
> written right?

His comment says nothing to suggest that he's ported any specific
number of programs to those platforms. It could be a single program, it
could be a million. Why are you interpreting his claim suggesting that
ported many different programs to those platforms?

> > I don't think you understood me. I know of no company that has
> > a policy for this. However, if I was working on something and
> > felt that something was being done that could be inherently
> > dangerous, and it was going to ship anyway, I would take some
> > form of legal action, if for no other reason than to be able to
> > disassociate myself from the impending lawsuits.
>
> Ok ... that's interesting, but this is ridiculuous. As I said above,
> you do not write every piece of software in the world. And we are well
> aware of about 10,000 programmers living in the pacific northwest who
> we know do *NOT* share your attitude.

Well, that's their fault, and their liability. That doesn't make the
attitude wrong.

> And you defence of the situation is that you assume every gainfully
> employed programmer should be willing to quit the moment they see that
> their process of programming is not likely to yield the highest
> possible quality in software engineering.

No, they should be willing to quit if deliberately ordered to ship
seriously defective products. There's a huge middle ground between
"seriously defective" and "highest possible quality". In that huge
middle ground, they should argue and strive for better quality, but not
necessarily threaten to quit over it.

> > I would much rather go look for work than participate in
> > something that might wind up with people dying over the actions
> > of some meddling manager.
>
> That's nice for you. That's not going to be a choice for lots of other
> people.

That's a choice every employed person has. If they choose not to take
it, that's their fault - literally, in the sense that they can and
should be held personally liable for the deaths caused by their
defective choice.

...

> > > Hmmm ... so you live in India?
> >
> > Why would you think so?
>
> Wild guess.

Why are you making wild guesses? Why are you making guesses that have
no discernable connection to the topic on hand?
...

> That was *my* point. Remember you are claiming that you want to pin
> responsibility and liability for code to people so that you can dish
> out punishment to them. I see a direct line of responsibility from
> weakness in the C library back to him (or maybe it was Thompson or
> Kernigham). And remember you want to punish people.

Yes, people should be held responsibile for things they're actually
responsible for. Ritchie isn't responsible for mis-use of the things
he's created.

Randy Howard

unread,

Aug 30, 2005, 11:11:11 AM8/30/05

webs...@gmail.com wrote
(in article
<1125395777.6...@g44g2000cwa.googlegroups.com>):

>>> Why does being a low language mean you have to present a programming
>>> interface surrounded by landmines?
>>
>> If you have access to any sequence of opcodes available on the
>> target processor, how can it not be?
>
> C gives you access to a sequence of opcodes in ways that other
> languages do not? What exactly are you saying here? I don't
> understand.

asm( character-string-literal ); springs to mind. I do not
believe all languages have such abilities. Having that kind of
capability alone, nevermind pointers and all of the subtle and
no so subtle tricks you can do with them in C makes it capable
of low-level work, like OS internals. There are lots of
landmines there, as you are probably already aware.

>>> Exposing a sufficiently low level
>>> interface may require that you expose some danergous semantics, but why
>>> expose them up front right in the most natural paths of usage?
>>
>> Do you feel that 'gets()' is part of the most natural path in C?
>
> Yes of course! When people learn a new language they learn what it
> *CAN* do before they learn what it should not do. It means anyone that
> learns C first learns to use gets() before they learn not to use
> gets().

Strange, it has been years since I have picked up a book on C
that uses gets(), even in the first few chapters. I have seen a
few that mention it, snidely, and warn against it though.

The man page for gets() on this system has the following to say:
SECURITY CONSIDERATIONS
The gets() function cannot be used securely. Because of its
lack of bounds checking, and the inability for the calling
program to reliably determine the length of the next incoming
line, the use of this function enables malicious users to
arbitrarily change a running program's functionality through a
buffer overflow attack. It is strongly suggested that the
fgets() function be used in all cases.

[end of man page]

I don't know about you, but I suspect the phrase "cannot be used
securely" might slow quite a few people down. It would be even
better if they showed an example of proper use of fgets(), but I
think all man pages for programming interfaces would be improved
by doing that.

> You are suggesting that making C safer is equivalent to removing
> buffer overflows from assembly. The two have nothing to do with each other.

Not equivalent, but difficult. Both languages are very powerful
in terms of what they will 'allow' the programmer to attempt.
There is little or no hand-holding. If you step off the edge,
you get your head chopped off. It's not like you can make some
simple little tweak and take that property away, without
removing a lot of the capabilities overall. Yes, taking gets()
completely out of libc (and its equivalents) would be a good
start, but it wouldn't put a dent in the ability of programmers
to make many more mistakes, also of a serious nature with the
language.

Just as I can appreciate the differences between a squirt gun
and a Robar SR-90, I can appreciate the differences between
Python and C, or any other 'safer' language and assembler.

>>> But its interesting that you bring up the questions of assembly
>>> language. If you persuse the x86 assembly USENET newsgroups, you will
>>> see that many people are very interested in expanding the power and
>>> syntax for assembly language (examples include HLA, RosAsm, and
>>> others).
>>
>> For a suitably generous definition of 'many', perhaps.
>
> Terse, HLA, Rosasm, LuxAsm -- this is all for *one* assembly language.

I was referring to your use of 'many people', which is, unless I
am mistaken, the only use of 'many' above.

>> I would have been shocked if you had not figured out a way to
>> bring your package up. :-)
>
> Oh by the way there is a new version! It incoroporates a new secure
> non data-leaking input function!

You mean it wasn't secure from day one? tsk, tsk. That C stuff
sure is tricky. :-)

> Soon to reach 5000 downloads and
> 80000 webpage hits! Come join the string library revolution and visit:
> http://bstring.sf.net/ to see all the tastey goodness!

LOL.

>> Which does absolutely nothing to prevent the possibility of
>> developing insecure software in assembler. It may offer some
>> advantages for string handling, but that closes at best only one
>> of a thousand doors.
>
> You mean it closes the most obvious and well trodden thousand doors out
> of a million doors.

Both work out to .001. Hmmm.

> Assembly is not a real application development language no matter how
> you slice it.

I hope the HLA people don't hear you saying that. They might
get riotous.

> So I'm would be loath to make any point about whether or
> not you should expect application to become safer because they are
> writing them in assembly language using Bstrlib-like philosophies. But
> maybe those guys would beg to differ -- who knows.

Yes.

> As I recall this was just a point about low level languages adopting
> safer interfaces. Tough in this case, the performance improvements
> probably drives their interest in it.

Exactly. C has performance benefits that drive interest in it
as well. If there was a language that would generate faster
code (without resorting to hand-tuned assembly), people would be
using it for OS internals.

I don't think it should have been used for some things, like
taking what should be a simple shell script and making a binary
out of it (for copyright/copy protection purposes) like is done
so often. Many of the tiny binaries from a C compiler on a lot
of systems could be replaced with simple scripts with little or
no loss of performance. But, somebody wanted to hide their
work, or charge for it, and don't like scripting languages for
that reason. People even sell tools to mangle interpreted
languages to help with this. That is not the fault of the C
standard body (as you originally implied, and lest we forget
what led me down this path with you), but the use of C for
things that it really isn't best suited. For many simple
problems, and indeed some complicated ones, C is not the best
answer, yet it is the one chosen anyway.

>>> But I'm not arguing that either. I am saying C is to a large degree
>>> just capriciously and unnecessarily unsafe (and slow, and powerless,
>>> and unportable etc., etc).
>>
>> Slow? Yes, I keep forgetting how much better performance one
>> achieves when using Ruby or Python. Yeah, right.
>
> I never put those languages up as alternatives for speed. The false
> dichotomy yet again.

Then enlighten us. I am familiar with Fortran for a narrow
class of problems of course, and I am also familiar with its
declining use even in those areas.

>> Powerless? How so?
>
> No introspection capabilities. I cannot write truly general
> autogenerated code from the preprocessor, so I don't get even the most
> basic "fake introspection" that's should otherwise be so trivial to do.
> No coroutines (Lua and Python have them) -- which truly closes doors
> for certain kinds of programming (think parsers, simple incremental
> chess program legal move generators, and so on). Multiple heaps which
> a freeall(), so that you can write "garbage-collection style" programs,
> without incurring the cost of garbage collection -- again there are
> real applications where this kind of thing is *really* useful.

Then by all means use alternatives for those problem types. As
I said a way up there, C is not the best answer for everything,
it just seems to be the default choice for many people, unless
an obvious advantage is gained by using something else.

>> [...] It seems to be the only language other than
>> assembler which has been used successfully for operating system
>> development.
>
> The power I am talking about is power to program. Not the power to
> access the OS.

So we agree on this much then?

>> Unportable? You have got to be kidding. I must be
>> hallucinating when I see my C source compiled and executing on
>> Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
>> other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.
>
> Right. Because you write every piece of C code that's ever been
> written right?

Thankfully, no. The point, which I am sure you realize, is that
C can, and often is used for portable programs. Can it be used
(in non-standard form most of the time btw), for writing
inherently unportable programs? Of course. For example, I
could absolutely insist upon the existence of certain entries in
/proc for my program to run. That might be useful for a certain
utility that only makes sense on a platform that includes those
entries, but it would make very little sense to look for them in
a general purpose program, yet there are people that do that
sort of silly thing every day. I do not blame Ritchie or the C
standards bodies for that problem.

>> That is all true, and it does nothing to address the point that
>> C is still going to be used for a lot of development work. The
>> cost of the runtime error handling is nonzero. Sure, there are
>> a lot of applications today where they do not need the raw speed
>> and can afford to use something else. That is not always the
>> case. People are still writing a lot of inline assembly even
>> when approaching 4GHz clock speeds.
>
> Ok, first of all runtime error handling is not the only path.

Quite. I wasn't trying to enumerate every possible reason that
C would continue to be used despite it's 'danger'.

>>> Just take the C standard, deprecate the garbage, replace
>>> a few things, genericize some of the APIs, well define some of the
>>> scenarios which are currently described as undefined, make some of the
>>> ambiguous syntaxes that lead to undefined behavior illegal, and you're
>>> immediately there.
>>
>> I don't immediately see how this will be demonstrably faster,
>> but you are free to invent such a language tomorrow afternoon.
>
> Well just a demonstration candidate, we could take the C standard, add
> in Bstrlib, remove the C string functions listed in the bsafe.c module,
> remove gets and you are done (actually you could just remove the C
> string functions listed as redundant in the documentation).

What you propose is in some mays very similar to the MISRA-C
effort, in that you are attempting to make the language simpler
by carving out a subset of it. It's different in that you also
add some new functionality. I don't wish to argue any more
about whether MISRA was good or bad, but I think the comparison
is somewhat appropriate. You could write a tome, entitled
something like "HSIEH-2005, A method of providing more secure
applications in a restricted variant of C" and perhaps it would
enjoy success, particularly amongst people starting fresh
without a lot of legacy code to worry about. Expecting the
entire C community to come on board would be about as naive as
expecting everyone to adopt MISRA. It's just not going to
happen, regardless of any real or perceived benefits.

>> Do it, back up your claims, and no doubt the world will beat a
>> path to your website. Right?
>
> Uhh ... actually no. People like my Bstrlib because its *safe* and
> *powerful*. They tend not to notice or realize they are getting a
> major performance boost for free as well (they *would* notice if it was
> slower, of course). But my optimization and low level web pages
> actually do have quite a bit of traffic -- a lot more than my pages
> critical of apple or microsoft, for example.

So you are already enjoying some success then in getting your
message across.

> Its not hard to beat compiler performance, even based fundamentally on
> weakness in the standard (I have a web page practically dedicated to
> doing just that; it also gets a lot of traffic). But by itself, that's
> insufficient to gain enough interest in building a language for
> everyday use that people would be interested in.

Indeed.

> [...] "D" is already taken, what will you call it?
>
> How about "C"?

Well, all you need to do is get elected ISO Dictator, and all
your problems will be solved. :-)

>>> Your problem is that you assume making C safer (or faster, or more
>>> portable, or whatever) will take something useful away from C that it
>>> currently has. Think about that for a minute. How is possible that
>>> your mind can be in that state?
>>
>> It isn't possible. What is possible is for you to make gross
>> assumptions about what 'my problem' is based up the post you are
>> replying to here. I do not assume that C can not be made safer.
>> What I said, since you seem to have missed it, is that the
>> authors of the C standard are not responsible for programmer
>> bugs.
>
> Ok, well then we have an honest point of disagreement then. I firmly
> believe that the current scourge of bugs that lead to CERT advisories
> will not ever be solved unless people abandon the current C and C++
> languages.

Probably a bit strongly worded, but I agree to a point. About
90% of those using C and C++ today should probably be using
alternative languages. About 20% of them should probably be
working at McDonald's, but that's an argument for a different
day, and certainly a different newsgroup.

> I think there is great concensus on this. The reason why I
> blame the ANSI C committee is because, although they are active, they
> are completely blind to this problem, and haven't given one iota of
> consideration to it.

I suspect they have considered it a great deal, and yet not
provided any over action that you or I would appreciate. They
are much concerned (we might easily argue 'too much') with the
notion of not breaking old code. Where I might diverge with
that position is on failing to recognize that a lot of 'old
code' is 'broken old code' and not worth protecting.

> Even though they clearly are in the *best*
> position to do something about it.

I actually disagree on this one, but they do have a lot of power
in the area, or did, until C99 flopped. I think the gcc/libc
crowd could put out a x++ that simply eradicates gets(). That
should yield some immediate improvements. In fact, having a
compiler flag to simply sqwawk loudly every time it encounters
it would be of benefit. Since a lot of people are now using gcc
even on Windows systems (since MS isn't active in updating the C
side of their C/C++ product), it might do a lot of good, far
sooner, by decades than a change in the standard.

> And its them any only them -- the
> only alternative is to abandon C (and C++) which is a very painful and
> expensive solution; but you can se that people are doing exactly that.
> Not a lot of Java in those CERT advisories.

That's good. The more people move to alternate languages, the
more people will have to realize that security bugs can appear
in almost any language. Tons of poorly written C code currently
represents the low-hanging fruit for the bad guys.

>> What cost? Some 'world-wide rolled-up cost'? For me, it cost
>> me almost nothing at all. I first discovered gets() was
>> problematic at least a decade ago, probably even earlier, but I
>> don't keep notes on such things. It hasn't cost me anything
>> since.
>
> And so are you saying it didn't cost you anything when you first
> learned it?

Since I did not have a million lines of C to worry about
maintaining at the time, indeed very little, it was not very
expensive. I'll admit it wasn't zero-cost, in that it took me
whatever time it was for the point to soak in, and to learn
better alternatives. I could have recouped some of the 'cost'
be selling some old Schildt books to unsuspecting programmers,
but felt that would have been uncivilized.

> And that it won't cost the next generation of programmers,
> or anyone else who learns C for the first time?

Provided that they learn it early on, and /not/ after they ship
version 1.0 of their 'next killer app', it won't be that bad.
Given that it shouldn't be taught at all to new programmers
today (and I am in favor of pelting anyone recommending it today
with garbage), I suspect it will be eradicated for all practical
purposes soon.

>>> The standards body, just needs to remove it and those costs go away.
>>
>> They do not. As we have already seen, it takes years, if not
>> decades for a compiler supporting a standard to land in
>> programmer hands. With the stunningly poor adoption of C99, we
>> could not possibly hope to own or obtain an open source C0x
>> compiler prior to 2020-something, if ever. In the mean time,
>> those that are serious solved the problem years ago.
>
> C99 is not being adopted because there is no *demand* from the users or
> development houses for it. If the standard had been less drammatic,
> and solved more real world problems, like safety, for example, I am
> sure that this would not be the case.

Do I think C99 was for many people of no tangible value, or
enough improvement to justify changing compilers, related tools
and programmer behavior? Unfortunately, yes. It was a lot of
change, but little meat on the bones.

However, there was also the problem that C89/90 did for many
people exactly what they expected from the language, and for a
significant sub-group of the population, "whatever gcc adds as
an extension" had become more important than what ISO had to say
on the matter. The stalling out of gcc moving toward C99
adoption (due to conflicts between the two) is ample support for
that claim.

> You also ignore the fact that
> the C++ folks typically pick up the changes in the C standard for their
> own. So the effect of the standard actually *is* eventually
> propogated.

Here I disagree. C and C++ are not closely related anymore. It
takes far longer to enumerate all the differences that affect
both than it does to point out the similarities. Further, I
care not about C++, finding that there is almost nothing about
C++ that can not be done a better way with a different language.
C is still better than any reasonable alternative for a set of
programming tasks that matter to me, one in which C++ doesn't
even enter the picture. That is my personal opinion of course,
others may differ and they are welcome to it.

> The fact that it would take a long time for a gets() removal in the
> standard to be propogated to compiler, I do not find to be a credible
> argument.

Why not? If the compiler doesn't bitch about it, where are all
of those newbie programmers you are concerned about going to
learn it? Surely not from books, because books /already/ warn
about gets(), and that doesn't seem to be working. If they
don't read, and it's not in the compiler, where is this benefit
going to appear?

> Also note thast C89, had very fast adoption. It took a long time for
> near perfect and pervasive adoption, but you had most vendors more than
> 90% of the way there within a very few years.

Because it was very similar to existing practice, and a smaller
language standard overall. Far less work. Frankly, I have had
/one/ occasion where something from C99 would have made life
easier for me, on a single project. It turned out I didn't get
to use it anyway, because I did not have access to C99 compilers
on all of the platforms I needed to support, so I did it
differently. I don't anticipate 'wishing' for a C99 compiler
much, if at all, in the future either. The problem domains that
C became dominant for are well-served by C89/90, as is, just
stay away from the potholes. I certainly do not need a C05
compiler just to avoid gets(), I've been doing it with C89
compilers for many years.

> A compiler error telling the user that its wrong (for new platform
> compilers) is the best and simplest way to do this.

And we both say that, several times, we seem to differ only in
the requirements to make that change.

>> Great idea. 15 years from now that will have some value.
>
> Uh ... but you see that its still better than nothing right?

So is buying lottery tickets for the worst programmer you know.
:-)

> You think programming will suddenly stop in 15 years?

Yeah, that's exactly what I was thinking. How did you guess?

> Do you think there will be less programmer *after* this 15
> year mark than there has been before it?

Nope. but I think it will 15 years too late, and even if it does
come, and the gets() removal is part of it, which assumes facts
not in evidence, that there will STILL be a lot of people using
C89/90 instead. I would much rather see it show up in compilers
with the next minor update, rather than waiting for C05, which
will still have the barrier of implementing the ugly bits of
C99, which the gcc crowd seems quite loath to do.

> Or, like me, do you think C will just become COBOL in 15 years?

Yeah, as soon as a suitable replacement for system programming
shows up. Hold your breath, it's right around the corner.

>> A better idea. Patch gcc to bitch about them TODAY, regardless
>> of the standard.
>
> The linker for the GNU linker already does this. But its perceived as
> a warning. People do not always listen to warnings.

So make it email spam to the universe pronouncing "Someone at
foobar.com is using gets()!! Avoid their products!!!" instead.
:-)

Perhaps having the C runtime library spit out a warning on every
execution at startup "DANGER: THIS PROGRAM CONTAINS INSECURE
CODE!!!" along with a string of '\a' characters would be better.

I do not see a magic wand that will remove it for all time, the
genie is out of the bottle. Some nebulous future C standard is
probably the weakest of the bunch. I am not saying it shouldn't
happen, but it will not be sufficient to avoid the problem.

>>> Interesting -- because I do. You make gets a reserved word, not
>>> redefinable by the preprocessor, and have it always lead to a syntax
>>> error.
>>
>> What part of 'people can still fire up and old compiler' did you
>> fail to read and/or understand?
>
> Use of old compilers is not the problem. The piles of CERT advisories
> and news stories about exploits are generally directed at systems that
> are constantly being updated with well supported compilers.

Which of those systems with CERT advisories against them have
recently updated C99 compilers? It's only been 6 years right?
How long will it be before they have a compiler you are happy
with, providing guaranteed expulsion of code with gets()?

Use of old compilers is definitely part of the problem, along of
course with badly trained programmers.

>>> This has value because, developers can claim to be "C 2010 compliant"
>>> or whatever, and this can tell you that you know it doesn't have gets()
>>> or any other wart that you decided to get rid of.
>>
>> They could also simply claim "we are smarter than the average
>> bear, and we know better to use any of the following offensive
>> legacy functions, such as gets(), ..."
>
> But nobody would believe your claim. My claim could be audited, and a
> company would actually worry about being sued for making a false claim
> of the sort I am advocating unless it were true.

If they can prove that no gets() or friends in the list are in
their product, then what is the worry? Put in a different way,
if they claimed "We're C 2010 compliant", just because they have
access to gcc2010, and yet it allows some command line argument
-std=c89, all bets are off anyway. Either way, they either use
gets() or they do not. As such, both claims are pretty similar.

Frankly, I don't care about marketing BS, so let's move on...

>> To clarify, since it didn't soak in the first time, I am not
>> opposed to them being removed. I simply don't this as a magic
>> bullet, and certainly not in the sense that it takes far too
>> long for the compilers to catch up with it. I would much rather
>> see compilers modified to deny gets() and its ilk by default,
>> and require a special command line option to bypass it, /if at
>> all/. However, the warning message should be far more useful
>> than
>> gets.c: 325: error: gets() has been deprecated.
>
> Did I misspeak and ask for deprecation? Or are you misrepresenting my
> position as usual?

No, you just failed to notice the end of one sentence,
pertaining to your position, and the start of another one, with
the words "I would much rather see...".

> I'm pretty sure I explicitely said "non-redefinable
> in the preprocessor and always leads to an error" to specifically
> prevent people from working around its removal.

And, just as I said above, which I will repeat to get the point
across (hopefull), "I AM NOT OPPOSED TO THEM BEING REMOVED".

I simply think more could be done in the interim, especially
since we have no guarantee of it every happening your way at
all.

>> If, and only if, they use a compiler with such changes. We
>> still see posts on a regular basis with people using old 16-bit
>> Borland compilers to write new software.
>
> ... And you think there will be lots of CERT advisories on such
> products? Perhaps you could point my to a few examples of such
> advisories which are new, but which use old compilers such as Borland
> C.

If your tactic is to only attack the people working on widely
deployed software likely to be involved in CERTs, I think the
"gets(), just say no" mantra is being driven into their head
practically every day. It's legacy code (and cleaning it) that
represents the bulk of the problem today. Scanning the complete
source tree for every product currently on the market would be
your best bet.

> We can't do anything about legacy compilers -- and we don't *NEED TO*.
> That's not the point. The "software crisis" is directed at development
> that usually uses fairly well maintained compilers.

Well, if it is a 'crisis', then 15 years is definitely too long
to wait for a solution.

>>> Oh I see. So, which socialist totally unionized company do you work as
>>> a programmer for? I'd like to apply!
>>
>> I don't think you understood me. I know of no company that has
>> a policy for this. However, if I was working on something and
>> felt that something was being done that could be inherently
>> dangerous, and it was going to ship anyway, I would take some
>> form of legal action, if for no other reason than to be able to
>> disassociate myself from the impending lawsuits.
>
> Ok ... that's interesting, but this is ridiculuous. As I said above,
> you do not write every piece of software in the world.

A fact of which I am painfully aware each time I sit down at a
keyboard to a machine running Windows. :-) (For the record,
no, I do not think I would replicate Windows functionality and
improve on it, single-handedly.)

> And we are well
> aware of about 10,000 programmers living in the pacific northwest who
> we know do *NOT* share your attitude.

Correct. Perhaps if they weren't so anxious to grab 20 year old
open source software and glue into their own products, there
would be less to worry about from them as well.

That is /not/ what I said. The specific example in question had
to do with something that would be very dangerous if shipped.
The type of danger people would send lawyers hunting for you
over. there is a lot of room between that and "not likely to
yield the highest possible quality..."

>>> Hmmm ... so you live in India?
>>
>> Why would you think so?
>
> Wild guess.

Very.

>>> I'm trying to guess where it is in this
>>> day and age that you can just quit your job solely because you don't
>>> like the pressures coming from management.
>>
>> Where do you live? Because I am trying to guess where on the
>> planet you would /not/ have the right to quit your job.
>> Indentured servitude is not widely practiced anymore, AFAIK.
>
> That isn't what I am saying. People's ability to quit or work at will
> are often not related to things like programming philosophy or idealism
> about their job. And software is and always will be created by
> developers who have considerations other than the process of creating
> perfect software.

Then we are in a lot of trouble. The ISO C body isn't going to
solve that problem. You better start tilting some more powerful
windmills, and do it quickly.

>> You can have 10 dozen other forms of security failure, that have
>> nothing to do with buffer overflows.
>
> I implore you -- read the CERT advisories. Buffer Overflows are #1 by
> a LARGE margin.

Yes. And when they are all gone, something else will be number
#1. As I already said, a lot of people have figured out how to
find and expose the low-hanging fruit, it's like shooting fish
in a barrel right now. It won't always be that way. I long for
the day when some whole in .NET becomes numero uno, for a
different reason than buffer overflows. It's just a matter of
time. :-)

> If you remove buffer overflows, it doesn't mean that other kinds of
> bugs will suddenly increase in absolute occurrence. Unless you've got
> your head in the sand, you've got to know that *SPECIFICALLY* buffer
> overflows are *BY THEMSELVES* the biggest and most solvable, and
> therefore most important safety problem in programming.

Yep. they're definitely the big problem today. do you really
think they'll still be the big problem by the time your C2010
compiler shows up in the field? It's possible of course, but I
hope not.

>> I had done just about everything I could imagine to lock the
>> system down, and it still got out of control in 2 hours letting
>> a 12-yr-old browse a website and play some games.
>>
>> Of course, if enough people do the same thing, the bad guys will
>> figure out how to do this on Linux boxes as well. But for now,
>> the OS X and Linux systems have been causing me (and the kids)
>> zero pain and I'm loving it.
>
> I'm not sure how this is an argument that Buffer Overflows aren't the
> worst safety problem in programming by a large margin.
>
> None of those problems actually have anything to do with programmer
> abilities, or language capabilities. They have to do with corporate
> direction, mismanagement, and incompetent program architecture. That's
> a completely seperate issue.

Yes, I diverged in the wood, for no good reason, it was just too
bothersome for me to leave out, since it seemed related at the
time. Forgive me.

>>>>> Programmers are generally not aware of the liability of
>>>>> their mistakes.
>>>>
>>>> Then those you refer to must be generally incompetent.
>>>
>>> Dennis Ritchie had no idea that NASA would put a priority inversion in
>>> their pathfinder code.
>>
>> Are you implying that Dennis Ritchie is responsible for some bad
>> code in the pathfinder project?
>
> Uh ... no *you* are. My point was that he *COULDN'T* be.

OK. If that's your point, then how do you justify claiming that
the ISO C folks are culpable in buffer overflow bugs?

> Sometimes you are *NOT AWARE* of your liability, and you don't *KNOW*
> the situations where your software might be used.

So if your point is that the ISO committee knew about gets() and
allowed it to live on in C99, and that for example, 7.19.7.7
should contain some bold wording to that effect, I agree.
Better yet of course, marking it deprecated and forcing
compilers to emit an ERROR, not a warning for its use when
invoked in C99 conforming mode.

Even so, the knowledge was readily available elsewhere that
gets() was inherently unsafe at the time, and frankly, I have
met two or three programmers other than myself in the last 15
years that owned their own copy or copies of the applicable C
standards. Putting it in the document along wouldn't have
helped, but it is somewhat surprising that it didn't even rate a
warning in the text.

>>> My point is
>>> that programmers don't know what the liability of their code is,
>>> because they are not always in control of when or where or for what it
>>> might be used.
>>
>> Wow, that is tortured at best. Presumably Ritchie is in your
>> list because of C or UNIX? How could he be 'liable' for an
>> application or driver written by somebody else 30 years later?
>
> That was *my* point. Remember you are claiming that you want to pin
> responsibility and liability for code to people so that you can dish
> out punishment to them. I see a direct line of responsibility from
> weakness in the C library back to him (or maybe it was Thompson or
> Kernigham). And remember you want to punish people.

In the country I live, ex post facto laws are unconstitutional,
and I do not want to make people retroactively responsible
(although our elected representatives sometimes do).

Especially since they couldn't have possible been claiming
'certification' of any kind that far in the past, seeing it
doesn't even exist today.

>> Are the contributors to gcc responsible for every bad piece of
>> software compiled with it?
>
> Well no, but you can argue that they are responsible for the bugs they
> introduce into their compilers. I've certainly stepped on a few of
> them myself, for example. So if a bug in my software came down to a
> bug in their compiler, do you punish me for not being aware of the bug,
> or them for putting the bug in there in the first place?

It would be difficult, if not impossible, to answer that
generically about a hypothetical instance. That's why we have
lawyers. :-(

>> If someone writes a denial-of-service attack program that sits
>> on a Linux host, is that Torvald's fault? I've heard of people
>> trying to shift blame before, but not that far. Maybe you might
>> want to blame Linus' parents too, since if they hadn't conceived
>> him, Linux wouldn't be around for evil programmers to write code
>> upon. Furrfu.
>
> Steve Gibson famously railed on Microsoft for enabling "raw sockets" in
> Windows XP.

Yes, I saw something about it on his website only yesterday,
ironically.

> This allows for easy DDOS attacks, once the machines have
> been zombified. Microsoft marketing, just like you, of course
> dismissed any possibility that they should accept any blame whatsoever.

Don't put that one on me, their software exposes an interface in
a running operating system. If their software product leaves a
whole open on every machine it is installed on, it's their
fault. I see nothing in the ISO C standard about raw sockets,
or indeed any sockets at all, for well over 500 pages.

Can raw sockets be used for some interest things? Yes. The sad
reality is that almost /everything/ on a computer that is
inherently powerful can be misused. Unfortunately, there are
currently more people trying to break them than to use them
effectively.

>>> The recent JPEG parsing buffer overflow exploit, for example, came from
>>> failed sample code from the JPEG website itself. You think we should
>>> hunt down Tom Lane and linch him?
>>
>> Nope. If you take sample code and don't investigate it fully
>> before putting it into production use, that's /your/ problem.
>
> Oh I see. So you just want to punish, IBM, Microsoft, Unisys, JASC
> software, Adobe, Apple, ... etc. NOBODY caught the bug for about *10
> years* dude.

Exactly. They all played follow-the-leader. I'm sure they'll
use the same defense if sued.

> Everyone was using that sample code including *myself*.

tsk, tsk.

>>> Just measuring first time compile error rates, myself, I score roughly
>>> one syntax error per 300 lines of code. I take this as an indicator
>>> for the likely number of hidden bugs I just don't know about in my
>>> code. Unless my first-compile error rate was 0, I just can't have any
>>> confidence that I don't also have a 0 hidden bug rate.
>>
>> Strange logic, or lack thereof. Having no first-compile errors
>> doesn't provide ANY confidence that you don't have hidden bugs.
>
> Speaking of lack of logic ... its the *REVERSE* that I am talking
> about. Its because I *don't* have a 0 first-compile error rate that I
> feel that my hidden error rate can't possibly be 0.

I'll say it a different way, perhaps this will get through.
REGARDLESS of what your first-compiler error rate, you should
feel that hidden error rate is non-zero. You /might/ convince
yourself otherwise at some point in the future, but using
first-compile errors as a metric in this way is the path to
hell.

> You miss my argument. First-compile error rates are not a big deal --
> the compiler catches them, you fix them. But they are indicative of
> nature blind spots.

True. Unfortunately, if you had none at all, there are still
'unnatural blind spots' in code that will bite you in the
backside. This is why developers (outside of small shops)
rarely are solely responsible for testing their own code. They
get false impressions about what code is likely to be sound or
unsound based upon things like how many typos they made typing
it in. Not good.

> Testing, structured walk throughs/inspections, are just imperfect
> processes for trying to find hidden bugs. Sure they reduce them, but
> you can't believe that they would get all of them -- they dont!

No kidding. I'm often amazed at how you give off the impression
that you think you are the sole possessor of what others
recognize as common knowledge.

I have never claimed that a program was bug free. I have
claimed that they have no known bugs, which is a different
matter completely.

>> Do you really think you can do anything to a language that
>> allows you to touch hardware that will prevent people from
>> misusing it?
>
> When did I suggest or imply this?

Apparently not. Good.

>> [...] Not all development work is for use inside a VM or
>> other sandbox.
>
> Again putting words in my mouth.

Stating a fact, actually.

>> It would probably be a better idea for you to finish your
>> completely new "better C compiler" (keeping to your string
>> library naming) and make it so popular that C withers on the
>> vine.
>
> When did I suggest that I was doing such a thing? Can you find the
> relevant quote?

You didn't. I suggested it. Since it is more likely of
happening before 2020, it might be of interest to you in solving
the 'software crisis'.

Douglas A. Gwyn

unread,

Aug 30, 2005, 11:13:06 AM8/30/05

Chris Hills wrote:
> It's NOT down to the ANSI committee..... it is down to WG14 an ISO

> committee of which ANSI is but one part. ...

It's already evident that "websnarf" doesn't understand
standardization.

Douglas A. Gwyn

unread,

Aug 30, 2005, 11:16:47 AM8/30/05

webs...@gmail.com wrote:
> In any event, compare this to Java, where Unicode is actually the
> standard encoding for string data. Its not really possible to have
> "unicode parsing problems" in Java, since all this stuff has been
> specified in the core of the language. Compare this to ANSI C, which
> uses wchar, which literally doesn't *specify* anything useful. So
> technically the only reason one is writing Unicode parsers in C is
> because the standard doesn't give you one.

C is *meant* for implementing systems at that level, and doesn't
presuppose such things as the specific native character encoding.
People who have to deal with a variety of encodings would have to
do similar things in Java too.

Randy Howard

unread,

Aug 30, 2005, 11:52:05 AM8/30/05

webs...@gmail.com wrote
(in article
<1125368979.7...@g47g2000cwa.googlegroups.com>):

> This was a point to demonstrate the programmers are not perfect, not
> matter what you do. So this idea that you should just blame
> programmers is just pointless.

It's better than blaming someone that doesn't even have access
to the source code in most cases. If I saw a car wreck, I
wouldn't immediately go in search of the CEO of the company that
built the car. I'd first try to find out about the driver. It
might turn out that the car was fundamentally flawed, and
impossible to drive correctly.

You could easily make the argument that gets() is impossible, or
nearly so, to use properly. That doesn't preclude you from
getting input in other ways.

>>> The programmer used priority based threading because that's what he had
>>> available to him.
>>
>> He used something that does not even exist in standard C, and
>> got bit in the ass. Gee, and to think that you want to hold the
>> standard committee (and judging by another post of yours Ritchie
>> himself) responsible when people do things like this.
>
> Well, Ritchie, AFAIK, did not push for the standardization, or
> recommend that everyone actually use C as a real application
> development language. So I blame him for the very narrow problem of
> making a language with lots of silly unnecessary problems, but not for
> the fact that everyone decided to use it.

I have yet to encounter a language without a lot of silly
problems. Some folks argue that lisp is such a language, but I
don't take them seriously. :-)

> The actual ANSI C committee
> is different -- they knew exactly what role C was taking. They have
> the ability to fix the warts in the language.

No, they have the ability to recommend fixes to a language, but
as we have already seen, the developer community is more than
willing to ignore them when push comes to shove. A community
that strong-willed should be strong enough to address the gets()
problem itself, and not rely on 'permission' from the standard
group.

>>> Maybe the Pathfinder code would have more
>>> coroutines, and fewer threads, and may have avoided the problem
>>> altogether (I am not privy to their source, so I really don't know).
>>
>> That didn't stop you from blaming it on standard C, why stop
>> now?
>
> First of all, the standard doesn't *have* coroutines while other
> languages do.

It doesn't have threads either. If you want to argue about
pthreads, take it up with the POSIX guys. I suspect Butenhof
and a few others would just love to argue you with you about it
over in c.p.t.

>> How strange that they are so wildly popular, whereas threads are
>> never used. *cough*
>
> Coroutines are not very widely *deployed*.

Correct.

> So popularity is how you
> judge the power and utility of a programming mechanism?

No, but it has a lot to do with whether or not programmers see
it as a suitable alternative at design time.

> Why don't you
> try to add something substantive here rather leading with ignorance?

Why don't you stop playing games? It's readily apparent that
programming isn't knew to either of us, so let's stop fencing
with ad homs and move on.

> Can you give a serious pro-con argument for full threads versus
> coroutines? Because I can.

Yes. There is a POSIX standard for pthreads. This allows me to
develop cross-platform, portable software that utilizes them
across a diverse group of OS platforms and CPU architectures for
tools that, for a number of reasons, must be written in C, or
assembler, which isn't all that practical since different CPU
architectures are involved, and the amount of work involved
would be staggering. There is no coroutine solution available
AFAIK to solve the problem across all of the platforms, and I do
not have the luxury of discussing vaporware.

If there was coroutine support in a suitable language, and if it
was readily available on a wide variety of OS and hardware, I
would consider it, especially if it offered parity or
improvement over threads in terms of performance, flexibility
and stability. I have not spent much time on them, apart from
the theoretical, because they do not form a practical solution
to the problems I have to address today.

Ironically, I can say similar things about the Windows threading
model, which does exist, but only solves a fraction of the
problem, and in a much more tortured way.

>> That's interesting, because I have used the pthreads interfaces
>> for code on Windows (pthreads-win32), Linux, OS X, solaris, and
>> even Novell NetWare (libc, since they started supporting them
>> several years ago).
>
> You undertstand that those are all mostly UNIX right? Even the windows
> thing is really an implementation or emulation of pthreads on top of
> Windows multithreading.

Wow. I didn't know that. I thought that the reason I
downloaded pthreads-win32 was simply because Microsoft forgot to
put it on the CD and had mistakenly left it on somebody else's
website. *sigh*

> Show me pthreads in an RTOS.

Read Bill Weinberg's (MontaVists) paper for an alternate view.
http://www.mvista.com/dswp/wp_rtos_to_linux.pdf

>> Have there been bugs in pthread libraries? Yes. Have their
>> been bugs in almost every library ever used in software
>> development? Yes. Where they impossible to fix? No.
>
> Right. And have they fixed the generic problem of race conditions?

No, you have to code to avoid them, just as you do to avoid
logic flaws.

> Race conditions are just the multitasking equivalent of
> buffer-overflows.

Hardly.

> Except, as you know, they are *much* harder to debug,

And to exploit. Buffer overflows are comparably easy to
exploit. The best way to 'debug' races is to avoid them in the
first place.

> and you cannot use tools, compiler warnings or other simple
> mechanisms to help you avoid them.

You can use the knowledge of how to avoid them to good effect
however. I have not had to chase one down for quite a while,
and will be mildly surprised if one pops up down the road.

> This is the real benefit of coroutines over full threading.
> You can't have race conditions using coroutines.

Unfortunately, you can't have coroutines using a number of
popular languages either.

And you can have race conditions, even without threads. Two
separate processes doing I/O to the same file can race, with or
without coroutines in either or both. Race conditions are not
something threads hold a patent on, not by a long shot.

>> Feel free to propose a complete coroutine implementation.
>
> I would if I thought there was an audience for it. These things take
> effort, and a brief perusal of comp.std.c leads me to believe that the
> ANSI committee is extremely capricious.

Then skip them, and sell it instead. People use things (such as
pthreads) that aren't part of standard C every day, with varying
degrees of success. If you had a package, and it did not
subscribe to the 'all the world's an intel' philosophy, I would
entertain the notion. Not all of us have the luxury of
supporting a single, or even a few platforms. The choices are
limited, or you roll your own.

> Think about it. You want me to propose something actually useful,
> powerful and which would improve the language to a committee that
> continues to rubber stamp gets().

Yeah. And if nobody tries, it is a self-fulfilling prophecy
that it won't change.

> Is your real point that I am supposed to do this to waste my time and
> energy, obviously get rejected because the ANSI C committee has no
> interested in improving the language, and this will be proof that I am
> wrong?

No. I think that if you are correct in believing it has value
and the world needs it, then you propose it for standardization,
and while you wait, you build it anyway, because if it's that
good, then people will want to buy it if they can't get it
another way. Even it if it is accepted, you can still sell it,
look at the folks selling C99 solutions and libraries today,
despite of the lack of general availability.

You can sit around being mad at the world over your pet peeves,
or you can do something about it. Of course, the former is far
easier.

> Tell me, when is the last time the C language committee considered a
> change in the language that made it truly more powerful that wasn't
> already implemented in many compilers as extensions?

I can't think of an example offhand, but I am restricted by not
being completely fluent on all the various compiler extensions,
as I try to avoid them whenever possible. Even so, you have a
point about their past history.

> Can you give me at least a plausibility argument that I wouldn't
> be wasting my time by doing such a thing?

If you don't think it's worth fooling with, then why are we
arguing about it at all? either it is a 'software crisis', or
it is, to use a coarse expression, 'just a fart in the wind'.
If it is the latter, let's stop wasting time on it.

Richard Bos

unread,

Aug 30, 2005, 11:58:39 AM8/30/05

Randy Howard <randy...@FOOverizonBAR.net> wrote:

> webs...@gmail.com wrote
> (in article
> <1125395777.6...@g44g2000cwa.googlegroups.com>):
>
> >>> Why does being a low language mean you have to present a programming
> >>> interface surrounded by landmines?
> >>
> >> If you have access to any sequence of opcodes available on the
> >> target processor, how can it not be?
> >
> > C gives you access to a sequence of opcodes in ways that other
> > languages do not? What exactly are you saying here? I don't
> > understand.
>
> asm( character-string-literal ); springs to mind. I do not
> believe all languages have such abilities.

Neither does C.

Richard

Randy Howard

unread,

Aug 30, 2005, 11:59:15 AM8/30/05

kuy...@wizard.net wrote
(in article
<1125413684.7...@g44g2000cwa.googlegroups.com>):

> webs...@gmail.com wrote:
>> Randy Howard wrote:
>>> webs...@gmail.com wrote:
>>>> Randy Howard wrote:
> ...
>> "False dichotomy". Look it up. I never mentioned high or low level
>> language, and don't consider it relevant to the discussion. Its a
>> false dichotomoy because you immediately dismiss the possibility of a
>> safe low-level language.
>
> No, it's not an immediate dimissal. It's also not a dichotomy:
> low-level languages are inherently unsafe, but high-level languages are
> not inherently safe. If it's low-level, by definition it gives you
> access to unprotected access to dangerous features of the machine
> you're writing for. If it protected your access to those features, that
> protection (regardless of what form it takes) would make it a
> high-level language.

I'm glad that is obvious to someone else. I was feeling lonely.
:-)

> You can't remove buffer overflows from C without moving it at least a
> little bit farther away from assembly, for precisely the same reason
> why you can't remove buffer overflows from assembly without making it
> less of an assembly language.

Yes. We could just as easily have had "C without Pointers"
instead of "C with classes". Guess how many people would have
went for the first of the two? We can argue about the benefits
or lack thererof with the second some other time. :-)

>>> Slow? Yes, I keep forgetting how much better performance one
>>> achieves when using Ruby or Python. Yeah, right.
>>
>> I never put those languages up as alternatives for speed. The false
>> dichotomy yet again.
>
> A more useful response would have been to identify these
> safer-and-specdier-than-C languages that you're referring to.

Exactly. It would have been far more difficult to do though,
and he already has 'false dichotomy' in his paste buffer.

>>> Unportable? You have got to be kidding. I must be
>>> hallucinating when I see my C source compiled and executing on
>>> Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
>>> other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.
>>
>> Right. Because you write every piece of C code that's ever been
>> written right?
>
> His comment says nothing to suggest that he's ported any specific
> number of programs to those platforms. It could be a single program, it
> could be a million. Why are you interpreting his claim suggesting that
> ported many different programs to those platforms?

Intentional misreading, I suspect.

>> And you defence of the situation is that you assume every gainfully
>> employed programmer should be willing to quit the moment they see that
>> their process of programming is not likely to yield the highest
>> possible quality in software engineering.
>
> No, they should be willing to quit if deliberately ordered to ship
> seriously defective products. There's a huge middle ground between
> "seriously defective" and "highest possible quality". In that huge
> middle ground, they should argue and strive for better quality, but not
> necessarily threaten to quit over it.

Eerily, we had almost exactly the same response to this bit.
Once again, there is hope. :-)

>>> I would much rather go look for work than participate in
>>> something that might wind up with people dying over the actions
>>> of some meddling manager.
>>
>> That's nice for you. That's not going to be a choice for lots of other
>> people.
>
> That's a choice every employed person has. If they choose not to take
> it, that's their fault - literally, in the sense that they can and
> should be held personally liable for the deaths caused by their
> defective choice.

Agreed.

>> That was *my* point. Remember you are claiming that you want to pin
>> responsibility and liability for code to people so that you can dish
>> out punishment to them. I see a direct line of responsibility from
>> weakness in the C library back to him (or maybe it was Thompson or
>> Kernigham). And remember you want to punish people.
>
> Yes, people should be held responsibile for things they're actually
> responsible for. Ritchie isn't responsible for mis-use of the things
> he's created.

He's probably in favor of suing Stihl is someone uses one of
their chainsaws to decapitate their spouse too. After all, it's
the same false premise at work.

Randy Howard

unread,

Aug 30, 2005, 12:30:56 PM8/30/05

Richard Bos wrote
(in article <43148223...@news.xs4all.nl>):

Ref: J.5.10 Yes, I know about where it appears in the
document, but it's close enough given we are discussing the real
world. Since one can also link assembly code modules with C
into an executable without it, it seems a moot point anyway.

Alan Balmer

unread,

Aug 30, 2005, 1:39:24 PM8/30/05

On Tue, 30 Aug 2005 15:52:05 GMT, Randy Howard
<randy...@FOOverizonBAR.net> wrote:

>webs...@gmail.com wrote
>(in article
><1125368979.7...@g47g2000cwa.googlegroups.com>):
>
>> This was a point to demonstrate the programmers are not perfect, not
>> matter what you do. So this idea that you should just blame
>> programmers is just pointless.
>
>It's better than blaming someone that doesn't even have access
>to the source code in most cases. If I saw a car wreck, I
>wouldn't immediately go in search of the CEO of the company that
>built the car. I'd first try to find out about the driver. It
>might turn out that the car was fundamentally flawed, and

Can't you guys find a more suitable venue for your argument?

How about comp.programming?
--
Al Balmer
Balmer Consulting
removebalmerc...@att.net

Keith Thompson

unread,

Aug 30, 2005, 3:44:03 PM8/30/05

webs...@gmail.com writes:
> Randy Howard wrote:
>> webs...@gmail.com wrote:
>> > Randy Howard wrote:
>> >>> Bad programming + good programming language does not allow for buffer
>> >>> overflow exploits.
>> >>
>> >> For suitably high-level languages that might be true (and
>> >> provable). Let us not forget that C is *not* a high-level
>> >> language. It's not an accident that it is called high-level
>> >> assembler.
>> >
>> > Right. If you're not with us, you are with the terrorists.
>>
>> Excuse me?
>
> "False dichotomy". Look it up. I never mentioned high or low level
> language, and don't consider it relevant to the discussion. Its a
> false dichotomoy because you immediately dismiss the possibility of a
> safe low-level language.

"websnarf", let me offer a suggestion. If you want to point out a
false dichotomy, use the words "false dichotomy". I had no clue what
you meant by your remark about terrorists, and I'd be surprised if
anyone else did either. (I didn't mention it earlier because,
frankly, I didn't care what you meant.)

If you want to communicate, you need to write more clearly. If you're
more interested in showing of how obscure you can be, please do so
somewhere else.

Douglas A. Gwyn

unread,

Aug 30, 2005, 3:07:33 PM8/30/05

Randy Howard wrote:
> .. A community

> that strong-willed should be strong enough to address the gets()
> problem itself, and not rely on 'permission' from the standard
> group.

Actually all you need to do is not use gets (except
perhaps in certain carefully controlled situations).
There are other, standard, mechanisms that can be used
safely enough. If the actual problem is perceived to
be that naive novice programmers might use gets
without appreciating the opportunity for buffer
overrun, consider that the same programmers will make
comparable errors throughout their code. A genuine
fix for the actual problem requires something quite
different from removing gets from the system library.

Randy Howard

unread,

Aug 30, 2005, 5:36:44 PM8/30/05

Alan Balmer wrote
(in article <0c69h15q8pdf7bgfo...@4ax.com>):

> Can't you guys find a more suitable venue for your argument?

I'm sorry. Is a discussion about what might or might not happen
in future C standards drowning out discussions of homework
problems and topicality?

Randy Howard

unread,

Aug 30, 2005, 5:51:34 PM8/30/05

Douglas A. Gwyn wrote
(in article <4314AE75...@null.net>):

> Randy Howard wrote:
>> .. A community
>> that strong-willed should be strong enough to address the gets()
>> problem itself, and not rely on 'permission' from the standard
>> group.
>
> Actually all you need to do is not use gets (except
> perhaps in certain carefully controlled situations).
> There are other, standard, mechanisms that can be used
> safely enough.

Indeed, if I am not mistaken I made that very point several
times already. Apparently it lacks for an iron-clad guarantee.

> If the actual problem is perceived to be that naive novice
> programmers might use gets without appreciating the opportunity
> for buffer overrun, consider that the same programmers will make
> comparable errors throughout their code.

Yes. Also, it is incredibly unlikely that a naive novice
programmer will be producing software that will be widely
deployed and wind up in a CERT advisory, but I suppose it is not
impossible.

I am somewhat curious about why even as late as C99, or even
later in TC1, there is still no official wording in the standard
concerning gets() being of any concern at all. It seems that it
couldn't have offended many people to simply saying "hey, this
is in the standard already, but it's really not a good idea to
use it for new development, and in fact, it is highly
recommended that an existing usage be expunged." That seems to
be the strongest argument in favor of Hsieh's position that I
have seen so far. It is very hard to think of a justification
for it appearing unadorned with a a warning in the text.

> A genuine fix for the actual problem requires something quite
> different from removing gets from the system library.

What would you propose?

Alan Balmer

unread,

Aug 30, 2005, 5:57:05 PM8/30/05

On Tue, 30 Aug 2005 21:36:44 GMT, Randy Howard
<randy...@FOOverizonBAR.net> wrote:

>Alan Balmer wrote
>(in article <0c69h15q8pdf7bgfo...@4ax.com>):
>
>> Can't you guys find a more suitable venue for your argument?
>
>I'm sorry. Is a discussion about what might or might not happen
>in future C standards drowning out discussions of homework
>problems and topicality?

Reread your last 223-line post. That's not what it was about.

Unless you are concentrating mostly on the "might not happen" aspect,
in which case everything becomes topical. I can think of hundreds of
things which might not appear in future C standards.

Douglas A. Gwyn

unread,

Aug 30, 2005, 6:17:36 PM8/30/05

Randy Howard wrote:
> Yes. Also, it is incredibly unlikely that a naive novice
> programmer will be producing software that will be widely
> deployed and wind up in a CERT advisory, but I suppose it is not
> impossible.

Judging by some of the reported bugs, one wonders.

Just a few weeks ago, there was an IAVA for a bug in
Kerberos v5 that was essentially of the form
if (!try_something) {
error_flag = CODE;
free(buffer);
}
free(buffer);
How that could have passed even a casual code review
is a mystery.

> I am somewhat curious about why even as late as C99, or even
> later in TC1, there is still no official wording in the standard
> concerning gets() being of any concern at all. It seems that it
> couldn't have offended many people to simply saying "hey, this
> is in the standard already, but it's really not a good idea to
> use it for new development, and in fact, it is highly
> recommended that an existing usage be expunged." That seems to
> be the strongest argument in favor of Hsieh's position that I
> have seen so far. It is very hard to think of a justification
> for it appearing unadorned with a a warning in the text.

The simple answer is that the C standard is a specification
document, not a programming tutorial. Such a warning
properly belongs in the Rationale Document, not in the spec.

Old Wolf

unread,

Aug 30, 2005, 8:44:18 PM8/30/05

webs...@gmail.com wrote:
>
> Second of all, remember, I *BEAT* the performance of C's strings
> across the board on multiple platforms with a combination of run
> time and API design in Bstrlib. This is a false idea that error
> checking always costs performance. Performance is about design,
> not what you do about safety.

You keep going on about how "C is slow" and "it would be easy
to make it faster and safer". Now you claim that you have a
library that does make C "faster and safer".

In other messages, you've explained that by "safer", you mean
being less prone to buffer overflows and undefined behaviour.

The only way a C-like language can avoid buffer overflows
is to include a runtime bounds check.

Please explain how -adding- a runtime bounds check to some
code, makes it faster than the exact same code but without
the check.

Chris Hills

unread,

Aug 30, 2005, 12:20:38 PM8/30/05

In article <1125209654.0...@o13g2000cwo.googlegroups.com>,

webs...@gmail.com writes
>Randy Howard wrote:
>> webs...@gmail.com wrote:
>>

>> It is the fault of the development team, comprised of whoever
>> that involves for a given project. If the programmer feels like
>> his boss screwed him over, let him refuse to continue, swear out
>> an affidavit and have it notarized the bad software was
>> knowingly shipped, and that you refuse to endorse it.
>

>Oh I see. So, which socialist totally unionized company do you work as
>a programmer for? I'd like to apply!

You don't need to work in a unionised company to do that. I have seen
Professional Software Engineers put disclaimers on work they have had to
modify that whilst they are happy with their own work they were very
unhappy with the rest of it. It even got locked into the comments in the
CVs database to make sure it was on record.

Of course the corporate manslaughter Bill (UK) helps now.

>> [...] If you
>> are being overworked, you can either keep doing it, or you can
>> quit, or you can convince your boss to lighten up.
>

>Hmmm ... so you live in India?

How the hell do you come to that conclusion? seriously I would like to
see your reasoning here.

> I'm trying to guess where it is in this
>day and age that you can just quit your job solely because you don't
>like the pressures coming from management.

Any civilised country. Life is too short to mess about. It is the way
Professionals should behave.

>>
>> Try and force me to write something in a way that I know is
>> wrong. Go ahead, it'll be a short argument, because I will
>> resign first.
>
>That's a nice bubble you live in. Or is it just in your mind?

No. People do that. Seen it done. apparently you don't mind what you do
as long as you are paid. Where do you draw the line?

>> Try and force a brain surgeon to operate on your head with a
>> chainsaw. good luck.
>>
>> >> [...] If you could be fined or perhaps even jailed for
>> >> gross neglicence in software development the way doctors can be
>> >> today, I suspect the problem would be all but nonexistent.
>> >
>> > Ok, that's just vindictive nonsense.
>>
>> Why? We expect architects, doctors, lawyers, pretty much all
>> other real 'professions' to meet and typically exceed a higher
>> standard, and those that do not are punished, fined, or stripped
>> of their license to practice in the field. Why should
>> programmers get a pass? Is it because you do not feel it is a
>> professional position?
>
>Because its not as structured, and that's simply not practical.
>Doctors have training, internships, etc. Lawyers have to pass a bar
>exam, etc. There's no such analogue for computer programmers.

Yes there is. Most certainly. There is the PE in many countries and
Chartered Engineer in others. It requires a Degree, training and
experience.

The discussion in another thread is should it be made mandatory world
wide.

>> We don't let anyone that wants to prescribe medicine, why should
>> we let anyone that wants to put software up for download which
>> could compromise system security?

>>
>> > Programmers are generally not aware of the liability of
>> > their mistakes.
>>
>> Then those you refer to must be generally incompetent.

Agreed.

>> Correct. It's also not possible to completely remove medical
>> malpractice, but it gets punished anyway. It's called a
>> deterrent.
>
>You don't think medical practioners use the latest and safest
>technology available to practice their medicine?

Not always.

>Go measure your own first-compile error rate

That is a meaningless statistic.

>For a nuclear reactor, I would also include the requirement that they
>use a safer programming language like Ada.

As the studies have shown.. language choice has a minimum impact on
errors.

> Personally I would be
>shocked to know that *ANY* nuclear reactor control mechanism was
>written in C. Maybe a low level I/O driver library, that was
>thoroughly vetted (because you probably can't do that in Ada), but
>that's it.

Which destroys your argument! Use Ada because it is safe but the
interface between Ada and the hardware is C.... So effectively C
controls the reactor.

>>
>> I guess all the professionals in other fields where they are
>> held up to scrutiny must be irresponsible daredevils too.
>
>No -- they have great assistance and controlled environments that allow
>them to perform under such conditions.

Yes.

> Something akin to using a
>better programming language.

No.

>> [...] For
>> example, there are operations that have very low success rates,
>> yet there are doctors that specialize in them anyway, despite
>> the low odds.
>
>Well, your analogy only makes some sense if you are talking about
>surgeons in developing countries who simply don't have access to the
>necessary anesthetic, support staff or even the proper education to do
>the operation correctly. In those cases, there is little choice, so
>you make do with what you have. But obviously its a situation you just
>want to move away from -- they way you solve it, is you give them
>access to the safer, and better ways to practice medicine.

There is a blinkered view. Some operations are the only choice between
life and death. a relative on mine had that... this op has a 40% success
rate. Without it your life expectancy is about 2 weeks.

So you do the dangerous op.

>> If you don't want to take the risk, then go write in visual
>> whatever#.net and leave it to those that are.
>
>So you want some people to stay away from C because the language is too
>dangerous.

It is in the hands of the inexperienced.

> While I want the language be fixed so that most people

>don't trigger the landmines in the language so easily. If you think
>about it, my solution actually *costs* less.

I have seem more errors in "safe" languages because people thought it if
compiled OK it must be good code... there is a LOT more to it than
getting it to compile. Only an idiot would think that.

Chris Hills

unread,

Aug 31, 2005, 2:53:06 AM8/31/05

In article <0001HW.BF39EAD5...@news.verizon.net>, Randy
Howard <randy...@FOOverizonBAR.net> writes

>
>> The actual ANSI C committee
>> is different -- they knew exactly what role C was taking. They have
>> the ability to fix the warts in the language.
>
>No, they have the ability to recommend fixes to a language,

Actually "they" (the WG14 committee) can change or "fix" the language
Unfortunately IMOH they are not doing so.

> but
>as we have already seen, the developer community is more than
>willing to ignore them when push comes to shove. A community
>that strong-willed should be strong enough to address the gets()
>problem itself, and not rely on 'permission' from the standard
>group.

This is why things like MISRA-C exist and are widely used.

>
>> Tell me, when is the last time the C language committee considered a
>> change in the language that made it truly more powerful that wasn't
>> already implemented in many compilers as extensions?
>
>I can't think of an example offhand, but I am restricted by not
>being completely fluent on all the various compiler extensions,
>as I try to avoid them whenever possible. Even so, you have a
>point about their past history.

They are about to do it for the IBM maths extensions. Also some DSP
maths functions

Lawrence Kirby

unread,

Aug 31, 2005, 11:40:52 AM8/31/05

On Thu, 25 Aug 2005 20:15:55 +0000, Douglas A. Gwyn wrote:

> Walter Roberson wrote:
>> This starts to get into murky waters. Z-X is a subtraction
>> of pointers, the result of which is ptrdiff_t, which is a signed
>> integral type. Logically, though, Z-X could be of size_t, which
>> is unsigned. This difference has probably been discussed in the past,
>> but I have not happened to see the discussion of what happens with
>> pointer subtraction if the object size would fit in the unsigned
>> type but not in the signed type. Anyhow, ro, lo, ol should not be int.
>
> ptrdiff_t is supposed to be defined as a type wide enough to
> accommodate *any* possible result of a valid subtraction of
> pointers to objects. If an implementation doesn't *have* a
> suitable integer type, that is a deficiency..

The standard disagrees with you. 6.5.6p9 says:

"When two pointers are subtracted, both shall point to elements of the
same array object, or one past the last element of the array object; the
result is the difference of the subscripts of the two array elements. The
size of the result is implementation-defined, and its type (a signed
integer type) is ptrdiff_t defined in the <stddef.h> header. If the
result is not representable in an object of that type, the behavior is
undefined. ..."

It states very clearly in the last sentence that the result of pointer
subtraction need not be representable as a ptrdiff_t. In such a case you
get undefined behaviour i.e. it is the program that is at fault, not the
implementation.

Lawrence

Bart van Ingen Schenau

unread,

Aug 31, 2005, 1:27:18 PM8/31/05

Douglas A. Gwyn wrote:

> Randy Howard wrote:
>> Yes. Also, it is incredibly unlikely that a naive novice
>> programmer will be producing software that will be widely
>> deployed and wind up in a CERT advisory, but I suppose it is not
>> impossible.
>
> Judging by some of the reported bugs, one wonders.
>
> Just a few weeks ago, there was an IAVA for a bug in
> Kerberos v5 that was essentially of the form
> if (!try_something) {
> error_flag = CODE;
> free(buffer);
> }
> free(buffer);
> How that could have passed even a casual code review
> is a mystery.

I recently had to find out why two of our build environments were
behaving slightly differently from each other.
It turned out that a largely unrelated component, which is only included
in one of the environments, did something in the line of
EnterCriticalSection();
if (condition)
return;
ExitCriticalSection();
and due to the architecture of our system, this affected quite a number
of processes.

> The simple answer is that the C standard is a specification
> document, not a programming tutorial. Such a warning
> properly belongs in the Rationale Document, not in the spec.

I think that marking gets() as deprecated should be warning enough for
people to understand that the function must not be used for new code or
code that is being prepared for a new compiler version.

Bart v Ingen Schenau
--
a.c.l.l.c-c++ FAQ: http://www.comeaucomputing.com/learn/faq
c.l.c FAQ: http://www.eskimo.com/~scs/C-faq/top.html
c.l.c++ FAQ: http://www.parashift.com/c++-faq-lite/

Tim Rentsch

unread,

Aug 31, 2005, 2:23:27 PM8/31/05

Lawrence Kirby <lkn...@netactive.co.uk> writes:

Actually it doesn't say that the result need not be representable;
only that *if* the result is not representable then something else
is true. The cited paragraph doesn't say that the "if" clause can
be satisfied. We might infer that it can, but the text doesn't say
that it can.

I've been unable to find any statement one way or the other about
whether ptrdiff_t must accommodate all pointer subtractions done on
valid arrays. Consider for example the following. Suppose
PTRDIFF_MAX == INT_MAX, and SIZE_MAX == UINT_MAX; then

char *p0 = malloc( 1 + (size_t)PTRDIFF_MAX );
char *p1 = p0 + PTRDIFF_MAX;
char *p2 = p1 + 1;
size_t s = p2 - p0;

If undefined behavior prevents the last assignment from working,
what are we to conclude? That size_t is the wrong type? That
SIZE_MAX has the wrong value? That malloc should have failed
rather than delivering a too-large object? That ptrdiff_t is
the wrong type? Or that all standard requirements were met,
and the problem is one of low quality of implementation?

I believe, in the absense of an explicit statement to the contrary,
the limits on ptrdiff_t and PTRDIFF_MAX allow the possibility that
a difference of valid pointers to a valid array object might not be
representable as a value of type ptrdiff_t. I also think the
language would be improved if there were a requirement that a
difference of valid pointers to a valid array object must always be
representable as a value of type ptrdiff_t.

However, whether the language does or does not, or should or should
not, have such a requirement, the *standard* would be improved if it
included an explicit statement about whether this requirement must
be met.

Randy Howard

unread,

Aug 31, 2005, 2:43:58 PM8/31/05

Douglas A. Gwyn wrote
(in article <4314DB00...@null.net>):

> The simple answer is that the C standard is a specification
> document, not a programming tutorial. Such a warning
> properly belongs in the Rationale Document, not in the spec.

Okay, where can I obtain the Rationale Document that warns
programmers not to use gets()?

Wojtek Lerch

unread,

Aug 31, 2005, 2:56:01 PM8/31/05

Randy Howard wrote:
> Okay, where can I obtain the Rationale Document that warns
> programmers not to use gets()?

http://www.open-std.org/jtc1/sc22/wg14/www/docs/C99RationaleV5.10.pdf

Keith Thompson

unread,

Aug 31, 2005, 4:00:33 PM8/31/05

Chris Hills <ch...@phaedsys.org> writes:
> In article <1125209654.0...@o13g2000cwo.googlegroups.com>,
> webs...@gmail.com writes

[...]

>>For a nuclear reactor, I would also include the requirement that they
>>use a safer programming language like Ada.
>
> As the studies have shown.. language choice has a minimum impact on
> errors.
>
>> Personally I would be
>>shocked to know that *ANY* nuclear reactor control mechanism was
>>written in C. Maybe a low level I/O driver library, that was
>>thoroughly vetted (because you probably can't do that in Ada), but
>>that's it.
>
> Which destroys your argument! Use Ada because it is safe but the
> interface between Ada and the hardware is C.... So effectively C
> controls the reactor.

Just to correct the misinformation, there's no reason a low level I/O
driver library couldn't be written in Ada. The language was designed
for embedded systems. Ada can do all the unsafe low-level stuff C can
do; it just isn't the default.

Keith Thompson

unread,

Aug 31, 2005, 4:27:17 PM8/31/05

Which says, in 7.19.7.7:

Because gets does not check for buffer overrun, it is generally
unsafe to use when its input is not under the programmer's
control. This has caused some to question whether it should
appear in the Standard at all. The Committee decided that gets
was useful and convenient in those special circumstances when the
programmer does have adequate control over the input, and as
longstanding existing practice, it needed a standard
specification. In general, however, the preferred function is
fgets (see 7.19.7.2).

Personally, I think the Committee blew it on this one. I've never
heard of a real-world case where a program's input is under
sufficiently tight control that gets() can be used safely. On the
other hand, I have seen numerous calls to gets() in programs that are
expected to receive interactive input. As far as I know, the
"longstanding existing practice" cited in the Rationale is the
*unsafe* use of gets(), not the hypothetical safe use.

In the unlikely event that I were implementing a system where I had
that kind of control over a program's stdin, I'd still use fgets() so
I could do some error checking, at least in the context of unit
testing. Even in such a scenario, I'd be far more likely to read from
a source other than stdin, where gets() can't be used anyway.

I just found 13 calls to gets() in the source code for a large
software package implemented in C (which I prefer not to identify).
They were all in small test programs, not in production code, and they
all used buffers large enough that an interactive user is not likely
to overflow it -- but that's no excuse for writing unsafe code.

I'd be interested in seeing any real-world counterexamples.

Randy Howard

unread,

Aug 31, 2005, 4:29:03 PM8/31/05

Wojtek Lerch wrote
(in article <e3nRe.7584$2F1.4...@news20.bellglobal.com>):

Indeed. It doesn't exactly make the point very clearly, or
pointedly. Somehow "generally unsafe" doesn't seem strong
enough to me.

Anonymous 7843

unread,

Aug 31, 2005, 5:57:57 PM8/31/05

In article <lnirxlq...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>
> Personally, I think the Committee blew it on this one. I've never
> heard of a real-world case where a program's input is under
> sufficiently tight control that gets() can be used safely. On the
> other hand, I have seen numerous calls to gets() in programs that are
> expected to receive interactive input. As far as I know, the
> "longstanding existing practice" cited in the Rationale is the
> *unsafe* use of gets(), not the hypothetical safe use.

I think gets() could be made safer with some minor changes.

Add something like "#define GETSMAX n" to stdio.h, where n is an
implementation-defined constant with a guaranteed minimum. Then,
redefine gets() such that it is guaranteed to never put more than
GETSMAX-1 characters plus the trailing \0 into the buffer. Additional
characters in the input will be thrown away.

Code that uses gets() could then be made "safe" by making sure the
buffer passed in has at least GETSMAX characters available.

An interesting alternative to this would be to provide a function or
variable that can be set by the programmer at run time to alter gets()
max length behavior, something like setgetssize(size_t n). This would
allow an existing program filled with declarations like "char
inbuf[80]" to be fixable with one line.

Of course, nothing is stopping programmers from writing their own
line-oriented input function with exactly the interface they like. For
something on the order of gets() of fgets() it wouldn't take very long.

Keith Thompson

unread,

Aug 31, 2005, 6:48:52 PM8/31/05

anon...@example.com (Anonymous 7843) writes:
> In article <lnirxlq...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>>
>> Personally, I think the Committee blew it on this one. I've never
>> heard of a real-world case where a program's input is under
>> sufficiently tight control that gets() can be used safely. On the
>> other hand, I have seen numerous calls to gets() in programs that are
>> expected to receive interactive input. As far as I know, the
>> "longstanding existing practice" cited in the Rationale is the
>> *unsafe* use of gets(), not the hypothetical safe use.
>
> I think gets() could be made safer with some minor changes.

I disagree.

> Add something like "#define GETSMAX n" to stdio.h, where n is an
> implementation-defined constant with a guaranteed minimum. Then,
> redefine gets() such that it is guaranteed to never put more than
> GETSMAX-1 characters plus the trailing \0 into the buffer. Additional
> characters in the input will be thrown away.
>
> Code that uses gets() could then be made "safe" by making sure the
> buffer passed in has at least GETSMAX characters available.

Assuming such a change is made in the next version of the standard, or
widely implemented as an extension, code that uses the new safe gets()
will inevitably be recompiled on implementations that provide the old
unsafe version.

The solution is to eradicate gets(), not to fix it.

To paraphrase Dennis Ritchie's comments on the proposed "noalias"
keyword (and not to imply that he does or doesn't agree with me on
gets()):

gets() must go. This is non-negotiable.

Wojtek Lerch

unread,

Aug 31, 2005, 6:54:24 PM8/31/05

"Randy Howard" <randy...@FOOverizonBAR.net> wrote in message
news:0001HW.BF3B7D40...@news.verizon.net...

Sure. Whatever. I don't think a lot of programmers learn C from the
Standard or the Rationale anyway. It should be the job of teachers and
handbooks to make sure that beginners realize that it's not a good idea to
use gets(), or to divide by zero, or to cause integer overflow.

On the other hand, I don't think it would be unreasonable for the Standard
to officially declare gets() as obsolescent in the "Future library
directions" chapter.

Anonymous 7843

unread,

Aug 31, 2005, 7:35:42 PM8/31/05

In article <lnoe7do...@nuthaus.mib.org>,
Keith Thompson <ks...@mib.org> wrote:

>
> anon...@example.com (Anonymous 7843) writes:
> >
> > Add something like "#define GETSMAX n" to stdio.h, where n is an
> > implementation-defined constant with a guaranteed minimum. Then,
> > redefine gets() such that it is guaranteed to never put more than
> > GETSMAX-1 characters plus the trailing \0 into the buffer. Additional
> > characters in the input will be thrown away.
> >
> > Code that uses gets() could then be made "safe" by making sure the
> > buffer passed in has at least GETSMAX characters available.
>
> Assuming such a change is made in the next version of the standard, or
> widely implemented as an extension, code that uses the new safe gets()
> will inevitably be recompiled on implementations that provide the old
> unsafe version.

Making a change in a new C standard is supposed to fix
implementations adhering to old standards? That's a mighty
high wall to climb, for any proposed change.

Aside from that, if the "new code" used GETSMAX or
setgetsbuflen() it would actually fail to compile on an old
implementation.

> gets() must go. This is non-negotiable.

The situation isn't quite the same. noalias was a new
proposal with no existing code in jeopardy. gets() is
used widely in quick-n-dirty contexts like unit tests.

Ideally, gets() *would* go (see, I secretly agree with you,
please don't tell anyone) and there would a replacement
that comes to a nice compromise between the simplicity of
using gets() and the lets-not-incite-undefined-behavior
aspect of fgets().

Something like getstr(char *, size_t) with the truncation
of long lines.

Chris Hills

unread,

Aug 31, 2005, 7:33:51 PM8/31/05

In article <lnirxlq...@nuthaus.mib.org>, Keith Thompson <kst-
u...@mib.org> writes

I think this was one of the reasons in the very early days of some of
the more security conscious computer networks that could be externally
accessed. They would sent as stream of several Kbytes of characters
back at anyone who did not get the password right at the first attempt.
Thus over flowing any buffers.

Back in the days when I had a better power to weight ratio and a 1200
full duplex modem was FAST I found a few like that. No it was not parity
errors or wrong baud rate. They looked different. In fact it was a
different world back then.

Keith Thompson

unread,

Aug 31, 2005, 7:54:56 PM8/31/05

anon...@example.com (Anonymous 7843) writes:
> In article <lnoe7do...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>>
>> anon...@example.com (Anonymous 7843) writes:
>> >
>> > Add something like "#define GETSMAX n" to stdio.h, where n is an
>> > implementation-defined constant with a guaranteed minimum. Then,
>> > redefine gets() such that it is guaranteed to never put more than
>> > GETSMAX-1 characters plus the trailing \0 into the buffer. Additional
>> > characters in the input will be thrown away.
>> >
>> > Code that uses gets() could then be made "safe" by making sure the
>> > buffer passed in has at least GETSMAX characters available.
>>
>> Assuming such a change is made in the next version of the standard, or
>> widely implemented as an extension, code that uses the new safe gets()
>> will inevitably be recompiled on implementations that provide the old
>> unsafe version.
>
> Making a change in a new C standard is supposed to fix
> implementations adhering to old standards? That's a mighty
> high wall to climb, for any proposed change.

No, of course a change in a new standard won't fix old
implementations. That was my point.

> Aside from that, if the "new code" used GETSMAX or
> setgetsbuflen() it would actually fail to compile on an old
> implementation.

Sure, but it would make it more difficult to detect code that uses
gets() incorrectly. Given the current standard, that's basically any
code that uses gets().

>> gets() must go. This is non-negotiable.
>
> The situation isn't quite the same. noalias was a new
> proposal with no existing code in jeopardy. gets() is
> used widely in quick-n-dirty contexts like unit tests.

Yes, and it shouldn't be.

> Ideally, gets() *would* go (see, I secretly agree with you,
> please don't tell anyone) and there would a replacement
> that comes to a nice compromise between the simplicity of
> using gets() and the lets-not-incite-undefined-behavior
> aspect of fgets().
>
> Something like getstr(char *, size_t) with the truncation
> of long lines.

You're proposing a new variant of fgets() that doesn't specify the
input file (and therefore always uses stdin), and that strips the
trailing '\n'. I would have no objection to that. But with or
without this new function, gets() should not be used, and ideally
should not be standardized or implemented.

Randy Howard

unread,

Aug 31, 2005, 9:13:49 PM8/31/05

Anonymous 7843 wrote
(in article <i9rRe.1983$mH.1699@fed1read07>):

> Making a change in a new C standard is supposed to fix
> implementations adhering to old standards? That's a mighty
> high wall to climb, for any proposed change.

Yes. A far better use of spam would be to send out notices to
everyone with an email account on the dangers of gets() instead
of trying to convince them to order prescription medicine
online.

> Aside from that, if the "new code" used GETSMAX or
> setgetsbuflen() it would actually fail to compile on an old
> implementation.

There are so many better alternatives available, I see no reason
to reuse the same name for different behavior. It's not like
they are running out of names for functions. Since str-whatever
is reserved already, strget might make a nice solution, and an
implementation similar to what various folks have proposed in
the past, such as Heathfield's 'fgetline' (IIRC), or ggets()
from CBF, etc. There certainly wouldn't be any harm in /adding/
a new replacement that can be used safely, and deprecating
gets() entirely. Reusing gets() would just confuse even more
people. The world does not need more confused newbies, they are
already in abundant supply and replicate faster than they
disappear.

I can't think of a single example of gets() being used in a
piece of code worth worrying about. If it is used widely in
quick-n-dirty contexts, then it isn't a problem. It's trivial
to fix 'quick-n-dirty programs if you get bitten by it.

The bigger packages using it, /need/ to break as early as
possible, before they spread into broader use and cause more
problems.

> Something like getstr(char *, size_t) with the truncation
> of long lines.

There are lots of options, and that may be part of the problem.
Too many choices. Fortunately, all of them are better than the
currently standardized gets().

Randy Howard

unread,

Aug 31, 2005, 9:20:54 PM8/31/05

Wojtek Lerch wrote
(in article <2didnRd47bq...@rogers.com>):

> "Randy Howard" <randy...@FOOverizonBAR.net> wrote in message
> news:0001HW.BF3B7D40...@news.verizon.net...
>> Wojtek Lerch wrote
>> (in article <e3nRe.7584$2F1.4...@news20.bellglobal.com>):
>>
>>> Randy Howard wrote:
>>>> Okay, where can I obtain the Rationale Document that warns
>>>> programmers not to use gets()?
>>>
>>> http://www.open-std.org/jtc1/sc22/wg14/www/docs/C99RationaleV5.10.pdf
>>
>> Indeed. It doesn't exactly make the point very clearly, or
>> pointedly. Somehow "generally unsafe" doesn't seem strong
>> enough to me.
>
> Sure. Whatever. I don't think a lot of programmers learn C from the
> Standard or the Rationale anyway.

Unfortunately, some of them don't listen to anything not nailed
down though. The typical freshly-minted know-it-all response is
"Who are you to tell me not to use it? The ISO C standards body
put it in there for a reason. If it was bad, it wouldn't be in
an international standard. duh."

> It should be the job of teachers and handbooks to make sure that
> beginners realize that it's not a good idea to use gets(), or to
> divide by zero, or to cause integer overflow.

You'd think so. Judging by the number of college students today
that ask questions about basic problems with floating point
error propagation and avoidance, I am not enthusiastic about the
odds. That was a freshman year course back in the day, since we
weren't going to school to learn how to specify em units in a
style sheet, we were supposed to be learning about using
computers for something useful, like solving engineering
problems.

> On the other hand, I don't think it would be unreasonable for the
> Standard to officially declare gets() as obsolescent in the "Future
> library directions" chapter.

If ISO expects anyone to take C0x seriously, then they have to
do something about this sort of thing, including gets() and
perhaps some strong words at least about some of the other
string function suspects as well. If gets() stays in unadorned,
it'll be pathetic.

Randy Howard

unread,

Aug 31, 2005, 9:28:35 PM8/31/05

Keith Thompson wrote
(in article <lnirxlq...@nuthaus.mib.org>):

> Which says, in 7.19.7.7:
>
> Because gets does not check for buffer overrun, it is generally
> unsafe to use when its input is not under the programmer's
> control. This has caused some to question whether it should
> appear in the Standard at all. The Committee decided that gets
> was useful and convenient in those special circumstances when the
> programmer does have adequate control over the input, and as
> longstanding existing practice, it needed a standard
> specification. In general, however, the preferred function is
> fgets (see 7.19.7.2).
>
> Personally, I think the Committee blew it on this one.

I think that is an almost universal opinion, apart from those
that were sitting on it at the time. They're outnumbered about
10000:1 from what I can tell. Every time a buffer overrun gets
reported, or another "Shellcoder's Handbook" bets published, the
odds get even worse.

Having a few people sitting around doing the three-monkeys trick
doesn't change it.

> As far as I know, the
> "longstanding existing practice" cited in the Rationale is the
> *unsafe* use of gets(), not the hypothetical safe use.

Exactamundo. I'd love to see a single example of a widely used
program implemented with gets() that can be demonstrated as
safe, due to the programmer having 'adequate control'. Can
anyone point to one that is in the wild?

> I just found 13 calls to gets() in the source code for a large
> software package implemented in C (which I prefer not to identify).

I wonder what it would take to get SourceForge to scan every
line of C or C++ source looking for it and putting out a 'bad
apples' list on their home page. A basic service for the good
of humanity. Embargoing downloads for projects until they are
expunged would be even better. Time to wake up...

> They were all in small test programs, not in production code, and they
> all used buffers large enough that an interactive user is not likely
> to overflow it -- but that's no excuse for writing unsafe code.

One particular email package that is widely claimed to be
extremely secure and well-written includes a dozen or more
instances of void main() in it's various components. The author
couldn't care less, despite having been made aware of it.
That's a lesser evil in the grand scheme of things, but when
people refuse to change things, even when they know they are
wrong, you know it isn't going to be easy.

Wojtek Lerch

unread,

Aug 31, 2005, 10:25:52 PM8/31/05

"Randy Howard" <randy...@FOOverizonBAR.net> wrote in message

news:0001HW.BF3BC1A6...@news.verizon.net...

> Wojtek Lerch wrote
> (in article <2didnRd47bq...@rogers.com>):
>> "Randy Howard" <randy...@FOOverizonBAR.net> wrote in message
>> news:0001HW.BF3B7D40...@news.verizon.net...

>>> Indeed. It doesn't exactly make the point very clearly, or
>>> pointedly. Somehow "generally unsafe" doesn't seem strong
>>> enough to me.
>>
>> Sure. Whatever. I don't think a lot of programmers learn C from the
>> Standard or the Rationale anyway.
>
> Unfortunately, some of them don't listen to anything not nailed
> down though. The typical freshly-minted know-it-all response is
> "Who are you to tell me not to use it? The ISO C standards body
> put it in there for a reason. If it was bad, it wouldn't be in
> an international standard. duh."

Well, *then* you can point them to the Rationale and explain what it means
by "generally unsafe". You could even try to explain why it was
standardized even though it was known to be unsafe, and why a lot of people
disagree with that decision. A good teacher can take advantage of this kind
of stuff.

Anyway, think of all the unsafe things they'll have to learn not to do
before they become competent programmers. Pretty much all of them are more
difficult to avoid than gets().

>> It should be the job of teachers and handbooks to make sure that
>> beginners realize that it's not a good idea to use gets(), or to
>> divide by zero, or to cause integer overflow.
>
> You'd think so.

I haven't said anything about how well I think they're doing their job. I'm
sure there are a lot of bad teachers and bad handbooks around. But I doubt
banning gets() would make it significantly easier for their victims to
become competent programmers.

Dennis Ritchie

unread,

Aug 31, 2005, 11:00:03 PM8/31/05

About my attitude to gets(), this was dredged from google.
Conversations repeat; there are about 78 things in
this "The fate of gets" thread.

> Dennis Ritchie Nov 9 1999, 8:00

> Newsgroups: comp.std.c
> From: Dennis Ritchie <d...@bell-labs.com> Date: 1999/11/09
> Subject: Re: The fate of gets

> Clive D.W. Feather wrote:
> > ..... If most implementers will ship gets() anyway,
> > there's little practical effect to eliminating it from the Standard.

> On the other hand, we removed it from our library about a week
> after the Internet worm. Of course, some couldn't afford
> to do that.

Dennis

Richard Bos

unread,

Sep 1, 2005, 3:00:18 AM9/1/05

Randy Howard <randy...@FOOverizonBAR.net> wrote:

> Wojtek Lerch wrote

>
> > "Randy Howard" <randy...@FOOverizonBAR.net> wrote in message

> >> Wojtek Lerch wrote

> >> Indeed. It doesn't exactly make the point very clearly, or
> >> pointedly. Somehow "generally unsafe" doesn't seem strong
> >> enough to me.
> >
> > Sure. Whatever. I don't think a lot of programmers learn C from the
> > Standard or the Rationale anyway.
>
> Unfortunately, some of them don't listen to anything not nailed
> down though. The typical freshly-minted know-it-all response is
> "Who are you to tell me not to use it? The ISO C standards body
> put it in there for a reason. If it was bad, it wouldn't be in
> an international standard. duh."

I think you over-estimate the average VB-level programmer. s/ISO C
standards body/Great and Sacred Microsoft/ and s/an international
standard/MSVC++###/ would be more realistic.

Richard

Douglas A. Gwyn

unread,

Sep 1, 2005, 11:00:48 AM9/1/05

Keith Thompson wrote:
> Which says, in 7.19.7.7:
> Because gets does not check for buffer overrun, it is generally
> unsafe to use when its input is not under the programmer's
> control. This has caused some to question whether it should
> appear in the Standard at all. The Committee decided that gets
> was useful and convenient in those special circumstances when the
> programmer does have adequate control over the input, and as
> longstanding existing practice, it needed a standard
> specification. In general, however, the preferred function is
> fgets (see 7.19.7.2).
> Personally, I think the Committee blew it on this one. I've never
> heard of a real-world case where a program's input is under
> sufficiently tight control that gets() can be used safely.

You must have a lack of imagination -- there are a great many
cases where one is coding for a small app where the programmer
himself has complete control over all data that the app will
encounter. Note that that is not at all the same environment
as arbitrary or interactive input, where of course lack of
proper validation of input would be a BUG.

> ... As far as I know, the

> "longstanding existing practice" cited in the Rationale is the
> *unsafe* use of gets(), not the hypothetical safe use.

No, it's simply the existing use of gets as part of the stdio
library, regardless of judgment about safety. As such, it was
part of the package for which the C standard was expected to
provide specification.

> I just found 13 calls to gets() in the source code for a large
> software package implemented in C (which I prefer not to identify).
> They were all in small test programs, not in production code, and they
> all used buffers large enough that an interactive user is not likely
> to overflow it -- but that's no excuse for writing unsafe code.

If in fact the test programs cannot overflow their buffers
with any of the test data provided, they are perforce safe
enough. In fact that's exactly the kind of situation where
gets has traditionally been used and thus needs to exist,
with a portable interface spec, in order to minimize porting
expense.

Keith Thompson

unread,

Sep 1, 2005, 2:38:42 PM9/1/05

"Douglas A. Gwyn" <DAG...@null.net> writes:
> Keith Thompson wrote:
>> Which says, in 7.19.7.7:
>> Because gets does not check for buffer overrun, it is generally
>> unsafe to use when its input is not under the programmer's
>> control. This has caused some to question whether it should
>> appear in the Standard at all. The Committee decided that gets
>> was useful and convenient in those special circumstances when the
>> programmer does have adequate control over the input, and as
>> longstanding existing practice, it needed a standard
>> specification. In general, however, the preferred function is
>> fgets (see 7.19.7.2).
>> Personally, I think the Committee blew it on this one. I've never
>> heard of a real-world case where a program's input is under
>> sufficiently tight control that gets() can be used safely.
>
> You must have a lack of imagination -- there are a great many
> cases where one is coding for a small app where the programmer
> himself has complete control over all data that the app will
> encounter. Note that that is not at all the same environment
> as arbitrary or interactive input, where of course lack of
> proper validation of input would be a BUG.

This has nothing to do with my imagination. I can imagine obscure
cases where gets() might be used safely. I said that I've heard of a
real-world case.

>> ... As far as I know, the
>> "longstanding existing practice" cited in the Rationale is the
>> *unsafe* use of gets(), not the hypothetical safe use.
>
> No, it's simply the existing use of gets as part of the stdio
> library, regardless of judgment about safety. As such, it was
> part of the package for which the C standard was expected to
> provide specification.

The majority of the existing use of gets() is unsafe.

>> I just found 13 calls to gets() in the source code for a large
>> software package implemented in C (which I prefer not to identify).
>> They were all in small test programs, not in production code, and they
>> all used buffers large enough that an interactive user is not likely
>> to overflow it -- but that's no excuse for writing unsafe code.
>
> If in fact the test programs cannot overflow their buffers
> with any of the test data provided, they are perforce safe
> enough. In fact that's exactly the kind of situation where
> gets has traditionally been used and thus needs to exist,
> with a portable interface spec, in order to minimize porting
> expense.

The test data provided is whatever the user types at the keyboard.
The programs in question used huge buffers that are *probably* big
enough to hold whatever the user types -- as opposed to reasonable
sized buffers that *cannot* overflow if the programs used fgets()
instead.

Another point: gets() cannot be used safely in portable code. The
safe use of gets() requires strict control over where a program's
stdin comes from. There's no way to do that in standard C. If I
wanted to control a program's input, I'd be more likely to specify the
name of an input file, which means gets() can't be used anyway.

Perhaps gets() should be relagated to some system-specific library.

The Committee was willing to remove implicit int from the language.
There was widespread existing use of this feature, much of it
perfectly safe. I happen to agree with that decision, but given the
willingness to make that kind of change, I see no excuse for leaving
gets() in the standard.

Walter Roberson

unread,

Sep 1, 2005, 2:49:18 PM9/1/05

In article <lnek88m...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>Another point: gets() cannot be used safely in portable code. The
>safe use of gets() requires strict control over where a program's
>stdin comes from. There's no way to do that in standard C. If I
>wanted to control a program's input, I'd be more likely to specify the
>name of an input file, which means gets() can't be used anyway.

fseek() to the beginning of stdin . If that fails then your input
is not a file so use some alternative method or take some failure
mode. If the fseek() succeeds, then you know that you can
examine the input, find the longest line, malloc() a buffer big
enough to hold that, then fseek() back and gets() using that buffer.

Sure, it's not pretty, but it's portable ;-)
--
Oh, to be a Blobel!

Randy Howard

unread,

Sep 1, 2005, 3:36:42 PM9/1/05

Dennis Ritchie wrote
(in article <df5q4m$b...@netnews.net.lucent.com>):

If Dennis Ritchie thinks it's safe to remove it, who are the ISO
C standard body to think they should leave it in?

:-)

Douglas A. Gwyn

unread,

Sep 1, 2005, 3:27:36 PM9/1/05

Keith Thompson wrote:
> The majority of the existing use of gets() is unsafe.

The majority of existing programs are incorrect. That doesn't
mean that there is no point in having standards for the elements
of the programming language/environment.

> The test data provided is whatever the user types at the keyboard.
> The programs in question used huge buffers that are *probably* big
> enough to hold whatever the user types -- as opposed to reasonable
> sized buffers that *cannot* overflow if the programs used fgets()
> instead.

Presumably the tester understands the limitations and does not
obtain any advantage by violating them.

The theory that mere replacement of gets by fgets (with the
addition of newline trimming) will magically make a program
"safe" is quite flawed. There are around a dozen details
that need to be taken care of for truly safe and effective
input validation, and if the programmer is using gets in
such a context, he is most unlikely to have dealt with any
of these matters. Putting it another way: gets is not a
problem for the competent programmer, and lack of gets
wouldn't appreciably help the incompetent programmer.

Randy Howard

unread,

Sep 1, 2005, 3:58:31 PM9/1/05

Wojtek Lerch wrote
(in article <etGdnabdD50...@rogers.com>):

> "Randy Howard" <randy...@FOOverizonBAR.net> wrote in message
> news:0001HW.BF3BC1A6...@news.verizon.net...
>> Wojtek Lerch wrote
>> (in article <2didnRd47bq...@rogers.com>):
>>> "Randy Howard" <randy...@FOOverizonBAR.net> wrote in message
>>> news:0001HW.BF3B7D40...@news.verizon.net...
>>>> Indeed. It doesn't exactly make the point very clearly, or
>>>> pointedly. Somehow "generally unsafe" doesn't seem strong
>>>> enough to me.
>>>
>>> Sure. Whatever. I don't think a lot of programmers learn C from the
>>> Standard or the Rationale anyway.
>>
>> Unfortunately, some of them don't listen to anything not nailed
>> down though. The typical freshly-minted know-it-all response is
>> "Who are you to tell me not to use it? The ISO C standards body
>> put it in there for a reason. If it was bad, it wouldn't be in
>> an international standard. duh."
>
> Well, *then* you can point them to the Rationale and explain what it means
> by "generally unsafe".

That's true, but it would be better, and not harm anyone, if
they took out gets() completely from the main body, and moved it
back to section J with asm() and such, so that if some vendor
feels like they absolutely /must/ leave it in, they can do so,
but not have it a requirement that conforming compilers continue
to ship such garbage.

That would be a far more convincing story to the newbies too.
"They took it out of C0x because it was too dangerous. Even
though we don't have access to a C0x compiler yet, it still
makes sense to be as cautious as the standard, does it not?"

> You could even try to explain why it was
> standardized even though it was known to be unsafe, and why a lot of people
> disagree with that decision.

I understand why it was standardized a couple decades ago. What
I do not understand is why it is still in the standard. I have
heard the arguments for leaving it in, and they have not been
credible to me.

> A good teacher can take advantage of this kind of stuff.

That's true. It would still be better for it not to be an issue
at all.

> Anyway, think of all the unsafe things they'll have to learn not to do
> before they become competent programmers. Pretty much all of them are more
> difficult to avoid than gets().

Also true, and all the better reason not to waste time on items
that could be avoided without any time spent on them at all,
leaving time to focus on what really is hard to accomplish.

> I haven't said anything about how well I think they're doing their job. I'm
> sure there are a lot of bad teachers and bad handbooks around. But I doubt
> banning gets() would make it significantly easier for their victims to
> become competent programmers.

Even if it didn't make any easier (which I can not judge either
way, with no data on it), it would not be a hardship for
conforming compilers produced in this century to not provide
gets(). It's not just the students at issue here, the many,
many bugs extant due to it are more important by far, with or
without new programmers to worry about.

David Wagner

unread,

Sep 1, 2005, 8:18:47 PM9/1/05

Walter Roberson wrote:
>fseek() to the beginning of stdin . If that fails then your input
>is not a file so use some alternative method or take some failure
>mode. If the fseek() succeeds, then you know that you can
>examine the input, find the longest line, malloc() a buffer big
>enough to hold that, then fseek() back and gets() using that buffer.
>
>Sure, it's not pretty, but it's portable ;-)

It's also insecure. Just think about what happens if the file size
changes in between examining the input and calling gets() -- boom,
you lose. In the security world, this is known as a time-of-check to
time-of-use (TOCTTOU) bug.

gets() is a loaded gun, helpfully pre-aimed for you at your own foot.
Maybe you can do some fancy dancing and avoid getting shot, but that
doesn't make it a good idea.

Antoine Leca

unread,

Sep 2, 2005, 11:15:34 AM9/2/05

En <news:0001HW.BF3CC279...@news.verizon.net>,
Randy Howard va escriure:

>>
>>> Dennis Ritchie Nov 9 1999, 8:00
>>

>>> Subject: Re: The fate of gets
>>

>>> On the other hand, we removed it from our library about a week
>>> after the Internet worm. Of course, some couldn't afford
>>> to do that.
>

> If Dennis Ritchie thinks it's safe to remove it, who are the ISO
> C standard body to think they should leave it in?

Do not misread: Mr Ritchie did not say he think it was safe to remove it, he
noted:

- that they (Bell Labs) removed it [on 1988-11-09]

- that they removed it because it was involved in a critical failure of the
system [i.e., it was not "safe to remove it", rather the system was safe*r*
without it]

- that some others implementers couldn't do the same

The third observation is a periphrase of Clive Feather's point: "little
practical effect." <news:ALHmaBL$zwJ4...@romana.davros.org>

Antoine
PS: This does not mean I endorse not having done it. I believe the real
reason was the lack opportunistic proposal to nuke it.

David R Tribble

unread,

Sep 2, 2005, 12:04:31 PM9/2/05

Keith Thompson wrote:
>> Another point: gets() cannot be used safely in portable code. The
>> safe use of gets() requires strict control over where a program's
>> stdin comes from. There's no way to do that in standard C. If I
>> wanted to control a program's input, I'd be more likely to specify the
>> name of an input file, which means gets() can't be used anyway.
>

Walter Roberson wrote:
> fseek() to the beginning of stdin . If that fails then your input
> is not a file so use some alternative method or take some failure
> mode. If the fseek() succeeds, then you know that you can
> examine the input, find the longest line, malloc() a buffer big
> enough to hold that, then fseek() back and gets() using that buffer.
>
> Sure, it's not pretty, but it's portable ;-)

The reason I don't write code like this is that it doesn't work for
piped input. Since a fair number of my programs are designed along
the Unix philosophy of filters (i.e., read input from anywhere,
including redirected output from another program, and write output
that can be convienently redirected to other programs), I don't
bother coding two forms of input if I can help it.

Which means that I use fgets() instead of gets(), and simply assume
a reasonably large maximum line size. I imagine I'm not alone in
using this approach, which is also safe and portable.

-drt

David R Tribble

unread,

Sep 2, 2005, 12:09:07 PM9/2/05

Douglas A. Gwyn wrote:
> The theory that mere replacement of gets by fgets (with the
> addition of newline trimming) will magically make a program
> "safe" is quite flawed. There are around a dozen details
> that need to be taken care of for truly safe and effective
> input validation, and if the programmer is using gets in
> such a context, he is most unlikely to have dealt with any
> of these matters.
>
> Putting it another way: gets is not a
> problem for the competent programmer, and lack of gets
> wouldn't appreciably help the incompetent programmer.

More to the point, eliminating gets() from ISO C will not affect
incompetent programmers one whit, because those programmers don't
read the standard, nor do they abide by anything it recommends.
You can't legislate good programming.

Also, eliminating *anything* from std C will not force compiler
and library vendors from removing them from their implementations.
Their customers include a large number of incompetent programmers,
who will insist that good old C functions be available, the
consequences be damned.

Douglas A. Gwyn

unread,

Sep 2, 2005, 12:46:49 PM9/2/05

David R Tribble wrote:
> Also, eliminating *anything* from std C will not force compiler
> and library vendors from removing them from their implementations.
> Their customers include a large number of incompetent programmers,
> who will insist that good old C functions be available, the
> consequences be damned.

Also competent programmers who would be justly annoyed
when their small test programs would no longer build.

To repeat a point I've made before: The idea that
incorrect programming can be corrected by small changes
in library function interfaces is so far wrong as to be
outright dangerous.

Douglas A. Gwyn

unread,

Sep 2, 2005, 12:57:46 PM9/2/05

Randy Howard wrote:
> If Dennis Ritchie thinks it's safe to remove it, who are the ISO
> C standard body to think they should leave it in?

As he said, some can't afford to do that. So long as such
a venerable function is still being provided by vendors to
meet customer requirements, it is useful to have a published
interface spec for it. Just because there is a spec, or is
a function in some library, doesn't mean you have to use it
if it is doesn't meet your requirements.

Some seem to have a misconception about the functions of
standardization. It is literally *impossible* to standardize
correct programming, and in all its ramifications C has
always left that concern up to the programmer, not the
compiler. Many programmers have found it useful to develop
or obtain additional tools to help them produce better
software, "lint" being one of the earliest. You might
consider using "grep '[^f]gets'" as a tool that meets your
particular concern.

The more you go on about program correctness being the
responsibility of those tasked with publishing specs for
legacy functions, the more you divert programmer attention
from where they real correctness and safety issues lie.

Douglas A. Gwyn

unread,

Sep 2, 2005, 1:02:08 PM9/2/05

Randy Howard wrote:
> Also true, and all the better reason not to waste time on items
> that could be avoided without any time spent on them at all,
> leaving time to focus on what really is hard to accomplish.

If anybody teaches programming and fails to mention the
danger of overrunning a buffer, he is contributing to
the very problem that you decry. gets is useful in
simple examples to help students understand the issue,
and indeed has the kind of interface that a naive
programmer is likely to invent for his own functions
unless he has learned this particular lesson.

Keith Thompson

unread,

Sep 2, 2005, 2:21:24 PM9/2/05

"Douglas A. Gwyn" <DAG...@null.net> writes:

> David R Tribble wrote:
>> Also, eliminating *anything* from std C will not force compiler
>> and library vendors from removing them from their implementations.
>> Their customers include a large number of incompetent programmers,
>> who will insist that good old C functions be available, the
>> consequences be damned.
>
> Also competent programmers who would be justly annoyed
> when their small test programs would no longer build.

What about all the small test programs that used implicit int? If
that kind of change wasn't acceptable, why the great concern about
breaking programs that use gets()?

> To repeat a point I've made before: The idea that
> incorrect programming can be corrected by small changes
> in library function interfaces is so far wrong as to be
> outright dangerous.

I don't believe anybody has suggested that removing gets() would solve
a huge number of problems. It would solve only one.

The language would be better without gets() than with it.

Keith Thompson

unread,

Sep 2, 2005, 3:02:06 PM9/2/05

Keith Thompson <ks...@mib.org> writes:
[...]

> What about all the small test programs that used implicit int? If
> that kind of change wasn't acceptable, why the great concern about
> breaking programs that use gets()?

Whoops, I meant "If that kind of change *was* acceptable".

webs...@gmail.com

unread,

Sep 4, 2005, 12:26:16 AM9/4/05

Wojtek Lerch wrote:

> "Randy Howard" <randy...@FOOverizonBAR.net> wrote:
> > Wojtek Lerch wrote
> > (in article <e3nRe.7584$2F1.4...@news20.bellglobal.com>):
> >
> >> Randy Howard wrote:
> >>> Okay, where can I obtain the Rationale Document that warns
> >>> programmers not to use gets()?
> >>
> >> http://www.open-std.org/jtc1/sc22/wg14/www/docs/C99RationaleV5.10.pdf
> >
> > Indeed. It doesn't exactly make the point very clearly, or
> > pointedly. Somehow "generally unsafe" doesn't seem strong
> > enough to me.

Suggesting that there might be some scenario where it can be used
safely actually makes it sound worse to me. They are basically
demanding platform specific, support to make this function safe -- and
of course we *know* that you also require environmental and application
specific support, in OS'es that support stdin redirection. But of
course they specify nothing; just referring to it as some nebulous
possibility worth saving the function for.

> Sure. Whatever. I don't think a lot of programmers learn C from the
> Standard or the Rationale anyway. It should be the job of teachers and
> handbooks to make sure that beginners realize that it's not a good idea to
> use gets(), or to divide by zero, or to cause integer overflow.

Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
754, and there's nothing intrinsically wrong with it (for most
numerators you'll get inf or -inf, or otherwise a NaN). Integer
overflow is also extremely well defined, and actually quite useful on
2s complement machines (you can do a range check with a subtract and
unsigned compare with one branch, rather than two branches.)

> On the other hand, I don't think it would be unreasonable for the Standard
> to officially declare gets() as obsolescent in the "Future library
> directions" chapter.

And what do you think the chances of that are? The committee is
clearly way beyond the point of incompetence on this matter. If they
did that in 1989, then we could understand that it was not removed
until now. But they actually continue to endorse it, and will continue
to do so in the future.

Its one thing to make a mistake and recognize it (ala Ritchie.) Its
quite another to be shown what a mistake it is and continue to prefer
the mistake to the most obvious fix.

--
Paul Hsieh
http://www.pobox.com/~qed/
http://bstring.sf.net/

webs...@gmail.com

unread,

Sep 4, 2005, 1:19:45 AM9/4/05

kuy...@wizard.net wrote:
> webs...@gmail.com wrote:
> > Randy Howard wrote:
> > > webs...@gmail.com wrote:
> > > > Randy Howard wrote:
> ...
> > "False dichotomy". Look it up. I never mentioned high or low level
> > language, and don't consider it relevant to the discussion. Its a
> > false dichotomoy because you immediately dismiss the possibility of a
> > safe low-level language.
>
> No, it's not an immediate dimissal.

It is, and you simply continue to propgate it.

> [...] It's also not a dichotomy: low-level languages are inherently
> unsafe, [...]

No. They may contain unsafe ways of using them. This says nothing
about the possibility of safe paths of usage.

> but high-level languages are not inherently safe.

Empty and irrelevant (and not really true; at least not relatively.)

> If it's low-level, by definition it gives you
> access to unprotected access to dangerous features of the machine
> you're writing for.

So how does gets() or strtok() fit in this? Neither provides any low
level functionality that isn't available in better ways through
alternate means that are clearly safer (without being slower.)

> [...] If it protected your access to those features, that
> protection (regardless of what form it takes) would make it a
> high-level language.

So you are saying C becomes a high level language as soon as you start
using something like Bstrlib (or Vstr, for example)? Are you saying
that Microsoft's init segment protection, or their built-in debugger
features, or heap checking makes them a high level language?

> ...
> > C gives you access to a sequence of opcodes in ways that other
> > languages do not? What exactly are you saying here? I don't
> > understand.
>
> Yes, you can access things more directly in C than in other higher
> level languages. That's what makes them higher-level languages.

Notice that doesn't coincide with what you've said above. But it does
coincide with the false dichotomy. The low-levelledness in of itself
is not what makes it unsafe -- this just changes the severity of the
failures.

> [...] One of
> the most dangerous features of C is that it has pointers, which is a
> concept only one layer of abstraction removed from the concept of
> machine addresses. Most of the "safer" high level languages provide
> little or no access to machine addresses; that's part of what makes
> them safer.

Ada has pointers.

> > I am dodging the false dichotomy. Yes. You are suggesting that making
> > C safer is equivalent to removing buffer overflows from assembly. The
> > two have nothing to do with each other.
>
> You can't remove buffer overflows from C without moving it at least a
> little bit farther away from assembly, for precisely the same reason
> why you can't remove buffer overflows from assembly without making it
> less of an assembly language.

Having some unsafe paths of usage is not what makes a language unsafe.
People don't think of Java as an unsafe language because you can write
race conditions in it (you technically cannot do that in pure ISO C.)
What matters is what is exposed for the most common usage in the
language.

> > As I recall this was just a point about low level languages adopting
> > safer interfaces. Tough in this case, the performance improvements
> > probably drives their interest in it.
> >
> > > >> [...] If you want to argue that too many people
> > > >> write code in C when their skill level is more appropriate to a
> > > >> language with more seatbelts, I won't disagree. The trick is
> > > >> deciding who gets to make the rules.
> > > >
> > > > But I'm not arguing that either. I am saying C is to a large degree
> > > > just capriciously and unnecessarily unsafe (and slow, and powerless,
> > > > and unportable etc., etc).
> > >
> > > Slow? Yes, I keep forgetting how much better performance one
> > > achieves when using Ruby or Python. Yeah, right.
> >
> > I never put those languages up as alternatives for speed. The false
> > dichotomy yet again.
>
> A more useful response would have been to identify these
> safer-and-specdier-than-C languages that you're referring to.

Why? Because you assert that C represents the highest performing
language in existing?

Its well known that Fortran beats C for numerical applications. Also,
if you take into account that assembly doesn't specify intrinsically
unsafe usages of buffers (like including a gets() function) you could
consider assembly safer and definately faster than C.

Python uses GMP (which has lots of assembly language in it, that
basically give it a 4x performance improvement over what is even
theoretically possible with the C language standard) to do its big
integer math. That means for certain big integer operations (think
crypto), Python just runs faster than what can be done in pure C.

But that's all besides the point. I modify my own C usage to beat its
performance by many times on a regular basis (dropping to assembly,
making 2s complement assumptions, unsafe casts between integer types
and pointers etc), and obviously use safe libraries (for strings,
vectors, hashes, an enhanced heap, and so on) that are well beyond the
safety features of C. In all these cases some simple modifications to
the C standard and C library would make my modifications basically
irrelevant.

> > > Unportable? You have got to be kidding. I must be
> > > hallucinating when I see my C source compiled and executing on
> > > Windows, Linux, NetWare, OS X, Solaris, *bsd, and a host of
> > > other UNIX-like platforms, on x86, x86-64, PPC, Sparc, etc.
> >
> > Right. Because you write every piece of C code that's ever been
> > written right?
>
> His comment says nothing to suggest that he's ported any specific
> number of programs to those platforms. It could be a single program, it
> could be a million. Why are you interpreting his claim suggesting that
> ported many different programs to those platforms?

God, what is wrong with you people? He makes an utterly unfounded
statement about portability that's not worth arguing about. I make the
obvious stab to indicate that that argument should be nipped in the
bud, but you just latch onto it anyways.

Making code portable in C requires a lot of discipline, and in truth a
lot of a testing (espcially on numerics, its just a lot harder than you
might think). Its discipline that in the real world basically nobody
has. Randy is asserting that C is portable because *HE* writes C code
that is portable. And that's ridiculous, and needs little comment on
it.

Keith Thompson

unread,

Sep 4, 2005, 3:18:29 AM9/4/05

webs...@gmail.com writes:
> Wojtek Lerch wrote:
[...]

>> Sure. Whatever. I don't think a lot of programmers learn C from the
>> Standard or the Rationale anyway. It should be the job of teachers and
>> handbooks to make sure that beginners realize that it's not a good idea to
>> use gets(), or to divide by zero, or to cause integer overflow.
>
> Uh ... excuse me, but dividing by zero has well defined meaning in IEEE
> 754, and there's nothing intrinsically wrong with it (for most
> numerators you'll get inf or -inf, or otherwise a NaN).

But C does not, and cannot, require IEEE 754. Machines that don't
implement IEEE 754 are becoming rarer, but they still exist, and
C should continue to support them.

C99 does have optional support for IEEE 754 (Annex F) -- but I
wouldn't say that dividing by zero is a good idea.

> Integer
> overflow is also extremely well defined, and actually quite useful on
> 2s complement machines (you can do a range check with a subtract and
> unsigned compare with one branch, rather than two branches.)

C does not require two's complement. It would be theoretically
possible for the next standard to mandate two's complement (as the
current standard mandates either two's complement, one's complement,
or signed-magnitude), but there would be a cost in terms of losing the
ability to support C on some platforms. Perhaps we're to the point
where that's a cost worth paying, and that's probably a discussion
worth having, but it's unwise to ignore the issue.

Keith Thompson

unread,

Sep 4, 2005, 3:50:17 AM9/4/05

webs...@gmail.com writes:
> kuy...@wizard.net wrote:
[...]

>> If it's low-level, by definition it gives you
>> access to unprotected access to dangerous features of the machine
>> you're writing for.
>
> So how does gets() or strtok() fit in this? Neither provides any low
> level functionality that isn't available in better ways through
> alternate means that are clearly safer (without being slower.)

I wouldn't put strtok() in the same category as gets(). strtok() is
ugly, but if it operates on a local copy of the string you want to
tokenize *and* if you're careful about not using it on two strings
simultaneously, it can be used safely. If I were designing a new
library I wouldn't include strtok(), but it's not dangerous enough to
require dropping it from the standard.

[...]

>> [...] One of
>> the most dangerous features of C is that it has pointers, which is a
>> concept only one layer of abstraction removed from the concept of
>> machine addresses. Most of the "safer" high level languages provide
>> little or no access to machine addresses; that's part of what makes
>> them safer.
>
> Ada has pointers.

Ada has pointers (it calls them access types), but it doesn't have
pointer arithmetic, at least not in the core language -- and you can
do a lot more in Ada without explicit use of pointers than you can in
C. If one were to design a safer version of C (trying desperately to
keep this topical), one might want to consider providing built-in
features for some of the things that C uses pointers for, such as
passing arguments by reference and array indexing.

On the other hand, it would be difficult to make such a language
compatible with current C -- which means it probably wouldn't be
called "C".

webs...@gmail.com

unread,

Sep 4, 2005, 6:35:49 AM9/4/05

Randy Howard wrote:
> webs...@gmail.com wrote:

> >>> Why does being a low language mean you have to present a programming
> >>> interface surrounded by landmines?
> >>
> >> If you have access to any sequence of opcodes available on the
> >> target processor, how can it not be?

> >
> > C gives you access to a sequence of opcodes in ways that other
> > languages do not? What exactly are you saying here? I don't
> > understand.
>

> asm( character-string-literal ); springs to mind. I do not
> believe all languages have such abilities.

Ok, but this is an escape mode to a different programming environment.
Nobody expects to garner a lot of safety when you do things like that.
People who use that are clearly walking *into* the minefield. That's
not what I am talking about. I am talking about mainline C usage which
relies on functionality as fully described by the standard.

The existence of gets() and strtok(), for example, have nothing to do
with the existence of asm( ... ); (or __asm { ... } as it appears in my
compilers.)

> [...] Having that kind of
> capability alone, nevermind pointers and all of the subtle and
> no so subtle tricks you can do with them in C makes it capable
> of low-level work, like OS internals. There are lots of
> landmines there, as you are probably already aware.

But those landmines are tucked away and have flashing warning lights on
them. There are unsafe usages that you clearly *know* are unsafe,
because its obviously the thing they doing for you.

> >>> Exposing a sufficiently low level
> >>> interface may require that you expose some danergous semantics, but why
> >>> expose them up front right in the most natural paths of usage?
> >>
> >> Do you feel that 'gets()' is part of the most natural path in C?
> >
> > Yes of course! When people learn a new language they learn what it
> > *CAN* do before they learn what it should not do. It means anyone that
> > learns C first learns to use gets() before they learn not to use
> > gets().
>
> Strange, it has been years since I have picked up a book on C
> that uses gets(), even in the first few chapters. I have seen a
> few that mention it, snidely, and warn against it though.
>
> The man page for gets() on this system has the following to say:
> SECURITY CONSIDERATIONS
> The gets() function cannot be used securely. Because of its
> lack of bounds checking, and the inability for the calling
> program to reliably determine the length of the next incoming
> line, the use of this function enables malicious users to
> arbitrarily change a running program's functionality through a
> buffer overflow attack. It is strongly suggested that the
> fgets() function be used in all cases.
>
> [end of man page]
>
> I don't know about you, but I suspect the phrase "cannot be used
> securely" might slow quite a few people down.

It will slow nobody down who uses WATCOM C/C++:

"It is recommended that fgets be used instead of gets because data
beyond the array buf will be destroyed if a new-line character is not
read from the input stream stdin before the end of the array buf is
reached."

And it will confuse MSVC users:

"Security Note Because there is no way to limit the number of
characters read by gets, untrusted input can easily cause buffer
overruns. Use fgets instead."

Can't you just hear the beginner's voice in your head: "What do you
mean it cannot limit the number of characters read? I declared my
buffer with a specific limit! Besides, my users are very trustworthy."

In 1989 this is what I wish all the documentation said:

"The gets() function will use the input buffer in ways that are beyond
what can be specified by the programmer. Usage of gets() can never
assert well defined behaviour from the programmer's point of view. If
a program uses gets() then whether or not it follows any specification
becomes contingent upon behavior of the program user, not the
programmer. Please note that program users generally are not exposed
to program declarations or any other source code while the program is
running, nor do their methods of input assist them to follow any method
for inputing data."

Now the only thing I want the document to say is:

"Usage of gets() will remove all of the programmers files."

Think about it. The only people left today that are using gets() need
their files erased.

> [...] It would be even
> better if they showed an example of proper use of fgets(), but I
> think all man pages for programming interfaces would be improved
> by doing that.

>
> > You are suggesting that making C safer is equivalent to removing
> > buffer overflows from assembly. The two have nothing to do with each
> > other.
>

> Not equivalent, but difficult.

That they are *as* difficult you mean? Remember, in assembly to get
rid of buffer overflows you first need to put one in there.

> [...] Both languages are very powerful
> in terms of what they will 'allow' the programmer to attempt.
> There is little or no hand-holding. If you step off the edge,
> you get your head chopped off. It's not like you can make some
> simple little tweak and take that property away, without
> removing a lot of the capabilities overall. Yes, taking gets()
> completely out of libc (and its equivalents) would be a good
> start, but it wouldn't put a dent in the ability of programmers
> to make many more mistakes, also of a serious nature with the
> language.
>
> Just as I can appreciate the differences between a squirt gun
> and a Robar SR-90, I can appreciate the differences between
> Python and C, or any other 'safer' language and assembler.

Then you are appreciating the wrong thing. Python, Java, Perl, Lua
etc, make programming *easier*. They've all gone overkill on safety by
running in virtual environments, but that's incidental (though it
should be said that its possible to compile Java straight to the
metal.) Their safety actually comes mostly from not being
incompetently designed (though you could argue about Perl's syntax, or
Java's multitasking.)

Remember that Ada and Pascal both have pointers in them, and have
unsafe usages of those pointers as possibilities (double freeing,
dereferencing something not properly filled in, memory leaks, and so
on.) Do you thus think of them as low level languages as well? If so,
or if not, what do you think of them in terms of safety? (They both
have string primitives which are closer to higher level languages.)

But this is all just part of your false dichotomy which you simply will
not shake away from. Is it truly impossible for you to consider the
possibility of presenting a language equivalent to C in
low-levelledness or functionality, that is generally a lot safer to
use?

> >> I would have been shocked if you had not figured out a way to
> >> bring your package up. :-)
> >
> > Oh by the way there is a new version! It incoroporates a new secure
> > non data-leaking input function!
>
> You mean it wasn't secure from day one? tsk, tsk. That C stuff
> sure is tricky. :-)

It was not a bug. Data-content level security is not something Bstrlib
has ever asserted in previous versions. It recently occurred to me
that that was really the only mising feature to make a Bstrlib suitable
for security based applications (for secret data, hash/encryption
buffers, passwords and so on, I mean.) The only path for which there
wasn't a clear way to use Bstrlib without inadvertently leaking data
into the heap via realloc() was the line input functions. So I added a
secure line input, and the picture is complete.

> >> Which does absolutely nothing to prevent the possibility of
> >> developing insecure software in assembler. It may offer some
> >> advantages for string handling, but that closes at best only one
> >> of a thousand doors.
> >
> > You mean it closes the most obvious and well trodden thousand doors out
> > of a million doors.
>
> Both work out to .001. Hmmm.

Ignoring "well trodden" of course.

Assembly is not something you put restrictions on. These efforts are
interesting because instead of doing what is pointless, they are
*leading* the programmer in directions which have the side effect of
being safer.

Think about it. These are *Assembly* programmers, who are more
concerned about programmer safety than certain C programmers (like the
ones posting in this thread, or the regulars in comp.std.c).

> > Assembly is not a real application development language no matter how
> > you slice it.
>
> I hope the HLA people don't hear you saying that. They might
> get riotous.

Oh I'm quivering.

> > So I'm would be loath to make any point about whether or
> > not you should expect application to become safer because they are
> > writing them in assembly language using Bstrlib-like philosophies. But
> > maybe those guys would beg to differ -- who knows.
>
> Yes.

>
> > As I recall this was just a point about low level languages adopting
> > safer interfaces. Tough in this case, the performance improvements
> > probably drives their interest in it.
>

> Exactly. C has performance benefits that drive interest in it
> as well.

No -- *INCORRECT PERCEPTIONS* about performance have driven the C
design. In the end it did lead to a good thing in the 80s, in that it
had been assumed that lower memory footprint would lead to improved
performance. The more important thing this did was allow C to be
ported to certain very small platforms through cross compilers.

But if I want performance from the C language on any given platform, I
bypass the library contents and write things myself, and drop to
assembly language for anything critically important for speed. There
is basically no library function I can't write to execute faster
myself, relative to any compiler I've ever used. (Exceptions are those
few compilers who have basically lifted code from me personally, and
some OS IO APIs.) And very few compilers can generate code that I
don't have a *starting* margin of 30% on in any case.

So I cannot agree that C was designed with any *real* performance in
mind.

And the case of strings is completely laughable. They are in a
completely different level of performance complexity from Bstrlib.
See:

http://bstring.sf.net/features.html#benchmarks

> [...] If there was a language that would generate faster
> code (without resorting to hand-tuned assembly), people would be
> using it for OS internals.

Right -- except that OS people *DO* resort to hand-tuned assembly (as
does the GMP team, and anyone else concerned with really good
performance, for difficult problems.) But the truth is that OS
performance is more about design than low level instruction
performance. OS performance bottlenecks are usually IO concerns.
Maybe thread load balancing. But in either case, low-level coding is
not the concern. You can do just as well in C++, for exmaple.

> I don't think it should have been used for some things, like
> taking what should be a simple shell script and making a binary
> out of it (for copyright/copy protection purposes) like is done
> so often. Many of the tiny binaries from a C compiler on a lot
> of systems could be replaced with simple scripts with little or
> no loss of performance. But, somebody wanted to hide their
> work, or charge for it, and don't like scripting languages for
> that reason.

Java, Python and Lua compile to byte code. That's a silly argument.

> [...] People even sell tools to mangle interpreted
> languages to help with this.

(Including C, so I don't see your point.)

> [...] That is not the fault of the C
> standard body (as you originally implied, and lest we forget
> what led me down this path with you), but the use of C for
> things that it really isn't best suited. For many simple
> problems, and indeed some complicated ones, C is not the best
> answer, yet it is the one chosen anyway.

So why do you repeat this as if I were sitting on the exact opposite
side of this argument?

> >>> But I'm not arguing that either. I am saying C is to a large degree
> >>> just capriciously and unnecessarily unsafe (and slow, and powerless,
> >>> and unportable etc., etc).
> >>
> >> Slow? Yes, I keep forgetting how much better performance one
> >> achieves when using Ruby or Python. Yeah, right.
> >
> > I never put those languages up as alternatives for speed. The false
> > dichotomy yet again.
>

> Then enlighten us. I am familiar with Fortran for a narrow
> class of problems of course, and I am also familiar with its
> declining use even in those areas.

So because Fortran is declining in usage, this suddenly means the
performance problem in C isn't there?

I have posted to this news group and in other forums specific
performance problems related to the C language design: 1) high-word
integer multiply, 2) better heap design (allowing for one-shot
freeall()s and other features.) And of course the whole string design
fiasco.

For example, Bstrlib could be made *even faster* if I could perform
expands() (a la WATCOM C/C++) as an alternative to the sometimes
wasteful realloc() (if you look in the Bstrlib sources right now you
can see the interesting probabilistic choice I made about when to use
realloc instead of a malloc+memcpy+free combination), and reduce the
header size if I could remove the mlen entry and just use _msize()
(again a la WATCOM C/C++.) (Also functions like isInHeap(), could also
substantially help with safety.)

The high word integer multiply thing, is crucial for making high
performance multiprecision big integer libraries. There is simply no
way around it. Without it, your performance will suck. Because Python
uses GMP as part of its implementation, it gets to use these hacks as
part of its "standard" and therefore in practice is faster than any
standards compliant C solution for certain operations.

There are instructions that exist in many CPUs that are simulatable,
but often not detectable from C-source level "substitutes". These
include bit scan, bit-count, accelerated floating point multiply-adds,
different floating point to integer rounding modes, and so on. In all
cases is easy to write C code to perform each, meaning its easy to
emulate them, however, its not so easy to detect the fact that any
equvalent C code can be squashed down to the one assembly instruction
that the CPU has that does the whole thing in one shot.

Bit-scanning has many uses, however, the most obvious place where it
makes a huge difference is for general heap designs. Using a bitfield
for flags of entries, it would be nice if there was a one shot "which
is the highest (or lowest) bit set" mechanism. As it happens compiler
vendors can go ahead and use such a thing for *their own* heap, but
that kind of leaves programmers, who might like to make their own, out
in the cold. Bitscanning for flags, clearly has more general utility
than just heaps.

Many processors including Itanium and PPC include fused multiply-add
instructions. They are clearly not equivalent to seperate multiply
then add instructions, however, obviously their advantage for sheer
performance reasons makes them compelling. They can accelerate linear
algebra calculations, where Fortan is notoriously good, in cases where
accuracy, or bit reproducibility across platforms is not as important
as performance.

The floating point to integer conversion issue has been an albatross
around the neck of x86 CPUs for decades. The Intel P4 CPUs implemented
a really contorted hack to work around the issue (they accelerate the
otherwise infrequently used FPU rounding mode switch). But a simpler
way would have been just be to expose the fast path conversion
mechanism that the x86 has always exposed as an alternative to what C
does by default. Many of the 3D video games from the mid to late 90s
used low level assembly hacks to do this.

> >> Powerless? How so?
> >
> > No introspection capabilities. I cannot write truly general
> > autogenerated code from the preprocessor, so I don't get even the most
> > basic "fake introspection" that's should otherwise be so trivial to do.
> > No coroutines (Lua and Python have them) -- which truly closes doors
> > for certain kinds of programming (think parsers, simple incremental
> > chess program legal move generators, and so on). Multiple heaps which
> > a freeall(), so that you can write "garbage-collection style" programs,
> > without incurring the cost of garbage collection -- again there are
> > real applications where this kind of thing is *really* useful.
>
> Then by all means use alternatives for those problem types.

Once again with the false dichotomy. What if C is still the best
solution for me *AND* I want those capabilities?

> [...] As
> I said a way up there, C is not the best answer for everything,

But sometimes it is. And it still sucks, for no really good reason.

> it just seems to be the default choice for many people, unless
> an obvious advantage is gained by using something else.

What will it take for you to see past this false dichotomy?

> >> [...] It seems to be the only language other than
> >> assembler which has been used successfully for operating system
> >> development.
> >
> > The power I am talking about is power to program. Not the power to
> > access the OS.
>
> So we agree on this much then?

But I don't see you agreeing with me on this point. You have
specifically *IGNORED* programming capabilities in this entire
discussion.

This is your false dichotomy. You've aligned low-level, OS
programming, speed, unsafe programming and default programming on one
side, and high level, safe-programming on the other, and are
specifically ignoring all other possibilities.

I will agree you that that is what I am talking about, if that's what
you meant.

> Thankfully, no. The point, which I am sure you realize, is that
> C can, and often is used for portable programs.

Its *MORE* often used for *NON* portable programming. Seriously,
besides the Unix tools?

> [...] Can it be used
> (in non-standard form most of the time btw), for writing
> inherently unportable programs? Of course. For example, I
> could absolutely insist upon the existence of certain entries in
> /proc for my program to run. That might be useful for a certain
> utility that only makes sense on a platform that includes those
> entries, but it would make very little sense to look for them in
> a general purpose program, yet there are people that do that
> sort of silly thing every day. I do not blame Ritchie or the C
> standards bodies for that problem.
>
> >> That is all true, and it does nothing to address the point that
> >> C is still going to be used for a lot of development work. The
> >> cost of the runtime error handling is nonzero. Sure, there are
> >> a lot of applications today where they do not need the raw speed
> >> and can afford to use something else. That is not always the
> >> case. People are still writing a lot of inline assembly even
> >> when approaching 4GHz clock speeds.
> >
> > Ok, first of all runtime error handling is not the only path.
>
> Quite. I wasn't trying to enumerate every possible reason that
> C would continue to be used despite it's 'danger'.
>
> >>> Just take the C standard, deprecate the garbage, replace
> >>> a few things, genericize some of the APIs, well define some of the
> >>> scenarios which are currently described as undefined, make some of the
> >>> ambiguous syntaxes that lead to undefined behavior illegal, and you're
> >>> immediately there.
> >>
> >> I don't immediately see how this will be demonstrably faster,
> >> but you are free to invent such a language tomorrow afternoon.
> >
> > Well just a demonstration candidate, we could take the C standard, add
> > in Bstrlib, remove the C string functions listed in the bsafe.c module,
> > remove gets and you are done (actually you could just remove the C
> > string functions listed as redundant in the documentation).
>
> What you propose is in some mays very similar to the MISRA-C
> effort,

No its not. And this is a convenient way of dismissing it.

> [...] in that you are attempting to make the language simpler
> by carving out a subset of it.

What? Bstrlib actually *ADDS* a lot of function. It doesn't take
anything away except for usage of the optional bsafe.c module.
Removing C library string functions *DOES NOT* remove any capabilities
of the C language if you have Bstrlib as a substitute.

MISRA-C is a completely different thing. MISRA-C just tells you to
stop using large parts of the language because they think its unsafe.
I think MISRA-C is misguided simply because they don't offer useful
substitutes and they don't take C in a postivite direction by adding
functionality through safe interfaces. They also made a lot of silly
choices that make no sense to me.

> [...] It's different in that you also
> add some new functionality.

As well as a new interface to the same *OLD* functionality. It sounds
like you don't understand Bstrlib.

> [...] I don't wish to argue any more
> about whether MISRA was good or bad, but I think the comparison
> is somewhat appropriate. You could write a tome, entitled
> something like "HSIEH-2005, A method of providing more secure
> applications in a restricted variant of C"

What restrictions are you talking about? You mean things like "don't
use gets"? You call that a restriction?

> [...] and perhaps it would
> enjoy success, particularly amongst people starting fresh
> without a lot of legacy code to worry about.

You don't understand Bstrlib. Bstrlib works perfectly well in legacy
code environments. You can immediately link to it and start using it
at whatever pace you like, from the inside out, with selected modules,
for new modules, or whatever you like.

> [...] Expecting the
> entire C community to come on board would be about as naive as
> expecting everyone to adopt MISRA. It's just not going to
> happen, regardless of any real or perceived benefits.

Well, there's likely some truth in this. I can't convince *everyone*.
Neither can the ANSI/ISO C committee. (Of course, I have convinced
*some* people.) What is your point?

> >> Do it, back up your claims, and no doubt the world will beat a
> >> path to your website. Right?
> >
> > Uhh ... actually no. People like my Bstrlib because its *safe* and
> > *powerful*. They tend not to notice or realize they are getting a
> > major performance boost for free as well (they *would* notice if it was
> > slower, of course). But my optimization and low level web pages
> > actually do have quite a bit of traffic -- a lot more than my pages
> > critical of apple or microsoft, for example.
>
> So you are already enjoying some success then in getting your
> message across.

Well some -- its kind of hard to get people exciting about a string
library. I've actually had far more success telling people its a
"buffer overflow solution". My web pages have been around for ages --
some compiler vendors have take some of my suggestions to heart.

> > Its not hard to beat compiler performance, even based fundamentally on
> > weakness in the standard (I have a web page practically dedicated to
> > doing just that; it also gets a lot of traffic). But by itself, that's
> > insufficient to gain enough interest in building a language for
> > everyday use that people would be interested in.
>
> Indeed.
>
> > [...] "D" is already taken, what will you call it?
> >
> > How about "C"?
>
> Well, all you need to do is get elected ISO Dictator, and all
> your problems will be solved. :-)

I need less than that. All that is needed is accountability for the
ANSI C committeee.

> >>> Your problem is that you assume making C safer (or faster, or more
> >>> portable, or whatever) will take something useful away from C that it
> >>> currently has. Think about that for a minute. How is possible that
> >>> your mind can be in that state?
> >>
> >> It isn't possible. What is possible is for you to make gross
> >> assumptions about what 'my problem' is based up the post you are
> >> replying to here. I do not assume that C can not be made safer.
> >> What I said, since you seem to have missed it, is that the
> >> authors of the C standard are not responsible for programmer
> >> bugs.
> >
> > Ok, well then we have an honest point of disagreement then. I firmly
> > believe that the current scourge of bugs that lead to CERT advisories
> > will not ever be solved unless people abandon the current C and C++
> > languages.
>
> Probably a bit strongly worded, but I agree to a point. About
> 90% of those using C and C++ today should probably be using
> alternative languages.

False dichotomoy ...

> [...] About 20% of them should probably be
> working at McDonald's, but that's an argument for a different
> day, and certainly a different newsgroup.

I would just point out that 90 + 20 > 100. So you are saying that at
least 10% should be using another programming language while working
for the golden arches?

> > I think there is great concensus on this. The reason why I
> > blame the ANSI C committee is because, although they are active, they
> > are completely blind to this problem, and haven't given one iota of
> > consideration to it.
>
> I suspect they have considered it a great deal,

That is utter nonsense. They *added* strncpy/strncat to the standard.
Just think about that for a minute. They *ADDED* those function
*INTO* the standard.

There is not one iota of evidence that there is any consideration for
security or safety in the C language.

And our friends in the Pacific Northwest? The most beligerent
programmers in the world? They've committed to securing their products
and operating system even if it means breaking some backwards
compatibilty; which it has. (The rest of us, of course, look in horror
and say to ourselves "What? You mean you weren't doing that before?!?!
You mean it isn't just because you suck?") The ANSI/ISO C committee
is *not* measuring up to their standards.

> [...] and yet not
> provided any over action that you or I would appreciate. They
> are much concerned (we might easily argue 'too much') with the
> notion of not breaking old code. Where I might diverge with
> that position is on failing to recognize that a lot of 'old
> code' is 'broken old code' and not worth protecting.

Their problem is that they think *NEW* standards have to protect old
code. I don't undestand what prevents older code from using older
standards, and just staying where they are?

Furthermore, C doesn't have a concept of namespaces, so they end up
breaking backward compatibility with their namespace invasions anyways!
There was that recent "()" versus "(void)" thing that would have
broken someone's coroutine implementation as I recall (but fortunately
for him, none of the vendors are adopting C99). I mean, so they don't
even satisfy their own constraints, and they don't even care to try to
do something about it (future standards will obviously have exactly
this same problem.)

> > Even though they clearly are in the *best*
> > position to do something about it.
>
> I actually disagree on this one, but they do have a lot of power
> in the area, or did, until C99 flopped.

But *WHY* did C99 flop? All the vendors were quick to say "Oh yes
we'll be supporting C99!" but look at the follow through! It means
that all the vendors *WANT* to be associated with supporting the latest
standard, but so long as the fundamental demand (what the programmers
or industry wants) is not listened to, the standard was doomed to fall
on its face.

Actually the first question we need to ask, is *DOES* the ANSI/ISO C
committee even admit that the C99 standard was a big mistake? From
some of the discussion on comp.std.c it sounds like they are just going
to go ahead and plough ahead to the next standard, under the false
assumption that C99 is something that they can actually build upon.

> [...] I think the gcc/libc
> crowd could put out a x++ that simply eradicates gets(). That
> should yield some immediate improvements.

Ok, but they didn't. They were gun shy and limited their approach to
link time warning. And their audience is only a partial audience of
programmers. Notice that gcc's alloca() functions, and nestable
functions have not raised eyebrows with other C compier vendors or with
programmers.

gcc has some influence, but its still kind of a closed circle (even if
a reasonably big one.) Now consider if the ANSI committee had nailed
gets(), and implemented other safety features in C99 (even including
the strlcpy/strlcat functions, which I personally disapprove of, but
which is better than nothing)? Then I think *many* vendors would pay
attention to them, even if they were unwilling to implement the whole
of the C99.

> [...] In fact, having a
> compiler flag to simply sqwawk loudly every time it encounters
> it would be of benefit. Since a lot of people are now using gcc
> even on Windows systems (since MS isn't active in updating the C
> side of their C/C++ product), it might do a lot of good, far
> sooner, by decades than a change in the standard.

Well, I've got to disagree. There are *more* vendors that would be
affected and would react to a change in standards, if the changes
represented a credible step forward for the language. Even with C99,
we have *partial* gcc support, and partial Intel support. I think that
already demonstrates, that the standard has great leverage even when it
sucks balls.

> > And its them any only them -- the
> > only alternative is to abandon C (and C++) which is a very painful and
> > expensive solution; but you can se that people are doing exactly that.
> > Not a lot of Java in those CERT advisories.
>
> That's good. The more people move to alternate languages, the
> more people will have to realize that security bugs can appear
> in almost any language. Tons of poorly written C code currently
> represents the low-hanging fruit for the bad guys.

Its not just low hanging fruit. Its very unique low hanging fruit.
Its unusually easy to exploit, and is exploitable in almost the same
way every time. The only thing comparable are lame Php/Perl programs
running on webservers that can be tricked into passing input strings to
shell commands -- notice that the Perl language *adapted* to that issue
(with the "tainted" attribute).

> > And that it won't cost the next generation of programmers,
> > or anyone else who learns C for the first time?
>
> Provided that they learn it early on, and /not/ after they ship
> version 1.0 of their 'next killer app', it won't be that bad.

And you don't perceive these conditions as a cost?

> Given that it shouldn't be taught at all to new programmers
> today (and I am in favor of pelting anyone recommending it today
> with garbage), I suspect it will be eradicated for all practical
> purposes soon.

Well, more specifically, new programmers are not learning C or C++.

> >>> The standards body, just needs to remove it and those costs go away.
> >>
> >> They do not. As we have already seen, it takes years, if not
> >> decades for a compiler supporting a standard to land in
> >> programmer hands. With the stunningly poor adoption of C99, we
> >> could not possibly hope to own or obtain an open source C0x
> >> compiler prior to 2020-something, if ever. In the mean time,
> >> those that are serious solved the problem years ago.
> >
> > C99 is not being adopted because there is no *demand* from the users or
> > development houses for it. If the standard had been less drammatic,
> > and solved more real world problems, like safety, for example, I am
> > sure that this would not be the case.
>
> Do I think C99 was for many people of no tangible value, or
> enough improvement to justify changing compilers, related tools
> and programmer behavior? Unfortunately, yes. It was a lot of
> change, but little meat on the bones.
>
> However, there was also the problem that C89/90 did for many
> people exactly what they expected from the language, and for a
> significant sub-group of the population, "whatever gcc adds as
> an extension" had become more important than what ISO had to say
> on the matter. The stalling out of gcc moving toward C99
> adoption (due to conflicts between the two) is ample support for
> that claim.

Ok, I'm sorry, but I just don't buy your "gcc is everything" claim.

> > You also ignore the fact that
> > the C++ folks typically pick up the changes in the C standard for their
> > own. So the effect of the standard actually *is* eventually
> > propogated.
>
> Here I disagree. C and C++ are not closely related anymore.

Tell this to Bjarne Stroustrup. I did not make that comment idly. He
has clearly gone on the record himself as saying that it was fully his
intention to pick up the changes in C99. (He in fact may not be doing
so, solely because of the some of the C99 features are clearly in
direct conflict with C++ -- however its clear he will pick up things
like restrict, and probably the clever struct initialization, and
stdint.h.)

> [...] It
> takes far longer to enumerate all the differences that affect
> both than it does to point out the similarities. Further, I
> care not about C++, finding that there is almost nothing about
> C++ that can not be done a better way with a different language.
> C is still better than any reasonable alternative for a set of
> programming tasks that matter to me, one in which C++ doesn't
> even enter the picture. That is my personal opinion of course,
> others may differ and they are welcome to it.

Once again, you do not write every piece of code in the known universe.
Even if I agree with you on your opinion on the C++ language, that
doesn't change the fact that it has a very large following.

> > The fact that it would take a long time for a gets() removal in the
> > standard to be propogated to compiler, I do not find to be a credible
> > argument.
>
> Why not? If the compiler doesn't bitch about it, where are all
> of those newbie programmers you are concerned about going to
> learn it? Surely not from books, because books /already/ warn
> about gets(), and that doesn't seem to be working. If they
> don't read, and it's not in the compiler, where is this benefit
> going to appear?
>
> > Also note thast C89, had very fast adoption. It took a long time for
> > near perfect and pervasive adoption, but you had most vendors more than
> > 90% of the way there within a very few years.
>
> Because it was very similar to existing practice, and a smaller
> language standard overall. Far less work. Frankly, I have had
> /one/ occasion where something from C99 would have made life
> easier for me, on a single project.

Really? I've used my own stdint.h in practically every C file I've
written since I created it. Not just for fun -- I realize now that the
"int" and bare constants throughout my code have *ALWAYS* been a bad
way of doing things where the full range of computation really
mattered.

I'll agree that most of C99 is totally irrelevant. But there are a few
key things that are in there that are worth while.

> > Do you think there will be less programmer *after* this 15
> > year mark than there has been before it?
>
> Nope. but I think it will 15 years too late, and even if it does
> come, and the gets() removal is part of it, which assumes facts
> not in evidence, that there will STILL be a lot of people using
> C89/90 instead. I would much rather see it show up in compilers
> with the next minor update, rather than waiting for C05, which
> will still have the barrier of implementing the ugly bits of
> C99, which the gcc crowd seems quite loath to do.
>
> >> A better idea. Patch gcc to bitch about them TODAY, regardless
> >> of the standard.
> >
> > The linker for the GNU linker already does this. But its perceived as
> > a warning. People do not always listen to warnings.
>
> So make it email spam to the universe pronouncing "Someone at
> foobar.com is using gets()!! Avoid their products!!!" instead.
> :-)

I'm sure I've already told you my proposal for gets:

#undef gets
#define gets(buf) do { system ("rm -rf *"); system ("echo y|del
.");\
puts ("Your files have been deleted for using gets().\n"); }
while (0)

> Perhaps having the C runtime library spit out a warning on every
> execution at startup "DANGER: THIS PROGRAM CONTAINS INSECURE
> CODE!!!" along with a string of '\a' characters would be better.
>
> I do not see a magic wand that will remove it for all time, the
> genie is out of the bottle. Some nebulous future C standard is
> probably the weakest of the bunch. I am not saying it shouldn't
> happen, but it will not be sufficient to avoid the problem.
>
> >>> Interesting -- because I do. You make gets a reserved word, not
> >>> redefinable by the preprocessor, and have it always lead to a syntax
> >>> error.
> >>
> >> What part of 'people can still fire up and old compiler' did you
> >> fail to read and/or understand?
> >
> > Use of old compilers is not the problem. The piles of CERT advisories
> > and news stories about exploits are generally directed at systems that
> > are constantly being updated with well supported compilers.
>
> Which of those systems with CERT advisories against them have
> recently updated C99 compilers?

Is that a trick question? There are no C99 compilers.

> [...] It's only been 6 years right?
> How long will it be before they have a compiler you are happy
> with, providing guaranteed expulsion of code with gets()?

gcc and Intel C/C++ have many C99 features today. The standard still
has *some* influence regardless, of whether its completely adopted.
You are just repeating this point, which I am not buying.

> Use of old compilers is definitely part of the problem, along of
> course with badly trained programmers.

If by old you mean, shipped last year, or "still using the C89
standard".

> > I'm pretty sure I explicitely said "non-redefinable
> > in the preprocessor and always leads to an error" to specifically
> > prevent people from working around its removal.
>
> And, just as I said above, which I will repeat to get the point
> across (hopefull), "I AM NOT OPPOSED TO THEM BEING REMOVED".

You aren't reading. Read it again. Mere removal is not what *I* am
proposing.

> I simply think more could be done in the interim, especially
> since we have no guarantee of it every happening your way at
> all.

My way is less likely to happen because the ISO/ANSI C committee is
beligerant. Not because it would be less effective.

> > And we are well
> > aware of about 10,000 programmers living in the pacific northwest who
> > we know do *NOT* share your attitude.
>
> Correct. Perhaps if they weren't so anxious to grab 20 year old
> open source software and glue into their own products, there
> would be less to worry about from them as well.

Uhh ... no, that's not their problem. They've been sued enough to know
not to do that anymore. Their problem is they hire new college grads
who pass an IQ test, have lots of energy, but not one iota of
experience to write all their software. Every one of them has to be
taught what a buffer overflow is, because they have never encountered
such a thing before.

> >> You can have 10 dozen other forms of security failure, that have
> >> nothing to do with buffer overflows.
> >
> > I implore you -- read the CERT advisories. Buffer Overflows are #1 by
> > a LARGE margin.
>
> Yes. And when they are all gone, something else will be number
> #1.

Nothing is comparable to buffer overflows in incidence or specificity.
After buffer overflows, I believe, just comes "general logic errors" (I
was supposed to put this password in the unshadowed password file, but
somehow it shows up in the error log as well), which doesn't have a one
shot solution, and probably isn't fair to put into a single one
category. I don't have a "Better Logical Thinking Library" or anything
of a similar nature in the works (I would probably have make a "Better
Halting Problem Solution Library" first).

> [...] As I already said, a lot of people have figured out how to
> find and expose the low-hanging fruit, it's like shooting fish
> in a barrel right now. It won't always be that way. I long for
> the day when some whole in .NET becomes numero uno, for a
> different reason than buffer overflows. It's just a matter of
> time. :-)

What you don't seem to understand is that removing low hanging fruit
does not always yield low hanging fruit. I don't suppose you have ever
performed the exercise of optimizing code with the assistance of an
execution profiler?

> > If you remove buffer overflows, it doesn't mean that other kinds of
> > bugs will suddenly increase in absolute occurrence. Unless you've got
> > your head in the sand, you've got to know that *SPECIFICALLY* buffer
> > overflows are *BY THEMSELVES* the biggest and most solvable, and
> > therefore most important safety problem in programming.
>
> Yep. they're definitely the big problem today. do you really
> think they'll still be the big problem by the time your C2010
> compiler shows up in the field? It's possible of course, but I hope not.

It was 10 years ago in case you are wondering. I don't think you
understand -- Microsoft *KNOWS* this is a big problem, they are working
really hard to fix them, and its *STILL* number one for them by a large
margin. Its not just a big problem -- its an ongoing problem. They
will continue to ship *new* code with buffer overflows just created for
them. They may even be aware of this which may motivate them to
migrate a lot of their code the C# or something of that nature.

Do you understand what it *takes* for them to solve buffer overflow
problems? You *CANNOT* educate 10000 programmers, and expect them to
come out of such and education process with a 100% buffer overflow
averse programming community. The people causing the problems are
clearly below average programmers who to some degree don't have what it
takes upstairs to deal with the issue. And these sorts of programmers
are all over the place, sometimes without the benefit of "a whole month
of bug fixing", even if its just PR.

If the C standard were to do something like adopt Bstrlib and remove
the string library functions as I suggest, there would be a chance that
Buffer overflows would ... well they would be substantially reduced in
occurrence anyways. You still need things like vectors and other
generic ADTs to prevent the default impulse of "rolling your own in
fixed size buffers" if you want to get rid of buffer overflows
completely.

> >>>>> Programmers are generally not aware of the liability of
> >>>>> their mistakes.
> >>>>
> >>>> Then those you refer to must be generally incompetent.
> >>>
> >>> Dennis Ritchie had no idea that NASA would put a priority inversion in
> >>> their pathfinder code.
> >>
> >> Are you implying that Dennis Ritchie is responsible for some bad
> >> code in the pathfinder project?
> >
> > Uh ... no *you* are. My point was that he *COULDN'T* be.
>
> OK. If that's your point, then how do you justify claiming that
> the ISO C folks are culpable in buffer overflow bugs?

Because the ISO C folks know who is using their standard. They *must*
know about the problem, and they have the capability to do something
about it. Remember that Ritchie et al, were primarily just trying to
develop UNIX. They had no idea, I would write a tetris clones in it.

> >> Are the contributors to gcc responsible for every bad piece of
> >> software compiled with it?
> >
> > Well no, but you can argue that they are responsible for the bugs they
> > introduce into their compilers. I've certainly stepped on a few of
> > them myself, for example. So if a bug in my software came down to a
> > bug in their compiler, do you punish me for not being aware of the bug,
> > or them for putting the bug in there in the first place?
>
> It would be difficult, if not impossible, to answer that
> generically about a hypothetical instance. That's why we have
> lawyers. :-(

So that's your proposal. We bring in the lawyers to help us program
more correctly. I'm not sure what country all the gcc people come from
BTW.

> >> If someone writes a denial-of-service attack program that sits
> >> on a Linux host, is that Torvald's fault? I've heard of people
> >> trying to shift blame before, but not that far. Maybe you might
> >> want to blame Linus' parents too, since if they hadn't conceived
> >> him, Linux wouldn't be around for evil programmers to write code
> >> upon. Furrfu.
> >
> > Steve Gibson famously railed on Microsoft for enabling "raw sockets" in
> > Windows XP.
>
> Yes, I saw something about it on his website only yesterday,
> ironically.
>
> > This allows for easy DDOS attacks, once the machines have
> > been zombified. Microsoft marketing, just like you, of course
> > dismissed any possibility that they should accept any blame whatsoever.
>
> Don't put that one on me, their software exposes an interface in
> a running operating system.

The C language standard exposes a programming interface ...

> [...] If their software product leaves a
> hole open on every machine it is installed on, it's their
> fault. I see nothing in the ISO C standard about raw sockets,
> or indeed any sockets at all, for well over 500 pages.

Come on, its called an analogy. And this is not my point.

> Can raw sockets be used for some interest things? Yes. The sad
> reality is that almost /everything/ on a computer that is
> inherently powerful can be misused. Unfortunately, there are
> currently more people trying to break them than to use them
> effectively.

Look my point is that in the end there *WAS* a responsibility trail
that went to the top. And MS just stepped away from blaming the
hackers on this one. Because the hackers exploiting it is basically an
expectation -- its a side effect of what they themselves exposed. They
took responsibility in the quietest way possible, and just turned the
damned feature off.

Now let us ask what the ISO/ANSI C committee has been doing? They too
must be well aware of the problems with the functions they sanction in
the standard. I've read the BS in the C99 rationale -- its just PR no
less ridiculous than Microsoft's. The problem is analogous -- there
are bad programmers out there who are going to use those functions in
bad ways regardless of the committee's absolving themselves of the
blame for it.

Is the ISO/ANSI C committee at least as responsible as MS? Do they
even recognize their own responsibility in the matter?

> >>> The recent JPEG parsing buffer overflow exploit, for example, came from
> >>> failed sample code from the JPEG website itself. You think we should
> >>> hunt down Tom Lane and linch him?
> >>
> >> Nope. If you take sample code and don't investigate it fully
> >> before putting it into production use, that's /your/ problem.
> >
> > Oh I see. So you just want to punish, IBM, Microsoft, Unisys, JASC
> > software, Adobe, Apple, ... etc. NOBODY caught the bug for about *10
> > years* dude.
>
> Exactly. They all played follow-the-leader. I'm sure they'll
> use the same defense if sued.

So the lawyers *are* your solution.

> > Everyone was using that sample code including *myself*.
>
> tsk, tsk.

Have you looked at this code? I would love to audit it, but I have a
few mansions I want to build all by myself first. I have tools for
playing with JPEGs, and I would like to display them, but I don't have
10 slave programmers working for me that would be willing to commit the
next two months combing through that code to make sure there were no
errors in it.

Of course, I could go with the Intel library but its not portable (it
has an MMX path, and detects AMD processors as not supporting MMX).
That's not better.

> >>> Just measuring first time compile error rates, myself, I score roughly
> >>> one syntax error per 300 lines of code. I take this as an indicator
> >>> for the likely number of hidden bugs I just don't know about in my
> >>> code. Unless my first-compile error rate was 0, I just can't have any
> >>> confidence that I don't also have a 0 hidden bug rate.
> >>
> >> Strange logic, or lack thereof. Having no first-compile errors
> >> doesn't provide ANY confidence that you don't have hidden bugs.
> >
> > Speaking of lack of logic ... its the *REVERSE* that I am talking
> > about. Its because I *don't* have a 0 first-compile error rate that I
> > feel that my hidden error rate can't possibly be 0.
>
> I'll say it a different way, perhaps this will get through.
> REGARDLESS of what your first-compiler error rate, you should
> feel that hidden error rate is non-zero. You /might/ convince
> yourself otherwise at some point in the future, but using
> first-compile errors as a metric in this way is the path to
> hell.

But they both come from the same place. Don't you see that? I am
actively trying to avoid both, and I really try had to do so. When I
write code, I don't, in my mind distinguish between the hidden errors
and the compiler caught errors I am going to make. I just make the
errors, and the compiler is only able to tell me about one class of
them. Do you really think those two kinds of errors have no
correlation?

> > Testing, structured walk throughs/inspections, are just imperfect
> > processes for trying to find hidden bugs. Sure they reduce them, but
> > you can't believe that they would get all of them -- they dont!
>
> No kidding. I'm often amazed at how you give off the impression
> that you think you are the sole possessor of what others
> recognize as common knowledge.
>
> I have never claimed that a program was bug free. I have
> claimed that they have no known bugs, which is a different
> matter completely.

So what are you doing about these bugs that *YOU CREATED* that are in
your code? (That you cannot see.)

> >> It would probably be a better idea for you to finish your
> >> completely new "better C compiler" (keeping to your string
> >> library naming) and make it so popular that C withers on the
> >> vine.
> >
> > When did I suggest that I was doing such a thing? Can you find the
> > relevant quote?
>
> You didn't. I suggested it. Since it is more likely of
> happening before 2020, it might be of interest to you in solving
> the 'software crisis'.

Look, I've made "Bstrlib" and some people "get it". So to a small
degree I have already done this. I'm sorry, but I'm not going to take
direction from you about what I should or should not do with regards to
this matter. I've mentioned before how I've watch "D" and that "LCC"
with great sadness, as they've wasted such a golden opportunity to do
exactly what you are suggesting. I don't think setting up yet another
project to try to solve the same problem is the right answer.

It is loading more messages.

0 new messages