All is not lost however. In the book
"Value Range Analysis of C programs" Axel Simon tries to establish a
theoretical framework for analyzing C programs. In contrast to other
books where the actual technical difficulties are "abstracted away",
this books tries to analyze real C programs taking into account
pointers, stack frames, etc.
It has just arrived today, I was waiting for it since several weeks.
http://www.di.ens.fr/~simona/book.html
--
jacob navia
jacob at jacob point remcomp point fr
logiciels/informatique
http://www.cs.virginia.edu/~lcc-win32
>All is not lost however. In the book
>"Value Range Analysis of C programs" Axel Simon tries to establish a
>theoretical framework for analyzing C programs.
If buffer overflows are indeed, as you claim, "a fact of C", then
no amount of theoretical analysis will allow you to eliminate them
in any actual C program.
On the other hand:
int main(void) { return 0; }
is a C program that has no possibility of buffer overflow. The
existance of a single counter-example is enough to disprove your
claim that buffer overflows are "a fact of C".
There may be constructs in C that are -prone- to buffer overflows
when used by typical programmers, but that doesn't establish
that buffer overflows are "a fact of C". Therefor, any analysis
such as the book you refer to is not a book about C, but rather
a book about algorithms and about C implementations. That
makes it relevant for comp.programming and to newsgroups dealing
with the specifics of implementations similar to the ones s/he
discusses, but does not make the book relevant to comp.lang.c .
--
"There are some ideas so wrong that only a very intelligent person
could believe in them." -- George Orwell
Startup or shutdown code may call library functions. These library
functions may overflow a buffer. To say that there is no possibility of
buffer overflow is an error.
> The
> existance of a single counter-example is enough to disprove your
> claim that buffer overflows are "a fact of C".
I think we have to admit that buffer overflows are a *problem* of C. I
guess that a C dialect could be produced where every single library function
was formally proven. In such a system, buffer overflows would only occur
when the compiler end-user created one.
> There may be constructs in C that are -prone- to buffer overflows
> when used by typical programmers, but that doesn't establish
> that buffer overflows are "a fact of C". Therefor, any analysis
> such as the book you refer to is not a book about C, but rather
> a book about algorithms and about C implementations. That
> makes it relevant for comp.programming and to newsgroups dealing
> with the specifics of implementations similar to the ones s/he
> discusses, but does not make the book relevant to comp.lang.c .
Having tools or ideas on how to analyze C programs for problems like buffer
overflow is clearly a good thing. I disagree that the book is not relevant
to comp.lang.c because buffer exploits are the single largest problem with
the language (by far), with manual memory management being the second most
significant issue [clearly, my opinion only]. I suggest that
news:comp.std.c is more on target than news:comp.lang.c but I definitely
think it is worth discussing here as well.
> --
> "There are some ideas so wrong that only a very intelligent person
> could believe in them." -- George Orwell
** Posted from http://www.teranews.com **
Well, happily for C, that is not true. What the author proposes is to
apply mathematical reasoning to the sets of values a variable in C can
have, and then, to reason mathematically about those sets. It uses
geometrical concepts like "lattices", and other mathematical
"software" to determine possible values. It is too early for me to
tell you exactly how he does it, I received the book today.
> On the other hand:
>
> int main(void) { return 0; }
>
> is a C program that has no possibility of buffer overflow. The
> existance of a single counter-example is enough to disprove your
> claim that buffer overflows are "a fact of C".
>
What is the point of that triviality?
Always the same. You take some sentence in my message, take it out
of its context, then find a single "counterexample", and then
that was it.
>
> There may be constructs in C that are -prone- to buffer overflows
> when used by typical programmers, but that doesn't establish
> that buffer overflows are "a fact of C".
There may be constructs yes. There "may" be constructs that are prone
to buffer overflows when used by typical programmers. Blame the
programmers that are typical, not the geniuses that live in
COMP.LANG.C
where they are a dime a dozen.
> Therefor, any analysis
> such as the book you refer to is not a book about C, but rather
> a book about algorithms and about C implementations.
Yes, algorithms surely. C implementations surely. It is not
an abstract book. Obviously for some people in this group
C stops at anything concrete that goes beyond the usual
void main(void) is not correct
i= i++ + ++i; is not correct.
and discussing forever homework issues.
> That
> makes it relevant for comp.programming and to newsgroups dealing
> with the specifics of implementations similar to the ones s/he
> discusses, but does not make the book relevant to comp.lang.c .
>
OFF TOPIC. !!!
A book about analysis of C programs is OBVIOUSLY
OFF TOPIC in comp.lang.c
The only on topic stuff is a sterile repeating of such important
stuff like
void main(void) is not correct
i= i++ + ++i; is not correct.
and discussing forever homework issues.
This people are just against ANY discussion.
By that logic (if "logic" is the right word), you have
also shown that the preprocessor is not a fact of C, nor the
modulo operator, nor the `for' statement, nor the `static'
keyword, nor the entire Standard library. Brevity, in this
case, is the soul of witlessness.
> In article <g65acc$427$1...@aioe.org>, jacob navia <ja...@nospam.org>
> wrote:
>>Buffer overflows are a fact of life, and, more specifically, a fact of
>>C.
>
>>All is not lost however. In the book
>
>>"Value Range Analysis of C programs" Axel Simon tries to establish a
>>theoretical framework for analyzing C programs.
>
> If buffer overflows are indeed, as you claim, "a fact of C", then
> no amount of theoretical analysis will allow you to eliminate them
> in any actual C program.
>
> On the other hand:
>
> int main(void) { return 0; }
>
> is a C program that has no possibility of buffer overflow.
Your point might have been more pointy if you'd written a program that
actually used a buffer.
<snip>
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
But are those startup and shutdown functions part of the C language?
Ah but his point was exactly that.
I don't think it is possible to say. They are part of the C compiler.
By what magic do the two of you know that no buffer is involved?
Here is the disassembly for the provided program on a C compiler:
---
c:\tmp\minimal\minimal.c ---------------------------------------------------
int main(void) { return 0; }
010B1370 push ebp
010B1371 mov ebp,esp
010B1373 sub esp,0C0h
010B1379 push ebx
010B137A push esi
010B137B push edi
010B137C lea edi,[ebp-0C0h]
010B1382 mov ecx,30h
010B1387 mov eax,0CCCCCCCCh
010B138C rep stos dword ptr es:[edi]
010B138E xor eax,eax
010B1390 pop edi
010B1391 pop esi
010B1392 pop ebx
010B1393 mov esp,ebp
010B1395 pop ebp
010B1396 ret
Did you notice the rep stos? That is a string fill operation.
REP STOS m32 --> Fill (E)CX doublewords at ES:[(E)DI] with EAX
Program startup is referenced 23 times in the C99 standard and program
termination is referenced 31 times. So by definition, startup and shutdown
are part of the C language specification. Whether startup and shutdown are
referenced as functions or not is unclear.
> "Mark McIntyre" <markmc...@TROUSERSspamcop.net> wrote in message
> news:0Rshk.30534$c47....@en-nntp-06.am2.easynews.com...
>> Richard Heathfield wrote:
>>> Walter Roberson said:
>>>
>>>> In article <g65acc$427$1...@aioe.org>, jacob navia <ja...@nospam.org>
>>>> wrote:
>>>>> Buffer overflows are a fact of life, and, more specifically, a fact
>>>>> of C.
>>>> If buffer overflows are indeed, as you claim, "a fact of C", then
>>>> no amount of theoretical analysis will allow you to eliminate them
>>>> in any actual C program.
>>>>
>>>> On the other hand:
>>>>
>>>> int main(void) { return 0; }
>>>>
>>>> is a C program that has no possibility of buffer overflow.
>>>
>>> Your point might have been more pointy if you'd written a program that
>>> actually used a buffer.
>>
>> Ah but his point was exactly that.
(Not much of a point, then.)
> By what magic do the two of you know that no buffer is involved?
I haven't claimed that no buffer is involved. I've claimed that the C
program as shown doesn't use a buffer. It doesn't. What the implementation
does in translation is of no concern as long as it faithfully interprets
the program's semantics. If it chooses to add a buffer, so be it - but the
C program does *not* use one.
The point discussed was in connection with vulnerabilities in connection
with the C programming language.
If a C program calls library functions, uses inline assembly or whatever and
performs buffer movements then that is salient to the discussion of
vulnerabilities and buffer overflow.
Whether or not it was the end-user of the compiler or the compiler vendor
who broke something is irrelevant to the discussion.
If a library call or other normal operation of any given C compiler causes a
buffer overrun, then that is a vulnerability.
In fact, I propose that is a more serious vulnerability than one found in
end-user code because it is potentially much harder to detect.
We can run programs that eat a pile of C code and perform static analysis
and detect buffer overruns. But it will be harder still to perform the same
analysis against the C run time libraries because (potentially) much of it
will not even be written in C and harder still to find problems in the goo
projected by the compiler that does not come from the run time libraries.
To say: "I don't see any buffer use." when examining a fragment of C code
does not mean that buffer overruns will not happen because of using a C
compiler. I think that this is a very important subject that directly
focuses on the chief criticism of the C language. If we can figure out
solutions, or at least partial solutions, then we end up with a better C
compiler and safer code in the long run. It is perhaps the biggest obstacle
in C's future.
> "Richard Heathfield" <r...@see.sig.invalid> wrote in message
> news:tfadnXXEvJdnFRvV...@bt.com...
<snip>
>> What the
>> implementation does in translation is of no concern as long as it
>> faithfully interprets the program's semantics. If it chooses to add a
>> buffer, so be it - but the [snipped] C program does *not* use one.
>
> The point discussed was in connection with vulnerabilities in connection
> with the C programming language.
> If a C program calls library functions, uses inline assembly or whatever
> and performs buffer movements then that is salient to the discussion of
> vulnerabilities and buffer overflow.
Perhaps it is, but that isn't the impression I get about what's eating the
"C is prone to buffer overflow" crowd. Rather, I think they're saying it's
too easy to screw up when writing C code itself - e.g. it's too easy to
write something like this:
char foo[32];
puts(prompt);
gets(foo);
or
strcpy(foo, toolongstring);
or
sprintf(foo, "Now is the time for %s to come to the aid of the party",
bar);
And it *is* easy to write that kind of stuff. (It's pretty easy to fly a
fighter jet into the ground, too, but I've always see that as more of an
argument for training pilots than an argument for grounding them.)
<snip>
> Buffer overflows are a fact of life, and, more specifically, a fact of
> C.
>
> All is not lost however. In the book
>
> "Value Range Analysis of C programs" Axel Simon tries to establish a
> theoretical framework for analyzing C programs. In contrast to other
> books where the actual technical difficulties are "abstracted away",
> this books tries to analyze real C programs taking into account
> pointers, stack frames, etc.
>
> It has just arrived today, I was waiting for it since several weeks.
>
> http://www.di.ens.fr/~simona/book.html
%- Anyone interested in source code analysis. The formal yet concise
%- definition of an analysis of a real-world programming language can
%- help to define a simiar description for the purpose of slicing,
%- taint analysis, calculating metrics and many other application areas.
I'm somewhat of a non-believer here. There is no calculus to decide.
Isn't 'taint' the scientific name for "perineum?"
--
Let's not burn the universities yet. After all, the damage they do might be
worse.
H. L. Mencken
Well, far more important, at least as far as I'm concerned, is the fact
that this thread is also cross-posted to comp.std.c, where there is far
less controversy about what is topical and what isn't. A book about how
to use C, no matter how well written and interesting it might be, is not
on-topic for comp.std.c, at least not for that reason. This group is
very specifically for discussions about the C standard, not about how to
use the language defined by that standard. "How to use C" can become
relevant, when discussing whether or not a particular existing or
proposed feature of the standard is a good idea, but "How to use C" is
not topical without such a connection.
If the book also contains statements about the C standard, it would be
on-topic to discuss whether or not those statements are true. If it has
suggestions for future versions of the C standard, it would also be on
topic to discuss those suggestions. However, just because it talks about
C does not make it on-topic for comp.std.c.
Sloppy wording breeds meaningless assertions.
A "buffer overflow in C" is not even a proposition, let alone a fact. A
modulo operator is also not a "fact" of C or anything else.
Let's add precision with some propositions:
1. Buffer overflows occur when some programs written in C are run.
2. Buffer overflows occur when all programs written in C are run.
3. Buffer overflows can occur when running most programs written in C under
some conditions of invocation and inputs.
4. The integer modulo operator is supported by all C compilers.
We may argue the finer points of meaning and methods of evaluation, but
arguing "facts" without clarification seems silly.
--
Thad
Thad Smith said:
<snip>
>
> Sloppy wording breeds meaningless assertions.
>
> A "buffer overflow in C" is not even a proposition, let alone a fact. A
> modulo operator is also not a "fact" of C or anything else.
>
> Let's add precision with some propositions:
> 1. Buffer overflows occur when some programs written in C are run.
> 2. Buffer overflows occur when all programs written in C are run.
> 3. Buffer overflows can occur when running most programs written in C
> under some conditions of invocation and inputs.
And then let's add:
4. Buffer overflows can sometimes occur when running some carelessly
written programs under some conditions of invocation and inputs, where
these programs are written in any of a variety of languages (certainly
including C, but also including C++ for a start) that are sufficiently
powerful to be capable of being dangerously misused by amateurs.
If you don't want buffer overflows, hire some good programmers. One way you
can tell they're good is that they hold regular code reviews in which they
point out faults in each others' code.
> 4. The integer modulo operator is supported by all C compilers.
One would hope so, wouldn't one? :-)
By that reasoning, buffer overflows are a problem for every single
computer language in existance.
Even for the safest of safe languages, you can not possibly rule out
that a (flawed) translator introduces a buffer overflow in the
program.
So, why do people complain all the time about the possibility of
buffer overflows in C, but not in other languages?
Bart v Ingen Schenau
This:
"Value Range Analysis of C programs" Axel Simon tries to establish a
theoretical framework for analyzing C programs.
Sounds like it would be interesting to compiler vendors and language
designers to me.
Because nobody brought it up. Besides which, it would not be topical in
news:comp.lang.c
I don't particularly care if buffer overflows are a problem in {e.g.} Snobol
because I don't use it.
C is not more dangerous than (for instance) C++ in this regard. But it is
more dangerous than (for instance) Ada. I would further argue that an Ada
compiler compiled with C is probably not as safe as an Ada compiler compiled
with Ada and that a C compiler compiled with Ada would probably be safer
than a C compiler compiled with C.
Unbounded arrays and trusting in nul termination of strings are reasons for
my above opinion. I'm not even sure that I want to fix it, if it will make
things a lot slower. But I do want to think about it.
Well, mathematics doesn't need beliefs. You should read that book first.
> Isn't 'taint' the scientific name for "perineum?"
When a variable has some property, for instance of being allocated with
malloc, this property goes on to other variables when they are assigned
the allocated variable, i.e. they are "tainted" by the right hand side.
The analysis and the following of those properties is "tainted"
analysis.
>> The
>> existance of a single counter-example is enough to disprove your
>> claim that buffer overflows are "a fact of C".
>
> I think we have to admit that buffer overflows are a *problem* of C. I
> guess that a C dialect could be produced where every single library
> function was formally proven. In such a system, buffer overflows would
> only occur when the compiler end-user created one.
Are buffer overflows problem of *C* or of implementation of *C*?
I see only one problem on C standard, that limits the capabilities
of a compiler (+ run time environment) to check overflows.
I think a C compiler with range checks will be still
faster than other language "without overflows".
So the only problem I see is the dynamic memory.
IMHO C misses functions like:
void * type_alloc(size_t s, type_t type);
type is implicitly or explicitly converted to a
TYPE_UNK value (and information not used, like malloc),
or implementation defined value if an implementation
cares about runtime types.
I think with this additional information, an implementation
could be overflow safe and still conforming.
Or do you see other problem on C?
ciao
cate
>
> So, why do people complain all the time about the possibility of
> buffer overflows in C, but not in other languages?
because that kind of errors plagues C programs mostly?
Regards
Friedrich
--
Please remove just-for-news- to reply via e-mail.
I don't follow you there.
How does the language used to implement a compiler affect the safety
of the code generated *by* that compiler?
I see absolutely no difficulty in writing a compiler in a 'safe'
language with full bounds checking that generates a buffer overflow in
every program compiled with it.
>
> Unbounded arrays and trusting in nul termination of strings are reasons for
> my above opinion. I'm not even sure that I want to fix it, if it will make
> things a lot slower. But I do want to think about it.
I agree that C lacks every kind of safety net.
But I don't blame the language if I have taken every possible
precaution in my source code and the compiler still manages to screw
up. Those things, I blame on the compiler regardless of which language
I am using.
Bart v Ingen Schenau
jacob navia said:
> Ron Ford wrote:
<snip>
>>
>> I'm somewhat of a non-believer here. There is no calculus to decide.
>>
>
> Well, mathematics doesn't need beliefs.
It needs a few. They are called axioms.
Giacomo Catenazzi said:
<snip>
> Are buffer overflows problem of *C* or of implementation of *C*?
No, they're a problem of C programmers.
> I see only one problem on C standard, that limits the capabilities
> of a compiler (+ run time environment) to check overflows.
Mandating it would be such a bad idea. Implementations are already free to
do bounds checking if they wish. Let the market decide. Programmers will
generally make the smart move if they're given the time to think about it.
Trust them to decide for themselves.
<snip>
Because functions like gets() asctime() and other standard
functions (still in C99 standard even if gets() got deprecated)
make buffer overflows almost MANDATORY.
Zero terminated strings, where there are no bounds checking make it
almost impossible to avoid errors since it requires from the programmer
never to forget the lengths of buffers!
I have proposed a string library for C to make those errors more
difficult. The reception was as expected... :-(
While many on-topic discussions in comp.std.c are "interesting to
compiler vendors and language designers", the simple fact that an issue
is "interesting to compiler vendors and language designers" does not
make it on-topic for comp.std.c, not even if it's specifically about C.
It has to be about the C standard to be on-topic.
why?
IIRC is one of the discussion topic of future C1x, and I'm
more interested on std part.
> Giacomo Catenazzi said:
>
>> Are buffer overflows problem of *C* or of implementation of *C*?
>
> No, they're a problem of C programmers.
No. language are defined for programmers.
Common problems of programmers are by definition also
problem of language.
>> I see only one problem on C standard, that limits the capabilities
>> of a compiler (+ run time environment) to check overflows.
>
> Mandating it would be such a bad idea. Implementations are already free to
> do bounds checking if they wish. Let the market decide. Programmers will
> generally make the smart move if they're given the time to think about it.
> Trust them to decide for themselves.
I totally agree that should not be mandated.
But actual C doesn't permit (for IMHO one single point) full
implementation of bounds check.
And IMHO it is simple and I doesn't see disadvantage for
programmers (they should include type on memory allocation functions)
or by implementation: they could ignore the extra fields.
But because it is a topic of next C1x (IIRC), I'm courios of the
direction what would take C, to support (at programmer wishes)
better security.
ciao
cate
Exactly. If a tools fails always at the same place it has a
problem, you can't just blame the user.
>
>>> I see only one problem on C standard, that limits the capabilities
>>> of a compiler (+ run time environment) to check overflows.
>>
>> Mandating it would be such a bad idea. Implementations are already
>> free to do bounds checking if they wish. Let the market decide.
>> Programmers will generally make the smart move if they're given the
>> time to think about it. Trust them to decide for themselves.
>
> I totally agree that should not be mandated.
> But actual C doesn't permit (for IMHO one single point) full
> implementation of bounds check.
>
> And IMHO it is simple and I doesn't see disadvantage for
> programmers (they should include type on memory allocation functions)
> or by implementation: they could ignore the extra fields.
>
> But because it is a topic of next C1x (IIRC), I'm courios of the
> direction what would take C, to support (at programmer wishes)
> better security.
>
It would be a BETTER direction. We have to consider that many things
that were not possible a few years ago are now easily possible with
the huge advances done in hardware.
This sounds very much like the sort of analyses done by tools such as
Polyspace, Absint & Astree.
It is unclear why the author would _try_ to establish such a framework - it
seems to already be done and there are practical tools available to apply
the techniques. These tools are already in use by organizations that place
a premium on code correctness.
Perhaps the book is a text that covers the subject of Abstract
Interpretation and the author is providing the technical details behind the
technique. This would still make it an interesting book, though I would
characterize it as a technical reference book about an established
technique, rather than presenting something new (which is the impression I
take from the "tries to establish" tag).
--
Stuart
I don't think there was hardware issue. Bound checking is
a simple task, and it should have a small impacts on
memory and performance. (But still, compiler should make
it optional).
But one problem is in the people head.
C was always a low level language, and it is still
in people heads. We read/write C, but we are thinking
about low level machine code.
People tends to prefer optimization to code readability,
although compiler will do the same (and additionally)
optimizations.
BTW how many people know that external linkage is usually
done run-time in very complex manner?
When I read a C program, I see a function call as a simple
assembler subroutine call (+ argument passing etc),
reading a variable as access to the specific location;
but reality, with external linkage, is very different.
So I think a (optional) support for bound checking
is a good think, but I doubt the people, and me,
will use such feature. I think think (wrongly) that
I can program good bug-free C (and other languages).
ciao
cate
> > So, why do people complain all the time about the possibility of
> > buffer overflows in C, but not in other languages?
> >
>
> Because functions like gets() asctime() and other standard
> functions (still in C99 standard even if gets() got deprecated)
> make buffer overflows almost MANDATORY.
rubbish. Yes gets() is a problem. No one in their right mind
uses gets(). But asctime() *can* be used safely. Just make sure
the buffer is big enough. Now can you name any other standard
function that *cannot* be used safely?
> Zero terminated strings, where there are no bounds checking make it
> almost impossible to avoid errors
nonsense. Plenty of C gets written using zero terminated
strings which is perfectly fine.
> since it requires from the programmer
> never to forget the lengths of buffers!
you can get the program to do that for you
> I have proposed a string library for C to make those errors more
> difficult. The reception was as expected... :-(
I don't think anyone objects to string libraries (at least
a couple of regulars have their own) the argument was aginst
incorporating a string library into the standard. Actually
I'd be interested. I've used C++ strings and it does make
some things easier.
--
Nick Keighley
Programming should never be boring, because anything
mundane and repetitive should be done by the computer.
~Alan Turing
Extremely short sighted. Even if limited to readers on c.s.c
(which it was not, since it was cross-posted) it is ridiculous to
suggest that readers should maintain total ignorance of anything
written outside of the actual standard. I have no knowledge of the
book in question, but such information is essential to anyone
daring to modify the actual C standard.
--
[mail]: Chuck F (cbfalconer at maineline dot net)
[page]: <http://cbfalconer.home.att.net>
Try the download section.
Because the basic design of C is such as to make automatic machine
checking of possible overflows impracticable to implement. That
was a choice during development.
No you can't make sure the buffer is big enough. The standard mandates
a 26 byte buffer.
> Now can you name any other standard
> function that *cannot* be used safely?
>
strncpy
>
>> Zero terminated strings, where there are no bounds checking make it
>> almost impossible to avoid errors
>
> nonsense. Plenty of C gets written using zero terminated
> strings which is perfectly fine.
>
Sure sure.
>> since it requires from the programmer
>> never to forget the lengths of buffers!
>
> you can get the program to do that for you
>
No
>> I have proposed a string library for C to make those errors more
>> difficult. The reception was as expected... :-(
>
> I don't think anyone objects to string libraries (at least
> a couple of regulars have their own) the argument was aginst
> incorporating a string library into the standard. Actually
> I'd be interested. I've used C++ strings and it does make
> some things easier.
>
The library uses operator overloading with counted strings.
That was rather my point; it should not have been cross-posted to
comp.std.c, it should only have been posted to comp.lang.c.
> ... it is ridiculous to
> suggest that readers should maintain total ignorance of anything
> written outside of the actual standard. I have no knowledge of the
> book in question, but such information is essential to anyone
> daring to modify the actual C standard.
I'm not suggesting ignorance, I'm suggesting that there's appropriate
places to discuss various topics. That general topic of buffer overruns
is an important one that many people should be interested in, and should
discuss - but only in the appropriate forum - comp.std.c is not such a
forum.
Discussions of whether and how the standard should be modified to make
it easier to avoid buffer overruns would be entirely on topic in
comp.std.c (and arguably (everything on c.l.c is arguable!) off-topic
for comp.lang.c). I've seen no such discussion so far.
Such libraries permits new types of attack.
Actual strings forces reading the memory from
beginning, so strings should terminate earlier
(by a "random" 0 character, or a segmentation fault).
On new libraries, the libraries could take advantage
of new fields, and thus it could access the end of
the string, so in kernel or other library space.
OTOH new strings will reduce programmer error, and
thus start of attacks.
Anyway alternate strings libraries already exists,
but it seems that they are not widely used, so
I don't think they are ready to be standardized.
BTW, an alternate string library should really
have a good design, allowing simple plug-in
of i18n, so reducing transition costs.
FYI I found n1173, with some rationale on these goals:
1.1.6 Preserve the null terminated string datatype
1.1.7 Do not require size arguments for unmodified strings
1.1.9 Library based solution
ciao
cate
That won't do the job. The right way to use asctime() safely is to make
sure it's argument points at a structure whose contents won't cause a
buffer overrun.
> No you can't make sure the buffer is big enough.
You're right about that, but not about this:
> ... The standard mandates
> a 26 byte buffer.
The standard mandates that the actual implementation be equivalent to
specified implementation. Since the consequences of writing past the end
of a buffer are that the behavior is undefined, there is no behavior
that is prohibited to asctime() when called with a pointer to a struct
tm whose contents would cause that buffer to overflow. Which means that
an implementation which used a longer buffer would be just as conforming
as one which did not.
>> Now can you name any other standard
>> function that *cannot* be used safely?
>>
>
> strncpy
A single counter-example is sufficient to disprove a general assertion
like "strncpy() cannot be used safely". If your assertion were true,
there would have have to be some safety issue with the following code.
Would you care to explain what you think that issue is?
#include <string.h>
#include <stdio.h>
int main(void)
{
char input[] = "This is the input string";
char output[10];
size_t size = (sizeof output - 1 < sizeof input ?
sizeof output - 1 : sizeof input);
strncpy(output, input, size);
output[size] = '\0';
printf("%s\n", output);
return 0;
}
The '-1' and the explicitly forced null termination are needed only if
downstream processing requires that output[] be a null-terminated
string. That is true in the code above, but has not been true in any of
the cases where I've have a reason to use strncpy() recently, which
makes safe usage of strncpy() a lot simpler.
If "input" and "output" were pointers rather than arrays, determine the
correct value for the third argument of strncpy() would be more
complicated, but is still feasible.
Yes, asctime() provides its own static buffer. But you can use
asctime() safely by ensuring that the arguments (or rather, the
members of the struct tm object pointed to by the single argument) are
within safe bounds.
asctime() is a poorly designed and specified function. It can't be
changed *too* much without breaking existing code. It *could* be
tweaked to require implementation-defined but safe behavior for
out-of-bounds arguments, and I would support such a change. But I
suspect that most existing code that uses asctime() uses it with an
argument corresponding to the current time, which will be safe until
the end of the year 9999.
It absolutely is not as dangerous as gets(). gets() cannot be used
safely (unless you have absolute control over what will appear on
stdin, which is not normally possible). asctime() *can* be used
safely with some care, and it usually is. For example, the following
program is safe if it's executed before the year 10,000 (assuming the
time() function doesn't misbehave):
#include <stdio.h>
#include <time.h>
int main(void)
{
time_t now = time(NULL);
fputs(asctime(localtime(&now)), stdout);
return 0;
}
>> Now can you name any other standard
>> function that *cannot* be used safely?
>
> strncpy
You are mistaken. strncpy() is poorly named, and is difficult to use
safely if you don't understand what it actually does, but it most
certainly can be used safely, if you happen to need the rather odd
data structure that it supports.
Yes, I can write unsafe code using strncpy, but that wasn't the
question. I can write unsafe code without using any library functions
at all.
[...]
>>> I have proposed a string library for C to make those errors more
>>> difficult. The reception was as expected... :-(
>> I don't think anyone objects to string libraries (at least
>> a couple of regulars have their own) the argument was aginst
>> incorporating a string library into the standard. Actually
>> I'd be interested. I've used C++ strings and it does make
>> some things easier.
>
> The library uses operator overloading with counted strings.
I think that explains the lack of enthusiasm. A proposal for a
library that can be used with C compilers other than yours might have
gotten a better reception.
--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Anyone daring to modify the actual C standard should certainly have
sources of information beyond comp.std.c.
The C standard shows a piece of code that will overflow its static
buffer if used with a year value greater than 8900 (if I remember
correctly)
Similarly, if the month value is greater than 12 it will
show UB.
Obviously, showing such a piece of code is a reminder to the rest
of the world how much the standard cares about buffer overflows.
The discussion in this group confirms this. Look at Mr Thomson:
> asctime() is a poorly designed and specified function. It can't be
> changed *too* much without breaking existing code.
Absolutely not. I derived the formula for the EXACT size of the buffer
in this discussion group. It is relatively simple. The only thing that
needs to be changed is the "26" in the size of the buffer.
I mailed a correction of asctime to Mr Plauger, probably member of the
comitee. Never an answer.
> It *could* be
> tweaked to require implementation-defined but safe behavior for
> out-of-bounds arguments, and I would support such a change.
Then why you don't support it now and act to get rid of a buffer
overflowing code written in the C standard document?
> But I
> suspect that most existing code that uses asctime() uses it with an
> argument corresponding to the current time, which will be safe until
> the end of the year 9999.
>
No, the end is 8100 since you add 1900
> It absolutely is not as dangerous as gets(). gets() cannot be used
> safely (unless you have absolute control over what will appear on
> stdin, which is not normally possible). asctime() *can* be used
> safely with some care, and it usually is. For example, the following
> program is safe if it's executed before the year 10,000 (assuming the
> time() function doesn't misbehave):
>
> #include <stdio.h>
> #include <time.h>
> int main(void)
> {
> time_t now = time(NULL);
> fputs(asctime(localtime(&now)), stdout);
> return 0;
> }
>
>>> Now can you name any other standard
>>> function that *cannot* be used safely?
>> strncpy
>
> You are mistaken. strncpy() is poorly named, and is difficult to use
> safely if you don't understand what it actually does, but it most
> certainly can be used safely, if you happen to need the rather odd
> data structure that it supports.
>
After all those "poorly named", "difficult to use", "odd data structure"
couldn't we get RID OF THAT PIECE OF ... ???
> Yes, I can write unsafe code using strncpy, but that wasn't the
> question. I can write unsafe code without using any library functions
> at all.
>
You misunderstand the whole point. Apparently you do not understand what
ERROR PRONE
means?
Would you drive a car that kills you at the slightest mistake
for years and years?
You CAN drive a car like that if you never make any mistakes
obviously. And after several thousand people have died you
CAN say:
They are just bad drivers. They made mistakes.
> [...]
>
>>>> I have proposed a string library for C to make those errors more
>>>> difficult. The reception was as expected... :-(
>>> I don't think anyone objects to string libraries (at least
>>> a couple of regulars have their own) the argument was aginst
>>> incorporating a string library into the standard. Actually
>>> I'd be interested. I've used C++ strings and it does make
>>> some things easier.
>> The library uses operator overloading with counted strings.
>
> I think that explains the lack of enthusiasm. A proposal for a
> library that can be used with C compilers other than yours might have
> gotten a better reception.
>
You jsut do not want to understand. I presented it as an example
of the direction that C could take, that is why I presented it in
this group.
Easier said than done, unless you know for sure that all the values in
the struct tm are within their normal ranges and that the value of
tm_year is between -2899 and 8099.
--
Larry Jones
I kind of resent the manufacturer's implicit assumption
that this would amuse me. -- Calvin
And you consider this makes you exceptional? :-)
You don't have control over the buffer. Best to receive it in a:
const char *buff;
...
buff = asctime(...);
>From the std:
7.23.3.1 The asctime function
Synopsis
[#1]
#include <time.h>
char *asctime(const struct tm *timeptr);
Description
[#2] The asctime function converts the broken-down time in
the structure pointed to by timeptr into a string in the
form
Sun Sep 16 01:03:52 1973\n\0
using the equivalent of the following algorithm.
char *asctime(const struct tm *timeptr) {
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
};
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
};
static char result[26];
^^^^^^
Note this word.
Specially because those limits ARE NOT EVEN MENTIONED in the
standards document. They can be inferred by reading the code
and seeing where it would overflow!
The standard doesn't care about anything. The C committee should care
about buffer overruns, and does. Doug Gwyn is a committee member, and
one of the people posting to comp.std.c who argues most strenuously
for keeping the standard as it is currently written; yet even he
expressed sympathy (2006-07-05) for the concept of modifying the
standard in the manner Keith Thompson suggested, to deal with this
issue.
...
> > It *could* be
> > tweaked to require implementation-defined but safe behavior for
> > out-of-bounds arguments, and I would support such a change.
>
> Then why you don't support it now and act to get rid of a buffer
> overflowing code written in the C standard document?
His comment expresses support, right now - why do thing otherwise?
What action are you suggesting that he should do and has not done to
get rid of it? Do you misunderstand the standardization process so
badly that you think that if a given change is desired right now, an
appropriately updated version of the standard will immediatley follow?
The process takes a little longer than that.
> > But I
> > suspect that most existing code that uses asctime() uses it with an
> > argument corresponding to the current time, which will be safe until
> > the end of the year 9999.
> >
>
> No, the end is 8100 since you add 1900
The maximum year is 9999, as Keith said. You're referring to the
maximum safe value of tm_year, which is 8099. 8100 is the first value
after that which is unsafe, which is a different matter.
> > Yes, I can write unsafe code using strncpy, but that wasn't the
> > question. I can write unsafe code without using any library functions
> > at all.
> >
>
> You misunderstand the whole point. Apparently you do not understand what
>
> ERROR PRONE
>
> means?
How did you reach that conclusion? The question wasn't about functions
that are error prone. The question was "Now can you name any other
standard function that *cannot* be used safely?". Those are two very
different things.
Please have the courtesy to spell my name right. Copy-and-paste it if
you have to.
>> asctime() is a poorly designed and specified function. It can't be
>> changed *too* much without breaking existing code.
>
> Absolutely not. I derived the formula for the EXACT size of the buffer
> in this discussion group. It is relatively simple. The only thing that
> needs to be changed is the "26" in the size of the buffer.
>
> I mailed a correction of asctime to Mr Plauger, probably member of the
> comitee. Never an answer.
Sorry to hear that. Without seeing your e-mail, I won't speculate on
why he didn't respond, but I'm sure he gets a lot of e-mail.
>> It *could* be
>> tweaked to require implementation-defined but safe behavior for
>> out-of-bounds arguments, and I would support such a change.
>
> Then why you don't support it now and act to get rid of a buffer
> overflowing code written in the C standard document?
I just stated my support in a posting to this newsgroup. I'm not a
member of the committee. What else do you expect me to do?
Implementers can already implement asctime() in a way that avoids
buffer overflows; I presume you've done so for lcc-win.
BTW, given the current specification and the existence of code that
depends on it, I'd recommend truncating fields rather than making the
buffer bigger. Code that uses asctime() might reasonably do something
like this:
char buffer[26];
char *result = asctime(/* ... */);
strcpy(buffer, result);
Shifting the buffer overflow from asctime() to the code that uses it
isn't particularly helpful.
>> But I
>> suspect that most existing code that uses asctime() uses it with an
>> argument corresponding to the current time, which will be safe until
>> the end of the year 9999.
>>
>
> No, the end is 8100 since you add 1900
Calling asctime() with timeptr->tm_year == 8100 will cause a buffer
overflow. That corresponds to the year 10000.
As I said, calling asctime() with an argument corresponding to the
current time will be safe until the end of the year 9999. I did *not*
refer to a tm_year value 9999.
But that's a minor point. The point is that asctime(), unlike gets(),
can be used safely. Please don't react to that simple statement as if
I were defending the design of asctime() or opposing any changes in
its specification.
>> It absolutely is not as dangerous as gets(). gets() cannot be used
>> safely (unless you have absolute control over what will appear on
>> stdin, which is not normally possible). asctime() *can* be used
>> safely with some care, and it usually is. For example, the following
>> program is safe if it's executed before the year 10,000 (assuming the
>> time() function doesn't misbehave):
>> #include <stdio.h>
>> #include <time.h>
>> int main(void)
>> {
>> time_t now = time(NULL);
>> fputs(asctime(localtime(&now)), stdout);
>> return 0;
>> }
>>
>>>> Now can you name any other standard
>>>> function that *cannot* be used safely?
>>> strncpy
>> You are mistaken. strncpy() is poorly named, and is difficult to use
>> safely if you don't understand what it actually does, but it most
>> certainly can be used safely, if you happen to need the rather odd
>> data structure that it supports.
>>
>
> After all those "poorly named", "difficult to use", "odd data structure"
> couldn't we get RID OF THAT PIECE OF ... ???
It would break existing code. Some existing code that uses strncpy()
undoubtedly uses it incorrectly. But some of it uses it correctly and
safely. If it's not useful to you, don't use it.
You claimed that strncpy "*cannot* be used safely". It has been
proven by example, that it can. Why are you unwilling to admit your
mistake?
>> Yes, I can write unsafe code using strncpy, but that wasn't the
>> question. I can write unsafe code without using any library functions
>> at all.
>>
>
> You misunderstand the whole point. Apparently you do not understand what
>
> ERROR PRONE
>
> means?
<sarcasm>Gosh, maybe you could explain it to me. Please use small
words.</sarcasm>
> Would you drive a car that kills you at the slightest mistake
> for years and years?
>
> You CAN drive a car like that if you never make any mistakes
> obviously. And after several thousand people have died you
> CAN say:
>
> They are just bad drivers. They made mistakes.
I *do* drive a car that can kill me at the slightest mistake. If I
turn the steering wheel in the wrong direction or hit the accelerator
at the wrong time, it can veer into oncoming traffic. Even with
airbags and seatbelts, I might not survive a head-on collision at a
relative velocity over 100 mph. Or I could drive off a cliff, or into
a wall. The car isn't smart enough to stop me from doing something
stupid. And yet I and millions of other people continue to drive, and
thousands are killed every year.
Sorry, what was the point of this metaphor?
>> [...]
>>
>>>>> I have proposed a string library for C to make those errors more
>>>>> difficult. The reception was as expected... :-(
>>>> I don't think anyone objects to string libraries (at least
>>>> a couple of regulars have their own) the argument was aginst
>>>> incorporating a string library into the standard. Actually
>>>> I'd be interested. I've used C++ strings and it does make
>>>> some things easier.
>>> The library uses operator overloading with counted strings.
>> I think that explains the lack of enthusiasm. A proposal for a
>> library that can be used with C compilers other than yours might have
>> gotten a better reception.
>>
>
> You jsut do not want to understand. I presented it as an example
> of the direction that C could take, that is why I presented it in
> this group.
You presented a string library that depends on adding a major new
feature to the language, one that's highly controversial. You're
surprised that it wasn't greeted with enthusiasm. I'm not. Which one
of us just does not want to understand?
No, that's not what makes me exceptional. 8-)}
Not really since there is nothing almost mandatory about using the
functions in question.
> Zero terminated strings, where there are no bounds checking make it
> almost impossible to avoid errors since it requires from the programmer
> never to forget the lengths of buffers!
Not really. You can pass the length around or use one of the C libraries
written in standard C (which excludes yours) and include it as part of
your C code-base if you want.
> I have proposed a string library for C to make those errors more
> difficult. The reception was as expected... :-(
The reasons it was badly received had nothing to do with safety.
I started off agreeing that the discussion of the book you posted about
would be topical in comp.lang.c but yet again your misrepresentation of
other peoples positions and taking any disagreement as a personal attack
lost you all sympathy.
With regards to the book, in general proving that a program does not
contain a buffer overflow is equivalent to the halting problem. E.g
prove there is no buffer overflow in the following snippet *without*
solving the halting problem:
int fred[5];
fred[may_not_halt()] = 0;
You cannot prove may_not_halt() never returns something greater than 4
(in general) without solving the halting problem.
The book could well still provide useful techniques.
--
Flash Gordon
OK.
Where is the newsgroup where discussions about reparing the most significant
defect in the C language are topical?
** Posted from http://www.teranews.com **
The same answer always:
"There are no problems with C. Only with lazy programmers
that do not know how to do their job".
comp.lang.c has nothing to do with the real world.
Since the C language is defined by the standard, any repair
necessarily takes the form of a change to some future version of the
standard. As I said earlier, referring to comp.std.c:
> If the book ... has suggestions for future versions of the C standard, it would also be on
> topic to discuss those suggestions.
So far, I've seen nothing to suggest that this book contains any
proposed changes to the C standard. It might, and if it does, I'd be
happy to see them discussed on comp.std.c. However, no aspect of that
book which is not so connected to the C standard would be on-topic.
For example, things that would not be on-topic in c.s.c include:
* Suggestions about how to write your code to avoid buffer overruns
* Design details of utilities for detecting code which might produce
buffer overruns,
* Discussions about compiler features that would help avoid buffer
overruns.
Things that would be on-topic in c.s.c:
* Discussions about whether a given compiler feature to avoid buffer
overruns is allowed for a standard-conforming implementation of C.
* Discussions about whether a compiler extension to avoid buffer
overruns should be standardized.
comp.std.c.
Is that what we're discussing? Are there proposals for changes to the
next version of the standard?
You misunderstand my comment as suggesting I was siding one way or the
other.
> Here is the disassembly for the provided program on a C compiler:
Fascinating but I suspect irrelevant. What the compiler or runtime does
prior to starting the main routine is entirely a QOI issue. I should
imagine all sorts of arrays get created, name of programme, copy of
cmdline arguments, copy of environment, stuff associated with stdin etc.
So what? Is any of this compulsorily written in C? It could be in
assembler, pascal or web (the knuth one) for all it matters.
--
Mark McIntyre
CLC FAQ <http://c-faq.com/>
CLC readme: <http://www.ungerhu.com/jxh/clc.welcome.txt>
In this instance, Jacob is right. His originating post to the
thread was met just nineteen minutes later by a response that had
all the stigmata of a personal attack, or at the very least a taunt
and provocation. Things then went predictably downhill.
Just to remind you: library functions need not be written in C.
So what you're saying is - its the fault of hte C programming language
if someone writes a library function in another language, and that
library function contains a buffer overflow.
Hm?
> To say: "I don't see any buffer use." when examining a fragment of C
> code does not mean that buffer overruns will not happen because of using
> a C compiler.
This is true, but axiomatic and useless. Exactly the same can be said
about any feature or bug that you cannot see in the code sample on display.
> I think that this is a very important subject that
> directly focuses on the chief criticism of the C language.
It is an important subject, but one won't help discussion of it by
accidentally blaming the C language for stuff which is in fact nothing
to do with it.
This is where your argument falls down. Remember that library functions
need not be written in the same language as the compiler is compiling.
Seems to me you're saying that
- the C runtime environment's startup code may call library functions
- those library functions might contain a buffer overflow
- therefore C is unsafe.
Is it not the case that
- the Ada runtime environment's startup code may call library functions
- those library functions might contain a buffer overflow
- therefore Ada is exactly as unsafe as C.
Unless you're asserting that no Ada compiler has libraries and the
startup code never calls any functions?
Right. I said that.
> So what you're saying is - its the fault of hte C programming language if
> someone writes a library function in another language, and that library
> function contains a buffer overflow.
>
> Hm?
>
>> To say: "I don't see any buffer use." when examining a fragment of C code
>> does not mean that buffer overruns will not happen because of using a C
>> compiler.
>
> This is true, but axiomatic and useless. Exactly the same can be said
> about any feature or bug that you cannot see in the code sample on
> display.
My point was that elimination of buffer overruns will not eliminate buffer
overruns from C.
In order to do that, we would have to eliminate buffer overruns from the C
library and C compilers as well.
>> I think that this is a very important subject that directly focuses on
>> the chief criticism of the C language.
>
> It is an important subject, but one won't help discussion of it by
> accidentally blaming the C language for stuff which is in fact nothing to
> do with it.
I am surprised that you don't see any connection between buffer overruns and
the C language.
It's a fundamental defect inherent in its design.
Correct. Also, the compiler may inject unsafe code snippets that are
nowhere contained in the C library.
> Is it not the case that
> - the Ada runtime environment's startup code may call library functions
> - those library functions might contain a buffer overflow
> - therefore Ada is exactly as unsafe as C.
If the Ada libraries and compiler is written in C then Ada is exactly as
unsafe as C. If the Ada compiler and runtime libraries are written in Ada,
then that instance of Ada is safer than a compiler manufactured using C.
> Unless you're asserting that no Ada compiler has libraries and the startup
> code never calls any functions?
No, but I am asserting that the design of Ada is safer than the design of C.
e.g. from: http://en.wikipedia.org/wiki/Ada_programming_language
"Ada supports run-time checks in order to protect against access to
unallocated memory, buffer overflow errors, off by one errors, array access
errors, and other avoidable bugs."
I am not saying (for instance) that this design decision was necessarily
better than the design decisions that went into making the C programming
language. What I am saying is that there are problems generated by the
design decisions that went into making the C language. Let's think about
these problems and see if we can find effective solutions or partial
solutions.
Dann Corbit said:
<snip>
> My point was that elimination of buffer overruns will not eliminate
> buffer overruns from C.
> In order to do that, we would have to eliminate buffer overruns from the
> C library and C compilers as well.
You seem to be saying that you can't eliminate buffer overruns without
eliminating buffer overruns, which is trivially true, but unhelpful. Yes,
it's true that an implementation might inject buffer overruns into the
code. Any implementation of any language, itself written in any language,
might do this. So by your reasoning all program code is suspect,
regardless of the language in which it is written. So why blame C?
<snip>
> I am surprised that you don't see any connection between buffer overruns
> and the C language.
I am surprised that you don't see any connection between buffer overruns
and programming languages in general.
If you *can't* overrun a buffer in a given language, then that language
isn't as powerful as it could be and, some would say, as it ought to be.
Safety restrictions, no matter how praiseworthy they may be on their own
merits, nevertheless represent a diminution of freedom and power.
> It's a fundamental defect inherent in its design.
No, it's a fundamental risk inherent in the provision of power - i.e. that
the power might be misused, either through accident or design.
If you want Ada, you know where to find it.
--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
"Usenet is a strange place" - dmr 29 July 1999
Dann Corbit said:
> "Mark McIntyre" <markmc...@TROUSERSspamcop.net> wrote in message
> news:7UMhk.19368$yq3....@en-nntp-07.am2.easynews.com...
<snip>
>>
>> Seems to me you're saying that
>> - the C runtime environment's startup code may call library functions
>> - those library functions might contain a buffer overflow
>> - therefore C is unsafe.
>
> Correct. Also, the compiler may inject unsafe code snippets that are
> nowhere contained in the C library.
Any translator for any language might do this. Therefore, all languages are
unsafe.
>> Is it not the case that
>> - the Ada runtime environment's startup code may call library functions
>> - those library functions might contain a buffer overflow
>> - therefore Ada is exactly as unsafe as C.
>
> If the Ada libraries and compiler is written in C then Ada is exactly as
> unsafe as C. If the Ada compiler and runtime libraries are written in
> Ada, then that instance of Ada is safer than a compiler manufactured
> using C.
No, because the Ada compiler might inject buffer overruns into the
translated binary - possibly because the compiler used to compile it
injected code to make it do so. Reflections on trusting trust...
>> Unless you're asserting that no Ada compiler has libraries and the
>> startup code never calls any functions?
>
> No, but I am asserting that the design of Ada is safer than the design of
> C. e.g. from: http://en.wikipedia.org/wiki/Ada_programming_language
> "Ada supports run-time checks in order to protect against access to
> unallocated memory, buffer overflow errors, off by one errors, array
> access errors, and other avoidable bugs."
None of those things mean spit if the implementation is compromised in the
manner you've been suggesting.
> I am not saying (for instance) that this design decision was necessarily
> better than the design decisions that went into making the C programming
> language. What I am saying is that there are problems generated by the
> design decisions that went into making the C language. Let's think about
> these problems and see if we can find effective solutions or partial
> solutions.
We could start by improving people's understanding of C - by finding a
better way to teach people about strings, buffers, memory in general. I
have yet to discover any mainstream teaching material that does this
particularly well.
The normal ranges of the struct tm members are mentioned quite
prominently in 7.23.1p4 where they're specified. I would have thought
it fairly obvious that passing out of range values to any function that
doesn't explicitly document that it allows them is a bad idea and likely
to result in undefined behavior. The only one that needs to be inferred
is the range for tm_year.
--
Larry Jones
Wow, how existential can you get? -- Hobbes
[followups overridden for now, since what I have to say is relevant to
comp.std.c]
> Dann Corbit said:
[...]
>> I am surprised that you don't see any connection between buffer overruns
>> and the C language.
>
> I am surprised that you don't see any connection between buffer overruns
> and programming languages in general.
>
> If you *can't* overrun a buffer in a given language, then that language
> isn't as powerful as it could be and, some would say, as it ought to be.
> Safety restrictions, no matter how praiseworthy they may be on their own
> merits, nevertheless represent a diminution of freedom and power.
>
>> It's a fundamental defect inherent in its design.
>
> No, it's a fundamental risk inherent in the provision of power - i.e. that
> the power might be misused, either through accident or design.
>
> If you want Ada, you know where to find it.
Ada is as powerful as C. It doesn't forbid unsafe actions, it merely
requires you to specify them explicitly in most cases. For example,
to interpret an integer as a pointer (something C allows with a simple
cast), you have to instantiate Unchecked_Conversion and then call the
instance. (Strictly speaking, the C cast performs a type conversion,
not a reinterpretation, but it's implemented as a reinterpretation in
every implementation I've seen.)
It would be nice if C could achieve similar safety in normal use,
while still allowing unsafe constructs that are sometimes necessary in
practice, without changing the language so radically that most
existing code would be broken. I'm skeptical that this is possible,
but I'd be interested in seeing any concrete proposals.
If you want C with Ada-like safety (say, because you really like curly
braces), I think your best bet is to invent a new language.
Yes, but calling asctime() with tm_hour==99 and all other members in
their normal ranges *doesn't* invoke undefined behavior, because the
implementation is required to use an algorithm equivalent to the
sample code presented in the standard.
I understand why the definition of asctime() was presented using
sample code, unlike every other function in the standard; it's much
easier to describe it that way than it would be in English. But a
consequence of that decision is that the behavior in the edge cases
can be a bit odd.
(Yes, sample implementations are provided for rand() and srand(), but
an actual implementation isn't required to be equivalent to the
samples.)
In my opinion:
(1) The standard committee should re-open DR 217
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_217.htm> and
fix the definition of asctime() so that (a) it remains compatible
with the current definition, and (b) undefined behavior does not
occur for any argument that's a valid pointer to a struct tm;
(2) Implementations should implement asctime() in a way that doesn't
cause an internal buffer overflow, even though the standard
doesn't currently require this (I think this is most safely done
*without* making the buffer bigger); and
(3) Users should either use asctime() very carefully, or should avoid
it in favor of the more versatile strftime().
[followups overridden for now, since what I have to say is relevant to
comp.std.c]
> Dann Corbit said:
[...]
>> I am surprised that you don't see any connection between buffer overruns
>> and the C language.
>
> I am surprised that you don't see any connection between buffer overruns
> and programming languages in general.
>
> If you *can't* overrun a buffer in a given language, then that language
> isn't as powerful as it could be and, some would say, as it ought to be.
> Safety restrictions, no matter how praiseworthy they may be on their own
> merits, nevertheless represent a diminution of freedom and power.
>
>> It's a fundamental defect inherent in its design.
>
> No, it's a fundamental risk inherent in the provision of power - i.e. that
> the power might be misused, either through accident or design.
>
> If you want Ada, you know where to find it.
Ada is as powerful as C. It doesn't forbid unsafe actions, it merely
requires you to specify them explicitly in most cases. For example,
to interpret an integer as a pointer (something C allows with a simple
cast), you have to instantiate Unchecked_Conversion and then call the
instance. (Strictly speaking, the C cast performs a type conversion,
not a reinterpretation, but it's implemented as a reinterpretation in
every implementation I've seen.)
It would be nice if C could achieve similar safety in normal use,
while still allowing unsafe constructs that are sometimes necessary in
practice, without changing the language so radically that most
existing code would be broken. I'm skeptical that this is possible,
but I'd be interested in seeing any concrete proposals.
If you want C with Ada-like safety (say, because you really like curly
braces), I think your best bet is to invent a new language.
--
I think that this should be modifed to "a valid pointer to a struct
tm, none of whose members contain a trap representation". For most
implementations, that won't make any difference. However, for
implementations where 'int' has trap representations, ensuring defined
behavior when that condition is not met would be very burdensome.
Agreed, good catch.
And saying that "undefined behavior does not occur", as I suggested
above, is hardly sufficient.
Given that the argument is a valid pointer to [see above], I suggest
either:
(a) For any member values for which the behavior of the currently
specified algorithm is undefined, the buffer after asctime()
returns must contain a valid (null-terminated) string; or
(b) If all the member values are within their normal ranges, the
behavior is as currently specified. Otherwise, the buffer after
asctime() returns must contain a valid (null-terminated) string.
The difference is that (b) could change the behavior of code that
depends on the current definition *and* that calls asctime() with
member values outside their normal ranges. The question is whether
this is a problem. I think approach (b) is cleaner, but it could
break some code (that arguably deserves to be broken anyway).
Rather than just "valid", we might want to insist that strlen()
applied to the result returns 25, and that all characters in the
string are printable (see isprint()) except for a mandatory trailing
'\n'. Or it might not be worth being that specific.
And we need to define the "normal range" for tm_year, and specify that
we don't care about the value of tm_isdst.
I think it would also be worth mentioning in a footnote that asctime()
is a legacy function (and perhaps even deprecated), and strftime() is
recommended as a more flexible alternative.
That would be comp.std.c. Not c.l.c.
Nothing needs derivation. The standard adequately specifies
everything. First, see the definition of "struct tm", which
follows:
[#3] The types declared are size_t (described in 7.17);
clock_t
and
time_t
which are arithmetic types capable of representing times;
and
struct tm
which holds the components of a calendar time, called the
broken-down time.
[#4] The tm structure shall contain at least the following
members, in any order. The semantics of the members and
their normal ranges are expressed in the comments.251)
int tm_sec; // seconds after the minute -- [0, 60]
int tm_min; // minutes after the hour -- [0, 59]
int tm_hour; // hours since midnight -- [0, 23]
int tm_mday; // day of the month -- [1, 31]
int tm_mon; // months since January -- [0, 11]
int tm_year; // years since 1900
int tm_wday; // days since Sunday -- [0, 6]
int tm_yday; // days since January 1 -- [0, 365]
int tm_isdst; // Daylight Saving Time flag
The value of tm_isdst is positive if Daylight Saving Time is
in effect, zero if Daylight Saving Time is not in effect,
and negative if the information is not available.
____________________
251The range [0, 60] for tm_sec allows for a positive leap
second.
Note the allowable range for values. Then see the definition for
asctime:
7.23.3.1 The asctime function
Synopsis
[#1]
#include <time.h>
char *asctime(const struct tm *timeptr);
Description
[#2] The asctime function converts the broken-down time in
the structure pointed to by timeptr into a string in the
form
Sun Sep 16 01:03:52 1973\n\0
using the equivalent of the following algorithm.
char *asctime(const struct tm *timeptr) {
static const char wday_name[7][3] = {
"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"
};
static const char mon_name[12][3] = {
"Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"
};
static char result[26];
sprintf(result, "%.3s %.3s%3d %.2d:%.2d:%.2d %d\n",
wday_name[timeptr->tm_wday],
mon_name[timeptr->tm_mon],
timeptr->tm_mday, timeptr->tm_hour,
timeptr->tm_min, timeptr->tm_sec,
1900 + timeptr->tm_year);
return result;
}
Returns
[#3] The asctime function returns a pointer to the string.
All of which guarantees no overflows for any legitimate value,
provided only that the year does not exceed 9999, or become
negative. Note that there is no buffer for the user to create.
This is intimately connected with the use of the word 'static'.
There is no point to raving about non-existent failings in the C
standard.
Pure unmitigated nonsense. The limits are clearly set out in the
description of "struct tm". You would do well to take some time
off and spend it reading the existing standard. As a poor
substitute you could follow my recent postings in c.l.c.
Or an old one. Try ISO10206.
Please explain how
"int tm_year; // years since 1900"
specifies that tm_year must be between -2899 and 8099. If you can't, but
you do agree that such a limit exists when calling asctime, then obviously
the limits are not clearly set out in the description of "struct tm".
Not directly, I'm afraid.
> First, see the definition of "struct tm", which
> follows:
>
> [#3] The types declared are
[...]
> and
> struct tm
>
> which holds the components of a calendar time, called the
> broken-down time.
>
> [#4] The tm structure shall contain at least the following
> members, in any order. The semantics of the members and
> their normal ranges are expressed in the comments.251)
>
> int tm_sec; // seconds after the minute -- [0, 60]
> int tm_min; // minutes after the hour -- [0, 59]
> int tm_hour; // hours since midnight -- [0, 23]
> int tm_mday; // day of the month -- [1, 31]
> int tm_mon; // months since January -- [0, 11]
> int tm_year; // years since 1900
> int tm_wday; // days since Sunday -- [0, 6]
> int tm_yday; // days since January 1 -- [0, 365]
> int tm_isdst; // Daylight Saving Time flag
[...]
> Note the allowable range for values.
What "allowable range"? It describes the "normal ranges", not the
same thing at all.
> Then see the definition for
> asctime:
>
> 7.23.3.1 The asctime function
>
> Synopsis
> [#1]
> #include <time.h>
> char *asctime(const struct tm *timeptr);
>
> Description
>
> [#2] The asctime function converts the broken-down time in
> the structure pointed to by timeptr into a string in the
> form
> Sun Sep 16 01:03:52 1973\n\0
>
> using the equivalent of the following algorithm.
>
> char *asctime(const struct tm *timeptr) {
[snip]
> }
>
> Returns
>
> [#3] The asctime function returns a pointer to the string.
>
> All of which guarantees no overflows for any legitimate value,
> provided only that the year does not exceed 9999, or become
> negative. Note that there is no buffer for the user to create.
> This is intimately connected with the use of the word 'static'.
[...]
That's correct as far as it goes (though as Harald pointed out, the
value of 9999 can only be derived by studying the code of the sample
implementation). But the "normal ranges" for the members of struct tm
are a *subset* of the values that yield well-defined behavior for
asctime. For example, this rather bizarre program:
#include <time.h>
#include <stdio.h>
int main(void)
{
struct tm foo;
foo.tm_year = -1891;
foo.tm_mon = 11;
foo.tm_mday = 999;
foo.tm_hour = 999;
foo.tm_min = 999;
foo.tm_sec = 99;
foo.tm_wday = 6;
fputs(asctime(&foo), stdout);
return 0;
}
*must* print
Sat Dec999 999:999:99 9
on any conforming implementation, because of the "equivalent of the
following algorithm" clause. Implementations are not allowed the
freedom to exhibit undefined behavior whenever any of the members are
outside their normal ranges, because the presented algorithm doesn't
do so.
Note also that the struct tm object pointed to by the argument to
mktime is explicitly allowed to have members whose values are outside
their normal ranges. The same permission applies to asctime() simply
because no restriction is given, other than the implicit restriction
to values that don't cause undefined behavior.
ISO 10206 is Extended Pascal. First, it would have been polite to
mention that fact; second, it hardly qualifies as "C with Ada-like
safety".
> >> Zero terminated strings, where there are no bounds checking make it
> >> almost impossible to avoid errors
>
> > nonsense. Plenty of C gets written using zero terminated
> > strings which is perfectly fine.
>
> >> since it requires from the programmer
> >> never to forget the lengths of buffers!
>
> > you can get the program to do that for you
>
> >> I have proposed a string library for C to make those errors more
> >> difficult. The reception was as expected... :-(
>
> > I don't think anyone objects to string libraries (at least
> > a couple of regulars have their own) the argument was aginst
> > incorporating a string library into the standard. Actually
> > I'd be interested. I've used C++ strings and it does make
> > some things easier.
>
> Such libraries permits new types of attack.
I'm assuming the string libraries are based on something like
struct __string
{
size_t __size;
char *__buffer;
};
I don't agree there are new forms of attack. There are old forms of
attack such as corrupting the internal data structures with pointers.
These are probably forms of Undefined Behaviour anyway. They can't
be fixed without severe limits on pointer arithmatic.
I submit that a secure string library *could* be written.
It might not be easy but it could be done. This Design by
Contract.
> Actual strings forces reading the memory from
> beginning, so strings should terminate earlier
> (by a "random" 0 character, or a segmentation fault).
I don't understand the above.
> On new libraries, the libraries could take advantage
> of new fields, and thus it could access the end of
> the string, so in kernel or other library space.
again. I don't understand
> OTOH new strings will reduce programmer error, and
> thus start of attacks.
yes
> Anyway alternate strings libraries already exists,
> but it seems that they are not widely used, so
> I don't think they are ready to be standardized.
maybe
> BTW, an alternate string library should really
> have a good design, allowing simple plug-in
> of i18n, so reducing transition costs.
maybe
> FYI I found n1173, with some rationale on these goals:
> 1.1.6 Preserve the null terminated string datatype
> 1.1.7 Do not require size arguments for unmodified strings
> 1.1.9 Library based solution
interesting
--
Nick Keighley
"I wish to God these calculations had been accomplished by steam."
--C. Babbage
> >> I see only one problem on C standard, that limits the capabilities
> >> of a compiler (+ run time environment) to check overflows.
what is that case?
> > Mandating it would be such a bad idea. Implementations are already free to
> > do bounds checking if they wish. Let the market decide. Programmers will
> > generally make the smart move if they're given the time to think about it.
> > Trust them to decide for themselves.
>
> I totally agree that should not be mandated.
> But actual C doesn't permit (for IMHO one single point) full
> implementation of bounds check.
why not? There are (or could be) "fat pointer" implementations.
(they store bounds info as well as a raw address). Why wouldn't
they work?
> And IMHO it is simple and I doesn't see disadvantage for
> programmers (they should include type on memory allocation functions)
> or by implementation: they could ignore the extra fields.
>
> But because it is a topic of next C1x (IIRC), I'm courios of the
> direction what would take C, to support (at programmer wishes)
> better security.
--
Nick Keighley
It isn't a software system it's a 1000 patches flying in
loose formation.
Allocated memory (see below)
>>> Mandating it would be such a bad idea. Implementations are already free to
>>> do bounds checking if they wish. Let the market decide. Programmers will
>>> generally make the smart move if they're given the time to think about it.
>>> Trust them to decide for themselves.
>> I totally agree that should not be mandated.
>> But actual C doesn't permit (for IMHO one single point) full
>> implementation of bounds check.
>
> why not? There are (or could be) "fat pointer" implementations.
> (they store bounds info as well as a raw address). Why wouldn't
> they work?
You can only check the extern boundary of allocated memory,
but not inside the block. I don't see how to check
boundaries within a structure or an array if allocated in the heap.
For my analysis, I start from a different point: a perfect debugger,
(with full support compiler).
Now it can have information about code, static storage and automatic
storage, but on allocated storage there is only information about
block size, not about type (but heuristically, if a programmer
do the right thing: a "malloc + sizeof".
Unfortunately I see on standard:
- only one mention of "allocated memory" on memory type (storage
duration, 6.2.4),
- a foot note 75: "Allocated objects have no declared type."
IMHO, if there is a way to define the "effective type" of
allocated memory, a lot more checks could be done run-time.
I think that making new allocation functions, which
explicitly allow compiler to save the type could solve
the problem. But programs should not rely on this
(i.e. wrapper to actual infrastructure could be used).
One problem I see: it need new functions, but the
feature is used practically only on debugging and
on secure compilation of program (and to better
document code)
ciao
cate
I agree, secure string library could be written.
I pointed on possible problems on the size based strings.
Which was the part you did not understand (sorry for my
bad English). So I try again.
On C-string, library should all read memory from start to the end.
On Unix, the stack, code, heap, libraries and kernel are in different
memory region.
So a bad c-string will segfault before to change region,
which limits some attacks (or probably at a zero memory
which is found before region change).
With size-based-string, an implementation COULD skip
checking memory ranges and thus allowing easily to access
other memory regions.
Note: in c-string the same attack could be done by changing
the string pointer, so the problem is mainly on starting
an attack, and using only the string data
(removing terminating zero or changing size)
Combining the two method (reading all the memory (or
checking regions/allocation ranges), checking also the
size) would improve security, but the few libraries
I've see, use size also for efficient string maipulation,
without further checking.
ciao
cate
Did you search for the phrase "effective type"? If so, you would have
found 6.5p6, which specifies how dynamically allocated memory can
acquire an effective type:
| If a value is stored into an object having no declared type through an
| lvalue having a type that is not a character type, then the type of
| the lvalue becomes the effective type of the object for that access
| and for subsequent accesses that do not modify the stored value. If a
| value is copied into an object having no declared type using memcpy or
| memmove, or is copied as an array of character type, then the
| effective type of the modified object for that access and for
| subsequent accesses that do not modify the value is the effective type
| of the object from which the value is copied, if it has one. For all
| other accesses to an object having no declared type, the effective
| type of the object is simply the type of the lvalue used for the
| access.
It seems to me that this provides all that is needed to allow bounds
checking within an allocation. For instance, if I allocate memory as
follows:
int (*array)[COLS] = malloc(ROWS*sizeof *array);
Then for all subsequent uses of array[i][j], the compiler can certainly
impose run-time bounds checking on both i and j.
Well, to me that is just normal C behaviour, because the language
doesn't have the ability to define sub-ranges (a la Pascal) for
acceptance. Of course the behaviour you quoted, of compensating
for out-of-range values, is rather hard.
I would argue exactly the opposite -- that the explicit permission in mktime
implies that such permission is *not* granted otherwise.
--
Larry Jones
These findings suggest a logical course of action. -- Calvin
I disagree, but I might be willing to be convinced.
An asctime() algorithm equivalent to the one in the standard (which
is what the standard requires) exhibits well-defined behavior with,
for example, time_ptr->tm_day == 32. I see no permission for the
implementation to do anything other than what that code specifies.
The only argument I can see that the behavior might be undefined
is 7.1.4p1:
If an argument to a function has an invalid value (such as a
value outside the domain of the function, or [snip]) or [snip],
the behavior is undefined.
(The snipped text doesn't apply in this case.)
But 7.23.1p4 defines the "normal ranges" for the members of struct
tm. You'd have to assume that anything outside the "normal range"
is an "invalid value" -- but normality and validity are not the
same thing.
Note that the standard does explicitly state that the characters
stored by strftime() are undefined if any members have values outside
the normal range, and this is listed under "Unspecified behavior"
in J.1 (which is non-normative). This doesn't tell us anything
definitive, but it is suggestive.
I think the term "normal ranges" refers, not to the ranges of
values the members are *permitted* to have, but to the ranges of
values that will be set by localtime() and gmtime() -- though it
would have been nice if that had been stated explicitly.
None of the other uses of the word "normal" in the standard are
useful for resolving this.
Now I think that it would have made more sense for the standard
to say that the behavior of calling asctime() with values outside
the "normal ranges" is undefined (as it does for strftime()).
But since the authors of the standard chose to define the behavior of
asctime() by presenting an explicit implementation in C source code,
I think we're stuck with the behavior of that code (except that an
implementation can do what it likes in cases where the behavior of
the sample code is undefined).
I think the real problem is that the standard fails to define
what it means by "normal range".
But feel free to convince me that anything outside the "normal range"
is an "invalid value".
No, the explicit permission is necessary because without it, the behaviour
of mktime() would be undefined by omission when the values are outside of
their "normal" ranges. To avoid undefined behaviour when the values do not
represent a valid date and time, the text needs to explain how such values
are interpreted, and mentioning that they're allowed is a reasonable
introduction to such an explanation. Given that that's a sufficient
justification for the explicit permission being there, there's no reason to
assume that its purpose might *also* be to imply that values outside of
their "normal" ranges are forbidden everywhere else.
(Unfortunately, even though the explicit perrmision makes it clear that the
standard wants to avoid undefined behaviour, I don't feel that it's doing a
very good job of defining the behaviour. What exactly does it mean that the
components of the structure "are set to represent the specified calendar
time, but with their values forced to the ranges indicated above"? What
date is "specified" by setting tm_mon to 12 and tm_mday to 50 -- is it
supposed to be obvious that it's referring to 19 February of the following
year? Or are the values "forced" into their "normal" ranges *before* being
interpreted as specifying a calendar time, and therefore my example really
specifies the 31th of December?)
Anyway, no such issue exists for asctime(). Its behaviour is defined
without depending on whether the values in the structure represent a valid
calendar time or not.
> Bart van Ingen Schenau wrote:
>> So, why do people complain all the time about the possibility of
>> buffer overflows in C, but not in other languages?
>>
>> Bart v Ingen Schenau
>
> Because functions like gets() asctime() and other standard
> functions (still in C99 standard even if gets() got deprecated)
> make buffer overflows almost MANDATORY.
I don't think anyone these days ever uses gets().
> Zero terminated strings, where there are no bounds checking make it
> almost impossible to avoid errors since it requires from the
> programmer never to forget the lengths of buffers!
IME it's a lot more bookkeeping, but I wouldn't say that it is "almost
impossible". The size of each and every object must be known at some
point in a program, and that information need only be carefully and
logically preserved and used each time the object is accessed.
> I have proposed a string library for C to make those errors more
> difficult. The reception was as expected... :-(
When did you propose this? If you had made operator overloading an
integral part of this proposal, then not many would've been every
enthusiastic since operator overloading is not supported by nearly all
the C compilers out there.
OTOH a counted strings interface based on the existing infrastructure of
C like BStrlib would not be an unwelcome addition to C1x, IMHO.
> Keith Thompson wrote:
>> jacob navia <ja...@nospam.com> writes:
>>> Nick Keighley wrote:
>>>> On 23 Jul, 10:36, jacob navia <ja...@nospam.com> wrote:
>>>>> Bart van Ingen Schenau wrote:
>>>>>> So, why do people complain all the time about the possibility of
>>>>>> buffer overflows in C, but not in other languages?
>>>>>>
>>>>> Because functions like gets() asctime() and other standard
>>>>> functions (still in C99 standard even if gets() got deprecated)
>>>>> make buffer overflows almost MANDATORY.
>>>> rubbish. Yes gets() is a problem. No one in their right mind
>>>> uses gets(). But asctime() *can* be used safely. Just make sure
>>>> the buffer is big enough.
>>> No you can't make sure the buffer is big enough. The standard
>>> mandates a 26 byte buffer.
>>
>> Yes, asctime() provides its own static buffer. But you can use
>> asctime() safely by ensuring that the arguments (or rather, the
>> members of the struct tm object pointed to by the single argument)
>> are within safe bounds.
>>
>
> The C standard shows a piece of code that will overflow its static
> buffer if used with a year value greater than 8900 (if I remember
> correctly)
>
> Similarly, if the month value is greater than 12 it will
> show UB.
>
> Obviously, showing such a piece of code is a reminder to the rest
> of the world how much the standard cares about buffer overflows.
>
> The discussion in this group confirms this. Look at Mr Thomson:
By your logic we could say that the Standard allows potential buffer
overruns with memcpy, memmove, strcpy, strcat, fread, fgets and so.
I think you may be happier with a language other than C.
>> It *could* be
>> tweaked to require implementation-defined but safe behavior for
>> out-of-bounds arguments, and I would support such a change.
>
> Then why you don't support it now and act to get rid of a buffer
> overflowing code written in the C standard document?
Let me turn the tables and ask you jacob that as an outspoken
implementor (and critic) of ISO C, why not arrange for your lcc-win to
emit a diagnostic for any use of asctime (and perhaps of other
functions that you deem as unsafe)? As far as I can see my recent copy
your compiler emits no helpful warnings upon the usage of "dangerous"
functions like gets and asctime and others. Gcc for example emits a
diagnostic for gets and tmpnam among others. It is not as good as
remedy at the source, but it's nonetheless helpful.
It would be good to see you judging yourself with the same high
standards with which you judge others.
<snip>
> Richard Heathfield <r...@see.sig.invalid> writes:
> > If you want Ada, you know where to find it.
>
> Ada is as powerful as C. It doesn't forbid unsafe actions, it merely
> requires you to specify them explicitly in most cases. For example,
> to interpret an integer as a pointer (something C allows with a simple
> cast), you have to instantiate Unchecked_Conversion and then call the
> instance. (Strictly speaking, the C cast performs a type conversion,
> not a reinterpretation, but it's implemented as a reinterpretation in
> every implementation I've seen.)
In other words, Ada requires a special construct to be applied to
interpret an integer as a pointer; C, by contrast, requires a special
construct to be applied to interpret an integer as a pointer. Apart from
the size of the construct, I see no difference.
Richard
> Keith Thompson wrote:
>
> > If you want C with Ada-like safety (say, because you really like
> > curly braces), I think your best bet is to invent a new language.
>
> Or an old one. Try ISO10206.
No; Keith was looking for Ada with a C-like syntax, while what you
suggest is Algol with a BASIC-like syntax.
Richard
The difference is that the "special construct" in C, a cast doesn't
stand out. It's the same construct used to for perfectly safe type
conversions, such as a conversion from int to long.
There *should* be few casts in well-written C code, but unfortunately
that's not the case in real-world code. So to some small extent,
perhaps the difference is more in how C is used than in how it's
designed -- though there are still plenty of ways to do dangerous
things in C without trying very hard.
I can see this turning into a language war, so perhaps we should drop
this sub-thread.
--
Ian Collins.
In my experience the whole "buffer overflows are a huge problem with C" is
heavily overrated.
1. I rarely come across bugs in the field that have to do with buffer
overflows. Yes, sometimes it happens during development that you acces an
invalid pointer. But you do a test run and see that the system doesnt work
properly anymore and you fix it.
2. Having a "safe" language is not the real solution. Lets take java, it
does bounds checking. What does it do when it sees a buffer overflow? It
throws an exception and you can bet the program is not designed to recover
from this kind of error. Almost the same result as in C, the program doesnt
work correctly. The only difference is that you get a neat error message
saying which line in which file displayed the bug. But at the end of the day
the bug is still there.
But I agree, it would be great if some software could actually be created
that runs together with your unit tests and checks for buffer overflows
provided it can be turned off after testing.
Another thing, I see here that people have the opinion that you should "just
always hire really good programmers". But how does a programmer get "really
good"? Right, by making mistakes and realize the importance of his mistakes.
After a big money costing buffer overflow mistake you think twice to ignore
them in the future, but when you're a student and your little student
database program you tend to care way less :)
> "jacob navia" <ja...@nospam.com> schreef in bericht
> news:g65acc$427$1...@aioe.org...
>> Buffer overflows are a fact of life, and, more specifically, a fact of
>> C.
>>
>> All is not lost however. In the book
>>
>> "Value Range Analysis of C programs" Axel Simon tries to establish a
>> theoretical framework for analyzing C programs. In contrast to other
>> books where the actual technical difficulties are "abstracted away",
>> this books tries to analyze real C programs taking into account
>> pointers, stack frames, etc.
>
> In my experience the whole "buffer overflows are a huge problem with C" is
> heavily overrated.
>
With due respect, it just means that your experience has been restricted
to those implementations of C that get you to think so.
> 1. I rarely come across bugs in the field that have to do with buffer
> overflows. Yes, sometimes it happens during development that you acces an
> invalid pointer. But you do a test run and see that the system doesnt work
> properly anymore and you fix it.
>
Consider yourself to be blessed and lucky. You have software that shows
some symptoms of "not working properly". There are software that gives no
such indication, and these are ticking time-bombs. Crackers that have too
much time on their hands craft exploits that can compromise the securities
of a system that host such software.
Even should your software give some indication of "not working properly", ...
> 2. Having a "safe" language is not the real solution. Lets take java, it
> does bounds checking. What does it do when it sees a buffer overflow? It
> throws an exception and you can bet the program is not designed to recover
> from this kind of error. Almost the same result as in C, the program doesnt
> work correctly. The only difference is that you get a neat error message
> saying which line in which file displayed the bug. But at the end of the day
> the bug is still there.
>
... the cost of identifying the real location of the buffer overflow is
*FAR* more significant that it would be in a Java program that fails like
you have described above. "Almost the same result as in C" is just plain
wrong. Failing noisily is a good thing that other programming languages
achieve at a (purportedly) higher overhead than C does not incur by not
implementing such safety nets.
If a language is interpreted then it is easy to put in these kinds of
run-time checks.
For the interpreter itself, it will be easier to thoroughly test just this
one, smallish program, and run a million identical copies, then to try and
ensure that a million totally different programs, from a million
programmers, and of unlimited complexity, do not have buffer overflow types
of problems.
> So, why do people complain all the time about the possibility of
> buffer overflows in C, but not in other languages?
I /think/ it may be possible to interpret C, and with the same sorts of
checks, so making this an implementation choice. Unless the language
requires that it must be possible to write outside of a buffer or array,
then C would appear less safe than other languages.
--
Bartc
<snip>
> I /think/ it may be possible to interpret C, and with the same sorts
> of checks, so making this an implementation choice. Unless the
> language requires that it must be possible to write outside of a
> buffer or array, then C would appear less safe than other languages.
I think that /any/ language can be interpreted or compiled. And in
places the boundary between what one considers interpretation or
compilation are almost non-existent.
<http://www.softintegration.com/>
But C was, I think, not /designed/ to be interpreted. It was designed as
a system programming language, and interpreters are not very feasible
for very low-level code. Reduction of speed is only one issue.
--
Free games and programming goodies.
http://www.personal.leeds.ac.uk/~bgy1mm
> r...@hoekstra-uitgeverij.nl (Richard Bos) writes:
> > Keith Thompson <ks...@mib.org> wrote:
> >> Richard Heathfield <r...@see.sig.invalid> writes:
> >> > If you want Ada, you know where to find it.
> >>
> >> Ada is as powerful as C. It doesn't forbid unsafe actions, it merely
> >> requires you to specify them explicitly in most cases. For example,
> >> to interpret an integer as a pointer (something C allows with a simple
> >> cast), you have to instantiate Unchecked_Conversion and then call the
> >> instance. (Strictly speaking, the C cast performs a type conversion,
> >> not a reinterpretation, but it's implemented as a reinterpretation in
> >> every implementation I've seen.)
> >
> > In other words, Ada requires a special construct to be applied to
> > interpret an integer as a pointer; C, by contrast, requires a special
> > construct to be applied to interpret an integer as a pointer. Apart from
> > the size of the construct, I see no difference.
>
> The difference is that the "special construct" in C, a cast doesn't
> stand out. It's the same construct used to for perfectly safe type
> conversions, such as a conversion from int to long.
One shouldn't use a cast when converting from int to long in the first
place. We're talking about C, here, not about C++. You're complaining
about people who stick Unchecked_Conversion instances on integer-on-
integer conversions, not about casts on pointers. IOW, you're
complaining about bad programmers, not about C.
Well, C has many, many more programmers than Ada, so yes, of course C
also has many more bad programmers than Ada. That does not mean that C
is inherently unsafe; what it means is that popular things get abused
more than unpopular things. Well, duh; more people drive cars badly than
motorcycles, too, but that doesn't make a car less safe than a
motorcycle.
Richard
No, this is wrong.
Interpreter language usually have also a function/statement like:
interpret(string)
so potentially an interpreter should be provided also
on "compiled" programs.
Fewer "real" programs use such function, because of security
problems.
OTOH, "compiled" language have strict requirements,
which could cause troubles when programs are interpreted,
i.e. where program could terminate because of error conditions.
E.g. on C, a interpreter could terminate because
of lack of memory when calling a function with a big
static array. C requires such initialization at the
start of the program, so errors are detected before
a maybe critical section of a C program.
These two cases are seldom, and I think that if there where
demand of interpreted C, C standardization would clarify and
change some texts.
ciao
cate
> santosh wrote:
>> Bartc wrote:
>>
>> <snip>
>>
>>> I /think/ it may be possible to interpret C, and with the same sorts
>>> of checks, so making this an implementation choice. Unless the
>>> language requires that it must be possible to write outside of a
>>> buffer or array, then C would appear less safe than other languages.
>>
>> I think that /any/ language can be interpreted or compiled. And in
>> places the boundary between what one considers interpretation or
>> compilation are almost non-existent.
>
> No, this is wrong.
Which exactly are you claiming is wrong? My first or second sentence?
They both seem reasonable to me.
[snip - as nothing you say further seems to explicitly contradict what
wrote]
PS. I think any further discussion of compiled/interpreted languages
should be moved over to comp.programming, unless you want to
specifically discuss C's applicability to them.
Such a function is commonly called "eval". For languages that have
such a feature (C doesn't), you need to have a compiler, or a large
chunk of it, as part of the run-time system.
> OTOH, "compiled" language have strict requirements,
> which could cause troubles when programs are interpreted,
> i.e. where program could terminate because of error conditions.
>
> E.g. on C, a interpreter could terminate because
> of lack of memory when calling a function with a big
> static array. C requires such initialization at the
> start of the program, so errors are detected before
> a maybe critical section of a C program.
An interpreter obviously has to satisfy the language requirements.
A "pure" interpreter that tries to process and execute a C program
line by line wouldn't work, but that's no barrier to a C interpreter
that reads the entire program and before it begins execution.
Note that Perl does something similar.
[...]