--
/-- Joona Palaste (pal...@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #80 D+ ADA N+++ |
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"Ice cream sales somehow cause drownings: both happen in summer."
- Antti Voipio & Arto Wikla
> I know why gets() is bad. What I want to know is, why does it exist in
> the first place? gets() is a function that's:
> - ISO standard
> - impossible to guarantee will have defined behaviour even if used in
> the correct context
I would assume that it's still in there for backwards compatibility
( I can't think of any other good reason ).
For hysterical raisins.
--
Richard Heathfield
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Yet another side effect of Beaujolais Nouveau...
--
-hs- Tabs out, spaces in.
CLC-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
ISO-C Library: http://www.dinkum.com/htm_cl
FAQ de FCLC : http://www.isty-info.uvsq.fr/~rumeau/fclc
> For hysterical raisins.
But... but...
Correct me if I'm wrong here, but wasn't gets() unsafe from day one? I
can't think of any C dialect, K&R, ANSI or otherwise, where gets() could
be used so that the user couldn't overflow the buffer. So, even when
gets() was first invented, it was unusable!
--
/-- Joona Palaste (pal...@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #80 D+ ADA N+++ |
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"Remember: There are only three kinds of people - those who can count and those
who can't."
- Vampyra
It happens that someone makes errors.
I don't now the reason, but let say there was a lot of gets() code around
before there was a standard.
In C, it is not only gets() where buffer overflow can happen, think about:
strcpy()
strcat()
scanf()/fscanf()
sprintf()
and an other interesting function for system security is
system()
C programmers must be very careful when interacting with users, in
particular is this true in sensitive web applications.
--
Tor
<torust AT online DOT no>
What sets gets() apart is
not that it is POSSIBLE to use it dangerously,
but that it is NOT POSSIBLE to use it safely.
John
--
John Hascall (__) Shut up, be happy.
Software Engineer, ,------(oo) The conveniences you demanded
Acropolis Project Manager, / |Moo U|\/ are now mandatory.
ISU Academic IT * ||----|| -- Jello Biafra
> In C, it is not only gets() where buffer overflow can happen, think about:
> strcpy()
> strcat()
The onus is on the programmer to see that there is enough space
to copy into. strlen() and malloc() can help with strings from an
external source (like argv)
> sprintf()
Same applies here, but the task is not so easy, so over-estimation
rules when dealing with integers and float types. snprintf (new in
C1999) can help.
> scanf()/fscanf()
These can be used safely, although I can see a case of banning
"%s" in scanf and friends. (As opposed to a limited "%20s")
However, gets() can't be used safely, unless you have some way
of knowing how big the next line will be.
> system()
The onus is on the programmer to see that there are no side effects.
Someone writing CGI code and simply does...
systr=malloc(strlen(record)+30);
/* Check for NULL. */
sprintf(systr,"grep \'%s\' records.txt",record);
system(systr);
is asking for trouble, lest someone pass in
"blibble' /dev/null; something_nasty; grep 'hello"
> Tor Rustad <tor...@online.no.spam> wrote:
> >In C, it is not only gets() where buffer overflow can happen, think about:
> >strcpy()
> >strcat()
> >scanf()/fscanf()
> >sprintf()
> >and an other interesting function for system security is
> >system()
> >C programmers must be very careful when interacting with users, in
> >particular is this true in sensitive web applications.
>
> What sets gets() apart is
> not that it is POSSIBLE to use it dangerously,
> but that it is NOT POSSIBLE to use it safely.
>
> John
Sure it is:
{
char c[50];
puts("If you type more than 49 characters, I'll delete everything: ");
fflush(stdout);
gets(c);
}
I doubt that anyone will enter more than 50 characters. :)
--
Clark S. Cox, III
clar...@yahoo.com
http://www.whereismyhead.com/clark/
It is possible to use gets safely if the input to the program is itself always
generated by software which knows about the limitation and never violates it.
(But why would one make a program non-robust in this way given that it's so
easy to avoid?)
One situation in which you can use gets() safely is when you have
written a file yourself, say a temporary, and want to read it back
later. Then, if you have carefully limited the length of written
lines, you can use that same size in the read buffer. This only
leaves the possibility of i/o system glitches that mung the lines
during read or write.
In this case the internal \n stripping may be useful.
--
Chuck Falconer (cbfal...@my-deja.com)
http://www.qwikpages.com/backstreets/cbfalconer
(Remove "NOSPAM." from reply address. Above works unmodified)
> On 20 Nov 2000 01:52:25 GMT, John Hascall <jo...@iastate.edu> wrote:
> >Tor Rustad <tor...@online.no.spam> wrote:
> >>In C, it is not only gets() where buffer overflow can happen, think about:
> >>strcpy()
> >>strcat()
> >>scanf()/fscanf()
> >>sprintf()
> >>and an other interesting function for system security is
> >>system()
> >>C programmers must be very careful when interacting with users, in
> >>particular is this true in sensitive web applications.
> >
> > What sets gets() apart is
> > not that it is POSSIBLE to use it dangerously,
> > but that it is NOT POSSIBLE to use it safely.
>
> It is possible to use gets safely if the input to the program is itself always
> generated by software which knows about the limitation and never violates it.
> (But why would one make a program non-robust in this way given that it's so
> easy to avoid?)
...barring accidental or malicious modification by users, file system
problems, bugs in the generating program...
Why would it be bad form not to check for data/pointer, etc., errors
within the same program, but not in a multi-program configuration?
And, of course, gets() was not developed and is hardly ever used, for
situations like you describe.
Jack Klein
--
Home: http://jackklein.home.att.net
<snip>
> For gets(), though, even the correct context (ie. the pointer passed
> to gets() is valid) will not guarantee defined behaviour. For gets(),
> the question of whether the behaviour is defined or not always depends
> on the user as well as the programmer.
> So, why does this unsafe function exist? K&R must have had some reason
> to put it in, and ANSI must have had some reason to keep it in. I'm
> thinking simple oversight - even K&R can't be trusted to always know
> everything. Or is there a more elaborate reason?
Joona...
I suspect that gets() was written because the author wanted symmetry with
puts() -- and I'll hazard a guess that the implementation preceeded
fputs(); and that at the time the author was less concerned with robustness
and foolish/hostile users than are we.
Best way to get an answer to this question is probably to pose it to the
original implementor. I'd bet that the answer will be something like "It
seemed like a good idea at the time."
If the original implementation had needed to be thread-safe, hostile user
safe, etc. then the project might have bogged down and we might not have
the language at all...
I still think it's beautiful (warts and all 8^)
--
Morris Dovey
West Des Moines, Iowa USA
mrd...@iedu.com
>Joona I Palaste <pal...@cc.helsinki.fi> wrote:
>
>> I know why gets() is bad. What I want to know is, why does it exist in
>> the first place? gets() is a function that's:
>> - ISO standard
>> - impossible to guarantee will have defined behaviour even if used in
>> the correct context
>
> I would assume that it's still in there for backwards compatibility
>( I can't think of any other good reason ).
^^^^^^^^^^^^^^^^^^^^^
This implies that you think that this *is* a GOOD reason! Hmmm...
Dan
--
Dan Pop
CERN, IT Division
Email: Dan...@cern.ch
Mail: CERN - IT, Bat. 31 1-014, CH-1211 Geneve 23, Switzerland
>I know why gets() is bad. What I want to know is, why does it exist in
>the first place?
Historical reasons. I don't think they justified its inclusion in the
C standard, but the ANSI committee didn't ask my advice.
>gets() is a function that's:
>- ISO standard
>- impossible to guarantee will have defined behaviour even if used in
>the correct context
Not entirely true. Consider an implementation that limits the length
of a line input from the keyboard (e.g. MSDOS) and which doesn't allow
the redirection of stdin. Use an array longer than the system-imposed
limit and you're 100% safe on that platform.
>"Joona I Palaste" <pal...@cc.helsinki.fi> wrote in message
>> Richard Heathfield <bin...@eton.powernet.co.uk> scribbled the following:
>> > Joona I Palaste wrote:
>> >>
>> >> I know why gets() is bad. What I want to know is, why does it exist in
>> >> the first place?
>>
>> > For hysterical raisins.
>>
>> But... but...
>> Correct me if I'm wrong here, but wasn't gets() unsafe from day one? I
>> can't think of any C dialect, K&R, ANSI or otherwise, where gets() could
>> be used so that the user couldn't overflow the buffer. So, even when
>> gets() was first invented, it was unusable!
>
>I don't now the reason, but let say there was a lot of gets() code around
>before there was a standard.
>
>In C, it is not only gets() where buffer overflow can happen, think about:
>
>strcpy()
>strcat()
>scanf()/fscanf()
>sprintf()
ALL these functions can be *safely* used in a C program. They can be also
misused to generate buffer overflows, but you can do that with a plain
loop, too.
>and an other interesting function for system security is
>
>system()
What's wrong with system()? It can't do more harm than the rest of the
program. I.e. on a Unix system, you can call system("rm -rf ~"), but you
can achieve the same results with a simple recursive function.
>Bill Godfrey wrote:
>>
>> However, gets() can't be used safely, unless you have some way
>> of knowing how big the next line will be.
>
>One situation in which you can use gets() safely is when you have
>written a file yourself, say a temporary, and want to read it back
>later. Then, if you have carefully limited the length of written
>lines, you can use that same size in the read buffer. This only
>leaves the possibility of i/o system glitches that mung the lines
>during read or write.
>
>In this case the internal \n stripping may be useful.
You can get the internal \n stripping with scanf in a 100% safe way!
>Sure it is:
>
>{
> char c[50];
>
> puts("If you type more than 49 characters, I'll delete everything: ");
> fflush(stdout);
> gets(c);
>
>}
>
>I doubt that anyone will enter more than 50 characters. :)
I will, when running this program on *your* computer :-)
> Dan
> --
> Dan Pop
> CERN, IT Division
> Email: Dan...@cern.ch
> Mail: CERN - IT, Bat. 31 1-014, CH-1211 Geneve 23, Switzerland
--
John Castle
Too err is human, to really louse things up takes a computer!
Email: J.E.C...@btinternet.com
> clar...@yahoo.com (Clark S. Cox, III) writes:
>
> >Joona I Palaste <pal...@cc.helsinki.fi> wrote:
> >
> >> I know why gets() is bad. What I want to know is, why does it exist in
> >> the first place? gets() is a function that's:
> >> - ISO standard
> >> - impossible to guarantee will have defined behaviour even if used in
> >> the correct context
> >
> > I would assume that it's still in there for backwards compatibility (
> >I can't think of any other good reason ).
> ^^^^^^^^^^^^^^^^^^^^^
> This implies that you think that this *is* a GOOD reason! Hmmm...
Not at all, I think that it's far from a good reason, but it's the
"least bad" reason that I could think of.
> clar...@yahoo.com (Clark S. Cox, III) writes:
>
> >Sure it is:
> >
> >{
> > char c[50];
> >
> > puts("If you type more than 49 characters, I'll delete everything: ");
> > fflush(stdout);
> > gets(c);
> >
> >}
> >
> >I doubt that anyone will enter more than 50 characters. :)
>
> I will, when running this program on *your* computer :-)
I'd never let anyone run this on *my* computer :)
Have I said otherwise?
> They can be also
> misused to generate buffer overflows, but you can do that with a plain
> loop, too.
Yes, but the point is that they not only *can* be misused, but frequently
*are* misused.
Buffer overflow is "the most common security bug" of the last decade. The
problem is not limited to gets() usage, in fact I hardly see proffesionals
use gets().
Sloppy programming makes YABOB (Yet Another Buffer Overflow Bug), and (some)
C programmers has a record for being sloppy.
> >and an other interesting function for system security is
> >
> >system()
>
> What's wrong with system()? It can't do more harm than the rest of the
> program. I.e. on a Unix system, you can call system("rm -rf ~"), but you
> can achieve the same results with a simple recursive function.
Let say the user is able to manipulate the input buffer to a system()
call...
what harm the rest of the program can do, does not remove the problem above.
strncpy() and strncat() are easier to control, but I don't like these
functions either, since I usually want the destination buffer to be null
terminated.
> > sprintf()
>
> Same applies here, but the task is not so easy, so over-estimation
> rules when dealing with integers and float types. snprintf (new in
> C1999) can help.
Yes, snprintf() addresses exactly this problem (not every implementation
does it correct).
> > scanf()/fscanf()
>
> These can be used safely, although I can see a case of banning
> "%s" in scanf and friends. (As opposed to a limited "%20s")
Using %s or %[...] for scanning some user supplied data, is not the way to
program a robust (or secure) program.
> However, gets() can't be used safely, unless you have some way
> of knowing how big the next line will be.
Therefore I don't see the big problem with gets(), "everybody" knows the
problem, and "nobody" uses that function. OTOH, even an experienced C
programmer can get scanf() wrong.
> > system()
>
> The onus is on the programmer to see that there are no side effects.
Yes, but since most programmers are not security aware, it is best to let
them avoid using it.
I'll run your program with:
C:\mydir>yourprog < filewithlonglines
and I think I can do it in. Would you care to reduce the
percentage?
Looks like redirection of stdin.
>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:8vc31r$mpf$1...@sunnews.cern.ch...
>>
>> Not entirely true. Consider an implementation that limits the length
>> of a line input from the keyboard (e.g. MSDOS) and which doesn't allow
>> the redirection of stdin. Use an array longer than the system-imposed
>> limit and you're 100% safe on that platform.
>>
>Does MSDOS limit line length from the keyboard(not under NT 4.0!)? I don't
>believe DOS 6.2 does either so what version was that then Dan?
The DOS console of Win95 limits it at 254 characters. If you continue
to press keys, it beeps back at you. I can hardly believe that this
limitation was not inherited from the "real thing".
>"Bill Godfrey" <bill-g...@usa.net.invalid> wrote in message
>> "Tor Rustad" <tor...@online.no.spam> writes:
>>
>> > In C, it is not only gets() where buffer overflow can happen, think
>about:
>>
>> > strcpy()
>> > strcat()
>>
>> The onus is on the programmer to see that there is enough space
>> to copy into. strlen() and malloc() can help with strings from an
>> external source (like argv)
>
>strncpy() and strncat() are easier to control, but I don't like these
>functions either, since I usually want the destination buffer to be null
>terminated.
strncat() will *always* do that for you. It's only strncpy() that has
a problem, for historical reasons (the implementation of directories on
the early Unix systems).
>> > scanf()/fscanf()
>>
>> These can be used safely, although I can see a case of banning
>> "%s" in scanf and friends. (As opposed to a limited "%20s")
>
>Using %s or %[...] for scanning some user supplied data, is not the way to
>program a robust (or secure) program.
But what's preventing you from supplying a maximum field length
specifier, as suggested by Bill, above?
>Therefore I don't see the big problem with gets(), "everybody" knows the
>problem, and "nobody" uses that function. OTOH, even an experienced C
>programmer can get scanf() wrong.
Then, your definition of "experienced C programmer" doesn't match mine.
>> > system()
>>
>> The onus is on the programmer to see that there are no side effects.
>
>Yes, but since most programmers are not security aware, it is best to let
>them avoid using it.
Programmers who are not security aware should not write security sensitive
applications. It's as simple as that.
>"Dan Pop" <Dan...@cern.ch> wrote in message
>> In <B__R5.9377$cr6.2...@news1.oke.nextra.no> "Tor Rustad"
><tor...@online.no.spam> writes:
>>
>> >
>> >In C, it is not only gets() where buffer overflow can happen, think
>about:
>> >
>> >strcpy()
>> >strcat()
>> >scanf()/fscanf()
>> >sprintf()
>>
>> ALL these functions can be *safely* used in a C program.
>
>Have I said otherwise?
>
>> They can be also
>> misused to generate buffer overflows, but you can do that with a plain
>> loop, too.
>
>Yes, but the point is that they not only *can* be misused, but frequently
>*are* misused.
An incompetent C programmer can misuse *any* C feature.
>Buffer overflow is "the most common security bug" of the last decade. The
>problem is not limited to gets() usage, in fact I hardly see proffesionals
>use gets().
You don't need *any* standard library function to overflow a buffer if
you're incompetent enough.
>Sloppy programming makes YABOB (Yet Another Buffer Overflow Bug), and (some)
>C programmers has a record for being sloppy.
Incompetent C programmers.
>> >and an other interesting function for system security is
>> >
>> >system()
>>
>> What's wrong with system()? It can't do more harm than the rest of the
>> program. I.e. on a Unix system, you can call system("rm -rf ~"), but you
>> can achieve the same results with a simple recursive function.
>
>Let say the user is able to manipulate the input buffer to a system()
>call...
How? Commands normally run via system() don't do any stdin processing.
It is the programmer who has full control over the behaviour of commands
executed via system(). Again, assuming a competent programmer.
>what harm the rest of the program can do, does not remove the problem above.
You have yet to demonstrate the problem.
>Dan Pop wrote:
>>
>> In <8v95cm$m2h$1...@oravannahka.helsinki.fi> Joona I Palaste <pal...@cc.helsinki.fi> writes:
>>
>> >I know why gets() is bad. What I want to know is, why does it exist in
>> >the first place?
>>
>> Historical reasons. I don't think they justified its inclusion in the
>> C standard, but the ANSI committee didn't ask my advice.
>>
>> >gets() is a function that's:
>> >- ISO standard
>> >- impossible to guarantee will have defined behaviour even if used in
>> >the correct context
>>
>> Not entirely true. Consider an implementation that limits the length
>> of a line input from the keyboard (e.g. MSDOS) and which doesn't allow
>> the redirection of stdin. Use an array longer than the system-imposed
>> limit and you're 100% safe on that platform.
>
>I'll run your program with:
>
> C:\mydir>yourprog < filewithlonglines
>
>and I think I can do it in.
Nope. Did you bother to read what I wrote above? I have explicitly
prohibited stdin redirection.
>Would you care to reduce the percentage?
Nope. Here's the source code of myprog:
#include <stdio.h>
char line[1000];
int main()
{
if (freopen("CON", "r", stdin) == NULL) {
perror("CON");
return 1;
}
gets(line);
puts(line);
return 0;
}
Try to crash it on MS-DOS and let me know how you did it.
> Bill Godfrey wrote:
>>
>> However, gets() can't be used safely, unless you have some way
>> of knowing how big the next line will be.
>
> One situation in which you can use gets() safely is when you have
> written a file yourself, say a temporary, and want to read it back
> later. Then, if you have carefully limited the length of written
> lines, you can use that same size in the read buffer. This only
> leaves the possibility of i/o system glitches that mung the lines
> during read or write.
That is, you depend on that nobody else wrote the file inbetween
your read/write sequence. gets() still remains insecure.
>
> In this case the internal \n stripping may be useful.
>
I think this just saves you some typing compared to
fgets(). It really does not help in any way.
Z
--
LISP is worth learning for the profound enlightenment experience you
will have when you finally get it; that experience will make you a
better programmer for the rest of your days. Eric S. Raymond
Yes, and the safe operation of your toaster similarly depends on some idiot not
sticking a fork into it when it's plugged in. The only proper input to a
toaster is bread, and the event of pushing down the slider; if you feed it
bread, then it won't hurt you unless it is defective.
If some malicious user can modify a program's *internal* file, such a a
temporary file, in order to exploit that program, then the security flaw is in
the operating system, or at the very least in the code which set up that file
without the appropriate protection. Or perhaps the security hole is the result
of the user following faulty security procedures, such as running programs of
dubious origin.
Files that can come from the environment are a different matter; keeping
with the toaster analogy, they are like the bread rather than the fork.
If a program is not robust to malformed files, then trojan horses can
be smuggled in those files. A recent example of this was one version of the
popular WinAmp program. MP3 playlists could be constructed that would cause
WinAmp to execute arbitrary code. Since people share these files over a
network, it was a real security hole.
> In <8vcdmd$r3b$1...@neptunium.btinternet.com> "John Castle" <J.E.C...@btinternet.com> writes:
>
> >Does MSDOS limit line length from the keyboard(not under NT 4.0!)? I don't
> >believe DOS 6.2 does either so what version was that then Dan?
>
> The DOS console of Win95 limits it at 254 characters. If you continue
> to press keys, it beeps back at you. I can hardly believe that this
> limitation was not inherited from the "real thing".
It is, but under MS-DOS (well, under the default command.com anyway) the
limit is 127 characters. Fixed limits lose, believe you me.
Richard
> John Hascall <jo...@iastate.edu> wrote:
>
> > What sets gets() apart is
> > not that it is POSSIBLE to use it dangerously,
> > but that it is NOT POSSIBLE to use it safely.
>
> Sure it is:
>
> {
> char c[50];
>
> puts("If you type more than 49 characters, I'll delete everything: ");
> fflush(stdout);
> gets(c);
>
> }
>
> I doubt that anyone will enter more than 50 characters. :)
All I can say to that is: you really overestimate the average user's
counting ability.
Richard
Under NT, run COMMAND.COM then see how many characters uyou can type.
--
Tristan Styles #1485
Failure is not an Option
It is Standard Operating Procedure
Sent via Deja.com http://www.deja.com/
Before you buy.
<red face>
Thanks for the correction, I tend to use strcat() myself.
</red face>
> >> > scanf()/fscanf()
> >>
> >> These can be used safely, although I can see a case of banning
> >> "%s" in scanf and friends. (As opposed to a limited "%20s")
> >
> >Using %s or %[...] for scanning some user supplied data, is not the way
to
> >program a robust (or secure) program.
>
> But what's preventing you from supplying a maximum field length
> specifier, as suggested by Bill, above?
Nothing, programmers still forget (or don't know).
> >Therefore I don't see the big problem with gets(), "everybody" knows the
> >problem, and "nobody" uses that function. OTOH, even an experienced C
> >programmer can get scanf() wrong.
>
> Then, your definition of "experienced C programmer" doesn't match mine.
I have no problems beleaving that your definition is different from mine.
> >> > system()
> >>
> >> The onus is on the programmer to see that there are no side effects.
> >
> >Yes, but since most programmers are not security aware, it is best to let
> >them avoid using it.
>
> Programmers who are not security aware should not write security sensitive
> applications. It's as simple as that.
I agree that they should not do so, but the problem is that they still does
it. It is not always easy to see the security implications of a program.
--
Tor <torust AT online DOT no>
I never make errors, my programs just has random behavoir sometimes.
So what?
String handling in C is more difficult to get right than in many other
languages. So what is the result of this? As far as I can see, C is loosing
ground and I don't like it. Also, I think the C string handling is a more
severe problem, than missing a <stdgui.h> header/interface. I hate GUI
programming anyway, and have no problems leaving that to the OO camp.
What if ISO C had something like this:
string string_func_in_2010(void)
{
string str1, str2, str3 = "!";
str1 = "Hello "; /* malloc, strcpy */
str2 = str1 + "world"; /* malloc, strcpy, strcat */
str2 = str2 + str3; /* realloc, strcat */
if (str1 == str2 || str1 == "Hello world!") /* strcmp*/
puts(str2);
/* free(str1), free(str3)*/
return str2; /* need garbage collection? */
}
Here, I have assumed 'string' type to be simply 'char *', an optimizer can
replace the dynamic memory handling indicated above, with some static
allocated buffers.
I am shure there are many problems with introducing something like the above
(you will all tell me), just flaming me will not help in solving the real
problem...
Another approach, is to have built in bounds checking on arrays (e.g.
stackguard), but that might hurt performance to much and does not help
readability as a new 'string' type will do.
> >Buffer overflow is "the most common security bug" of the last decade. The
> >problem is not limited to gets() usage, in fact I hardly see
proffesionals
> >use gets().
>
> You don't need *any* standard library function to overflow a buffer if
> you're incompetent enough.
True, still we can't close our eyes to the fact that C programmers
frequently misuse some standard library functions. To address this problem,
safer libraries has been written (e.g. libsafe), and there are also
automated tools for detecting such problems etc.
> >Sloppy programming makes YABOB (Yet Another Buffer Overflow Bug),
> > and (some) C programmers has a record for being sloppy.
>
> Incompetent C programmers.
Many are. Regulars of this ng does a great job of doing something about
this, but I will *never* assume that my co-worker is at Dan Pop level.
> >> >and an other interesting function for system security is
> >> >
> >> >system()
> >>
> >> What's wrong with system()? It can't do more harm than the rest of the
> >> program. I.e. on a Unix system, you can call system("rm -rf ~"), but
you
> >> can achieve the same results with a simple recursive function.
> >
> >Let say the user is able to manipulate the input buffer to a system()
> >call...
>
> How? Commands normally run via system() don't do any stdin processing.
> It is the programmer who has full control over the behaviour of commands
> executed via system(). Again, assuming a competent programmer.
I can't make that assumption (competent programmer), and I have seen
system() used with argument dependent on user input...
> >what harm the rest of the program can do, does not remove the problem
above.
>
> You have yet to demonstrate the problem.
I don't have to demonstrate the problem, others have done that for years.
Just look at some CERT reports.
Nope - didn't read the redirection part :-). I suppose you would
count mounting a TSR to intercept the BDOS calls as redirection
also?
At any rate, the fact that you or I can think of ways to let the
rats get at it reinforces the "don't use gets" dictum. No need
for further bandwidth.
>Dan...@cern.ch (Dan Pop) wrote:
>
>> In <8vcdmd$r3b$1...@neptunium.btinternet.com> "John Castle" <J.E.C...@btinternet.com> writes:
>>
>> >Does MSDOS limit line length from the keyboard(not under NT 4.0!)? I don't
>> >believe DOS 6.2 does either so what version was that then Dan?
>>
>> The DOS console of Win95 limits it at 254 characters. If you continue
>> to press keys, it beeps back at you. I can hardly believe that this
>> limitation was not inherited from the "real thing".
>
>It is, but under MS-DOS (well, under the default command.com anyway) the
>limit is 127 characters.
This is the limit for COMMAND.COM's command line. I'm not sure if the
limit for a C program stdin comes from MS-DOS or from the implementation.
The 254 char limit was measured with VC++ 4.2.
> On Tue, 21 Nov 2000 08:36:37 +0100, Z <Zoran....@daimlerchrysler.com> wrote:
> >Once upon a while "CBFalconer" <cbfal...@my-deja.com> wrote:
> >
> >> Bill Godfrey wrote:
> >>>
> >> One situation in which you can use gets() safely is when you have
> >> written a file yourself, say a temporary, and want to read it back
> >> later. Then, if you have carefully limited the length of written
> >> lines, you can use that same size in the read buffer. This only
> >> leaves the possibility of i/o system glitches that mung the lines
> >> during read or write.
> >
> >That is, you depend on that nobody else wrote the file inbetween
> >your read/write sequence. gets() still remains insecure.
>
> Yes, and the safe operation of your toaster similarly depends on some idiot not
> sticking a fork into it when it's plugged in. The only proper input to a
> toaster is bread, and the event of pushing down the slider; if you feed it
> bread, then it won't hurt you unless it is defective.
The difference is that there is no way to prevent this fork from being
stuck in, because you can't limit the entry to slices of bread only
except by Star Trek technology; but there _is_ a way to prevent gets()
from overrunning your buffer, to wit, by using the equally useful, and
safer, fgets() or scanf("%...s"). Thus, the toaster is as safe as it
will ever get, but gets() is not as safe as the readily available
alternative.
Richard
>Nope - didn't read the redirection part :-). I suppose you would
>count mounting a TSR to intercept the BDOS calls as redirection
>also?
Once you've done that, your OS is no longer MS-DOS.
>"Dan Pop" <Dan...@cern.ch> wrote in message
>>
>> You have yet to demonstrate the problem.
>
>I don't have to demonstrate the problem, others have done that for years.
>Just look at some CERT reports.
You have to demonstrate that a *correctly* used system() is insecure.
:> You have yet to demonstrate the problem.
: I don't have to demonstrate the problem, others have done that for years.
: Just look at some CERT reports.
That any program with privilege > the running user would call system() with
tainted data is just plain dumb. That said, plenty of them did/do just
that. They are systematically being fixed. If I were teaching C, any
student who did not untaint data before calling system would not get a
passing score.
It is a matter of educating them to "Don't Do That" (tm).
Just because CERT has historically had a lot of problem with system()
doesn't make system() bad, it makes programmers bad. If you take away
system, I'll roll my own:
(on UNIX)
fork(), exec()
--
Tim Hockin
tho...@isunix.it.ilstu.edu
This program has been brought to you by the language C and the number F.
ZZ
Then you simply re-format their hard-drive, after printing, "I told
you so" :)
I know that a proper written C program is not a security risk.
You didn't comment on my string idea, was it really that bad? Has there been
a proposal to ISO about including such a thing?
"Dan Pop" <Dan...@cern.ch> wrote in message
news:8vcum0$4v5$1...@sunnews.cern.ch...
> >Therefore I don't see the big problem with gets(), "everybody" knows the
> >problem, and "nobody" uses that function. OTOH, even an experienced C
> >programmer can get scanf() wrong.
>
> Then, your definition of "experienced C programmer" doesn't match mine.
So where does one find one of these "experienced C programmers" by your
definition - a definition which apparently does not allow for oversights,
mistakes, accidents, brain farts, effects of medications or indeed any
potential source or cause of performing less than absolutely perfectly at
all times?
I took a undergraduate course in physics once, where the students was told
that they would fail the final exam, if he/she got a velocity greater than
speed of light (I am still afraid of getting such big numbers).
Using gets(), does for qualify for 0 score, but it might be to hard with a
non-validated input to a system() call... its not clear that it will impose
a security risk in some cases.
> It is a matter of educating them to "Don't Do That" (tm).
First, there is a need to educate some teachers.
Just tried it under Win2K, and it's less than two lines worth.
>You didn't comment on my string idea, was it really that bad?
There's nothing wrong with the <string.h> stuff.
>Has there been a proposal to ISO about including such a thing?
No idea.
>[snips]
>
>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:8vcum0$4v5$1...@sunnews.cern.ch...
>
>> >Therefore I don't see the big problem with gets(), "everybody" knows the
>> >problem, and "nobody" uses that function. OTOH, even an experienced C
>> >programmer can get scanf() wrong.
>>
>> Then, your definition of "experienced C programmer" doesn't match mine.
>
>So where does one find one of these "experienced C programmers" by your
>definition - a definition which apparently does not allow for oversights,
>mistakes, accidents, brain farts, effects of medications or indeed any
>potential source or cause of performing less than absolutely perfectly at
>all times?
An experienced C programmer is supposed to be familiar with the potential
problems of each function from the standard library (that he is using)
and to take them into account when coding. Ditto about commonly used
platform specific extensions.
No perfection is required for that, only *experience* (that's why we call
him "experienced" C programmer in the first place).
Someone who has used, say, scanf for five years and is still not fully
acquainted with all the relevant issues is a poor programmer and will
remain a poor programmer up to the end of his career.
Someone who avoids scanf like the plague, however, is IMHO a wise man.
(That is not to say that those who don't are not.)
--
Richard Heathfield
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
On my system, dosbox under W98, the command line limit is 511
chars. COMMAND.COM is nowhere to be seen. But shell command line
limits are NOT the stdin limits.
> Richard Bos <in...@hoekstra-uitgeverij.nl> wrote:
>
> > clar...@yahoo.com (Clark S. Cox, III) wrote:
> >
> > > I doubt that anyone will enter more than 50 characters. :)
> >
> > All I can say to that is: you really overestimate the average user's
> > counting ability.
>
> Then you simply re-format their hard-drive, after printing, "I told
> you so" :)
If only I could, Clark, if only I could! ;->
Richard
>Dan Pop wrote:
>>
><snip>
>>
>> Someone who has used, say, scanf for five years and is still not fully
>> acquainted with all the relevant issues is a poor programmer and will
>> remain a poor programmer up to the end of his career.
>
>Someone who avoids scanf like the plague, however, is IMHO a wise man.
Someone who uses scanf when it is the right tool for the job is an even
wiser man.
It's basically the choice between learning how (and when) to use your
tools and avoiding the tools you don't know how to use. IMHO, the first
alternative is the wisest.
Sometimes, the same job can be done with a sharp tool and with a blunt one.
The sharp one will do it much faster, but, if you don't know how to use it
properly, you may hurt yourself or botch the job. If you look into the
toolkit of a professional, you'll find plenty of sharp tools...
>On my system, dosbox under W98, the command line limit is 511
>chars. COMMAND.COM is nowhere to be seen. But shell command line
>limits are NOT the stdin limits.
You didn't tell us what (if any) is the stdin limit (when not redirected)
on that system.
Not if he has not yet learned to use scanf correctly. And not when it
isn't the right tool for the job. Whether scanf is the right tool for a
given job is a matter of opinion, and your opinion differs from mine.
> It's basically the choice between learning how (and when) to use your
> tools and avoiding the tools you don't know how to use. IMHO, the first
> alternative is the wisest.
Sure. There is no point in learning how to use unnecessary tools,
though. Whether scanf is necessary is a matter of opinion.
> Sometimes, the same job can be done with a sharp tool and with a blunt one.
> The sharp one will do it much faster, but, if you don't know how to use it
> properly, you may hurt yourself or botch the job. If you look into the
> toolkit of a professional, you'll find plenty of sharp tools...
Right. No professional carries every tool around in his toolkit,
however; just the ones he is likely to need. Whether scanf belongs in
the toolkit is a matter of opinion.
MS Documentation for 16bit 8.00c compiler limits argument list of
system() to 128 bytes. Total number of bytes that can be input on
the command line on my Win98 box also happens to be 128 (127 + CR)
Documentation for 32 bit compiler states that limit of argument
lists is system dependent - so it does in fact depend on the limitations
of the command interpreter (which is cmd.exe on NT based systems)
Interesting that win95 allows 254 bytes on the cmd line, but MS
reverted back to 128 with win98.
I wrote a test utility to count bytes in arguments sent via system()
and found w98 could not handle 128 bytes... I received an E2BIG error
when trying to send more than 122 bytes including null character!
FreeBSD on the other hand was able to handle > 20480 bytes to system(),
though the command line seems to be limited to 1024 bytes.
Later maybe I'll boot into NT and see what cmd.exe's limitations are.
/* a.c */
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char *argv[])
{
int bytes = argc - argc;
while(*++argv) {
char *ptr = *argv;
while(*ptr++) bytes++;
}
printf("Bytes from cmd line: %d\n", bytes);
return EXIT_SUCCESS;
}
/* b.c */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int
main(void)
{
char s[122];
int ret;
memset(s, 'a', sizeof s);
s[1] = ' ';
s[sizeof s -1] = 0;
if((ret = system(s)) != 0)
perror("system");
return EXIT_SUCCESS;
}
I don't know if it has one, and between weeding through drivel on
many newsgroups, things my wife can find for me to do NOW, etc. I
am not about to find out.
At any rate, when I designed the stdin for my embedded systems,
years ago, I had two possible hardware devices, from the viewpoint
of the application. One was pretty normal, it had a maximum line
length and offered input line editing - nothing was known to the
application before the user hit CR. A system routine announced
the presence or absence of input data, so the software could avoid
issuing a read that would involve a (possibly) long wait.
The other was char by char, immediate, with no editing. It was
normally interrupt driven into a (limited) buffer, again with a
system dataready routine (the same system call). If the buffer
overran this was deliberately setup to discard the oldest char, so
that an abort key would eventually be noticed. It was simple, and
needed no special knowledge of any abort keys, etc. If the
application kept up, there was no limit to the line length the
application could see.
The application had no knowledge of which of these, or possibly
redirection, was supplying its input. The system had no
multi-processing, threads, etc.
I could use the 2nd form to implement full screen editors and the
ilk.
All of these devices (note topicality :-) are perfectly legitimate
stdins for ISO C systems. All of them are implementable on any
opsys of which I am aware, although maybe not via pure C coding.
And they all serve to customize an application, written to
internationally agreed standards, to a particular situation.
Moral: - don't use gets() even for your dinky PIC system. Also,
don't assume certain characters, such as BS, CR, LF, CTL-C, DEL,
whatever, cannot appear in your input.
Just realized you guys were talking about stdin when not redirected.
Sorry, I can be slow sometimes... :)
#include <stdio.h>
#include <stdlib.h>
int
main(void)
{
char buf[1024];
int bytes = 0;
fgets(buf, sizeof buf, stdin);
while(buf[bytes++]);
printf("%d bytes\n", bytes);
return EXIT_SUCCESS;
}
16 bit cl 8.00c compiler, ran on win98 & win95 = 129 bytes
32 bit cl 10.20.6166 compiler, ran on same machines = 256 bytes
What does this mean? stdin is limited by the implementation.
But at what point would the implementation be limited by the OS?
Hmm, isn't that what you guys were trying to determine to begin
with? I told you I can be slow sometimes :)
Regards,
Mark
> No professional carries every tool around in his toolkit, however;
> just the ones he is likely to need. Whether scanf belongs in the
> toolkit is a matter of opinion.
I don't know most parts of scanf either, but do you seriously claim
that you _never_ use it? Not even, say, sscanf to parse in some
integers?
--
Nils Goesche
"Don't ask for whom the <CTRL-G> tolls."
>Dan Pop wrote:
>>
>> In <3A1B7942...@eton.powernet.co.uk> Richard Heathfield <bin...@eton.powernet.co.uk> writes:
>>
>> >Dan Pop wrote:
>> >>
>> ><snip>
>> >>
>> >> Someone who has used, say, scanf for five years and is still not fully
>> >> acquainted with all the relevant issues is a poor programmer and will
>> >> remain a poor programmer up to the end of his career.
>> >
>> >Someone who avoids scanf like the plague, however, is IMHO a wise man.
>>
>> Someone who uses scanf when it is the right tool for the job is an even
>> wiser man.
>
>Not if he has not yet learned to use scanf correctly. And not when it
>isn't the right tool for the job. Whether scanf is the right tool for a
>given job is a matter of opinion, and your opinion differs from mine.
I don't think so. It is the right tool for the job if it gets it done
easier and in a cleaner/clearer way than other tools.
Otherwise, one could argue that more than half of the tools in the standard
library are redundant, because their functionality can be achieved by
other means. A few examples: most of <string.h> and <ctype.h>,
malloc and calloc.
>> Sometimes, the same job can be done with a sharp tool and with a blunt one.
>> The sharp one will do it much faster, but, if you don't know how to use it
>> properly, you may hurt yourself or botch the job. If you look into the
>> toolkit of a professional, you'll find plenty of sharp tools...
>
>Right. No professional carries every tool around in his toolkit,
>however; just the ones he is likely to need. Whether scanf belongs in
>the toolkit is a matter of opinion.
When you're programming in standard C, the contents of your toolbox was
already decided by the C standardisation committee. Whether you use it
or not, the tool is still there...
>> Dan Pop wrote:
>> > This is the limit for COMMAND.COM's command line. I'm not sure if the
>> > limit for a C program stdin comes from MS-DOS or from the implementation.
>> > The 254 char limit was measured with VC++ 4.2.
><snip>
>
>16 bit cl 8.00c compiler, ran on win98 & win95 = 129 bytes
>32 bit cl 10.20.6166 compiler, ran on same machines = 256 bytes
>What does this mean? stdin is limited by the implementation.
I did suspect that.
>But at what point would the implementation be limited by the OS?
The OS limit (on MS-DOS) is actually imposed by the BIOS and is 16.
As you can see, this limit doesn't affect the limit imposed by the
implementation. It only limits the amount of type-ahead you can do.
Another OS-imposed limit is the amount of memory available for stdin's
buffer: i.e. you can't have a 1MB stdin buffer on MS-DOS.
>Dan Pop wrote:
>>
>> In <3A1B37ED...@my-deja.com> CBFalconer <cbfal...@my-deja.com> writes:
>>
>> >On my system, dosbox under W98, the command line limit is 511
>> >chars. COMMAND.COM is nowhere to be seen. But shell command line
>> >limits are NOT the stdin limits.
>>
>> You didn't tell us what (if any) is the stdin limit (when not redirected)
>> on that system.
>
>I don't know if it has one, and between weeding through drivel on
>many newsgroups, things my wife can find for me to do NOW, etc. I
>am not about to find out.
It's extremely easy to find out: just call gets and keep a key pressed
until you stop getting it echoed back. Press Return and display the
length of the string you've managed to read from stdin.
I have never used scanf, or its ilk, by choice, in any program ever, as
far as I can recall. If I have used any of them, it would indeed be
sscanf, but I don't recall choosing it over what I consider to be better
alternatives.
I have of course *maintained* code containing sscanf and fscanf (but
never scanf itself!), and therefore had to learn enough about these
functions to be able to maintain programs using them. I'm surprised Dan
hasn't challenged me along those lines, since it's a much more promising
line of attack. (Come on Dan, do wake up!)
>I have of course *maintained* code containing sscanf and fscanf (but
>never scanf itself!), and therefore had to learn enough about these
>functions to be able to maintain programs using them. I'm surprised Dan
>hasn't challenged me along those lines, since it's a much more promising
>line of attack. (Come on Dan, do wake up!)
I'm not interested in attacking anyone. If you want a challenge, here it
is: write code to read a full line from stdin into a buffer, retain at
most the first 80 characters (discarding any excess characters), drop
the newline character from the buffer and don't leave anything from
that line in stdin's internal buffer.
My solution:
int rc;
char line[80 + 1];
rc = scanf("%80[^\n]%*[^\n]", line);
getchar(); /* remove the newline character from stdin */
And a very good solution it is too. Not the one I would have come up
with, obviously. Furthermore, your solution uses fewer lines of code
than mine would.
I have, I think, always maintained that scanf is IMHO for expert
programmers only. I see no reason, looking at this example, to change my
mind.
I have 25K lines of C code-plus-comments here with not a *scanf* in sight.
Since it's a compiler-and-interpreter, however, it's probably not a
good example.
--
Chris "electric hedgehog" Dollin
C FAQs at: http://www.faqs.org/faqs/by-newsgroup/comp/comp.lang.c.html
/*-----------------------------------------------------------------
* skip to eoln or eof. USUALLY return '\n'
*/
int flushln(FILE *f)
{
int ch;
ch = getc(f);
while (('\n' != ch) && (EOF != ch)) ch = getc(f);
return ch;
} /* flushln */
/* ------------------------------------------------
/* fill buffer, max size max, */
/* return count of chars installed */
int loadbuf(char * buf, int max, FILE *f)
{
int ch, posn;
ch = getc(f); posn = 0;
while ((max-- > 0) && (EOF != ch) && ('\n' != ch) {
buf[posn++] = ch;
ch = getc(f);
}
ungetc(ch, f);
return posn;
} /* loadbuf */
/* challenge operation */
void challenge(void);
{
#define SZ 80;
char buf[SZ];
/* leaves line, without any termination mark */
(void) loadbuf(buf, SZ, stdin);
(void) flushln(stdin);
} /* challenge */
/* untested code */
This line won't compile.
> /* leaves line, without any termination mark */
> (void) loadbuf(buf, SZ, stdin);
Neither will this one.
You are confusing buffering, type-ahead, line-editing, etc. You
can have any size of buffer anywhere, the buffer is just
implemented external to the OS, in the brain of the typist. Just
because the particular keyboard driver in use injects an
extraneous CR every Nth key doesn't affect the abstraction.
MSDOS in particular, can always install a new driver at boot up.
So work with the abstraction, not the particular current
implementation.
Bad name for this function; in C, flushing means pushing buffered output
to the environment.
>{
> int ch;
>
> ch = getc(f);
> while (('\n' != ch) && (EOF != ch)) ch = getc(f);
> return ch;
>} /* flushln */
That's gross, writing two calls to getc() like that. This is an opportunity to
apply the do/while loop:
do {
ch = getc(f);
} while (ch != EOF && ch != '\n');
or the common C idiom:
while ((ch = getc(f)) != EOF && ch != '\n')
;
>/* ------------------------------------------------
>/* fill buffer, max size max, */
>/* return count of chars installed */
>int loadbuf(char * buf, int max, FILE *f)
>{
> int ch, posn;
>
> ch = getc(f); posn = 0;
> while ((max-- > 0) && (EOF != ch) && ('\n' != ch) {
In the case when max is zero, why bother calling getc? ;)
> buf[posn++] = ch;
> ch = getc(f);
> }
> ungetc(ch, f);
What if the loop terminated due to ch == EOF? You end up doing ungetc(EOF, f);
which will have the effect of pushing the character (unsigned char) EOF
into the stream.
> return posn;
No null termination?
In article <slrn91qf5...@ashi.FootPrints.net>
Kaz Kylheku <k...@ashi.footprints.net> writes:
>That's gross, writing two calls to getc() like that. This is an
>opportunity to apply the do/while loop:
> do {
> ch = getc(f);
> } while (ch != EOF && ch != '\n');
>or the common C idiom:
> while ((ch = getc(f)) != EOF && ch != '\n')
> ;
As someone else pointed out to me so recently, either one could
perhaps stand to double-check feof(f) when c==EOF, just in case
sizeof(int) is 1. Such implementations pose other problems though,
and I tend not to worry about them.
>>int loadbuf(char * buf, int max, FILE *f)
A "size_t max" might be better, but ...
>> while ((max-- > 0) && (EOF != ch) && ('\n' != ch) {
... using size_t with signed comparisons can get tricky. (This
would still work since "max--" will, if max is 0, produce the value
0 -- of whatever type -- while setting max to (size_t)-1, i.e.,
the maximum possible size_t value.)
>In the case when max is zero, why bother calling getc? ;)
Just to be able to ungetc() it:
>> buf[posn++] = ch;
>> ch = getc(f);
>> }
>> ungetc(ch, f);
>What if the loop terminated due to ch == EOF? You end up doing ungetc(EOF, f);
>which will have the effect of pushing the character (unsigned char) EOF
>into the stream.
This is okay: an ungetc(EOF) call is required to leave the stream
unchanged. (See 4.9.7.11, p. 145, ANSI Classic.)
Note that this requirement conflicts with the idea of EOF being a
valid character value when sizeof(int)==1. This is one of those
"other problems" I mentioned above.
--
In-Real-Life: Chris Torek, Berkeley Software Design Inc
El Cerrito, CA, USA Domain: to...@bsdi.com +1 510 234 3167
http://claw.bsdi.com/torek/ (not always up) I report spam to abuse@.
>Dan Pop wrote:
>>
>> int rc;
>> char line[80 + 1];
>>
>> rc = scanf("%80[^\n]%*[^\n]", line);
>> getchar(); /* remove the newline character from stdin */
>
>I have, I think, always maintained that scanf is IMHO for expert
>programmers only. I see no reason, looking at this example, to change my
>mind.
The above code could have been written by an inexperienced programmer
who took the time to read pages 245-246 from K&R2.
>Dan Pop wrote:
>>
>> In <3A1C112E...@geocities.com> Mark <eng...@geocities.com> writes:
>>
>> >But at what point would the implementation be limited by the OS?
>>
>> The OS limit (on MS-DOS) is actually imposed by the BIOS and is 16.
>> As you can see, this limit doesn't affect the limit imposed by the
>> implementation. It only limits the amount of type-ahead you can do.
>>
>> Another OS-imposed limit is the amount of memory available for stdin's
>> buffer: i.e. you can't have a 1MB stdin buffer on MS-DOS.
>
>You are confusing buffering, type-ahead, line-editing, etc. You
I don't think I was confusing anything.
>can have any size of buffer anywhere, the buffer is just
>implemented external to the OS, in the brain of the typist. Just
>because the particular keyboard driver in use injects an
>extraneous CR every Nth key doesn't affect the abstraction.
??? The keyboard driver doesn't inject anything at all.
>MSDOS in particular, can always install a new driver at boot up.
I was talking about what MSDOS *does*, not about what it could do.
>So work with the abstraction, not the particular current
>implementation.
You completely missed the point. Which was that, for a *given*
implementation, even gets() may be safely used. And I also provided a
concrete example.
The gratuitious semi in the #define in my original aside, the
above are style comments IMO and don't affect the correctness. If
the file is at EOF it will remain there, so I don't THINK pushing
EOF back matters - the important thing is that EOF is handled
during the reads.
The lack of null termination is a direct consequence of the
original specification, which you snipped. I compensated by
having the loadbuf() routine return the count.
If you don't call getc when max is zero, ungetc will get called
with junk argument. This is an extreme case, but I think
correctly handled here. I think a -ve max value is also handled
correctly.
I did mark it UNTESTED. What do you want from the initial type
in? Incidentally, I disagree as to the badness of the flushln
name - people expect it to work for input and thus misuse the
standard flush.
What I would criticise is the handling if miscalled with, say, f
== stdout. Which doesn't affect the satisfaction of the original
challenge.
"Dan Pop" <Dan...@cern.ch> wrote in message
news:8vf6uc$941$1...@sunnews.cern.ch...
> In <TMBS5.114$iY5.5...@news1.van.metronet.ca> "Kelsey Bjarnason"
<kel...@no.spam.telus.net> writes:
>
> >[snips]
> >
> >"Dan Pop" <Dan...@cern.ch> wrote in message
> >news:8vcum0$4v5$1...@sunnews.cern.ch...
> >
> >> >Therefore I don't see the big problem with gets(), "everybody" knows
the
> >> >problem, and "nobody" uses that function. OTOH, even an experienced C
> >> >programmer can get scanf() wrong.
> >>
> >> Then, your definition of "experienced C programmer" doesn't match mine.
> >
> >So where does one find one of these "experienced C programmers" by your
> >definition - a definition which apparently does not allow for oversights,
> >mistakes, accidents, brain farts, effects of medications or indeed any
> >potential source or cause of performing less than absolutely perfectly at
> >all times?
>
> An experienced C programmer is supposed to be familiar with the potential
> problems of each function from the standard library (that he is using)
> and to take them into account when coding.
Take another look at what I asked - I asked where I could find a programmer
who *never* makes a mistake, *never* overlooks anything, *never* gets
distracted, *never* experiences any effects on concentration due to
medications he may be on, and so on - in short, the perfect programmer,
incapable of mistakes.
Find me one of those, and we can agree that an "experienced C programmer",
by the implied definition you offered, actually exists. Personally, I think
it's a purely mythical construct.
I suspect that ANSI kept it in because no one wrote a serious
proposal to obsolete it. Does anyone have access to the written record
of ANSI proceedings in order to search for discussion or proposals
involving gets() ?
Maybe it's time to get such a proposal in the record for the next round
of C changes?
There is a tolerance for changes to C that "break" existing code,
and I'd assume that there would be a phase-in period before gets() was
totally eliminated (is it possible to create gets() with portable code
using other C standard librray functions?).
- Larry Weiss
> totally eliminated (is it possible to create gets() with portable code
> using other C standard librray functions?).
Of course. It is just about trivial, in fact. Here's the core
code of interest from the GNU libc implementation:
while ((c = getchar ()) != EOF)
if (c == '\n')
break;
else
*p++ = c;
*p = '\0';
(The rest is just error handling.)
--
"To get the best out of this book, I strongly recommend that you read it."
--Richard Heathfield
With that established there is small cost to simply phasing out
gets(). The programs that still must depend on it would just
supply their own gets() implementation.
- Larry Weiss
That makes me wonder just what the hypothetical minimal set
of C standard library functions are? That would be the smallest set of
standard library functions that could be used to build all the
rest of the standard library?
That sounds like a great C course homework assignment!
There's probably more than one interesting answer and you'd have to
get acquainted with the whole set of standard functions to give
a correct answer. You'd learn a lot by trying to find such subsets.
- Larry Weiss
> That makes me wonder just what the hypothetical minimal set
> of C standard library functions are? That would be the smallest set of
> standard library functions that could be used to build all the
> rest of the standard library?
Hmm. Interesting question. Here's my first stab of it, going by
the header file summaries in Annex D of the C89 standard:
ctype.h: (An odd case; these can be implemented portably for the
"C" locale, I think, but not necessarily for other locales.)
locale.h:
setlocale
localeconv
math.h:
floor
ceil
fmod
(I believe that other functions can be implemented as expansions
of series or in terms of these functions and elementary
operations, but I'm definitely not sure, especially for frexp and
ldexp.)
setjmp.h:
setjmp
longjmp
signal.h:
signal
raise
stdarg.h:
va_start
va_arg
va_end
stdio.h:
remove
rename
tmpfile
fclose
fflush
fopen
freopen
setvbuf
getc
putc
ungetc
fgetpos
fseek
fsetpos
clearerr
feof
ferror
stdlib.h:
realloc
abort
atexit
exit
getenv
system
mblen
mbtowc
wctomb
mbstowcs
wcstombs
string.h:
strerror
time.h:
clock
difftime
mktime
time
ctime
gmtime
localtime
--
"Give me a couple of years and a large research grant,
and I'll give you a receipt." --Richard Heathfield
>Ben Pfaff wrote:
>> Larry Weiss <l...@airmail.net> writes:
>> > totally eliminated (is it possible to create gets() with portable code
>> > using other C standard library functions?).
>>
>> Of course. It is just about trivial...
>>
>
>That makes me wonder just what the hypothetical minimal set
>of C standard library functions are? That would be the smallest set of
>standard library functions that could be used to build all the
>rest of the standard library?
>
>That sounds like a great C course homework assignment!
I suspect there are few C courses offering complete coverage of the
standard C library.
Then, there is the difference between a working implementation and a
merely conforming implementation (e.g. the malloc that always returns
a null pointer). In many cases, this difference is so subtle that not
even the c.l.c regulars can agree if a certain trivial implementation is
conforming or not.
>Larry Weiss <l...@airmail.net> writes:
>
>> That makes me wonder just what the hypothetical minimal set
>> of C standard library functions are? That would be the smallest set of
>> standard library functions that could be used to build all the
>> rest of the standard library?
>
>locale.h:
> setlocale
> localeconv
Neither is needed. This stuff can be implemented in standard C.
>stdio.h:
> remove
> rename
> tmpfile
> fclose
> fflush
> fopen
> freopen
> setvbuf
> getc
> putc
> ungetc
> fgetpos
> fseek
> fsetpos
> clearerr
> feof
> ferror
Given only a few system-specific primitives (e.g. the equivalents of the
Unix open, close, read, write, lseek) all of stdio.h can be implemented
in standard C.
>time.h:
> clock
> difftime
> mktime
> time
> ctime
> gmtime
> localtime
ctime can be implemented in standard C. The standard itself provides a
sample implementation.
The problem is not that it won't crash on a severely restricted
OS/compiler/whatever combination. The problem is that sooner or later,
your program _is_ going to be used on some DOS clone that has less
strict buffer sizes than the original, whether you planned this when you
wrote the code or not. And it doesn't matter at all that you
specifically outlaw this in your documentation. Your user _is_ going to
blame you for not mentioning that "Nothing but MS-DOS" doesn't just mean
"No Losedows", but also "No Wibble-DOS".
IOW, gets() is acceptable for throw-away code - but only if you _do_
throw that code away minutes after writing it. If there's any chance
that the result is let out of your sight for even seconds, gets() won't
do - _all_ security demands will, sooner or later, get circumvented, and
gets() won't be able to deal with that. Alternatives will.
Richard
>Dan...@cern.ch (Dan Pop) wrote:
>>
>> Try to crash it on MS-DOS and let me know how you did it.
>
>The problem is not that it won't crash on a severely restricted
>OS/compiler/whatever combination.
My point was precisely that on a "severely restricted
OS/compiler/whatever combination" even gets may be safely used.
>The problem is that sooner or later,
>your program _is_ going to be used on some DOS clone that has less
>strict buffer sizes than the original, whether you planned this when you
>wrote the code or not. And it doesn't matter at all that you
>specifically outlaw this in your documentation. Your user _is_ going to
>blame you for not mentioning that "Nothing but MS-DOS" doesn't just mean
>"No Losedows", but also "No Wibble-DOS".
We have already established that the limit is imposed by the stdio
implementation. So, if I only distribute the MSDOS binaries, I'm safe :-)
Not with that interface. Do you really know how it will behave on every
possible iteration of the OS? E.g. DOS 6.20 verses DOS 6.21?
> >The problem is that sooner or later,
> >your program _is_ going to be used on some DOS clone that has less
> >strict buffer sizes than the original, whether you planned this when you
> >wrote the code or not. And it doesn't matter at all that you
> >specifically outlaw this in your documentation. Your user _is_ going to
> >blame you for not mentioning that "Nothing but MS-DOS" doesn't just mean
> >"No Losedows", but also "No Wibble-DOS".
>
> We have already established that the limit is imposed by the stdio
> implementation. So, if I only distribute the MSDOS binaries, I'm safe :-)
If you have the source code for the compiler and it does not use any OS
services (which seems very doubtful).
Using gets() in any format is idiotic. Look at the internet worm. Can you
guarantee that the behavior will be identical if (instead of stdin) the input
is redirected from a file?
Anyone who uses gets() for any purpose is simply not thinking clearly, or has
malice in mind.
--
C-FAQ: http://www.eskimo.com/~scs/C-faq/top.html
"The C-FAQ Book" ISBN 0-201-84519-9
C.A.P. FAQ: ftp://cap.connx.com/pub/Chess%20Analysis%20Project%20FAQ.htm
What happens when it is piped to a file via redirection?
Just jumped in.
> >Do you really know how it will behave on every
> >possible iteration of the OS? E.g. DOS 6.20 verses DOS 6.21?
>
> Yes, I do. I don't know how it will behave on every possible C
> implementation for MSDOS, but this is a moot point as long as I only make
> binary releases.
For NT in a console window?
For Win98 & Millenium console windows?
For DR-DOS?
For DOS 3.2?
>The problem is that sooner or later,
> >> >your program _is_ going to be used on some DOS clone that has less
> >> >strict buffer sizes than the original, whether you planned this when you
> >> >wrote the code or not. And it doesn't matter at all that you
> >> >specifically outlaw this in your documentation. Your user _is_ going to
> >> >blame you for not mentioning that "Nothing but MS-DOS" doesn't just mean
> >> >"No Losedows", but also "No Wibble-DOS".
> >>
> >> We have already established that the limit is imposed by the stdio
> >> implementation. So, if I only distribute the MSDOS binaries, I'm safe
:-)
> >
> >If you have the source code for the compiler and it does not use any OS
> >services (which seems very doubtful).
>
> What MSDOS service(s) do you a expect a stdin implementation to use, when
> stdin is connected to the system console?
An int21 of some sort (or an int10, but I have not done that stuff in a long
time, so I would have to look at Brown's interrupt list, which I have
hopefully misplaced where I won't ever find it again).
> >Using gets() in any format is idiotic. Look at the internet worm. Can you
> >guarantee that the behavior will be identical if (instead of stdin) the
input
> >is redirected from a file?
>
> Did you actually look at my sample program? Can you crash it, by
> switching MSDOS versions or by redirecting its standard input? If not,
> my assertion at the beginning of my previous post is true, whether you
> like it or not.
No. Where is it?
> >Anyone who uses gets() for any purpose is simply not thinking clearly, or
has
> >malice in mind.
>
> Or has more clues than you :-)
This is quite likely
>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:8vuffd$aft$1...@sunnews.cern.ch...
>> In <3a228bfd...@news.worldonline.nl> in...@hoekstra-uitgeverij.nl
>(Richard Bos) writes:
>>
>> >Dan...@cern.ch (Dan Pop) wrote:
>> >>
>> >> Try to crash it on MS-DOS and let me know how you did it.
>> >
>> >The problem is not that it won't crash on a severely restricted
>> >OS/compiler/whatever combination.
>>
>> My point was precisely that on a "severely restricted
>> OS/compiler/whatever combination" even gets may be safely used.
>
>Not with that interface.
Did you read this thread from its very beginning or are you a Jonny come
late?
>Do you really know how it will behave on every
>possible iteration of the OS? E.g. DOS 6.20 verses DOS 6.21?
Yes, I do. I don't know how it will behave on every possible C
implementation for MSDOS, but this is a moot point as long as I only make
binary releases.
>> >The problem is that sooner or later,
>> >your program _is_ going to be used on some DOS clone that has less
>> >strict buffer sizes than the original, whether you planned this when you
>> >wrote the code or not. And it doesn't matter at all that you
>> >specifically outlaw this in your documentation. Your user _is_ going to
>> >blame you for not mentioning that "Nothing but MS-DOS" doesn't just mean
>> >"No Losedows", but also "No Wibble-DOS".
>>
>> We have already established that the limit is imposed by the stdio
>> implementation. So, if I only distribute the MSDOS binaries, I'm safe :-)
>
>If you have the source code for the compiler and it does not use any OS
>services (which seems very doubtful).
What MSDOS service(s) do you a expect a stdin implementation to use, when
stdin is connected to the system console?
>Using gets() in any format is idiotic. Look at the internet worm. Can you
>guarantee that the behavior will be identical if (instead of stdin) the input
>is redirected from a file?
Did you actually look at my sample program? Can you crash it, by
switching MSDOS versions or by redirecting its standard input? If not,
my assertion at the beginning of my previous post is true, whether you
like it or not.
>Anyone who uses gets() for any purpose is simply not thinking clearly, or has
>malice in mind.
Or has more clues than you :-)
Dan
Interrupt 21H (AH=0Ah), if we're talking gets. Note that MSDOS has no
way to represent buffers longer than 255 characters with this call, so
it's not possible to overrun a program calling this if its buffers
exceed this size, even if newer versions provide ways to use longer
buffers.
This does not make a difference. It returns a non-C-style string with
its first byte as the size of the buffer, and therefore cannot specify
a buffer longer than 255 characters. There are other ways gets could
be implemented, of course, but you can verify the output of your C
compiler if you're releasing only binaries as Dan Pop stipulated.
><nais...@enteract.com> wrote in message news:8vvcpn$ks7$1...@bob.news.rcn.net...
>What happens when it is piped to a file via redirection?
My "safe gets" program takes care of this possibility by not allowing
stdin redirection (stdin is reconnected to CON inside the program).
>"Dan Pop" <Dan...@cern.ch> wrote in message
>news:8vv3p1$jcd$1...@sunnews.cern.ch...
>> In <8%zU5.68$3B3.371@client> "Dann Corbit" <dco...@solutionsiq.com> writes:
>>
>> >"Dan Pop" <Dan...@cern.ch> wrote in message
>> >news:8vuffd$aft$1...@sunnews.cern.ch...
>> >> In <3a228bfd...@news.worldonline.nl> in...@hoekstra-uitgeverij.nl
>> >(Richard Bos) writes:
>> >>
>> >> >Dan...@cern.ch (Dan Pop) wrote:
>> >> >>
>> >> >> Try to crash it on MS-DOS and let me know how you did it.
>> >> >
>> >> >The problem is not that it won't crash on a severely restricted
>> >> >OS/compiler/whatever combination.
>> >>
>> >> My point was precisely that on a "severely restricted
>> >> OS/compiler/whatever combination" even gets may be safely used.
>> >
>> >Not with that interface.
>>
>> Did you read this thread from its very beginning or are you a Jonny come
>> late?
>
>Just jumped in.
It's always a very bad idea to do that. I have already addressed your main
objections a few days ago and I hate repeating myself.
>> >Do you really know how it will behave on every
>> >possible iteration of the OS? E.g. DOS 6.20 verses DOS 6.21?
>>
>> Yes, I do. I don't know how it will behave on every possible C
>> implementation for MSDOS, but this is a moot point as long as I only make
>> binary releases.
>
>For NT in a console window?
>For Win98 & Millenium console windows?
>For DR-DOS?
>For DOS 3.2?
Yup, for ALL of them. I've tested my code under Windows 95, but I'm
reasonably confident that the limit is not imposed by the OS. People with
different compilers have reported different limits, under the same MSDOS
flavour.
>> >If you have the source code for the compiler and it does not use any OS
>> >services (which seems very doubtful).
>>
>> What MSDOS service(s) do you a expect a stdin implementation to use, when
>> stdin is connected to the system console?
>
>An int21 of some sort (or an int10, but I have not done that stuff in a long
>time, so I would have to look at Brown's interrupt list, which I have
>hopefully misplaced where I won't ever find it again).
When you discover it, you may also discover its own built-in
limitation(s) :-)
>> >Using gets() in any format is idiotic. Look at the internet worm. Can you
>> >guarantee that the behavior will be identical if (instead of stdin) the
>input
>> >is redirected from a file?
>>
>> Did you actually look at my sample program? Can you crash it, by
>> switching MSDOS versions or by redirecting its standard input? If not,
>> my assertion at the beginning of my previous post is true, whether you
>> like it or not.
>
>No. Where is it?
Upthread.
Once, I saw the following code example in a book:
#include <stdio.h>
...
char buf[BUFSIZ];
gets(buf);
...
Thus, the author relies on gets() not reading more than BUFSIZ bytes.
I know, it's not in the standard, but maybe they just forgot to put it
in there -- they are just people after all? At least, that would
explain why the function had been created in the first place.
BTW, the book was "Object-Oriented Programming With ANSI-C".
Sent via Deja.com http://www.deja.com/
Before you buy.
> In <3a228bfd...@news.worldonline.nl> in...@hoekstra-uitgeverij.nl (Richard Bos) writes:
>
> >The problem is not that it won't crash on a severely restricted
> >OS/compiler/whatever combination.
>
> My point was precisely that on a "severely restricted
> OS/compiler/whatever combination" even gets may be safely used.
And mine was that such a combination exists, for any useful length of
time and reliability, only on your own desk. As soon as you hit the real
world, you can't depend on one of the restrictions being circumvented,
and you can't depend on not getting blamed, either.
Richard
Perhaps they did intend to put that in the standard; perhaps it
was just a silly mistake.
So what. The intention doesn't matter. The fact is that there
is no such restriction in the standard and implementors work from
what the standard says, not from what it might have meant. Given
what the standard says, there's no reason to believe an
implementation restricts the size of the line gets() can input.
--
Michael M Rubenstein
> >> So, why does this unsafe function exist? K&R must have had some reason
> >> to put it in, and ANSI must have had some reason to keep it in. I'm
> >> thinking simple oversight - even K&R can't be trusted to always know
> >> everything. Or is there a more elaborate reason?
>
> Perhaps they did intend to put that in the standard; perhaps it
> was just a silly mistake.
Instead of guessing at what the committee intended to do, why not
find out for sure? Look up gets() in the Rationale for C99, at
http://anubis.dkuug.dk/JTC1/SC22/WG14/www/docs/n897.pdf
(relevant clauses from the n897 Rationale)
7.19.7.2 The fgets function This function subsumes gets which has no
limit to prevent storage overwrite on arbitrary input (see §7.19.7.7).
7.19.7.7 The gets function
Because gets does not check for buffer overrun, it is generally unsafe
to use when its input is not under the programmer’s control. This has
caused some to question whether it should appear in the Standard at all.
The Committee decided that gets was useful and convenient in those special
circumstances when the programmer does have adequate control over the input,
and as longstanding existing practice, it needed a standard specification.
In general, however, the preferred function is fgets (see §7.19.7.2).
[sounds like this might come up again in the next round...at least it answers
the question of whether it was discussed. I would request that if gets()
is a "keeper" then a footnote worded similarly to Rationale Clause 7.19.7.7
be added to the Standard Clause that describes gets().]
- Larry Weiss
>Once, I saw the following code example in a book:
>
> #include <stdio.h>
>
> ...
> char buf[BUFSIZ];
> gets(buf);
> ...
>
>Thus, the author relies on gets() not reading more than BUFSIZ bytes.
The author is clueless about the semantics of gets and BUFSIZ. There is
no requirement whatsoever that the size of stdin's buffer is BUFSIZ.
Even if a certain implementation chooses BUFSIZ, gets is not supposed
to stop after reading BUFSIZ characters. It is supposed to copy those
BUFSIZ characters to buf and then read another chunk of BUFSIZ characters
and append it to buf, until the newline character has been reached.
>I know, it's not in the standard, but maybe they just forgot to put it
>in there -- they are just people after all? At least, that would
>explain why the function had been created in the first place.
No gets description that I'm aware of makes any connection between gets
and BUFSIZ. All of them say that the copying stops only when a
newline character is read or an End-of-File condition is encountered.
>BTW, the book was "Object-Oriented Programming With ANSI-C".
The author may have some other misconceptions about the C programming
language.
BTW, even if your author had been correct about the BUFSIZ limit, his
code is still broken: he should have allocated BUFSIZ + 1 characters for
buf!
I can, as long as I only make binary distributions. At least, until you
can *prove* the contrary :-)