fgets - design deficiency: no efficient way of finding last character read

490 views
Skip to first unread message

John Reye

unread,
Apr 23, 2012, 9:33:45 AM4/23/12
to
Hello,

The last character read from fgets(buf, sizeof(buf), inputstream) is:
'\n'
OR
any character x, when no '\n' was encountered in sizeof(buf)-1
consecutive chars, or when x is the last char of the inputstream

***How can one EFFICIENTLY determine if the last character is '\n'??
"Efficiently" means: don't use strlen!!!

I only come up with the strlen method, which - to me - says that fgets
has a bad design.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[])
{
char buf[6];
FILE *fp = stdin;
while (fgets(buf, sizeof(buf), fp)) {
printf((buf[strlen(buf)-1] == '\n') ? "Got a line which ends with
newline: %s" : "no newline: %s", buf);
}


return EXIT_SUCCESS;
}



A well-designed fgets function should return the length of characters
read, should it not??

Please surprise me, that there is a way of efficiently determining the
number of characters read. ;)
I've thought of ftell, but I think that does not work with stdin.

Because right now, I think that fgets really seems useless.
Why is the standard C library so inefficient?
Do I really have to go about designing my own library? ;)

Thanks for tipps and pointers

Regards,
J.

(PS: I've posted to comp.lang.c but don't see my post appearing, so
I'll try here instead)
--
comp.lang.c.moderated - moderation address: cl...@plethora.net -- you must
have an appropriate newsgroups line in your header for your mail to be seen,
or the newsgroup name in square brackets in the subject line. Sorry.

Barry Schwarz

unread,
Apr 30, 2012, 10:57:59 PM4/30/12
to
On Mon, 23 Apr 2012 08:33:45 -0500 (CDT), John Reye
<jono...@googlemail.com> wrote:

snip

>(PS: I've posted to comp.lang.c but don't see my post appearing, so
>I'll try here instead)

Maybe it is just the moderation delay but this was already beaten to
death on comp.lang.c.

--
Remove del for email

Jasen Betts

unread,
Apr 30, 2012, 10:59:01 PM4/30/12
to
On 2012-04-23, John Reye <jono...@googlemail.com> wrote:
> Hello,
>
> The last character read from fgets(buf, sizeof(buf), inputstream) is:
> '\n'
> OR
> any character x, when no '\n' was encountered in sizeof(buf)-1
> consecutive chars, or when x is the last char of the inputstream
>
> ***How can one EFFICIENTLY determine if the last character is '\n'??
> "Efficiently" means: don't use strlen!!!

intialise the last character in the buffer with \0 before calling fgets
if it gets changed to anything other than \n then an incomplete line
was read.

OTOH strlen is much faster than most I/O so does it really matter?

> A well-designed fgets function should return the length of characters
> read, should it not??
>
> Please surprise me, that there is a way of efficiently determining the
> number of characters read. ;)
>
> I've thought of ftell, but I think that does not work with stdin.

yeah, it works with files, stdin is usually not a file.

> Because right now, I think that fgets really seems useless.
> Why is the standard C library so inefficient?

what's the point of using fgets on stdin anywaqy

> Do I really have to go about designing my own library? ;)
>
> Thanks for tipps and pointers

libreadline (GPL)
libgettext (BSD)
getline (posix.1 2008)

--
⚂⚃ 100% natural

--- Posted via news://freenews.netfront.net/ - Complaints to ne...@netfront.net ---

James Kuyper

unread,
Apr 30, 2012, 10:57:14 PM4/30/12
to
On 04/23/2012 09:33 AM, John Reye wrote:
> Hello,
>
> The last character read from fgets(buf, sizeof(buf), inputstream) is:
> '\n'
> OR
> any character x, when no '\n' was encountered in sizeof(buf)-1
> consecutive chars, or when x is the last char of the inputstream
>
> ***How can one EFFICIENTLY determine if the last character is '\n'??
> "Efficiently" means: don't use strlen!!!
...
> (PS: I've posted to comp.lang.c but don't see my post appearing, so
> I'll try here instead)

Your message to comp.lang.c, sent on Wed, 11 Apr 2012 10:23:14 -0700,
generated a discussion containing 26 other messages, the first of which
was dated Wed, 11 Apr 2012 10:45:12 -0700 (PDT). Four of those messages
claimed to have been from you.

I'll assume that your message to this newsgroup got delayed by the
moderation process, and was actually sent much earlier than the time
indicated by the message headers, which was 2012-04-23. Still, you
should have seen the first response within a half-hour of posting your
message; when using usenet it doesn't make sense to complain about a
lack of responses until at least 24 hours have passed. Either you have a
problem with your news server or you sent that message out far too early.
--
James Kuyper

Dag-Erling Smørgrav

unread,
Apr 30, 2012, 10:57:29 PM4/30/12
to
John Reye <jono...@googlemail.com> writes:
> The last character read from fgets(buf, sizeof(buf), inputstream) is:
> '\n'
> OR
> any character x, when no '\n' was encountered in sizeof(buf)-1
> consecutive chars, or when x is the last char of the inputstream
>
> ***How can one EFFICIENTLY determine if the last character is '\n'??
> "Efficiently" means: don't use strlen!!!

You can't. This is one of several reasons not to use fgets().

If you're on a POSIXish platform (which includes most Unix derivatives),
you can use getline() instead:

ssize_t getline(char **buf, size_t *size, FILE *f);

where

- buf is a pointer to a char * which is either NULL or points to a
malloc()ed buffer; getline() will malloc() or realloc() as needed.

- size is a pointer to a size_t which contains the size of the buffer,
i.e. the size argument from the last malloc() or realloc() call.

- f is the stream to read from. I wish they had placed it first in the
argument list instead of last.

- the return value is the length of the string (i.e. what strlen()
would return), or -1 if an error occurred or EOF was reached before
any data was read.

See also

http://en.wikibooks.org/wiki/C_Programming/C_Reference/stdio.h/gets

DES
--
Dag-Erling Smørgrav - d...@des.no

Thomas Richter

unread,
Apr 30, 2012, 10:58:14 PM4/30/12
to
On 23.04.2012 15:33, John Reye wrote:
> Hello,
>
> The last character read from fgets(buf, sizeof(buf), inputstream) is:
> '\n'
> OR
> any character x, when no '\n' was encountered in sizeof(buf)-1
> consecutive chars, or when x is the last char of the inputstream
>
> ***How can one EFFICIENTLY determine if the last character is '\n'??
> "Efficiently" means: don't use strlen!!!

What makes you so sure that this is inefficient? I would believe that in
almost all cases the IO operation is the dominant part of the code, and
the strlen is almost surely irrelevant. If this is not the case for you,
could you provide a complete working example you benchmarked where
strlen() turned out to be the bottleneck?

> I only come up with the strlen method, which - to me - says that fgets
> has a bad design.

fgets() has a simple design that works in simple cases. For example, it
cannot extend the buffer it reads the string into, which is for really
robust programs a more severe problem than the inconvenience of not
returning an indicator whether the overflow happened. If you need
something more complex, it's simple to write a replacement.

> int main(int argc, char *argv[])
> {
> char buf[6];
> FILE *fp = stdin;
> while (fgets(buf, sizeof(buf), fp)) {
> printf((buf[strlen(buf)-1] == '\n') ? "Got a line which ends with
> newline: %s" : "no newline: %s", buf);
> }


This program does not look like as if it would profit from a more
streamlined library call. I/O time will likely be dominating here,
especially for a buffer that small.

> A well-designed fgets function should return the length of characters
> read, should it not??

No, a well-designed fgets would possibly allocate the buffer itself. But
whether this fits your needs - or not - is application dependent. Maybe
you don't have the luxury of dynamic memory in some cases? As already
said, it is a simple function.

> Please surprise me, that there is a way of efficiently determining the
> number of characters read. ;)
> I've thought of ftell, but I think that does not work with stdin.

This is surely not *more* efficient but *less* efficient as it requires
(at least sometimes) the interaction with the operating system,
especially with the I/O system. This is what tends to be slow, not
iterating over six characters. Estimating the overall number of
instructions an average CPU has to execute to compute the strlen of a
six character string, and to execute ftell, I would bet that the latter
is far less efficient than the former.

> Because right now, I think that fgets really seems useless.

No. The usefulness of a function depends on its application. That
fgets() is in the standard library shows at least that it was useful at
some point.

> Why is the standard C library so inefficient?

Is it?

> Do I really have to go about designing my own library? ;)

If you have special needs, you need to write special code. But actually,
I doubt that *this* specific problem here makes any difference at all.
Except if strings are very very long, and if so, then processing the
input as strings is probably not the right approach in first place.

So long,
Thomas

John Reye

unread,
Apr 30, 2012, 10:59:16 PM4/30/12
to
Please ignore this thread, since the same message and replies can be
found on comp.lang.c

Please see here:
http://groups.google.com/group/comp.lang.c/browse_thread/thread/5c9040a3e519535c
https://groups.google.com/forum/?fromgroups#!topic/comp.lang.c/XJBAo-UZU1w
Reply all
Reply to author
Forward
0 new messages