find the corner case that will break this code.

0 views
Skip to first unread message

Mickey

unread,
Apr 15, 2010, 4:20:06 AM4/15/10
to nextgen_engg
There is one corner case where this function may even eventually cause
a crash. What is that corner case?

#include <stdio.h> /* For size_t */

/*
Reads at most size characters into the array.
Stops if a newline or EOF is encountered.
The newline is not included in the array.
The array is always terminated by '\0'.
Returns length of the string.
Please note that memory allocation is callers responsibility.
pptr should have space for size+1 characters.
*/

size_t input(char *pptr, size_t size)
{
size_t len = 0;
while(size > len)
{ int ch;
ch = getchar();
if(ch == '\n' || ch == EOF) break;
*(pptr++) = (char)ch;
++len;
}
*pptr = '\0';
return len;
}


Regards,
Jyoti

Divya Prakash

unread,
Apr 15, 2010, 5:04:46 AM4/15/10
to Mickey, nextgen_engg
if if(ch == '\n' || ch == EOF) break;
This can make the code crash... 


--
To unsubscribe from this group, send email to
nextgen_engg...@googlegroups.com
For more options, visit this group at
http://groups.google.co.in/group/nextgen_engg?hl=en

To unsubscribe, reply using "remove me" as the subject.



--
Yours sincerely
Divya Prakash :)



Mickey

unread,
Apr 15, 2010, 5:31:30 AM4/15/10
to nextgen_engg
It would be more helpful to state the corner case you were thinking
about instead of pointing a line in the code which is the culprit. A
sample code that would crash it will be even more welcome. Because
given the way the function is expected to work (in comments before the
function) and noting that it uses getchar() from stdio.h I am not
clear why you chose if(ch == '\n' || ch == EOF) break; as the culprit?

Regards,
Jyoti

Nilu

unread,
Apr 15, 2010, 11:30:35 AM4/15/10
to nextgen_engg
Suppose the number of characters read from keyboard exceeds the size
of pptr, then the code will break. Because in such a case, the loop
condition fails and the len always will remain as the size of pptr.

Here is a test code with some modifications to the function just to
make clear as to what is happeninng :
http://codepad.org/J267sKdP
1. Run the code on your local machine.
2. You will have to read characters more than the size of pptr to
break the function.

-
Nilesh

Nilu

unread,
Apr 15, 2010, 11:37:36 AM4/15/10
to nextgen_engg
Oh ok. Did I confuse something here? Is the function expected to
return the size of pptr or the number of characters read?

-
Nilesh

Mickey

unread,
Apr 15, 2010, 7:48:54 PM4/15/10
to nextgen_engg
It looks fine to me....

c:\c>test.exe

Sample : some
Size : 4
Read string from keyboard : qwerty12345

Current len : 0, Before : s, After : q,
Current len : 1, Before : o, After : w,
Current len : 2, Before : m, After : e,
Current len : 3, Before : e, After : r,
Final Length : 4
c:\c>

/*
Returns length of the string.

Please note that memory allocation is callers responsibility.
pptr should have space for size+1 characters.
*/

What input did you give to break the code... in the light of the above
note about the function? I think you are talking about giving a
misleading size argument to the function? That is caller's
responsibility. This function will "Reads at most size characters into
the array." thereby ensuring if the user passed sensible value for
size it will not overflow the allocated space... which... you can't
specify to scanf("%s", str)... this was the reason this function was
written in the first place.

Regards,
Jyoti

Mickey

unread,
Apr 17, 2010, 1:52:44 AM4/17/10
to nextgen_engg
Hints:

Number 1
--------
This function eliminates the woes in taking input using the following
functions:
char str[0xff];
scanf("%s" str);
Or
cin >> str;
Or
gets(str);
all three functions above suffer from the drawback of a possible
buffer overflow.


Number 2
--------
So what do we generally use in robust programs? We use fgets (which
takes the space available in the buffer and avoids this overflow
issue). This function here tries to mimic the same one except that it
additionally ensures that the trailing newline is stripped (call it a
feature or a drawback... but this is what it does) and guarantees that
a terminating null is written.


Number 3
--------
I said that there is a corner case where this function will not work
as expected. In fact it is a design fault with the function. Having
said that, it is quite nicely written and hasn't crashed for me ever.
It does work. So, do not look for programmatic faults instead imagine
boundary cases when this function might fail. Note that 'might'.


Thank you,
Jyoti

Mickey

unread,
May 5, 2010, 10:12:13 AM5/5/10
to nextgen_engg
Nobody got the lead here? I had this code in use for one year... and
got the corner case during review... I never actually hit the corner
case.

Regards,
Jyoti

SUDHEER DURUSOJU

unread,
May 9, 2010, 12:53:07 PM5/9/10
to nextge...@googlegroups.com
Micky,

Note 1 is about having new line character read into array which is may cause buffer overflow. If we define array for 5 characters and typed 5 characters followed by enter key, we are actually taking 7 charaters in memory. One extra for newline another for null character.

The function is limiting the no.of characters it reads. I was thinking the function will fail if it tries to read more than 65538 characters.

When it was called using input (str,65538) the variable len which is defined as size_t ( I think size_t is internally represented as unsigned int whose bounds are 0 to 65537 ) will cause  the condition while(size > len) to fail indefinitely making this loop as indefinite. To make it fail you need to input more than 65538 characters with out newline or EOF characters.

I was thinking pointer variables are also internally represented as unsigned int. In that case we need to consider bounds of variable ptr. If anyone have more info please comment about this.

This is only an idea... I was not able to find any other corner cases when this function fail...

-Sudheer

Mickey

unread,
May 9, 2010, 11:22:14 PM5/9/10
to nextgen_engg
You caught my mistake! Here is the corrected code(The while condition
has been changed):

/*
Reads at most size -1 characters into the array.
Stops if a newline or EOF is encountered.
The newline is not included in the array.
The array is always terminated by '\0'.
Returns length of the string.
*/

size_t input(char *pptr, size_t size)
{
size_t len = 0;
while((size - len) > 1) /* On break at most len == (size - 1). */
/* 0U - 1 = 65535, 65535U + 1 = 0 */
{
int ch;
ch = getchar();
if(ch == '\n' || ch == EOF) break;
*(pptr++) = (char)ch;
++len;
}
*pptr = '\0';
return len;
}

/*
fgets reads at most the next n - 1 characters into the array s,
stopping if a newline is encountered; the newline is included in
the array, which is terminated by '\0'. fgets returns s, or NULL
if end of file occurs. If an EOF is encountered right at the start
the contents of s are undisturbed.

getchar returns the next character of stdin as an unsigned
(converted
to an int), or EOF if end of file or error occurs.
*/


On May 9, 9:53 pm, SUDHEER DURUSOJU <sdurus...@gmail.com> wrote:
> Micky,
>
> Note 1 is about having new line character read into array which is may cause
> buffer overflow. If we define array for 5 characters and typed 5 characters
> followed by enter key, we are actually taking 7 charaters in memory. One
> extra for newline another for null character.
>
> The function is limiting the no.of characters it reads. I was thinking the
> function will fail if it tries to read more than 65538 characters.
>
> When it was called using input (str,65538) the variable len which is defined
> as size_t ( I think size_t is internally represented as unsigned int whose
> bounds are 0 to 65537 ) will cause  the condition while(size > len) to fail
> indefinitely making this loop as indefinite. To make it fail you need to
> input more than 65538 characters with out newline or EOF characters.
>
> I was thinking pointer variables are also internally represented as unsigned
> int. In that case we need to consider bounds of variable ptr. If anyone have
> more info please comment about this.
>
> This is only an idea... I was not able to find any other corner cases when
> this function fail...
>
> -Sudheer
>

Mickey

unread,
May 10, 2010, 2:22:49 AM5/10/10
to nextgen_engg
It seems I had pasted the correct code initially too. The faulty code
had this:
while(size >= len)
instead of this
while(size > len)

And the code I have pasted immediately before this is the same thing
in a more elegant way... the changed condition of size -1 to mimic the
convention of common functions like fgets. Both are correct. There
were no bugs in the code.

size and len are both of same data type. It doesn't matter what
(integral) data type that is... size > len will not go into infinite
loop in the above loop but size >= len (what I intended to paste
originally) will when size is passed as the max value of its data type
(65535 in case of unsigned 16 bit integer). As for pointers... if the
caller was able to have a buffer with such length then we should be
able to traverse it.

Thanks to Naren for pointing this out.

Regards,
Jyoti


Humm... Now this is interesting, four people (including me) found bugs
where it did not exist.
> > This is only an idea... I was not able to find any othercornercases when
> > this function fail...
>
> > -Sudheer
>
> > On Wed, May 5, 2010 at 10:12 AM, Mickey <jyoti.mic...@gmail.com> wrote:
> > > Nobody got the lead here? I had this code in use for one year... and
> > > got thecornercase during review... I never actually hit thecorner
> > > > I said that there is acornercase where this function will not work
> > > > > > > > It would be more helpful to state thecornercase you were
> > > thinking
> > > > > > > > about instead of pointing a line in the code which is the
> > > culprit. A
> > > > > > > > sample code that would crash it will be even more welcome.
> > > Because
> > > > > > > > given the way the function is expected to work (in comments
> > > before the
> > > > > > > > function) and noting that it uses getchar() from stdio.h I am not
> > > > > > > > clear why you chose if(ch == '\n' || ch == EOF) break; as the
> > > culprit?
>
> > > > > > > > Regards,
> > > > > > > > Jyoti
>
> > > > > > > > On Apr 15, 2:04 pm, Divya Prakash <divyaprakash...@gmail.com>
> > > wrote:
>
> > > > > > > > > if if(ch == '\n' || ch == EOF) break;
> > > > > > > > > This can make the code crash...
>
> > > > > > > > > On Thu, Apr 15, 2010 at 1:50 PM, Mickey <
> > > jyoti.mic...@gmail.com> wrote:
> > > > > > > > > > There is onecornercase where this function may even
Reply all
Reply to author
Forward
0 new messages