Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Binary Search in C

9 views
Skip to first unread message

arnuld

unread,
Dec 27, 2010, 7:07:49 AM12/27/10
to
WANTED: To write a Binary Search algorithm (using C) to find a character
in string
WHAT I GOT: It works
PROBLEM: In some cases I can't comprehend the last comparison in output.
Check the last comparison (just after comparison with 'Z'):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


char binarySearch(const char, const char*, const size_t);

int main(int argc, char* argv[])
{
const char* arrc = "ABCDEFGHIJKLMNOPSTUVZ";
char c = '\0';
char f;

if(2 != argc)
{
fprintf(stderr,"USAGE: ./exec [character]\n");
exit(EXIT_FAILURE);
}
else
{
c = *(argv[1]);
}


printf("Finding '%c' inside '%s'\n\n", c, arrc);
f = binarySearch(c, arrc, strlen(arrc));
if(f)
{
printf("Element Found: %c\n", f);
}
else
{
printf("Nothing Found :(\n");
}

return 0;
}

char binarySearch(const char ele, const char* str, const size_t sz)
{
char retchar = '\0';
size_t begin = 0;
size_t end = sz;
size_t mid;

while(begin <= end)
{
char fnd;
mid = (begin + end) / 2;
fnd = *(str + mid);
/* printf("*str = %c, *(str + %u) = %c\n", *str, mid, *(str +
mid)); */
printf("fnd = %c, ele = %c\n", fnd, ele);
printf("begin = %u, end = %u, mid = %u\n", begin, end, mid);

if(ele > fnd)
{
printf("'%c' > '%c'\n", ele, fnd);
begin = mid + 1;
}
else if(ele < fnd)
{
printf("'%c' < '%c'\n", ele, fnd);
end = mid - 1;
}
else
{
printf("'%c' == '%c'\n", ele, fnd);
retchar = fnd;
break;
}
printf("-----------------------------------------\n\n");
}

return retchar;
}
===================== OUTPUT =======================
[arnuld@dune programs]$ gcc -ansi -pedantic -Wall -Wextra binary-search.c
[arnuld@dune programs]$ ./a.out
USAGE: ./exec [character]
[arnuld@dune programs]$ ./a.out e
Finding 'e' inside 'ABCDEFGHIJKLMNOPSTUVZ'

fnd = K, ele = e
begin = 0, end = 21, mid = 10
'e' > 'K'
-----------------------------------------

fnd = S, ele = e
begin = 11, end = 21, mid = 16
'e' > 'S'
-----------------------------------------

fnd = V, ele = e
begin = 17, end = 21, mid = 19
'e' > 'V'
-----------------------------------------

fnd = Z, ele = e
begin = 20, end = 21, mid = 20
'e' > 'Z'
-----------------------------------------

fnd = , ele = e
begin = 21, end = 21, mid = 21
'e' > ''
-----------------------------------------

Nothing Found :(
[arnuld@dune programs]$

--
http://www.lispmachine.wordpress.com

Tobias Blass

unread,
Dec 27, 2010, 7:40:26 AM12/27/10
to

In the last case (mid=21) you compare *(str+21) (== fnd) with your
element. Since strlen(str)==21, you get the terminating nullbyte.
perhaps you should adjust end to be sz-1.

Ike Naar

unread,
Dec 27, 2010, 7:56:03 AM12/27/10
to
On 2010-12-27, arnuld <sun...@invalid.address> wrote:
> WANTED: To write a Binary Search algorithm (using C) to find a character
> in string
> WHAT I GOT: It works

Doubtful.
Did you try to run it for e.g. the characters '@', '?' or '3' ?

> PROBLEM: In some cases I can't comprehend the last comparison in output.
> Check the last comparison (just after comparison with 'Z'):
>

> [big snip]


>
> fnd = Z, ele = e
> begin = 20, end = 21, mid = 20
> 'e' > 'Z'
> -----------------------------------------
>
> fnd = , ele = e
> begin = 21, end = 21, mid = 21
> 'e' > ''

You mean the character on the right-hand-side of the ``>'' sign?
That's the null character at position 21 in the ``arrc'' array.

> -----------------------------------------
> Nothing Found :(

Not a surprise. The character that you're looking for ('e') does not
appear in ``arrc''.
Probably it's larger than the largest character in ``arrc''.

Cristian-Matei Toader

unread,
Dec 27, 2010, 9:20:07 AM12/27/10
to
The program seems to work just fine. However in your example you are
comparing lowercase 'e' with uppercase letters, which seems
misleading. The lowercase letters have a higher ASCII value than the
uppercase ones, the binary search is always looking in the subsection
with higher values.

Tobias Blass

unread,
Dec 27, 2010, 10:15:42 AM12/27/10
to

I think he wanted to do so. He searched for a nonexisting letter to
present the 'strange' empty char (the nullbyte at str+21)

Barry Schwarz

unread,
Dec 27, 2010, 1:26:53 PM12/27/10
to
On 27 Dec 2010 12:07:49 GMT, arnuld <sun...@invalid.address> wrote:

>WANTED: To write a Binary Search algorithm (using C) to find a character
>in string
>WHAT I GOT: It works
>PROBLEM: In some cases I can't comprehend the last comparison in output.
>Check the last comparison (just after comparison with 'Z'):
>
>
>
>#include <stdio.h>
>#include <stdlib.h>
>#include <string.h>
>
>
>char binarySearch(const char, const char*, const size_t);
>
>int main(int argc, char* argv[])
>{
> const char* arrc = "ABCDEFGHIJKLMNOPSTUVZ";
> char c = '\0';
> char f;
>
> if(2 != argc)
> {
> fprintf(stderr,"USAGE: ./exec [character]\n");

Brackets are usually used to indicate optional parameters.

> exit(EXIT_FAILURE);
> }
> else

When the if portion of a simple if-else construct ends with a
statement that jumps out of the normal sequence (e.g., return, break,
call to a function that does not return such as exit and abort), the
else is superfluous. Removing the else and the two braces would
result in the exact same execution sequence of statements.

> {
> c = *(argv[1]);
> }
>
>
> printf("Finding '%c' inside '%s'\n\n", c, arrc);
> f = binarySearch(c, arrc, strlen(arrc));
> if(f)
> {
> printf("Element Found: %c\n", f);
> }
> else
> {
> printf("Nothing Found :(\n");
> }
>
> return 0;
>}
>
>
>
>char binarySearch(const char ele, const char* str, const size_t sz)
>{
> char retchar = '\0';
> size_t begin = 0;
> size_t end = sz;
> size_t mid;
>
> while(begin <= end)
> {
> char fnd;
> mid = (begin + end) / 2;
> fnd = *(str + mid);
> /* printf("*str = %c, *(str + %u) = %c\n", *str, mid, *(str +
>mid)); */
> printf("fnd = %c, ele = %c\n", fnd, ele);
> printf("begin = %u, end = %u, mid = %u\n", begin, end, mid);

While size_t is definitely an unsigned integer, it need not be an
unsigned int. You should cast the expressions if you don't want to
use on the new C99 formats.

--
Remove del for email

christian.bau

unread,
Dec 27, 2010, 2:00:27 PM12/27/10
to
You have two choices: You can blindly hope to get your code right, or
you can use some very simple principles to make sure that your code is
correct and that you know that it is correct. It is quite simple:

In your loop, write a comment about the variables "begin" and "end"
which is true before each iteration of the loop, and which you can
prove to be true after each iteration of the loop, and which leads to
the correct result. I would suggest the comment:

// The element that I am looking for is either at an index
// i with begin <= i < end, or it is not in the array at all.

This condition is obviously true before the loop starts. It also shows
immediately that there is something dodgy with your loop condition,
because when begin == end you iterate again, but the element you look
for cannot be in the array.

arnuld

unread,
Dec 27, 2010, 11:57:21 PM12/27/10
to
> On Mon, 27 Dec 2010 10:26:53 -0800, Barry Schwarz wrote:

>> On 27 Dec 2010 12:07:49 GMT, arnuld <sun...@invalid.address> wrote:
>> if(2 != argc)
>> {
>> fprintf(stderr,"USAGE: ./exec [character]\n");

> Brackets are usually used to indicate optional parameters.

Okay, understood.

>> exit(EXIT_FAILURE);
>> }
>> else

> When the if portion of a simple if-else construct ends with a statement
> that jumps out of the normal sequence (e.g., return, break, call to a
> function that does not return such as exit and abort), the else is
> superfluous. Removing the else and the two braces would result in the
> exact same execution sequence of statements.

Understood this as well.


> While size_t is definitely an unsigned integer, it need not be an
> unsigned int. You should cast the expressions if you don't want to use
> on the new C99 formats.

casting a bigger type to smaller type, e.g. from long to int or even
unsigned int to int, may cause strange results if value can't fit in the
type being assigned too. Therefore I have to check for range errors,
something like:

if((ERANGE == errno && (INT_MAX == num || INT_MIN == num)) ||
(0 != errno && 0 == num))

Then 50% of the code will be there for error checking (mostly 60% of my
code is error-checking one). Is this the recommended way ?

--
http://www.lispmachine.wordpress.com

Barry Schwarz

unread,
Dec 28, 2010, 1:48:10 AM12/28/10
to
On 28 Dec 2010 04:57:21 GMT, arnuld <sun...@invalid.address> wrote:

>> On Mon, 27 Dec 2010 10:26:53 -0800, Barry Schwarz wrote:
>

snip

>> While size_t is definitely an unsigned integer, it need not be an
>> unsigned int. You should cast the expressions if you don't want to use
>> on the new C99 formats.
>
>casting a bigger type to smaller type, e.g. from long to int or even
>unsigned int to int, may cause strange results if value can't fit in the
>type being assigned too. Therefore I have to check for range errors,
>something like:
>
> if((ERANGE == errno && (INT_MAX == num || INT_MIN == num)) ||
> (0 != errno && 0 == num))
>
>
>Then 50% of the code will be there for error checking (mostly 60% of my
>code is error-checking one). Is this the recommended way ?

Only if you want to play dumb.

What part of your code do you think will set errno? Why are you
testing unsigned values against INT_MIN?

All your values are less than 100 so casting them to an int would not
overflow. In any event, you were using %u so you should cast them to
unsigned int which cannot overflow (though it can produce unexpected
results).

If you really have to worry about an array in excess of 65536 bytes,
then cast the value to unsigned long and use %lu as your format. This
will support arrays up to 4GB. Just don't post your initialization
definition here when you have one that big.

0 new messages