Let's develop the matching algorithm

1 view
Skip to first unread message

Mohammed Safwat

unread,
Sep 17, 2008, 7:24:04 AM9/17/08
to CAT Identity System Development
Asalam Alikum ,

I just want to share my thought until now.

I have more than one point to talk about :
1- The matching algorithm depends on the following : "the pattern we
want to match " , "the text that's being matched " and "spaces " we
add or remove or control to move the cursor of matching .

2- As we need an effective algorithm , we also need a fast search ,
and the manual things to be done must be the easiest thing at last .
Not to make an algorithm that we find our selves at last handling all
the work manually.

3- No matter if we share wrong algorithms , as wrong algorithms may be
useful in some way when we can modify them .

4-Until now , this is and idea to begin with , and I need your
opinions :

Consider the pattern we want to match is denoted by A=A[0 .. m-1]; its
length is equal to m. and the whole text is denoted by B=B[0 .. n-1];
its length is equal to n.

We want to check between these borders : (0) => (n-m) i.e , from the
beginning of the text to the end of the whole length of the text.

After each attempt , let's assume that we will increase the space by
one step.

Our beginning search is when cursor is at the beginning of both the
pattern and the text .
let's define a variable to indicate the cursor increment for the
pattern =i
let's define a variable to indicate the cursor increment for the
text=J

thus i=0 and J=0

Let's begin search :
we will work on the text ,and on its cursor (J).
move (J) one space , and begin testing :
if the value where the text character (J) stands == the value where
the pattern cursor (i) stands , so it's great ! and we want to do
this for the remaining characters of the pattern , and thus , we have
to increase the pattern cursor (i) by one.

This process will continue as a loop , but we want to stop it at the
end of the pattern .
So let's test the cursor position.
If the cursor position has exceeded the length of the pattern , and
all characters of the pattern are matched with the text ,so we have to
skip , and we should output the result we have found." till (J)".

I wrote some implementation in C++ :

void BF(char *A, int m, char *B, int n)
{
int i, J;

/* Searching */
for (j = 0; J <= n - m; ++J)
{
for (i = 0; i < m && A[i] == B[i + J]; ++i);
if (i >= m)
PUTCHAR(J);
}
}

I know there are shortage in this algorithm , but it's just a
beginning.
Waiting for modifications .

Thanks in advance.

Snap

unread,
Sep 17, 2008, 6:35:26 PM9/17/08
to CAT Identity System Development
asslamo 3alekom
nice ya safwat I was studying this problem, as it had more priority
for me than the database one :D,
so I think we can implement some more complicated algs than this Naive
alg, I read about Knuth-Morris-Pratt and Boyer-Moore algorithms. and
they have better efficiency.
Reply all
Reply to author
Forward
0 new messages