Matching algorithms from another point of view

1 view
Skip to first unread message

Mariam

unread,
Sep 19, 2008, 8:17:08 AM9/19/08
to CAT Identity System Development
what will we Use in User matching.. ? ( I am just thinking loudly )

when I thought I found that the most Unique thing (among users) will
be the E-mail

so let's start matching by E-mails
the problems are :
most of us is making lots of userNames till it works ;
"specifically because of activation problems" so now a user has
actually One userName he can use but his E-mail field was Randomly
filled .

so if I am using Just the mail to match I may use Inactive userNames
with other active ones


ok let's start matching by userNames
>>>if I am using just userName
the problems are :
- distinct users with the same uesrName.
- inactive userName.

first problem could be solved by Using a company of userName &{ E-
mail or Birthday}
(I think the Birthday will have a very tiny probability to be
Repeated for the same userName and distinct users, And we cant only
count on the E-mail because it may be very rarely repeated for the
same user )

and by using the Birthday as a 2nd parameter in matching another
problem may appear if two user didn't assign their Birthday and it is
Null so it match for different users so we should consider this point

yet I really dont know how to escape from these Inactive userNames
so Plz share your Idea's with me

Thanks in advance
Regards

Mohammed Safwat

unread,
Sep 19, 2008, 7:08:44 PM9/19/08
to CAT Identity System Development
Asalam Alikum ,

Firstly thanks Eng.Mariam for your share .

But when I read this post I became something confused about what you
mean actually.

We have more and more probabilities :

for example , I can make a username in hackers with my first e-mail ,
and in depiak I can make the same username with my second e-mail.

OR

I can make a username (Mohammed) with double "m" in hackers , and in
depiak I can make (Mohamed) with only one "m" , but both usernames
share the same e-mail.

OR

I can make (Mohammed) with my first e-mail , and (Mohamed) with my
second e-mail."and vice versa".

I agree with you that we should use a key to flag this situation in
every account ( not like the keys I talked about in securing data
session , remember :) ?!! )

but we had to consider this idea as a thing to depend upon it if the
birthday field in the registration form originally asked every person
to enter his birth date as "* required" field .

This is my point of view , and thanks very much .
Sincerely ;

Ahmed Soliman

unread,
Sep 21, 2008, 5:18:58 AM9/21/08
to cat-ident...@googlegroups.com
Look,

We can use "Score", this means that each matching step if successful will add a "score" to the match level.

Example:

User: h4ck3r
eMail: h4c...@cat-hackers.com


User: LiNuXaWy
eMail: h4c...@cat-hackers.com

eMails Match +100 points
Username NO match +0 points

then result is 100 points

we can set the limit to 85 *certainty level*
so this is the same user...

Another Example

Signature: Ahmed Soliman Farghal
eMail: h4c...@cat-hackers.com

Signature: Ahmed Soliman
eMail: ah...@farghal.com

eMails NO match +0 Points
Signature 66% match so we add 66% out of the total score for "Name in Signature" match (suppose 80) then +52.8

so total is 52.8 so we are half sure that he is the same person, but we cannot guarantee..so NO

If I modified the last example alittle bit.

Username: h4ck3r
Signature: Ahmed Soliman Farghal
eMail: h4c...@cat-hackers.com

Username: h4ck3r
Signature: Ahmed Soliman
eMail: ah...@farghal.com

then we add 50 points for usernames match so now we have more than 100 points, then he is definitely the man...

got the idea?

design on that basis...
--
Ahmed Soliman
R&D Engineer.
Red Hat Certified Engineer.

Linux-Plus Information Systems L.L.C,
Saray El-Maadi, Tower (A), 1st Floor
35A Cornich El-Nil, Maadi, Cairo, Egypt.
Tel/Fax: +202 2527 66 16 Ext: 802

http://www.linux-plus.com
Reply all
Reply to author
Forward
0 new messages