Question regarding Pagerank.

49 views
Skip to first unread message

RP

unread,
Nov 29, 2012, 8:39:11 PM11/29/12
to csc-32...@googlegroups.com
Using crawler (doc-Ids)its easy to keep a count of the out going links from a url? But how is it possible to determine the amount of links pointed to given url(incoming).

Wesley May

unread,
Nov 30, 2012, 1:09:06 AM11/30/12
to csc-32...@googlegroups.com
You can simply have a dictionary that maps URLs to the count of edges coming in, so something like count["google.com"] = 5. You'll have to go through your list of URLs to build this mapping. It doesn't suffice to just look at one website, because of course you can't see the number of links coming in.

RP

unread,
Nov 30, 2012, 2:30:02 AM11/30/12
to csc-32...@googlegroups.com
. So we compare the list of all urls(depth=1) formulated by the crawler of a single URL and  and check whether they have any common matches with another URL in the url.txt., comparing Doc_ids.

Wesley May

unread,
Nov 30, 2012, 4:42:04 PM11/30/12
to csc-32...@googlegroups.com
Yes, that sounds good.
Reply all
Reply to author
Forward
0 new messages