pagerank

87 views
Skip to first unread message

Elliot McNaught

unread,
Apr 21, 2012, 2:55:45 PM4/21/12
to common...@googlegroups.com
Hi Ahad,

Does pagerank exist in the metadata?

Thanks, Elliot

Ahad Rana

unread,
Apr 21, 2012, 3:01:07 PM4/21/12
to common...@googlegroups.com
Hi Elliot,

It will eventually. We have not re-enabled it yet in our new EC2 workflow, but it is definitely something we are going to be working on in the near future. 

Ahad.



--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To view this discussion on the web visit https://groups.google.com/d/msg/common-crawl/-/Y98NtUq1kSsJ.
To post to this group, send email to common...@googlegroups.com.
To unsubscribe from this group, send email to common-crawl...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/common-crawl?hl=en.

Matthew Berk

unread,
May 21, 2012, 10:44:35 AM5/21/12
to common...@googlegroups.com
Any update?


On Saturday, April 21, 2012 12:01:07 PM UTC-7, Ahad Rana wrote:
Hi Elliot,

It will eventually. We have not re-enabled it yet in our new EC2 workflow, but it is definitely something we are going to be working on in the near future. 

Ahad.

On Sat, Apr 21, 2012 at 11:55 AM, Elliot McNaught <elliot....@gmail.com> wrote:
Hi Ahad,

Does pagerank exist in the metadata?

Thanks, Elliot


--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To view this discussion on the web visit https://groups.google.com/d/msg/common-crawl/-/Y98NtUq1kSsJ.
To post to this group, send email to common...@googlegroups.com.
To unsubscribe from this group, send email to common-crawl+unsubscribe@googlegroups.com.

Ahad Rana

unread,
May 21, 2012, 1:01:09 PM5/21/12
to common...@googlegroups.com
Hi Matthew, 

Unfortunately, it is not queued up yet. My first priority was to get the crawl restarted (http://api.commoncrawl.org/crawlstats.html). Next up is to regenerate the ARCs and Metadata. Ideally, it would have been nice to drive the crawl off of the page-rank, but getting the crawl / corpus numbers up through other means will help light up more parts of the graph, which will eventually contribute to better page-rank numbers.

Ahad. 

To view this discussion on the web visit https://groups.google.com/d/msg/common-crawl/-/pMhNB2q3SBcJ.

To post to this group, send email to common...@googlegroups.com.
To unsubscribe from this group, send email to common-crawl...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages