You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to Common Crawl
Hi Sebastian,
Quick question on the web graph: being that the graph is produced just from a set of a few months of crawl data and each crawl has a different set of source pages, would it be correct to assume an optimally comprehensive graph would be produced from the union of multiple host/domain-level graph dumps?
Are you aware of any research to see what the difference is in ranking/present links/coverage across each?
Thanks!
Sebastian Nagel
unread,
Dec 15, 2022, 7:32:24 AM12/15/22
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Sign in to report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to common...@googlegroups.com
Hi Phil,
yes, if you combine multiple graphs (or build a graph from more
"monthly" crawls) the graph is expected to include more nodes and
also more edges between nodes.
Caused by a bug once a single-month graph was released: