http-equiv = refresh

21 views
Skip to first unread message

Christian Lund

unread,
Nov 13, 2016, 5:59:42 AM11/13/16
to Common Crawl
Hi,

Does the Common Crawl crawler consider the following as a redirect and does it then follow the potential redirect chain?

<meta http-equiv="refresh" content="0;url=index.php">


Sebastian Nagel

unread,
Nov 14, 2016, 10:03:33 AM11/14/16
to common...@googlegroups.com
Hi Christian,

the crawler does only follow HTTP redirects up to 3 hops.
As the content is not parsed during the crawling, the crawler
cannot follow meta refresh redirects. Of course, there is some
chance that the target URL has found another way into the crawl.
But it may then be contained in a different segment.

Best,
Sebastian
> --
> You received this message because you are subscribed to the Google Groups "Common Crawl" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> common-crawl...@googlegroups.com <mailto:common-crawl...@googlegroups.com>.
> To post to this group, send email to common...@googlegroups.com
> <mailto:common...@googlegroups.com>.
> Visit this group at https://groups.google.com/group/common-crawl.
> For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages