CCBot and Cloudflare

159 views
Skip to first unread message

Sophi Müller

unread,
Nov 28, 2023, 3:35:32 PM11/28/23
to Common Crawl
As CCBot isn't on the Verified Bots list of Cloudflare (probably because CCBot doesn't use a static IP range), I'm wondering if CCBot gets regularly blocked by Cloudflare? If yes, this would infer with the quality of CommonCrawl's datasets. I couldn't find any info on this matter. Thanks!

Greg Lindahl

unread,
Nov 28, 2023, 4:59:37 PM11/28/23
to common...@googlegroups.com
Sophi,

Cloudflare doesn't seem to respond to requests to become a Verified
Bot from anyone -- we could do the IP address thing, it's a little
annoying to implement.

We don't have any measure of Cloudflare blocking. It would be great to
measure that and the other common systems that might be blocking us:
Barracuda, Akamai, AWS WAF, etc etc.

-- greg


On Tue, Nov 28, 2023 at 12:35 PM Sophi Müller <schahse...@web.de> wrote:
>
> As CCBot isn't on the Verified Bots list of Cloudflare (probably because CCBot doesn't use a static IP range), I'm wondering if CCBot gets regularly blocked by Cloudflare? If yes, this would infer with the quality of CommonCrawl's datasets. I couldn't find any info on this matter. Thanks!
>
> --
> You received this message because you are subscribed to the Google Groups "Common Crawl" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to common-crawl...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/common-crawl/493f8a6f-a825-4602-baea-c57713f039a9n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages