It’s a bit unclear to me whether there Is still a way to download from common crawl with high perf (i.e. >>1MB/sec, i.e. many GB/sec) that is discussed below, from outside of AWS, and without incurring additional costs on any end.
Can someone please help clarifying?
Much apologies if this was already answered somwhere else and I missed the pointer.
From:
common...@googlegroups.com <common...@googlegroups.com> on behalf of kasper...@gmail.com <kasper...@gmail.com>
Date: Wednesday, September 21, 2022 at 12:12 AM
To: Common Crawl <common...@googlegroups.com>
Subject: [EXTERNAL] [cc] News Archive rate reduction
Hi everyone, I was recently downloading last months news crawl via authenticated S3. About halfway through, I got a rate reduction error: "botocore. exceptions. ClientError: An error occurred (SlowDown) when calling the GetObject operation (reached
ZjQcmQRYFpfptBannerStart
|
ZjQcmQRYFpfptBannerEnd
--
You received this message because you are subscribed to the Google Groups "Common Crawl" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
common-crawl...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/common-crawl/f3b531dd-1494-4eaa-9d62-5aa4de492314n%40googlegroups.com.