Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

5x better! Many, many thanks

56 views
Skip to first unread message

Tim Allison

unread,
Apr 7, 2025, 1:24:26 PMApr 7
to Common Crawl
Apologies for being late to the game, but I'm THRILLED that as of the March crawl you've bumped the max bytes to 5MiB from 1MiB.

Thank you!

Best,

         Tim

Greg Lindahl

unread,
Apr 15, 2025, 1:16:31 AMApr 15
to common...@googlegroups.com
On Mon, Apr 07, 2025 at 10:24:25AM -0700, Tim Allison wrote:
> Apologies for being late to the game, but I'm THRILLED that as of the March
> crawl you've bumped the max bytes to 5MiB from 1MiB.

Tim,

Glad to hear from you! You were part of the motivation for this
change -- you had to do a fair amount of extra crawling to create
your PDF dataset because of the large number of truncations.

-- greg


Reply all
Reply to author
Forward
0 new messages