> With each crawl costing money, the whole idea of crawling the Internet will have to change.
which led me to a thought: Since google bot is crawling zillions of web sites, a change from depth-first crawling to bread-first crawling would make a huge difference here. Dunno if that's practical, but it would be a nice thing for the google search guys to look into, to make GAE and Google-Bot more compatible. Because right now, there's a lot of evidence that they are accidentally conspiring to be evil.
-Joshua
--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/92F2o_-16zMJ.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
Sure, but if they just went breadth-first (putting pages to crawl into the tail of a work queue that spans hundreds of sites), then there wouldn't be a spike at all.
That’s all good info, but it doesn’t apply if you are on GAE. If you are on GAE you can’t specify your crawl rate. It is assigned a special Crawl rate.
--