On Thu, 18 Dec 2008 21:24:13 -0500, <dherr...@at.tentpost.dot.com> wrote:
>> Hmmm. Scrape Google? I see that their cache of wiki.alu.org has >> already updated to show alu.org, but some of the other pages are still >> cached. A query of "site:wiki.alu.org" returns 344 pages (347 if you >> select the "omitted results". I'll take a stab at it, but the history >> and metadata cannot be retrieved this way...
> So I started in on it, but the Goog identified what I was doing and > disabled my scraping. Got ~100 files (Bob_Bechtel to > Switch_Date_2001) before they caught me...
> "" > We're sorry... > ... but your query looks similar to automated requests from a computer > virus or spyware application. To protect our users, we can't process > your request right now.
-) Go slower (which you mention) -) Start-Stop a little (~ use a coin flip to continue) -) Use a proxy, or better several proxies
-- "Most programmers use this on-line documentation nearly all of the time, and thereby avoid the need to handle bulky manuals and perform the translation from barbarous tongues." CMU CL User Manual ** Posted from http://www.teranews.com **