We are developing a simple application that makes call to one of Google's services (Reverse Image Search http://www.google.com/insidesearch/features/images/searchbyimage.html by uploading images by url/image and getting the entity name for the image). Essentially, we were getting the results page (as html) that Google returned and scraping the results using a simple parser.
We hosted this on Google App Engine and found that after a while Google blocked our app (identified by the IP) and send out a message saying it is to prevent bots from sending requests to its websites. Below is the message I found in the web server's logs:
This page appears when Google automatically detects requests coming from your computer network which appear to be in violation of the http://www.google.com/policies/terms/">Terms of Service. The block will expire shortly after those requests stop. In the meantime, solving the above CAPTCHA will let you continue to use our services.
This traffic may have been sent by malicious software, a browser plug-in, or a script that sends automated requests. If you share your network connection, ask your administrator for help — a different computer using the same IP address may be responsible. http://support.google.com/websearch/answer/86640">Learn more
Sometimes you may be asked to solve the CAPTCHA if you are using advanced terms that robots are known to use, or sending requests very quickly.
I wanted to check if there is a way to solve this or any workaround, etc. Since Google doesn't expose any Reverse Image Search API's, we do not see any other way (other than creating a http request and scraping the response) to get the info we want.
Any leads will be helpful.