Non Javascript Environment - Spiders causing queries to API

3 views
Skip to first unread message

MarkOG

unread,
Jul 9, 2010, 6:17:31 AM7/9/10
to Google AJAX APIs, j...@google.com
Hello All (espcially Jeff),

I have a many web pages that display data from the Google Ajax APIs.
These pages do not use ajax. When a user visits the page the server
gets some data from Google api and builds the HTML page.

When GoogleBot recently scanned my site I saw lots of errors in my
custom error logs saying that the request to google api was blocked
for "Suspected Terms of Service Abuse".

At the same time that these errors were being produced I was able to
visit the pages without any problem.

I do include the userip parameter so Iam guessing that the only
request being blocked are the ones that GoogleBot creates. I know it
was Googlebot because the userip was 66.249.71.67 and HTTP_USER_AGENT
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/
bot.htm.

My Question:
Is it against Google Terms of Service to have a Non Ajax web page;
that when a user visits the page the server gets some data from Google
api and builds the HTML page?

The possible problem with this setup is that spiders will cause
requests to the Google Api in order for the page to be built. If this
is against TOS then

Please can a Google Employee help me out with this.

Kind Regards,
Mark.



MarkOG

unread,
Jul 9, 2010, 8:54:39 AM7/9/10
to Google AJAX APIs
This is quite an important question because it must affect many
publishers.

Jeremy Geerdes

unread,
Jul 9, 2010, 9:43:57 AM7/9/10
to google-ajax...@googlegroups.com
My guess is that GoogleBot hits so many pages in such rapid succession that it triggers the API's throttling mechanisms. If this is the case, there's not going to be much that you can do. You might try adding an API key.

Jeremy R. Geerdes
Effective website design & development
Des Moines, IA

For more information or a project quote:
http://jgeerdes.home.mchsi.com
jrge...@gmail.com

If you're in the Des Moines, IA, area, check out Debra Heights Wesleyan Church!

On Jul 9, 2010, at 7:54 AM, MarkOG wrote:

> This is quite an important question because it must affect many
> publishers.
>

> --
> You received this message because you are subscribed to the Google Groups "Google AJAX APIs" group.
> To post to this group, send email to google-ajax...@googlegroups.com.
> To unsubscribe from this group, send email to google-ajax-searc...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-ajax-search-api?hl=en.
>

Adam Feldman

unread,
Jul 9, 2010, 1:20:21 PM7/9/10
to Google AJAX APIs
In most cases, it is appropriate to use nofollow or other meta-tags to
prevent search engine crawlers from initiating API calls. As Jeremy
suggested, this is important because these requests are automated and
can therefore appear to be a large number of spam requests coming from
your site. Blocking these requests with tags (or in your server, when
you detect a search engine bot's header, for instance), can help
ensure that you are following the policy around automated queries and
permanently storing results. For the full Terms of Use, please see
here:
http://code.google.com/apis/ajaxsearch/terms.html
http://code.google.com/apis/ajaxlanguage/terms.html
(depending on which API you're using)

I hope this helps,

Adam

On Jul 9, 6:43 am, Jeremy Geerdes <jrgeer...@gmail.com> wrote:
> My guess is that GoogleBot hits so many pages in such rapid succession that it triggers the API's throttling mechanisms. If this is the case, there's not going to be much that you can do. You might try adding an API key.
>
> Jeremy R. Geerdes
> Effective website design & development
> Des Moines, IA
>
> For more information or a project quote:http://jgeerdes.home.mchsi.com
> jrgeer...@gmail.com

Vision Jinx

unread,
Jul 9, 2010, 3:05:50 PM7/9/10
to Google AJAX APIs
Well this would be a bit of a catch 22 then Adam. As the TOS say you
cant have the results as the primary or only content on the page so to
abide by the terms you would normally have a full webpage or site with
a search on it, so preventing indexing of your site or your content
rich page is probably what most would not want and to try to create
rules and filters for the hundreds of (or more) search engines or
spiders from accessing your site if your using a query url (like in my
case) this creates a bit of a rock and a hard place. On my site, I
have the ability to either use a search box or provide a link with a
query parameter. If somehow a bot picked that up and tried to follow
it I would have the same issue and since my site is fully ajax opting
to not have my site indexed at all is not a practical solution.
Fortunately for me in my case I use javascript to grab the params so
it is not run by a server side query but I can see some of this being
an issue for sure. Just my $0.02 on this is all

On Jul 9, 11:20 am, Adam Feldman <adam.feld...@google.com> wrote:
> In most cases, it is appropriate to use nofollow or other meta-tags to
> prevent search engine crawlers from initiating API calls.  As Jeremy
> suggested, this is important because these requests are automated and
> can therefore appear to be a large number of spam requests coming from
> your site.  Blocking these requests with tags (or in your server, when
> you detect a search engine bot's header, for instance), can help
> ensure that you are following the policy around automated queries and
> permanently storing results.  For the full Terms of Use, please see
> here:http://code.google.com/apis/ajaxsearch/terms.htmlhttp://code.google.com/apis/ajaxlanguage/terms.html

Vision Jinx

unread,
Jul 9, 2010, 3:17:03 PM7/9/10
to Google AJAX APIs
I should also add though that reading the original post "When a user
visits the page the server
gets some data from Google api and builds the HTML page."

This does maybe sound more like a query being sent not as the result
of an end user though. Would need to know more I guess also regarding
that,
> > here:http://code.google.com/apis/ajaxsearch/terms.htmlhttp://code.google.c...
Reply all
Reply to author
Forward
0 new messages