How Can I Get or Buy Search Results?

202 views
Skip to first unread message

PHPBABY3

unread,
Nov 18, 2012, 9:11:56 PM11/18/12
to Google AJAX APIs
I have an application in which I need to know the results of a
variable Google (or other search engine) results i.e. the first page
with the number of results and the first 10, and a link to the next
page if there are more.

When I get it through a PHP function at some point Google only sends
a
message that it thinks I'm a robot (which I am.)


Charlie

Jeremy Geerdes

unread,
Nov 18, 2012, 9:17:39 PM11/18/12
to google-ajax...@googlegroups.com

You can use either the Web Search or Custom Search API to make programmatic queries of Google's search services, but neither TOS allows robots. Further, both APIs are designed to NOT return results which are identical to a standard Google search in a deliberate effort to discourage robots. Sorry.

Jg

--
You received this message because you are subscribed to the Google
Groups "Google AJAX APIs" group.
To post to this group, send email to
google-ajax...@googlegroups.com
To unsubscribe from this group, send email to
google-ajax-searc...@googlegroups.com
To view this message on the web, visit
http://groups.google.com/group/google-ajax-search-api?hl=en_US
For more options, visit this group at
http://groups.google.com/group/google-ajax-search-api?hl=en?hl=en

Geoffrey Hoffman

unread,
Nov 18, 2012, 9:29:15 PM11/18/12
to google-ajax...@googlegroups.com
Throttle your requests throug multiple IPs.


PHPBABY3

unread,
Nov 18, 2012, 9:31:03 PM11/18/12
to Google AJAX APIs


On Nov 18, 9:17 pm, Jeremy Geerdes <jrgeer...@gmail.com> wrote:
> You can use either the Web Search or Custom Search API to make programmatic
> queries of Google's search services, but neither TOS allows robots.

I want a PHP function that I can use to get the number of results and
the first 10 URLs for a given input into Google. I don't know what
you're saying exactly, but would it do what I am saying?

> but neither TOS allows robots.

What would a robot be?

> Further, both APIs are designed to NOT return results which are identical
> to a standard Google search in a deliberate effort to discourage robots.

That sounds like a syntax change and of little consequence to a
programmer like me.

Charlie

> Sorry.
>
> Jg

PHPBABY3

unread,
Nov 18, 2012, 9:33:54 PM11/18/12
to Google AJAX APIs

On Nov 18, 9:29 pm, Geoffrey Hoffman <geoffrey.hoff...@gmail.com>
wrote:
> Throttle your requests throug multiple IPs.
>

Is that all it is - frequency of request? I lately started getting it
when I manually copy a complete set of results (all pages.) What
would the threshold frequency be?

Charlie

Geoffrey Hoffman

unread,
Nov 18, 2012, 11:21:57 PM11/18/12
to google-ajax...@googlegroups.com
You can also try things like randomizing your sleep time in-between requests and changing the user-agent string you send—basically make your robot seem more like an office full of regular users. Be nice to their servers and you may be able to get search results undetected.

For the record, the Google Web Search API, which is deprecated (but still worked last time I checked it), states, in part, on the TOS that you will not:
  • use any robot, spider, site search/retrieval application, or other device to retrieve or index any portion of Google Search Results or to collect information about users for any unauthorized purpose;
I have to believe that since they've deprecated the web search API, they want you to scrape their results now even less than they used to.

Jeremy Geerdes

unread,
Nov 19, 2012, 7:33:03 AM11/19/12
to google-ajax...@googlegroups.com
A robot would be an application which retrieves search results or other data in an automated manner. That would apply to applications which make requests at a fixed or random interval. Such applications are prohibited by the Web Search TOS. Scraping - which I suspect a court of law would define in a similar manner - is prohibited in the Custom Search APIs TOS.

Also, as I said before, neither API will return results which are completely consistent with the results you will find on google.com. This is not a mere syntax change. The results you receive from either of the Google APIs will be different than the results a regular end user would receive running a search on Google. So building an application based on the APIs will be of only marginal value for SEO purposes, which is what it sounds like you're trying to do.

And since the general TOS prohibit scraping of results, you're not allowed to pull a regular google.com search and parse out the results there, either.

In short, I'm not sure that you can do what you're proposing in a legal manner. And therefore, I would strongly discourage you from doing so. Certainly, I would discourage persons on this group from facilitating your efforts.

jg
Jeremy R. Geerdes
Generally Cool Guy
Des Moines, IA

If you're in the Des Moines, IA, area, check out Debra Heights Wesleyan Church!

PHPBABY3

unread,
Nov 19, 2012, 10:58:06 PM11/19/12
to Google AJAX APIs

On Nov 19, 7:33 am, Jeremy Geerdes <jrgeer...@gmail.com> wrote:
> A robot would be an application which retrieves search results or other
> data in an automated manner.

The user enters in a search term. I call Google to find out how many
pages contain it and use that information. Then I may want the first
10 results and show that to the user. Is that considered a robot?

> That would apply to applications which make
> requests at a fixed or random interval. Such applications are prohibited by
> the Web Search TOS. Scraping - which I suspect a court of law would define
> in a similar manner - is prohibited in the Custom Search APIs TOS.
>
> Also, as I said before, neither API will return results which are
> completely consistent with the results you will find on google.com. This is
> not a mere syntax change. The results you receive from either of the Google
> APIs will be different than the results a regular end user would receive
> running a search on Google. So building an application based on the APIs
> will be of only marginal value for SEO purposes, which is what it sounds
> like you're trying to do.
>
> And since the general TOS prohibit scraping of results, you're not allowed
> to pull a regular google.com search and parse out the results there, either.
>
> In short, I'm not sure that you can do what you're proposing in a legal

I'm offering to pay for the access. Is there no search engine that
sells its results?

> manner. And therefore, I would strongly discourage you from doing so.
> Certainly, I would discourage persons on this group from facilitating your
> efforts.
>
> jg
>
> On Sun, Nov 18, 2012 at 10:21 PM, Geoffrey Hoffman <
>
>
>
>
>
> geoffrey.hoff...@gmail.com> wrote:
> > You can also try things like randomizing your sleep time in-between
> > requests and changing the user-agent string you send—basically make your
> > robot seem more like an office full of regular users. Be nice to their
> > servers and you may be able to get search results undetected.
>
> > For the record, the Google Web Search API<https://developers.google.com/web-search/terms>,
> > which is deprecated (but still worked last time I checked it), states, in
> > part, on the TOS that *you will not*:
>
> >    - *use any robot*, spider, site search/*retrieval application*, or
> >    other device *to retrieve or index any portion of Google Search Results
> >    * or to collect information about users for any unauthorized purpose;
>
> > I have to believe that since they've deprecated the web search API, they
> > want you to scrape their results now even less than they used to.
>

hussain free

unread,
Oct 15, 2013, 8:04:56 PM10/15/13
to google-ajax...@googlegroups.com
hey PHPBABY3 

i'm trying to do the same thing,it's web filter,get all websites that contain a word like "chat", i have made some code to retrieve search results but i had a problem in changing ip address 
i already use proxy as a solution.

if you want to corporate to make this script working fine it will be for our behalf.
Reply all
Reply to author
Forward
0 new messages