Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

any one used googles api?

0 views
Skip to first unread message

Bill Sneddon

unread,
Dec 19, 2003, 11:16:15 AM12/19/03
to
Has anyone used googles API who would be will to share
a simple example. I have been wanting to play around with
SOAP for a while and this looks like a place to start.

I am going to mess with it in ASP when our IT guys set it up
for me.


I like Python but have not done anything with SOAP yet.

http://www.google.com/apis/

Skip Montanaro

unread,
Dec 19, 2003, 12:02:42 PM12/19/03
to Bill Sneddon, pytho...@python.org

Bill> Has anyone used googles API who would be will to share a simple
Bill> example. I have been wanting to play around with SOAP for a while
Bill> and this looks like a place to start.

Google for pygoogle.

Skip

Jarek Zgoda

unread,
Dec 19, 2003, 5:01:06 PM12/19/03
to
Bill Sneddon <bsneddo...@yahoo.com> pisze:

> Has anyone used googles API who would be will to share
> a simple example. I have been wanting to play around with
> SOAP for a while and this looks like a place to start.
>
> I am going to mess with it in ASP when our IT guys set it up
> for me.

I think I saw something on this topic at F. Lundh's site:
http://effbot.org/zone/element-google.htm

Frederik uses his own XML parsing library, but the idea is easy to
understand.

--
Jarek Zgoda
Unregistered Linux User # -1
http://www.zgoda.biz/ JID:zg...@chrome.pl http://zgoda.jogger.pl/

Cameron Laird

unread,
Dec 20, 2003, 10:48:35 AM12/20/03
to
In article <brv88k$kkq$1...@ngspool-d02.news.aol.com>,

Intel once published an introduction to SOAP using Google
at <URL:
http://cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Generic+Editorial%3a%3aws_google&cntType=IDS_EDITORIAL&catCode=CJA >
but I haven't been able to get that site to return content lately.
--

Cameron Laird <cla...@phaseit.net>
Business: http://www.Phaseit.net

Bill Sneddon

unread,
Dec 22, 2003, 1:48:58 PM12/22/03
to
Skip,

This works well. It is a cool package.

Like a lot of modules in Python. It reduces the problem to a trivial
level. Great way to be productive quickly but does not convey much
understanding of problem.

Bill

Jarek Zgoda

unread,
Dec 22, 2003, 2:47:30 PM12/22/03
to
Bill Sneddon <bsneddo...@yahoo.com> pisze:

> Like a lot of modules in Python. It reduces the problem to a trivial
> level. Great way to be productive quickly but does not convey much
> understanding of problem.

For understanding, go to Google API documentation, than later check
http://effbot.org for Python implementation.

Skip Montanaro

unread,
Dec 22, 2003, 3:07:44 PM12/22/03
to Bill Sneddon, pytho...@python.org

Bill> This works well. It is a cool package.

Bill> Like a lot of modules in Python. It reduces the problem to a
Bill> trivial level. Great way to be productive quickly but does not
Bill> convey much understanding of problem.

Sure it does. Study the source. ;-) There are probably many less productive
ways to spend your time than studying Mark Pilgrim's code as well.

Skip

Alan James Salmoni

unread,
Dec 22, 2003, 4:42:48 PM12/22/03
to
Bill Sneddon <bsneddo...@yahoo.com> wrote in message news:<brv88k$kkq$1...@ngspool-d02.news.aol.com>...

Hi Bill,

Like another poster suggested, I would recommend using PyGoogle. I
used it to retrieve materials for a number of my own experiments and
found that it was rather good - more effective that just spoofing a
query (which I used to do). Here's an example that I used to retrieve
a list of returns (document titles and "snippets") from the Google:
btw - a "concern" is just a search string for this study.

# program to retrieve loads of results from Google.
# (c) 2002 Alan James Salmoni, HCI Group, Cardiff University

# reminder: 10 concerns, 30 summaries each!
# concerns are strings in a file called "concerns.txt"
import google, os, os.path

google.LICENSE_KEY = 'heeheenottellingyou!' # must get your own!
fin = open('concerns.txt','r')
fout = open('results.exp6.txt','w')
deadpage = '<html><body>No Page</body></html>'


# this snippet gets 100 results for each of 10 concerns
#titles = summaries = urls = []

for i in range(11):

fin.readline()
for i in range(0,9):
concernstring = fin.readline()

k = 0

data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','')
fout.write(concernstring) # record the concern on the file
fout.write(str(data.meta.estimatedTotalResultsCount)+'\n') #
record # results
for k in range(0,100,10):
data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','')
# search for "concernstring" indeces 0-100, filtering out
similar results, no restrictions,
# safesearch ON, in english and no input or output encoding!
for j in range(10):

#print 'concern: '+str(i)+' block: '+str(k)+' result:
'+str(j)

if (j+k) < data.meta.estimatedTotalResultsCount:

if not os.path.isdir(str(i)):

os.mkdir(str(i))

<bits snipped out - nothing to do with Google - honest!)
try:
fout.write(data.results[j].URL+'\n')

except:

fout.write('NONE\n')
aliveflag = False

if aliveflag:

page = google.doGetCachedPage(data.results[j].URL)

print str(i)+':'+str(j+k+1)+": retrieved page
"+data.results[j].URL

else:

page = deadpage

print str(j+1)+': Page DEAD '+data.results[j].URL

fout2.write(page)

fout2.close()

fout.close()
fin.close()


This line (data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','') sends a search request (held in the
variable "concernstring" and asks for a list of results. The rank
starts from 0 + k and I've asked for 10. Not sure what the next bit
is, but after that, I'm requesting English language docs. Not sure
about the 2 other bits, but the PyGoogle module has the information
you need!

There's lots of other stuff like google.doGetCachedPage(URL) which
gets the entire page from the Google cache,

The code is a bit hairy (yeah, quick'n;dirty scripting - dontcha love
it!), and I haven't looked at it for a while (over a year), but it
does make sense to me! (another one up for Python!). Let me know if
you have any questions.

Alan James Salmoni
SalStat Statistics
http://salstat.sourceforge.net

Bill Sneddon

unread,
Dec 22, 2003, 9:37:17 PM12/22/03
to
Thanks to all.

I was able to make pyGoogle work quite easily.

I did try looking at the source but was getting tangled up in the
enheritance. I guess building a general soap parser is non-trival.

I was looking for more of low-brow approach.

some of the examples there were helpful.

I think I need to learn more about parsing xml and
xml.sax.handler.ContentHandler.

Thanks again for your input I am learning how be it slowly.

Bill

0 new messages