I am going to mess with it in ASP when our IT guys set it up
for me.
I like Python but have not done anything with SOAP yet.
Google for pygoogle.
Skip
> Has anyone used googles API who would be will to share
> a simple example. I have been wanting to play around with
> SOAP for a while and this looks like a place to start.
>
> I am going to mess with it in ASP when our IT guys set it up
> for me.
I think I saw something on this topic at F. Lundh's site:
http://effbot.org/zone/element-google.htm
Frederik uses his own XML parsing library, but the idea is easy to
understand.
--
Jarek Zgoda
Unregistered Linux User # -1
http://www.zgoda.biz/ JID:zg...@chrome.pl http://zgoda.jogger.pl/
Intel once published an introduction to SOAP using Google
at <URL:
http://cedar.intel.com/cgi-bin/ids.dll/content/content.jsp?cntKey=Generic+Editorial%3a%3aws_google&cntType=IDS_EDITORIAL&catCode=CJA >
but I haven't been able to get that site to return content lately.
--
Cameron Laird <cla...@phaseit.net>
Business: http://www.Phaseit.net
This works well. It is a cool package.
Like a lot of modules in Python. It reduces the problem to a trivial
level. Great way to be productive quickly but does not convey much
understanding of problem.
Bill
> Like a lot of modules in Python. It reduces the problem to a trivial
> level. Great way to be productive quickly but does not convey much
> understanding of problem.
For understanding, go to Google API documentation, than later check
http://effbot.org for Python implementation.
Bill> Like a lot of modules in Python. It reduces the problem to a
Bill> trivial level. Great way to be productive quickly but does not
Bill> convey much understanding of problem.
Sure it does. Study the source. ;-) There are probably many less productive
ways to spend your time than studying Mark Pilgrim's code as well.
Skip
Hi Bill,
Like another poster suggested, I would recommend using PyGoogle. I
used it to retrieve materials for a number of my own experiments and
found that it was rather good - more effective that just spoofing a
query (which I used to do). Here's an example that I used to retrieve
a list of returns (document titles and "snippets") from the Google:
btw - a "concern" is just a search string for this study.
# program to retrieve loads of results from Google.
# (c) 2002 Alan James Salmoni, HCI Group, Cardiff University
# reminder: 10 concerns, 30 summaries each!
# concerns are strings in a file called "concerns.txt"
import google, os, os.path
google.LICENSE_KEY = 'heeheenottellingyou!' # must get your own!
fin = open('concerns.txt','r')
fout = open('results.exp6.txt','w')
deadpage = '<html><body>No Page</body></html>'
# this snippet gets 100 results for each of 10 concerns
#titles = summaries = urls = []
for i in range(11):
fin.readline()
for i in range(0,9):
concernstring = fin.readline()
k = 0
data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','')
fout.write(concernstring) # record the concern on the file
fout.write(str(data.meta.estimatedTotalResultsCount)+'\n') #
record # results
for k in range(0,100,10):
data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','')
# search for "concernstring" indeces 0-100, filtering out
similar results, no restrictions,
# safesearch ON, in english and no input or output encoding!
for j in range(10):
#print 'concern: '+str(i)+' block: '+str(k)+' result:
'+str(j)
if (j+k) < data.meta.estimatedTotalResultsCount:
if not os.path.isdir(str(i)):
os.mkdir(str(i))
<bits snipped out - nothing to do with Google - honest!)
try:
fout.write(data.results[j].URL+'\n')
except:
fout.write('NONE\n')
aliveflag = False
if aliveflag:
page = google.doGetCachedPage(data.results[j].URL)
print str(i)+':'+str(j+k+1)+": retrieved page
"+data.results[j].URL
else:
page = deadpage
print str(j+1)+': Page DEAD '+data.results[j].URL
fout2.write(page)
fout2.close()
fout.close()
fin.close()
This line (data = google.doGoogleSearch(concernstring, 0+k,
10,1,'',1,'lang_en','','') sends a search request (held in the
variable "concernstring" and asks for a list of results. The rank
starts from 0 + k and I've asked for 10. Not sure what the next bit
is, but after that, I'm requesting English language docs. Not sure
about the 2 other bits, but the PyGoogle module has the information
you need!
There's lots of other stuff like google.doGetCachedPage(URL) which
gets the entire page from the Google cache,
The code is a bit hairy (yeah, quick'n;dirty scripting - dontcha love
it!), and I haven't looked at it for a while (over a year), but it
does make sense to me! (another one up for Python!). Let me know if
you have any questions.
Alan James Salmoni
SalStat Statistics
http://salstat.sourceforge.net
I was able to make pyGoogle work quite easily.
I did try looking at the source but was getting tangled up in the
enheritance. I guess building a general soap parser is non-trival.
I was looking for more of low-brow approach.
F. Lundh's site: http://effbot.org/zone/element-google.htm
some of the examples there were helpful.
I think I need to learn more about parsing xml and
xml.sax.handler.ContentHandler.
Thanks again for your input I am learning how be it slowly.
Bill