Aim- jayjay08balla
MSN- Jmai...@gmail.com
Yahoo- raeraefad72
Thx
Here are some samples with various libraries for XPath
http://www.oreillynet.com/pub/wlg/6225
Read XPath basics here
http://www.w3schools.com/xpath/default.asp
It is not practical and perhaps not polite to expect people write
tutorials just for you and send by email. There are a lot of tutorials
on the web on this. Just use Google.
http://docs.python.org/lib/dom-example.html
the handleSlide function almost does what you want, except that you should use
'parse' and not 'parseString'.
Sorry about any wrapping that mangles the code.
regards
Steve
#!/usr/bin/python
#
# getbooks.py: download book details from Amazon.com
#
# hwBuild: database-driven web content management system
# Copyright (C) 2005 Steve Holden - st...@holdenweb.com
#
# This program is free software; you can redistribute it
# and/or modify it under the terms of the GNU General
# Public License as published by the Free Software
# Foundation; either version 2 of the License, or (at
# your option) any later version.
#
# This program is distributed in the hope that it will be
# useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
# PURPOSE. See the GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public
# License along with this program; if not, write to the
# Free Software Foundation, Inc., 59 Temple Place, Suite 330,
# Boston, MA 02111-1307 USA
#
import urllib
import urlparse
import os
import re
from xml.parsers import expat
from config import Config
picindir = os.path.join(Config['datadir'], "pybooks")
for f in os.listdir(picindir):
os.unlink(os.path.join(picindir, f))
filpat = re.compile(r"\d+")
class myParser:
def __init__(self):
self.parser = expat.ParserCreate()
self.parser.StartElementHandler = self.start_element
self.parser.EndElementHandler = self.end_element
self.parser.CharacterDataHandler = self.character_data
self.processing = 0
self.count = 0
def parse(self, f):
self.parser.ParseFile(f)
return self.count
def start_element(self, name, attrs):
if name == "MediumImage":
self.processing = 1
self.imgname = ""
if self.processing == 1 and name == "URL":
self.processing = 2
def end_element(self, name):
if self.processing == 2 and name == "URL":
self.processing = 1
print "Getting:", self.imgname
scheme, loc, path, params, query, fragment =
urlparse.urlparse(self.imgname)
itemno = filpat.match(os.path.basename(path))
fnam = itemno.group()
u = urllib.urlopen(self.imgname)
img = u.read()
outfile = file(os.path.join(picindir, "%s.jpg" % fnam), "wb")
outfile.write(img)
outfile.close()
self.count += 1
if self.processing ==1 and name == "MediumImage":
self.processing = 0
def character_data(self, data):
if self.processing == 2:
self.imgname += data
def main(search=None):
print "Search:", search
count = 0
for pageNum in range(1,5):
f =
urllib.urlopen("http://webservices.amazon.com/onca/xml?Service=AWSECommerceService&AWSAccessKeyId=XXXXXXXXXXXXXXXXXXXX&t=steveholden-20&SearchIndex=Books&Operation=ItemSearch&Keywords=%s&ItemPage=%d&ResponseGroup=Images&type=lite&Version=2004-11-10&f=xml"
% (urllib.quote(search or Config['book-search']), pageNum))
fnam = os.path.join(picindir, "bookdata.txt")
file(fnam, "w").write(f.read())
f = file(fnam, "r")
p = myParser()
n = p.parse(f)
if n == 0:
break
count += n
return count
if __name__ == "__main__":
import sys
search = None
if len(sys.argv) > 1:
search = sys.argv[1]
n = main(search)
print "Pictures found:", n
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
> OK, I have this XML doc, i dont know much about XML, but what i want
> to do is take certain parts of the XML doc
the most simple module I've found to do that is xmltramp from
http://www.aaronsw.com/2002/xmltramp/
for example:
#!/usr/bin/env python
import xmltramp
note = xmltramp.load('http://www.w3schools.com/xml/note.xml')
print note.body
Someone already mentioned
http://www.oreillynet.com/pub/wlg/6225
I do want to update that Amara API. As of recent releases it's as
simple as
import amara
doc = amara.parse("foo.opml")
for url in doc.xpath("//@xmlUrl"):
print url.value
Besides the XPath option, Amara [1] provides Python API options for
unknown elements, such as
node.xml_child_elements
node.xml_attributes
This is all covered with plenty of examples in the manual [2]
[1] http://uche.ogbuji.net/tech/4suite/amara/
[2] http://uche.ogbuji.net/uche.ogbuji.net/tech/4suite/amara/manual-dev
--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/
and XMLTramp seemed the most simple to understand.
would the path be something like this?
import xmltramp
rssDigg = xmltramp.load("http://www.digg.com/rss/index.xml")
print note.rss.channel.item.title
I think thats wat im having the most confusion on now, is how to direct
to the path that i want...
Suggestions?
I suggest you read at least the front page information for the tools
you are using. It's quite clear from the xmltramp Web site (
http://www.aaronsw.com/2002/xmltramp/ ) that you want tomething like
(untested: the least homework you can do is to refine the example
yourself):
print rssDigg[rss.channel][item][title]
BTW, in Amara, the API is pretty much exactly what you guessed:
>>> import amara
>>> rssDigg = amara.parse("http://www.digg.com/rss/index.xml")
>>> print rssDigg.rss.channel.item.title
Video: Conan O'Brien iPod Ad Parody
http://www.activestate.com/Products/ActivePython/
And the requirements for Amara is Python 2.4 so.... Thats where we have
a problem, i need Amara for ActivePython. And i would like to keep
working on ActivePython w/o downloading Python 2.4.
>>> import amara
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
ImportError: No module named amara
Pretty straight forward....
As far as it should work since their both transparent, umm, well its
not.
But what would be a help would be if u knew the install dir for
ActivePython so maybe i can install amara stand alone into the
ActivePython installation dir. ?? Maybe
"As far as it should work since their both transparent, umm, well its
not."
Why do you think it is not transparent? Did you try installing it on
both?
I have ActivePython 2.4 here and it loads amara fine.
"
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
ImportError: No module named amara
"
That means you did not manage to install it properly. Are you new to
installing Python modules from command line? If you need more hand
holding, try the Python IRC channel on freenode. The responses will be
more in real time. You probably need that since you seem to have more
than one thing to learn about.
I meant that only mine isnt, maybe urs is but for some reason it isnt.
And you said amara works fine for you, ok, then could you tell me what
package to install...
I have installed Amara 1.1.6 for Python 2.4 and it works on python 2.4
only.
Now, which package should i download for it to work on any python
prompt:
Allinone
Standalone
Or something else
I've never used ActivePython. I don't know of any special gotchas for
it. But Amara works in Python 2.3 or 2.4. The only differences
between the Allinone and standalone packages is that Allinone includes
4Suite. Do get at least version 1.1.6.
If you're still having trouble with the ActivePython setup, the first
thing I'd ask is how you installed Amara. DId you run a WIndows
installer? Next I'd check the library path for ActivePython. What is
the output of
python -c "import sys; print sys.path"
Where you replace "python" abpve with whatever way you invoke
ActivePython.
And yes, last time i did type python setup.py install.
Thx anyway.
>>> import amara
>>> amara.parse("http://www.digg.com/rss/index.xml")
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
File "C:\Python23\Lib\site-packages\amara\__init__.py", line 50, in
parse
if IsXml(source):
NameError: global name 'IsXml' is not defined
So im guessing theres an error with one of the files...
IsXml is imported conditionally, so this is an indicator that somethign
about your module setup is still not agreeing with ActivePython. What
do you see as the output of:
python -c "import amara; print dir(amara)"
? I get:
['InputSource', 'IsXml', 'Uri', 'Uuid', '__builtins__', '__doc__',
'__file__', '__name__', '__path__', '__version__', 'bindery',
'binderytools', 'binderyxpath', 'create_document', 'dateutil_standins',
'domtools', 'os', 'parse', 'pushbind', 'pushdom', 'pyxml_standins',
'saxtools']
when doing it seperately, i got>
>>> import amara
>>> print dir(amara)
['__builtins__', '__doc__', '__file__', '__name__', '__path__',
'__version__', 'binderytools', 'os', 'parse']
>>>
['__builtins__', '__doc__', '__file__', '__name__', '__path__',
'__version__', 'binderytools', 'os', 'parse']
"""
So it's not able to load domtools. What do you get trying
from amara import domtools
print domtools.py
> """
> Spoke too soon, i get this error when running amara in
> ActivePython
>
>>>> import amara
>>>> amara.parse("http://www.digg.com/rss/index.xml")
>
> Traceback (most recent call last):
> File "<interactive input>", line 1, in ?
> File "C:\Python23\Lib\site-packages\amara\__init__.py", line
> 50, in
> parse
> if IsXml(source):
> NameError: global name 'IsXml' is not defined
>
> So im guessing theres an error with one of the files...
> """
>
> IsXml is imported conditionally, so this is an indicator that
> somethign about your module setup is still not agreeing with
> ActivePython. What do you see as the output of:
>
> python -c "import amara; print dir(amara)"
>
> ? I get:
>
> ['InputSource', 'IsXml', 'Uri', 'Uuid', '__builtins__',
> '__doc__', '__file__', '__name__', '__path__', '__version__',
> 'bindery', 'binderytools', 'binderyxpath', 'create_document',
> 'dateutil_standins', 'domtools', 'os', 'parse', 'pushbind',
> 'pushdom', 'pyxml_standins', 'saxtools']
>
Not wanting to hijack this thread, but it got me interested in
installing amara. I downloaded Amara-allinone-1.0.win32-py2.4.exe
and ran it. It professed that the installation directory was to be
D:\Python24\Lib\site-packages\ ... but it placed FT and amara in D:
\Python24\Python24\Lib\site-packages . Possibly the installer is
part of the problem here?
--
rzed
That's really good to know. Someone else builds the Windows installer
package for Amara (I'm a near Windows illiterate), but I definitely
want to help be sure the installer works properly. In fact, your
message rings a bell that this specifically came up before:
http://lists.fourthought.com/pipermail/4suite/2005-November/007610.html
I'll have to ask some of the Windows gurus on the 4Suite list whether
they know why this might be. Do you mind if I cc you on those
messages, so that you can perhaps try out any solutions we come up
with?
Thanks.
> """
> Not wanting to hijack this thread, but it got me interested in
> installing amara. I downloaded
> Amara-allinone-1.0.win32-py2.4.exe and ran it. It professed that
> the installation directory was to be
> D:\Python24\Lib\site-packages\ ... but it placed FT and amara in
> D: \Python24\Python24\Lib\site-packages . Possibly the installer
> is part of the problem here?
> """
>
> That's really good to know. Someone else builds the Windows
> installer package for Amara (I'm a near Windows illiterate), but
> I definitely want to help be sure the installer works properly.
> In fact, your message rings a bell that this specifically came
> up before:
>
> http://lists.fourthought.com/pipermail/4suite/2005-November/00761
> 0.html
>
> I'll have to ask some of the Windows gurus on the 4Suite list
> whether they know why this might be. Do you mind if I cc you on
> those messages, so that you can perhaps try out any solutions we
> come up with?
>
> Thanks.
>
I'd be delighted to run them. Bring 'em on!
If this is useful information: the opening screen of the installer
correctly shows D:\Python24\ as my Python directory, and correctly
shows (on my computer):
D:\Python24\Lib\site-packages\ as the Installation Directory. The
file names as it installs are of the form
"Python24\Lib\site-packages\...", which to me hints that it takes
that generated name and appends it to the Python directory to
produce the actual file path it then uses.
--
rzed
But anyway, i get this...
>>> import amara
>>>from amara import domtools
>>> print domtools.py
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
NameError: name 'domtools' is not defined
>>>
suggestions?
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
NameError: name 'domtools' is not defined
"""
Sheesh! That right after waking up. And it shows :-)
Should have been "print domtools.__file__"
>>> from amara import domtools
>>> print domtools.__file__
C:\Python23\lib\site-packages\amara\domtools.pyc
>>>