XPath and XQuery in Python?

1518 views
Skip to first unread message

Nelson Minar

unread,
Jan 11, 2005, 7:09:58 PM1/11/05
to
Could someone help me get started using XPath or XQuery in Python? I'm
overwhelmed by all the various options and am lacking guidance on what
the simplest way to go is. What library do I need to enable three line
Python programs to extract data with XPath expressions?

I have this problem a lot with Python and XML. Even with Uche's
excellent yearly roundups I have a hard time finding how to do fancy
things with XML in Python. I think it's a bit like web server
frameworks in Python - too many choices.
http://www.xml.com/pub/a/2004/10/13/py-xml.html

John Lenton

unread,
Jan 13, 2005, 6:39:00 PM1/13/05
to Nelson Minar, pytho...@python.org

my own favorite is libxml2. Something like the following:

#!/usr/bin/env python
import libxml2
import sys

def grep(what, where):
doc = libxml2.parseDoc(where)
for found in doc.xpathEval(what):
found.saveTo(sys.stdout, format=True)

if __name__ == '__main__':
try:
what = sys.argv[1]
except IndexError:
sys.exit("Usage: %s pattern file ..." % sys.argv[0])
else:
for where in sys.argv[2:]:
grep(what, file(where).read())

although you might want to be smarter with the errors...

--
John Lenton (jo...@grulic.org.ar) -- Random fortune:
The whole world is a scab. The point is to pick it constructively.
-- Peter Beard

signature.asc

Nelson Minar

unread,
Jan 14, 2005, 10:51:52 AM1/14/05
to
Nelson Minar <nel...@monkey.org> writes:
> Could someone help me get started using XPath or XQuery in Python?

I figured this out. Thanks for the help, John! Examples below.

I used this exercise as an opportunity to get something off my chest
about XML and Python - it's kind of a mess! More here:
http://www.nelson.monkey.org/~nelson/weblog/tech/python/xpath.html

Here are my samples, in three libraries:

# PyXML

from xml.dom.ext.reader import Sax2
from xml import xpath
doc = Sax2.FromXmlFile('foo.opml').documentElement
for url in xpath.Evaluate('//@xmlUrl', doc):
print url.value

# libxml2

import libxml2
doc = libxml2.parseFile('foo.opml')
for url in doc.xpathEval('//@xmlUrl'):
print url.content

# ElementTree

from elementtree import ElementTree
tree = ElementTree.parse("foo.opml")
for outline in tree.findall("//outline"):
print outline.get('xmlUrl')

Please see my blog entry for more commentary
http://www.nelson.monkey.org/~nelson/weblog/tech/python/xpath.html

Uche Ogbuji

unread,
Jan 15, 2005, 1:07:06 AM1/15/05
to
Interesting discussion. My own thoughts:

http://www.oreillynet.com/pub/wlg/6224
http://www.oreillynet.com/pub/wlg/6225

Meanwhile, please don't make the mistake of bothering with XQuery.
It's despicable crap. And a huge impedance mismatch with Python.
--Uche

Reply all
Reply to author
Forward
0 new messages