How to search XML? Are there special libs?

27 views
Skip to first unread message

Sullivan WxPyQtKinter

unread,
Mar 16, 2006, 9:40:51 PM3/16/06
to
I am now using XML to record my lab records in quite a complex way, in
which about 80 tags are used to identify different types of data. I
think it is a good idea to search for a popular and mature XML search
engine before I started to program it myself:

I need the following functions:
Search and get all the records that share the same tree structures.
Search records containing a specific node or sub tree structres.

Does anyone know about something suitable? Or other powerful XML search
engine in Python?
Thank you so much for help.

PS: I do not care about the learning curve, as long as it is not a
cliff to climb.

Diez B. Roggisch

unread,
Mar 17, 2006, 5:08:30 AM3/17/06
to
Sullivan WxPyQtKinter wrote:

> I am now using XML to record my lab records in quite a complex way, in
> which about 80 tags are used to identify different types of data. I
> think it is a good idea to search for a popular and mature XML search
> engine before I started to program it myself:
>
> I need the following functions:
> Search and get all the records that share the same tree structures.
> Search records containing a specific node or sub tree structres.
>
> Does anyone know about something suitable? Or other powerful XML search
> engine in Python?
> Thank you so much for help.

XPath is your friend. Available at least in the 4suite-xml-libraries. I'm
not aware of anything that resembles a more database-like approach, but at
least you can issue queries against the xml.

However you might reconsider your choice of a persistence model. Queries is
what SQL is for, and you very well might be better off using e.g. sqlite
and a XML-im/export if that is what you need.

Diez

Ravi Teja

unread,
Mar 17, 2006, 5:38:41 AM3/17/06
to
Yes! XPath is a good bet.
You can also try some Pythonic XML libraries like Amara. You need not
learn any special language even.

There are good database approaches to XML too, especially if you are
going to query a document collection as a whole rather than file by
file. You can try XQuery. I think 4Suite can do this (But I am too
sleepy to confirm :-) ). You also use eXist (Java but you can use
XMLRPC or SOAP to interface with it from Python). Not optimal like
parent said, but if it is XML that have to live with ...

uche....@gmail.com

unread,
Mar 17, 2006, 5:01:42 PM3/17/06
to

4Suite does not support XQuery. It does support full XPath plus EXSLT
and enough other extensions to come close to the power of XQuery.
Amara [1] makes it really easy to get XQuery-like power from right
within Python, as I've blogged before (e.g. [2][3]).

I don't know whether full-text indexing of XML is something the OP
needs as well. If so, see [3].

[1] http://uche.ogbuji.net/tech/4Suite/amara/
[2] http://copia.ogbuji.net/blog/2005-06-12/Amara_equi
[3] http://copia.ogbuji.net/blog/2005/Sep/20
[4] http://www.xml.com/pub/a/2004/12/08/py-xml.html

--
Uche Ogbuji Fourthought, Inc.
http://uche.ogbuji.net http://fourthought.com
http://copia.ogbuji.net http://4Suite.org
Articles: http://uche.ogbuji.net/tech/publications/

Reply all
Reply to author
Forward
0 new messages