http://www.reportlab.com/xml/pyrxp.html
RXP is a very fast and fully compliant validating XML parser
written by Richard Tobin of the University of Edinburgh,
Language Technology Group. pyRXP is a wrapper around this
which constructs a lightweight in-memory "tuple tree" in
a single call. This structure is the lightest one we could
define in Python, and it is constructed entirely in C code,
resulting in unprecedented speed; the memory footprint is
also several times more compact that DOM Node objects in
either Python or Java. The deployment is a single Python
extension module of approximately 100kb.
PyRXP, like RXP is under the General Public License.
Commercial licenses are available from ReportLab for
situations where GPL is not appropriate, such as embedding
in closed source products.
This is not a full DOM implementation. But if you need to
get XML data into memory, we think it will do what 90% of
the people want, in 10% of the time. And with validation.
Enjoy!
Andy Robinson
CEO/Chief Architect,
ReportLab Inc.
* We have been informed that Daniel Viellard's Python wrapper
for libxml2 may be a contender, but have not been able to
do a comparable benchmark. No parser using Python SAX events
even comes close.