base tag in XML

10 views
Skip to first unread message

prudek

unread,
Jun 11, 2009, 12:00:41 PM6/11/09
to beautifulsoup
This XML file:

<TranslationSet>

<base loc="en">Add photo</base>
<tran loc="cs" origin="OldLoc exact match">Přidat obrázek</
tran>

</TranslationSet>

produces very unexpected prettify():

<translationset>
<base loc="en" />
Add photo
<tran loc="cs" origin="OldLoc exact match">
Přidat obrázek
</tran>
</translationset>

Notice that the "base" tag became self-closing. This means that the
contents of the "base" tag is inaccessible.

If I rename the "base" tag into something silly, such as "piglet",
this problem does not occur.

Why is this happening for the base tag?

Lino Mastrodomenico

unread,
Jun 11, 2009, 6:12:07 PM6/11/09
to prudek, beauti...@googlegroups.com
2009/6/11 prudek <pru...@bvx.cz>:

> Notice that the "base" tag became self-closing.

The "base" tag in HTML is self-closing (like, e.g., "br" and "img").
You're parsing an XML document, so an XML parser will work much better
for you.

--
Lino Mastrodomenico

prudek

unread,
Jun 12, 2009, 3:20:50 AM6/12/09
to beautifulsoup
But I love BeautifulSoup :-) so I used regex to replace "base" with a
different name and then submitted the result to BeautifulSoup.

Which Python XML parser do you recommend?
Reply all
Reply to author
Forward
0 new messages