Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

LXML: can't register namespace

1,116 views
Skip to first unread message

Andrew Z

unread,
Mar 6, 2018, 11:04:11 PM3/6/18
to
Hello,
with 3.6 and latest greatest lxml:

from lxml import etree

tree = etree.parse('Sample.xml')
etree.register_namespace('','http://www.example.com')

causes:
Traceback (most recent call last):
File "/home/az/Work/flask/tutorial_1/src/xml_oper.py", line 16, in
<module>
etree.register_namespace('','http://www.example.com')
File "src/lxml/etree.pyx", line 203, in lxml.etree.register_namespace
(src/lxml/etree.c:11705)
File "src/lxml/apihelpers.pxi", line 1631, in lxml.etree._tagValidOrRaise
(src/lxml/etree.c:35382)
ValueError: Invalid tag name ''

partial Sample.xml:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

<Applications xmlns="http://www.example.com">
<Application>
<Customer email="ja...@acme.com" external_id="ABCDZ2002"
md_status_nonpro="true" type="INDIVIDUAL" prefix="jadoe">

<cut here>

it seems to not be happy with the empty tag .
But i'm not sure why and how to go about it.

thank you
AZ

Steven D'Aprano

unread,
Mar 7, 2018, 12:38:19 AM3/7/18
to
On Tue, 06 Mar 2018 23:03:15 -0500, Andrew Z wrote:

> Hello,
> with 3.6 and latest greatest lxml:
>
> from lxml import etree
>
> tree = etree.parse('Sample.xml')
> etree.register_namespace('','http://www.example.com')

> it seems to not be happy with the empty tag . But i'm not sure why and
> how to go about it.

Have you tried using something other than the empty string?

In the interactive interpreter, what does

help(etree.register_namespace)

say?



--
Steve

Andrew Z

unread,
Mar 7, 2018, 9:00:59 AM3/7/18
to
Yes, if i give it any non empty tag - all goes well.

All im trying to do is to extract a namespace ( i try to keep simple here.
Just first one for now) and register it so i can save xml later on.


On Mar 7, 2018 00:38, "Steven D'Aprano" <
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Andrew Z

unread,
Mar 7, 2018, 11:17:21 PM3/7/18
to
help(etree.register_namespace)
Help on cython_function_or_method in module lxml.etree:
register_namespace(prefix, uri)
Registers a namespace prefix that newly created Elements in that
namespace will use. The registry is global, and any existing
mapping for either the given prefix or the namespace URI will be
removed.


On Wed, Mar 7, 2018 at 8:55 AM, Andrew Z <for...@gmail.com> wrote:

> Yes, if i give it any non empty tag - all goes well.
>
> All im trying to do is to extract a namespace ( i try to keep simple here.
> Just first one for now) and register it so i can save xml later on.
>
>
> On Mar 7, 2018 00:38, "Steven D'Aprano" <steve+comp.lang.python@

Stefan Behnel

unread,
Mar 9, 2018, 4:22:48 AM3/9/18
to
Andrew Z schrieb am 07.03.2018 um 05:03:
> Hello,
> with 3.6 and latest greatest lxml:
>
> from lxml import etree
>
> tree = etree.parse('Sample.xml')
> etree.register_namespace('','http://www.example.com')

The default namespace prefix is spelled None (because there is no prefix
for it) and not the empty string.


> causes:
> Traceback (most recent call last):
> File "/home/az/Work/flask/tutorial_1/src/xml_oper.py", line 16, in
> <module>
> etree.register_namespace('','http://www.example.com')
> File "src/lxml/etree.pyx", line 203, in lxml.etree.register_namespace
> (src/lxml/etree.c:11705)
> File "src/lxml/apihelpers.pxi", line 1631, in lxml.etree._tagValidOrRaise
> (src/lxml/etree.c:35382)
> ValueError: Invalid tag name ''
>
> partial Sample.xml:
> <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
>
> <Applications xmlns="http://www.example.com">
> <Application>
> <Customer email="ja...@acme.com" external_id="ABCDZ2002"
> md_status_nonpro="true" type="INDIVIDUAL" prefix="jadoe">
>
> <cut here>
>
> it seems to not be happy with the empty tag .
> But i'm not sure why and how to go about it.

Could you explain why you want to do that?

Stefan

Steven D'Aprano

unread,
Mar 9, 2018, 6:43:35 AM3/9/18
to
On Fri, 09 Mar 2018 10:22:23 +0100, Stefan Behnel wrote:

> Andrew Z schrieb am 07.03.2018 um 05:03:
>> Hello,
>> with 3.6 and latest greatest lxml:
>>
>> from lxml import etree
>>
>> tree = etree.parse('Sample.xml')
>> etree.register_namespace('','http://www.example.com')
>
> The default namespace prefix is spelled None (because there is no prefix
> for it) and not the empty string.

Is that documented somewhere?

Is there a good reason not to support "" as the empty prefix?



--
Steve

Stefan Behnel

unread,
Mar 9, 2018, 7:08:34 AM3/9/18
to
Steven D'Aprano schrieb am 09.03.2018 um 12:41:
> On Fri, 09 Mar 2018 10:22:23 +0100, Stefan Behnel wrote:
>
>> Andrew Z schrieb am 07.03.2018 um 05:03:
>>> Hello,
>>> with 3.6 and latest greatest lxml:
>>>
>>> from lxml import etree
>>>
>>> tree = etree.parse('Sample.xml')
>>> etree.register_namespace('','http://www.example.com')
>>
>> The default namespace prefix is spelled None (because there is no prefix
>> for it) and not the empty string.
>
> Is that documented somewhere?

http://lxml.de/tutorial.html#namespaces


> Is there a good reason not to support "" as the empty prefix?

Well, the "empty prefix" is not an "empty" prefix, it's *no* prefix. The
result is not ":tag" instead of "prefix:tag", the result is "tag".

But even ignoring that difference, why should the API support two ways of
spelling the same thing, and thus encourage users to write diverging code?

Stefan

Steven D'Aprano

unread,
Mar 9, 2018, 8:02:02 AM3/9/18
to
On Fri, 09 Mar 2018 13:08:10 +0100, Stefan Behnel wrote:

>> Is there a good reason not to support "" as the empty prefix?
>
> Well, the "empty prefix" is not an "empty" prefix, it's *no* prefix. The
> result is not ":tag" instead of "prefix:tag", the result is "tag".

That makes sense, thanks.


--
Steve

Peter Otten

unread,
Mar 9, 2018, 8:12:07 AM3/9/18
to
Stefan Behnel wrote:

> Andrew Z schrieb am 07.03.2018 um 05:03:
>> Hello,
>> with 3.6 and latest greatest lxml:
>>
>> from lxml import etree
>>
>> tree = etree.parse('Sample.xml')
>> etree.register_namespace('','http://www.example.com')
>
> The default namespace prefix is spelled None (because there is no prefix
> for it) and not the empty string.

Does that mean the OP shouldn't use register_namespace() at all or that he's
supposed to replace "" with None?

If the latter -- it looks like None accepted either:

(lxml_again)$ python
Python 3.4.3 (default, Nov 28 2017, 16:41:13)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from lxml import etree
>>> etree.register_namespace(None, "http://www.example.com")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "src/lxml/etree.pyx", line 200, in lxml.etree.register_namespace
(src/lxml/etree.c:11612)
File "src/lxml/apihelpers.pxi", line 1442, in lxml.etree._utf8
(src/lxml/etree.c:32933)
TypeError: Argument must be bytes or unicode, got 'NoneType'
>>> etree.__version__
'4.1.1'


Stefan Behnel

unread,
Mar 9, 2018, 9:46:28 AM3/9/18
to
Peter Otten schrieb am 09.03.2018 um 14:11:
> Stefan Behnel wrote:
>
>> Andrew Z schrieb am 07.03.2018 um 05:03:
>>> Hello,
>>> with 3.6 and latest greatest lxml:
>>>
>>> from lxml import etree
>>>
>>> tree = etree.parse('Sample.xml')
>>> etree.register_namespace('','http://www.example.com')
>>
>> The default namespace prefix is spelled None (because there is no prefix
>> for it) and not the empty string.
>
> Does that mean the OP shouldn't use register_namespace() at all or that he's
> supposed to replace "" with None?

It meant neither of the two, but now that you ask, I would recommend the
first. ;)

An application global setup for the default namespace is never a good idea,
thus my question regarding the actual intention of the OP. Depending on the
context, the right thing to do might be be to either not care at all, or to
not use the default namespace but a normally prefixed one instead, or to
define a (default) namespace mapping for a newly created tree, as shown in
the namespace tutorial.

http://lxml.de/tutorial.html#namespaces

Usually, not caring about namespace prefixes is the best approach. Parsers,
serialisers and compressors can deal with them perfectly and safely, humans
should just ignore the clutter, pitfalls and complexity that they introduce.

Stefan

Andrew Z

unread,
Mar 9, 2018, 9:38:17 PM3/9/18
to
Stefan,
thank you for the link. That explains the line of thinking of the package
designer(s).
I also looked@ beautifulsoup and found it to work better with my old brains.
> --
> https://mail.python.org/mailman/listinfo/python-list
>
0 new messages