Xpath with no query for a node with default "xmlns"

764 views
Skip to first unread message

Iñaki Baz Castillo

unread,
Nov 16, 2009, 7:00:09 AM11/16/09
to nokogi...@googlegroups.com
Hi, let's suppose this complex XML document:

---------------------
<?xml version="1.0"?>
<foo xmlns="urn:test:default-namespace">
<ns1:bar xmlns:ns1="urn:test:namespace1-uri"
xmlns="urn:test:namespace1-uri">
<baz/>
<ns2:baz xmlns:ns2="urn:test:namespace2-uri"/>
</ns1:bar>
<ns3:hi xmlns:ns3="urn:test:namespace3-uri">
<there/>
</ns3:hi>
</foo>
---------------------


If I do a Xpath as '/foo' with no namespaces then I get no node.
However if I remove the 'xmlns="urn:test:default-namespace"' in the <foo> node
then I get the node.

Is it the expected behaviour?

If so, how could I get the <foo> node without removing the "xmlns" declaration
and without using namespaces in the Xpath query?
Perhaps node.default_namespace is useful for this purpose?

Thanks a lot.

--
Iñaki Baz Castillo <i...@aliax.net>

Iñaki Baz Castillo

unread,
Nov 16, 2009, 7:28:05 AM11/16/09
to nokogi...@googlegroups.com


Humm, but I cannot understand it. Note that if I do the following Xpath query
it also returns an empty node!:

xml.xpath("/foo", {"xmlns"=>"urn:test:default-namespace"})
=> []

Why??

Iñaki Baz Castillo

unread,
Nov 16, 2009, 8:00:35 AM11/16/09
to nokogi...@googlegroups.com
El Lunes, 16 de Noviembre de 2009, Iñaki Baz Castillo escribió:
> El Lunes, 16 de Noviembre de 2009, Iñaki Baz Castillo escribió:
> > Hi, let's suppose this complex XML document:
> >
> > ---------------------
> > <?xml version="1.0"?>
> > <foo xmlns="urn:test:default-namespace">
> > <ns1:bar xmlns:ns1="urn:test:namespace1-uri"
> > xmlns="urn:test:namespace1-uri">
> > <baz/>
> > <ns2:baz xmlns:ns2="urn:test:namespace2-uri"/>
> > </ns1:bar>
> > <ns3:hi xmlns:ns3="urn:test:namespace3-uri">
> > <there/>
> > </ns3:hi>
> > </foo>
> > ---------------------

> Humm, but I cannot understand it. Note that if I do the following Xpath
> query it also returns an empty node!:
>
> xml.xpath("/foo", {"xmlns"=>"urn:test:default-namespace"})
> => []
>
> Why??

The only way to make it working is by adding "xmlns" to the Xpath, then it
works even if namespaces are not included in the Xpath query:

xml.xpath("/xmlns:foo")
=> result OK!

xml.xpath("/xmlns:foo", {"xmlns"=>"urn:test:default-namespace"})
=> result OK!


But, is really required to use "/xmlns:foo" in the Xpath query? is not valid
just to use "/foo"?

Thanks.

Iñaki Baz Castillo

unread,
Nov 16, 2009, 11:29:35 AM11/16/09
to nokogi...@googlegroups.com

I wanted to expect that the following could work:

@xml.xpath("/foo", {nil=>"urn:test:default-namespace"})

but obviously it gives an error:
"NoMethodError: undefined method `gsub' for nil:NilClass"

So I've a big problem as XCAP protocol allows the client sending a Xpath query
without namespaces, but there is a document default namespace (in the case
above "urn:test:default-namespace") which should be used to inspect the XML
document in the server.
This is, it's not required that the client uses:


xml.xpath("/xmlns:foo", {"xmlns"=>"urn:test:default-namespace"})

and instead it could just use:
xml.xpath("/foo")
and the server must understand than the namespaces used are:


"xmlns"=>"urn:test:default-namespace"

However I cannot "rewrite" the Xpath query sent by the client (since I don't
know how to do it in a proper way).

Is there any solution to get what I need? Thanks a lot.

Aaron Patterson

unread,
Nov 16, 2009, 11:34:29 AM11/16/09
to nokogi...@googlegroups.com
On Mon, Nov 16, 2009 at 4:00 AM, Iñaki Baz Castillo <i...@aliax.net> wrote:
>
> Hi, let's suppose this complex XML document:
>
> ---------------------
> <?xml version="1.0"?>
> <foo xmlns="urn:test:default-namespace">
>  <ns1:bar xmlns:ns1="urn:test:namespace1-uri"
>           xmlns="urn:test:namespace1-uri">
>    <baz/>
>    <ns2:baz xmlns:ns2="urn:test:namespace2-uri"/>
>  </ns1:bar>
>  <ns3:hi xmlns:ns3="urn:test:namespace3-uri">
>    <there/>
>  </ns3:hi>
> </foo>
> ---------------------
>
>
> If I do a Xpath as '/foo' with no namespaces then I get no node.
> However if I remove the 'xmlns="urn:test:default-namespace"' in the <foo> node
> then I get the node.
>
> Is it the expected behaviour?

Yes.

xmlns defines a default namespace. That is, if no *explicit*
namespace is defined for the node, the *implicit* xmlns namespace is
used for the declaring node and all children of that node.

> If so, how could I get the <foo> node without removing the "xmlns" declaration
> and without using namespaces in the Xpath query?
> Perhaps node.default_namespace is useful for this purpose?

No. node.default_namespace gives you the "xmlns" namespace if it
exists. In this case, the word "default" really means explicit vs
implicit. The "default" being the implicit namespace.

--
Aaron Patterson
http://tenderlovemaking.com/

Aaron Patterson

unread,
Nov 16, 2009, 11:47:47 AM11/16/09
to nokogi...@googlegroups.com

This should not be expected to work. A namespace name cannot be
blank. When you make a query like:

/foo

You are requesting nodes named "foo" that *belong to no namespace*.
Your node belongs to a namespace. In XPath, the namespace prefix is
*required* to indicate that you are looking for a node that belongs to
a namespace.

>
> but obviously it gives an error:
>  "NoMethodError: undefined method `gsub' for nil:NilClass"
>
> So I've a big problem as XCAP protocol allows the client sending a Xpath query
> without namespaces, but there is a document default namespace (in the case
> above "urn:test:default-namespace") which should be used to inspect the XML
> document in the server.
> This is, it's not required that the client uses:
>  xml.xpath("/xmlns:foo", {"xmlns"=>"urn:test:default-namespace"})
> and instead it could just use:
>  xml.xpath("/foo")
> and the server must understand than the namespaces used are:
>  "xmlns"=>"urn:test:default-namespace"
>
> However I cannot "rewrite" the Xpath query sent by the client (since I don't
> know how to do it in a proper way).
>
> Is there any solution to get what I need? Thanks a lot.

Possibly. If the client *always* sends non-namespaced queries, you
could remove all namespaces from the document:

http://nokogiri.org/Nokogiri/XML/Document.html#M000385

But if the client mixes queries, then you have a problem. I'm not
sure what to do about that. Tell the client to fix their software?

Iñaki Baz Castillo

unread,
Nov 16, 2009, 12:07:24 PM11/16/09
to nokogi...@googlegroups.com
El Lunes, 16 de Noviembre de 2009, Aaron Patterson escribió:
> > I wanted to expect that the following could work:
> >
> > @xml.xpath("/foo", {nil=>"urn:test:default-namespace"})
>
> This should not be expected to work. A namespace name cannot be
> blank. When you make a query like:
>
> /foo
>
> You are requesting nodes named "foo" that *belong to no namespace*.
> Your node belongs to a namespace. In XPath, the namespace prefix is
> *required* to indicate that you are looking for a node that belongs to
> a namespace.

Yes, I understand, but as I said in XCAP the client is allowed to use a Xpath
query with no namespaces and a default namespace is supposed to be used :(


> > but obviously it gives an error:
> > "NoMethodError: undefined method `gsub' for nil:NilClass"
> >
> > So I've a big problem as XCAP protocol allows the client sending a Xpath
> > query without namespaces, but there is a document default namespace (in
> > the case above "urn:test:default-namespace") which should be used to
> > inspect the XML document in the server.
> > This is, it's not required that the client uses:
> > xml.xpath("/xmlns:foo", {"xmlns"=>"urn:test:default-namespace"})
> > and instead it could just use:
> > xml.xpath("/foo")
> > and the server must understand than the namespaces used are:
> > "xmlns"=>"urn:test:default-namespace"
> >
> > However I cannot "rewrite" the Xpath query sent by the client (since I
> > don't know how to do it in a proper way).
> >
> > Is there any solution to get what I need? Thanks a lot.
>
> Possibly. If the client *always* sends non-namespaced queries, you
> could remove all namespaces from the document:
>
> http://nokogiri.org/Nokogiri/XML/Document.html#M000385
>
> But if the client mixes queries, then you have a problem.

This is not valid for me as the XML I manage have always 2-3 namespaces :(


> I'm not
> sure what to do about that. Tell the client to fix their software?

Unfortunatelly it's not a bug. Perhaps it's a bad designed specification
(XCAP) which "corrupts" the Xpath specification.

Iñaki Baz Castillo

unread,
Nov 19, 2009, 10:18:16 AM11/19/09
to nokogi...@googlegroups.com
El Lunes, 16 de Noviembre de 2009, Iñaki Baz Castillo escribió:

> Unfortunatelly it's not a bug. Perhaps it's a bad designed specification
> (XCAP) which "corrupts" the Xpath specification.

Hi again. I've analized and re-analized this subject by asking in other
maillists (related to XCAP and so).

Finally it seems that XCAP protocol allows a "feature" which "breaks" a bit
the usual Xpath/XML mechanism. To summarize it:

------------------------------------------------------------------
> - If the XML document or node contains a default namespace then the XCAP
> Xpath query *MUST* contain a namespace definition matching it.
>
not exactly since if prefix is "empty" default document namespace must
be applied when inspecting the Xpath query.

> So, XML/Xpath specs are not violated by XCAP :)
>
I wouldn't call it a violation, but there's definitely different
interpretations between XPath 1 / XPath 2 / XCAP / Patch-Ops
selections.
------------------------------------------------------------------


So definitively this is my escenario:

- A XML document in the XCAP server belongs to a specific XCAP application.
Each XCAP application has a "default document namespace".

- Assume "default document namespace" = "urn:default-namespace":
----test1.xml------------------------
<?xml version="1.0"?>
<foo xmlns="urn:default-namespace">
</foo>
------------------------------------

- The client generates a XCAP request with this Xpath (no NS query):
"/foo"

- I thought it was not valid since the XML document contains a default
namespace in node "foo" but the query doesn't declare it.
However this MUST be correct per XCAP specs:

"as default document namespace URI for this AU is indeed
"urn:default-namespace" a match occurs. This _is_ the use-case for
default document namespace, i.e. to avoid the query parameter."




So what I need exactly is:

- Let's imagine I've a Nokogiri::XML::Document:

---- @xml --------------------------
<?xml version="1.0"?>
<foo xmlns="urn:default-namespace">
</foo>
------------------------------------

- In "some" way I define that the application default document namespace is
"urn:default-namespace" (note that this is a XCAP specific subject, and it
could, or not, match the XML document root node default namespace).

- I do a Xpath query as follows:

node = @xml.xpath("/foo")

- For now this would fail as "foo" node has a default namespace in the XML
document.

- However I must instruct Nokogiri to interpret Xpath nodes with no ns prefix
as if they belong to the *application* default document namespace
("urn:default-namespace").
This is, Nokogiri must behave as if the Xpath would be:

node = @xml.xpath("/a:foo", { "a" => "urn:default-namespace" })

- Of course, if "foo" node has a different default namespace the query would
fail.


So my question: Assuming this is not natively possible with Nokogiri, which
would be the best workaround to implement it?
Perhaps modifying Nokogiri code? would it depend too much on libxml2?
Perhaps inspecting the Xpath to manualy insert a ns prefix and a namespace so
no changes wouuld be required in Nokogiri?


Thanks for any suggestion and help. Best regards.






http://www.ietf.org/mail-archive/web/simple/current/msg08574.html

Aaron Patterson

unread,
Nov 20, 2009, 11:50:22 AM11/20/09
to nokogi...@googlegroups.com
You would have to modify the XPath engine in libxml2. Nokogiri does
no processing of the XPath, but just sends it along to libxml2. Any
changes would have to be made there.

Iñaki Baz Castillo

unread,
Nov 20, 2009, 12:19:06 PM11/20/09
to nokogi...@googlegroups.com
El Viernes, 20 de Noviembre de 2009, Aaron Patterson escribió:
> > So my question: Assuming this is not natively possible with Nokogiri,
> > which would be the best workaround to implement it?
> > Perhaps modifying Nokogiri code? would it depend too much on libxml2?
> > Perhaps inspecting the Xpath to manualy insert a ns prefix and a
> > namespace so no changes wouuld be required in Nokogiri?
>
> You would have to modify the XPath engine in libxml2. Nokogiri does
> no processing of the XPath, but just sends it along to libxml2. Any
> changes would have to be made there.

Ok, I'm already coding a Xpath parser in Ragel to fix this requeriment I have.

Regards.
Reply all
Reply to author
Forward
0 new messages