Let me show the following issue:
----------------------------
doc = '<?xml version="1.0" encoding="UTF-8"?>
<cp:ruleset xmlns:pr="urn:ietf:params:xml:ns:pres-rules"
xmlns:cp="urn:ietf:params:xml:ns:common-policy">
<cp:rule id="pres_blacklist">
<cp:identity>
<cp:one id="sip:al...@domain.net">Alice</cp:one>
</cp:identity>
</cp:rule>
</cp:ruleset>'
xml = Nokogiri::XML.parse(doc)
# Create a new child to insert under <cp:identity> section:
frag = xml.fragment('<cp:one id="b...@domain.net">Bob</cp:one>')
puts "frag = #{frag}"
ns = { "xmlns:pr" => "urn:ietf:params:xml:ns:pres-rules", "xmlns:cp" =>
"urn:ietf:params:xml:ns:common-policy" }
parent_node =
xml.xpath('/cp:ruleset/cp:rule[@id="pres_blacklist"]/cp:identity', ns)[0]
parent_node << frag
puts "xml = #{xml}"
------------------------------
The output is:
------------------------------
frag = <cp:cp:one id="b...@domain.net">Bob</cp:cp:one>
xml = <?xml version="1.0" encoding="UTF-8"?>
<cp:ruleset xmlns:pr="urn:ietf:params:xml:ns:pres-rules"
xmlns:cp="urn:ietf:params:xml:ns:common-policy">
<cp:rule id="pres_blacklist">
<cp:identity>
<cp:one id="sip:al...@domain.net">Alice</cp:one>
<cp:cp:one id="b...@domain.net">Bob</cp:cp:one></cp:identity>
</cp:rule>
</cp:ruleset>
------------------------------
Note that "frag" variable is incorrect as it starts by "<cp:cp:one" (it
duplicates the namespace) while it should start by "<cp:one".
Due to it, the final xml document gets also affected.
If I remove the XML namespace when creating "frag" the it works ok:
----------------
frag = xml.fragment('<one id="b...@domain.net">Bob</one>')
=> <cp:one id="b...@domain.net">Bob</cp:one>
---------------
However I don't understand why it works (it shouldn't). What about if I have
varios "one" childes belonging to different namespaces? something like:
<cp:identity>
<cp:one id="sip:al...@domain.net">Alice</cp:one>
<pr:one class="lalalala">LALALA</pr:one>
</cp:identity>
This is, why does Nokogiri guess the fragment namespace?
How to solve it? is it a bug?
Thanks.
--
Iñaki Baz Castillo <i...@aliax.net>
Hi, nokogiri 1.3.3. under Ruby 1.9.
If I remove the XML namespace when creating "frag" the it works ok:
----------------
frag = xml.fragment('<one id="b...@domain.net">Bob</one>')
=> <cp:one id="b...@domain.net">Bob</cp:one>
---------------
However I don't understand why it works (it shouldn't). What about if I have
varios "one" childes belonging to different namespaces? something like:
<cp:identity>
<cp:one id="sip:al...@domain.net">Alice</cp:one>
<pr:one class="lalalala">LALALA</pr:one>
</cp:identity>
This is, why does Nokogiri guess the fragment namespace?
How to solve it? is it a bug?
Thanks.
--
Iñaki Baz Castillo <i...@aliax.net>
> When creating a new fragment, Nokogiri uses the namespace of the document
> root's first child. This seems arbitrary, and there doesn't seem to be any
> test coverage. (Aaron: do you remember why we did this?)
It cannot be reliable as there could be various XML tags with same name but
different namespace (cp:one, pr:one...). This is the reason of the existance
of namespaces :)
> > How to solve it? is it a bug?
>
> Let's talk about what we want it to do.
ok, I'll tell you my needs.
I'm implementing a XCAP server (RFC 4825). Basically it's a protocol on top of
HTTP to create/manipulate/delete/fetch XML documents or document nodes stored
in server.
A common operation is when the user appends a child to a XML document stored
in the server.
Let's imagine the server has a document:
----------------------------------
<cp:ruleset xmlns:pr="urn:ietf:params:xml:ns:pres-rules"
xmlns:cp="urn:ietf:params:xml:ns:common-policy">
<cp:rule id="pres_blacklist">
<cp:identity>
<cp:one id="sip:al...@domain.net">Alice</cp:one>
</cp:identity>
</cp:rule>
</cp:ruleset>
----------------------------------
The client wants to add a child "<cp:one id="sip:b...@domain.net/>" into
<cp:identity> node. So the client sends a HTTP PUT request like this (I show
the URL hex-unescaped):
-------------------------------------
PUT
/my_docs/docs/doc.xml/~~/cp:ruleset/cp:rule[@id="pres_blacklist"]/cp:identity/cp:one[@id="sip:b...@domain.net"]?xmlns(pr=urn:ietf:params:xml:ns:pres-
rules)xmlns(cp=urn:ietf:params:xml:ns:common-policy) HTTP/1.1
<cp:one id="sip:b...@domain.net"/>
-------------------------------------
Note that after /~~/ you have a XPATH pointing to the new child to create. And
the body is the node to insert there.
Then the server must inspect the XPATH using the namespaces present in the
http URL query. They math "pr" and "cp" used in the server document but they
could be different (they just must match those namespaces used in the XPATH).
But the namespaces used in the PUT body (<cp:one id="sip:b...@domain.net"/>)
*MUST* match those used in the document, if not, the server should reply an
error. It's responsability of the client to ask first about the used
namespaces in order to use them in the PUT body.
> Here's what I think should happen:
>
> - document fragments should not have a namespace, by default
In my case they will have namespaces in most of the cases :)
> - if a namespace is specified in the node fragment (like your <cp:one>
> fragment above), Nokogiri should check if the prefix matches any of the
> namespace definitions on the document root node. If it finds a match,
> the node should have that namespace. So in your above example, the node
> name would be "one" under the namespace with the prefix "cp".
ok
> - if a namespace is specified in the node fragment but does NOT match
> any of the namespace definitions on the document root, then the prefix
> will be silently ignored (which is libxml2's default behavior when parsing
> documents).
IMHO it should fail as there is no way to guess which namespace belongs the
new node. Am I wrong?
Thanks a lot.
Thanks a lot. I want to test it in ruby1.9 but get an error due to "rexical"
gem dependency:
-----------------------------------
/usr/src/ruby/nokogiri# rake1.9
(in /usr/src/ruby/nokogiri)
warning: couldn't activate the debugging plugin, skipping
cp tmp/x86_64-linux/nokogiri/1.9.1/nokogiri.so lib/nokogiri/nokogiri.so
rex --independent -o lib/nokogiri/css/generated_tokenizer.rb
lib/nokogiri/css/tokenizer.rex
/usr/local/lib/ruby1.9/gems/1.9.1/gems/rexical-1.0.4/lib/rexical/rexcmd.rb:66:in
`initialize': undefined method `collect' for #<String:0x00000000b4f830>
(NoMethodError)
from
/usr/local/lib/ruby1.9/gems/1.9.1/gems/rexical-1.0.4/bin/rex:18:in `new'
from
/usr/local/lib/ruby1.9/gems/1.9.1/gems/rexical-1.0.4/bin/rex:18:in `<top
(required)>'
from /usr/local/bin/rex:19:in `load'
from /usr/local/bin/rex:19:in `<main>'
need rexical, sudo gem install rexical
-----------------------------------
Note "undefined method `collect' for #<String:0x00000000b4f830>". This doesn't
work in 1.9 since String doesn't include Enumerator anymore.
So, does it mean that Nokogiri cannot be compiled for ruby-1.9?
(however I can do "gem1.9 install nokogiri" and it works...).
Thanks.
I've open a new mail thread about this issue as it is independent from the
original subject.