Need help for Nokogiri XML parser

14 views
Skip to first unread message

betterabhi .

unread,
Feb 25, 2015, 4:40:39 AM2/25/15
to rubyonrails-talk
Hi,

I am facing issue with the Nokogiri XML Parser:

I am using the following code to parse:

doc = Nokogiri::HTML.fragment(xml)
puts doc.to_xml

Output:
<image>
<x>-18</x>
<y>0</y>
<width>960</width>
<height>720</height>
<borderwidth>0px</borderwidth>
<link />%% url-97255 %%
<bordercolor>000000</bordercolor>
<caption>Opening Still</caption>
<linktype>url-97255</linktype>
<linktarget>_self</linktarget>
</image>

Ideally the link tag should be "<link>%% url-97255 %%</link>". 

Any when I use:
doc = Nokogiri::XML(xml)
puts doc.to_xml

That time the HTML entities is not parsing correctly:
<richtext>
<x>331</x>
<y>183</y>
<width>508</width>
<height>44</height>
<richmailmerge/>
<usedarkbg>false</usedarkbg>
<borderColor>000000</borderColor>
<usebgcolor>false</usebgcolor>
<richtext-textfield>P ALIGN=LEFTFONT FACE=Arial SIZE=24 COLOR=#CC0033 LETTERSPACING=0 KERNING=0Thi - Creativs ise/FONT/P</richtext-textfield>
<borderWidth>0px</borderWidth>
<backgroundcolor>000000</backgroundcolor>
<richtext/>
</richtext>
 
Instead I was hoping that I will get output something like:
<richtext-textfield>&lt;P ALIGN="LEFT"&gt;&lt;FONT FACE="Arial" SIZE="24" COLOR="#CC0033" LETTERSPACING="0" KERNING="0"&gt;Membership Rewards&lt;/FONT&gt;&lt;/P&gt;</richtext-textfield>

Need help

Thank you,
Abhishek Shukla

Frederick Cheung

unread,
Feb 25, 2015, 11:17:31 AM2/25/15
to rubyonra...@googlegroups.com


On Wednesday, February 25, 2015 at 9:40:39 AM UTC, bette...@gmail.com wrote:
Hi,

I am facing issue with the Nokogiri XML Parser:

I am using the following code to parse:

doc = Nokogiri::HTML.fragment(xml)
puts doc.to_xml

What is the input you're feeding it? if the input is malformed, then nokogiri will have to guess at how to fix it.

Fred 
Reply all
Reply to author
Forward
0 new messages