Yes, when you call the text method, it assumes that you want to insert
text and will escape characters for you. I would recommend copying
the nodes from one document to another like this:
doc1 = Nokogiri::HTML(<<-eohtml)
<html>
<body><h1>Hello World</h1> how are you?</body>
</html>
eohtml
dest = Nokogiri::HTML::Builder.new do |b|
b.html do
b.head do
b.script
b.style
end
b.body
end
end.doc
# Get the destination body
body = dest.at('body')
# Add the children of the source body to the destination body
doc1.at('body').children.each { |c| body << c }
puts dest
> 2) For some block element(s) in the source HTML there is:
> <p>blah blhal bhalb blah blah<br><span style="mso-spacerun:
> yes"> </span>foo foo foo foo foo foo foo</p>
>
> I would like to transform the above <p> to essentially:
> <p>blah blhal bhalb blah blah<br><span class="poetry-line-odd">foo foo
> foo foo foo foo foo foo</span></p>
>
> But I'm having trouble accomplishing (2) as this is my first exposure
> to XPath / Nokogiri -- but I believe the tools are in my hands to get
> this done ... and I think this "might work" (TM).
If you're more comfortable with CSS, you should use CSS. Here is a
CSS query that says "find all span tags that have an attribute named
style whose value is 'mso-spacerun: yes'":
doc.css('span[style = "mso-spacerun: yes"]').each do |span|
span['style'] = 'poetry-line-odd'
end
Here is the same query using XPath:
doc.xpath('//span[@style = "mso-spacerun: yes"]').each do |node|
span['style'] = 'poetry-line-odd'
end
Hope that helps!
--
Aaron Patterson
http://tenderlovemaking.com/