Any ideas where hex values may be coming in from?

Skip to first unread message

Wayne Brissette

Jun 9, 2020, 6:18:15 AM6/9/20
to nokogiri-talk
I'm seeing something that I can't quite put my finger on. I have some
files that are being written out in XML as I would expect when I view
them in oXygen. Then I have a few files that when written out in instead
of carriage returns, I see: 

I recognize that this is just a hex representation, but I also think it
may be causing me some processing issues. That being said, when I look
one particular file throughout the Intellij debugger, I don't see this
until builder is done with it.

Any ideas as to what may cause this in some files, but not others? All
of these files are produced exactly the same. So it's not like the are
coming in from different sources.


Mike Dalessio

Jun 9, 2020, 8:52:44 AM6/9/20
to nokogiri-talk
Hey Wayne,

Without seeing your code and the source documents, I can only guess. But here's my guess.

` is hex for "carriage return", as in hex 0D and part of the common line terminator in the Windows world. It can be seen when "canonicalizing" documents with CRLF line terminators (i.e., Nokogiri::HTML::Document#canonicalize), is that how you're serializing?

#! /usr/bin/env ruby

require "nokogiri"

html = <<-EOXML
  <element foo="asdf\r\n">this has a CR and a LF at the end of it\r\n</element>

puts Nokogiri::HTML(html).canonicalize


  <element foo="asdf&#xD;&#xA;">this has a CR and a LF at the end of it&#xD;

You received this message because you are subscribed to the Google Groups "nokogiri-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
To view this discussion on the web visit
Reply all
Reply to author
0 new messages