Why \b is replaced with \u{65533} � ?

5 views
Skip to first unread message

Larry Zhao

unread,
Jan 4, 2018, 9:48:43 AM1/4/18
to nokogiri-talk
Hi, Guys,

Just found that nokogiri will replace "\b" in the original html string to  \u{65533} � , 

Why is that? Is there a list for this? What kind of character will be replaced with \u{65533} � ?


Thanks a lot for the help & Best Regards.

Mike Dalessio

unread,
Jan 4, 2018, 9:53:24 AM1/4/18
to nokogiri-talk
Hi,

Thanks for asking this question. I'm not able to reproduce what you're seeing, can you help me understand your question?

Here's what I did to try to reproduce:

require 'nokogiri'

html = "<body><div>this is a single backslash '\b'</div><div>this is a double backslash '\\b'</div></body>"
doc = Nokogiri::HTML html

puts doc.to_html

output is:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<div>this is a single backslash ''</div>
<div>this is a double backslash '\b'</div>
</body></html>

and my system looks like:

$ nokogiri -v
# Nokogiri (1.8.1)
    ---
    warnings: []
    nokogiri: 1.8.1
    ruby:
      version: 2.4.1
      platform: x86_64-linux
      description: ruby 2.4.1p111 (2017-03-22 revision 58053) [x86_64-linux]
      engine: ruby
    libxml:
      binding: extension
      source: packaged
      libxml2_path: "/home/flavorjones/.rvm/gems/ruby-2.4.1/gems/nokogiri-1.8.1/ports/x86_64-pc-linux-gnu/libxml2/2.9.5"
      libxslt_path: "/home/flavorjones/.rvm/gems/ruby-2.4.1/gems/nokogiri-1.8.1/ports/x86_64-pc-linux-gnu/libxslt/1.1.30"
      libxml2_patches: []
      libxslt_patches: []
      compiled: 2.9.5
      loaded: 2.9.5


--
You received this message because you are subscribed to the Google Groups "nokogiri-talk" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nokogiri-talk+unsubscribe@googlegroups.com.
To post to this group, send email to nokogi...@googlegroups.com.
Visit this group at https://groups.google.com/group/nokogiri-talk.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages