How to convert the HTML entities into UTF-8 character set, in ruby 1.8.7

93 views
Skip to first unread message

Nila

unread,
Sep 11, 2012, 4:02:22 AM9/11/12
to rubyonra...@googlegroups.com
Hi,

Is there away to convert the HTML entities into UTF-8 character set, in ruby 1.8.7? 
(For example, if we consider "ö", convert the entity number "ö" into "\303\266". Or at least converting the html entity to the character "ö" )

Thank you

Matt Jones

unread,
Sep 11, 2012, 7:53:49 AM9/11/12
to rubyonra...@googlegroups.com
CGI.unescapeHTML may do what you're looking for.

--Matt Jones 

Walter Lee Davis

unread,
Sep 11, 2012, 9:37:28 AM9/11/12
to rubyonra...@googlegroups.com
Here's what I do:

coder = HTMLEntities.new
foo = coder.decode(foo)

I tried CGI.unescapeHTML and hit some problems, but that might have been my source talking, since there were also custom entities declared in XML.

Walter

Nila

unread,
Sep 12, 2012, 1:26:51 AM9/12/12
to rubyonra...@googlegroups.com
Thanks alot for the responses. Could do it using HTMLEntities.
Reply all
Reply to author
Forward
0 new messages