help handling web page encoding

16 views
Skip to first unread message

nilanjan

unread,
Apr 11, 2012, 12:55:44 AM4/11/12
to nokogiri-talk
Hi,

I am a new user.

When I read some HTML pages, some of the characters (quotes,
apostrophes) come in as unicode(?). How should I be handling these in
nokogiri. These are plain English pages.

In the sample below, the quotes (") come in as multiple characters

Thanks,
- Nilanjan


example text: (from http://www.sqe.com/BetterSoftwareWest/Concurrent/Default.aspx?Date=6/13/2012)
he “centerpiece of our five-year corporate IT strategy.” Ho
Reply all
Reply to author
Forward
0 new messages