Dealing with the  character

174 views
Skip to first unread message

George

unread,
Mar 9, 2010, 11:58:08 AM3/9/10
to Watir General
I'm not sure if this is a Watir or a Ruby question. I'm pulling all
the contents from a select list, and when I display each option, I get
something similar to the following (without the quotes):

"3       Jubitz Travel
Center               Portland,OR"

Is there something in Watir that will get rid of the  character? Is
this related to UTF-8? Ultimately, I'm trying to isolate "Jubitz
Travel Center" from the text string. If someone can help, I would be
most grateful!

Thanks,

George

windy

unread,
Mar 9, 2010, 12:51:03 PM3/9/10
to watir-...@googlegroups.com
I think it's because the page encoding is different from utf-8,the default
encoding(in watir1.6.5 it's utf-8).
Maybe you can change here: WIN32OLE.codepage = WIN32OLE::CP_UTF8 at
watir-1.6.5\lib\watir\win32ole.rb.
Here's some Value you can try
WIN32OLE::CP_ACP
WIN32OLE::CP_OEMCP
WIN32OLE::CP_MACCP
WIN32OLE::CP_THREAD_ACP
WIN32OLE::CP_SYMBOL
WIN32OLE::CP_UTF7
WIN32OLE::CP_UTF8

orde

unread,
Mar 9, 2010, 12:57:23 PM3/9/10
to Watir General
This will remove each "Â" character from the string:

x = "3       Jubitz Travel Center              Â
Portland,OR"
x.gsub!(/Â/, "")

But you're probably looking to solve your root problem. Looks like a
UTF-8 vs. ISO-8859-1 issue. Do a search for "utf-8 Â as a space" on
your favorite search engine, and that should point you in the right
direction (hopefully).

Hope that helps.

orde

Ethan

unread,
Mar 9, 2010, 6:36:02 PM3/9/10
to watir-...@googlegroups.com
Is this in IE or firefox? 

--
You received this message because you are subscribed to the Google Groups "Watir General" group.
To post to this group, send email to watir-...@googlegroups.com
Before posting, please read the following guidelines: http://wiki.openqa.org/display/WTR/Support
To unsubscribe from this group, send email to watir-genera...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/watir-general

George

unread,
Mar 10, 2010, 11:26:29 AM3/10/10
to Watir General
I'm using IE. I tried using gsub!(/Â/, ""), but I couldn't remove the
remaining spaces. It took a long time, but I think I figured it out
using this:

x = "3       Jubitz Travel Center              Â
Portland,OR"

b = x.gsub(x[/(\d+)(\b[^0-9A-Za-z]{1,}\b)/], '') # gets rid of the
beginning number/funky chars
puts b.gsub(b[/(\b[^0-9A-Za-z]{1,}\b)(\w+)(,..)\Z/], '') # gets rid of
the second set of funky chars and city/state

orde

unread,
Mar 10, 2010, 12:44:06 PM3/10/10
to Watir General
Nice that you got it worked out.

One question, though.

Open IE, go to the website you're testing, right click, and select
Encoding. Is Unicode (UTF-8) selected?

George

unread,
Mar 10, 2010, 1:04:40 PM3/10/10
to Watir General
It appears that Western European (ISO) is selected by default. Is
there something I can do in the code to convert it to UTF-8?

orde

unread,
Mar 10, 2010, 2:03:35 PM3/10/10
to Watir General
You should be able to select Unicode (UTF-8) from that dropdown. That
might fix the "Â" issue that you're observing.
Reply all
Reply to author
Forward
0 new messages