SeleniumLibrary - how to click on links that contain Umlauts (and other funny characters)?

1,510 views
Skip to first unread message

AndreasEK

unread,
Feb 19, 2010, 11:02:36 AM2/19/10
to robotframework-users
Hi,

following problem. Selenium should click on a link that looks similar
to this:

<p>
<a href="theUrl" class="class1 class2 class3 class4">
<strong>

Link Text

</strong>
</a>

</p>

I know it's ugly, but that's how it is :) The selenium api offers a
location strategiy so that you can directly use "link=Link Text". Now
the selenium library is a little bit smarter and also checks the id,
name, href, etc. prior to using the click("link=Link Text") from
selenium directly. This results in two calls as one can see from the
selenium log:

# info(1266594482024): Executing: |click | xpath=//a[@id="Link Text"
or @name="Link Text" or @href="Link Text" or normalize-
space(descendant-or-self::text())="Link Text" or @href="http://
vam03.dev5.oev.de/web/html/privat/_mfl/suche/Link Text"] | |
# error(1266594482024): Element xpath=//a[@id="Link Text" or
@name="Link Text" or @href="Link Text" or normalize-space(descendant-
or-self::text())="Link Text" or @href="http://vam03.dev5.oev.de/web/
html/privat/_mfl/suche/Link Text"] not found
# info(1266594482044): Executing: |click | link=Link Text | |
# error(1266594482047): Element link=Link Text not found

Of course the test case fails:

C:\cc\robot\Neuer Ordner>pybot Umlaute.html
==============================================================================
Umlaute
==============================================================================
Should Open Link With Umlaut
| FAIL |
ERROR: Element link=Link Text not found
------------------------------------------------------------------------------
Umlaute
| FAIL |
1 critical test, 0 passed, 1 failed
1 test total, 0 passed, 1 failed
==============================================================================


Now comes the problem. When the Link contains Umlauts, there is
somewhere a problem with handling that error message correctly.
Selenium reports that the element could not be found:

# info(1266594753116): Executing: |click | xpath=//a[@id="linkitää
teksti" or @name="linkitää teksti" or @href="linkitää teksti" or
normalize-space(descendant-or-self::text())="linkitää teksti" or
@href="http://vam03.dev5.oev.de/web/html/privat/_mfl/suche/linkitää
teksti"] | |
# error(1266594753116): Element xpath=//a[@id="linkitää teksti" or
@name="linkitää teksti" or @href="linkitää teksti" or normalize-
space(descendant-or-self::text())="linkitää teksti" or @href="http://
vam03.dev5.oev.de/web/html/privat/_mfl/suche/linkitää teksti"] not
found

But when processing that error, somewhere there's a problem. This
UnicodeEncodeError is really stressing me out by now!

C:\cc\robot\Neuer Ordner>pybot Umlaute.html
==============================================================================
Umlaute
==============================================================================
Should Open Link With Umlaut
| FAIL |
UnicodeEncodeError: ('ascii', u'ERROR: Element xpath=//a[@id="linkit
\xe4\xe4 teksti" or @name="linkit\xe4\xe4 teksti" or @href="linkit
\xe4\xe4 teksti" or normal
ize-space(descendant-or-self::text())="linkit\xe4\xe4 teksti" or
@href="http://vam03.dev5.oev.de/web/html/privat/_mfl/suche/linkit
\xe4\xe4 teksti"] not found',
36, 38, 'ordinal not in range(128)')
------------------------------------------------------------------------------
Umlaute
| FAIL |
1 critical test, 0 passed, 1 failed
1 test total, 0 passed, 1 failed
==============================================================================
Output: c:\cc\robot\neuer ordner\output.xml
Report: c:\cc\robot\neuer ordner\report.html
Log: c:\cc\robot\neuer ordner\log.html


Of course, then the whole test run is stopped, and the link is never
clicked. I have no idea, how or where to fix this. Any ideas?

AndreasEK

unread,
Feb 19, 2010, 12:02:45 PM2/19/10
to robotframework-users
I was impatient, sorry :)

raised a bug and attached a patch:
http://code.google.com/p/robotframework-seleniumlibrary/issues/detail?id=98

Have a nice weekend,

Andreas Ebbert-Karroum

Magnus

unread,
Feb 22, 2010, 3:50:45 AM2/22/10
to robotframework-users
I'm pretty sure I've seen this error in test cases that do not use
SeleniumLibrary as well.
Could it be that the same "bug" is present in BuiltIn library too?

/Magnus

On Feb 19, 6:02 pm, AndreasEK <Andreas.Ebb...@gmx.de> wrote:
> I was impatient, sorry :)
>

> raised a bug and attached a patch:http://code.google.com/p/robotframework-seleniumlibrary/issues/detail...

Pekka Klärck

unread,
Feb 22, 2010, 5:11:02 AM2/22/10
to magnus....@gmail.com, robotframework-users
2010/2/22 Magnus <magnus....@gmail.com>:

> I'm pretty sure I've seen this error in test cases that do not use
> SeleniumLibrary as well.

Unicode doesn't work 100% correctly on Jython. Many bugs we reported
during Jython 2.2 alpha/beta/rc phase were fixed but not all. There
are some workarounds for these remaining issues in the code base, but
unfortunately they aren't fully compatible with Jython 2.5 or even
2.2.1, which both have introduced different problems. I hope this
situation gets better in RF 2.5 when we plan to support only Jython
(and Python) 2.5 or never 2.x releases.

> Could it be that the same "bug" is present in BuiltIn library too?

I'd be surprised if the exact same problems occurs there but
everything is obviously possible. Let us know if you encounter
something via this mailing list or the issue tracker.

Cheers,
.peke

AndreasEK

unread,
Feb 22, 2010, 6:17:04 AM2/22/10
to robotframework-users
Hi,

in my example I was using plain Python (2.6) - no Jython at all. I
understand, that there may be additional problems with Jython, but it
should work in Python as well :)

On Feb 22, 11:11 am, Pekka Klärck <pekka.kla...@gmail.com> wrote:
> 2010/2/22 Magnus <magnus.smedb...@gmail.com>:


>
> > Could it be that the same "bug" is present in BuiltIn library too?
>
> I'd be surprised if the exact same problems occurs there but
> everything is obviously possible. Let us know if you encounter
> something via this mailing list or the issue tracker.

In my opinion it's not only possible but quite likely that the same
behaviour is present also at other places. The problem was that the
python keyword tried to extract the error message from an Exception
with str(err). This leads to the UnicodeEncodeError, since it cannot
convert a unicode text into ascii. The solution was to use
unicode(err) instead. But ... my python knowledge is really limited
and maybe that approach brings other problems with it, which I am not
aware of.

For example, this can be found in the __init__.py of RF:

def _run_or_rebot_from_cli(method, cliargs, usage,
**argparser_config):
LOGGER.register_file_logger()
ap = utils.ArgumentParser(usage, utils.get_full_version())
try:
options, datasources = \
ap.parse_args(cliargs, argfile='argumentfile',
unescape='escape',
help='help', version='version',
check_args=True,
**argparser_config)
except Information, msg:
_exit(INFO_PRINTED, str(msg))
except DataError, err:
_exit(DATA_ERROR, str(err))


which shows the same problem - it should be unicode(msg/err) instead.

With all the Problems around Exceptions [1], str() [2] and unicode()
[3] maybe these suggested safe methods are a good option?

http://code.activestate.com/recipes/466341/ Guaranteed conversion to
unicode or byte string (Python)

[1]
http://docs.python.org/library/exceptions.html#exceptions.BaseExceptionhttp://docs.python.org/library/exceptions.html#exceptions.BaseException
[2] http://docs.python.org/library/functions.html#str
[3] http://docs.python.org/library/functions.html#unicode

Andreas

Pekka Klärck

unread,
Feb 22, 2010, 6:57:13 AM2/22/10
to Andreas...@gmx.de, robotframework-users
2010/2/22 AndreasEK <Andreas...@gmx.de>:

>
> in my example I was using plain Python (2.6) - no Jython at all. I
> understand, that there may be additional problems with Jython, but it
> should work in Python as well :)

My bad, should have read the problem description more closely and not
just assume that all our Unicode problems are Jython related.

> On Feb 22, 11:11 am, Pekka Klärck <pekka.kla...@gmail.com> wrote:
>> 2010/2/22 Magnus <magnus.smedb...@gmail.com>:
>>
>> > Could it be that the same "bug" is present in BuiltIn library too?
>>
>> I'd be surprised if the exact same problems occurs there but
>> everything is obviously possible. Let us know if you encounter
>> something via this mailing list or the issue tracker.
>
> In my opinion it's not only possible but quite likely that the same
> behaviour is present also at other places. The problem was that the
> python keyword tried to extract the error message from an Exception
> with str(err). This leads to the UnicodeEncodeError, since it cannot
> convert a unicode text into ascii. The solution was to use
> unicode(err) instead. But ... my python knowledge is really limited
> and maybe that approach brings other problems with it, which I am not
> aware of.

You are correct that blindly calling str() for a received string is
bug. Using unicode() is better but it's not fool-proof either. A good
news is that SeleniumLibrary using str() with exceptions doesn't mean
it's used everywhere else too. Bad news is that there probably are
many places where this happens and all them should be fixed.

> For example, this can be found in the __init__.py of RF:
>
> def _run_or_rebot_from_cli(method, cliargs, usage,
> **argparser_config):
>    LOGGER.register_file_logger()
>    ap = utils.ArgumentParser(usage, utils.get_full_version())
>    try:
>        options, datasources = \
>            ap.parse_args(cliargs, argfile='argumentfile',
> unescape='escape',
>                          help='help', version='version',
> check_args=True,
>                          **argparser_config)
>    except Information, msg:
>        _exit(INFO_PRINTED, str(msg))
>    except DataError, err:
>        _exit(DATA_ERROR, str(err))
>
> which shows the same problem - it should be unicode(msg/err) instead.

It's true that str() is potentially risky here too. Luckily in most
places in the core framework helper methods utils.unic() and
utils.get_error_message() are used and they ought to handle Unicode
just fine. I just submitted a bug report to convert all remaining
str() calls to something more safe:
http://code.google.com/p/robotframework/issues/detail?id=471

> With all the Problems around Exceptions [1], str() [2] and unicode()
> [3] maybe these suggested safe methods are a good option?
>
> http://code.activestate.com/recipes/466341/ Guaranteed conversion to
> unicode or byte string (Python)

The recipe looks interesting. We need to consider incorporating it to
the previously mentioned utils.unic(). Nowadays this helper mainly
contains Jython related workarounds, but it could easily handle
problems with invalid encodings too.

Thanks for debugging Andreas! Your Python skills are getting better
all the time. =)

Cheers,
.peke
--
Agile Tester/Developer/Consultant :: http://eliga.fi
Lead Developer of Robot Framework :: http://robotframework.org

AndreasEK

unread,
Feb 22, 2010, 8:46:17 AM2/22/10
to robotframework-users
Hi Pekka,

On Feb 22, 12:57 pm, Pekka Klärck <pekka.kla...@gmail.com> wrote:

> It's true that str() is potentially risky here too. Luckily in most
> places in the core framework helper methods utils.unic() and
> utils.get_error_message() are used and they ought to handle Unicode
> just fine. I just submitted a bug report to convert all remaining
> str() calls to something more safe:http://code.google.com/p/robotframework/issues/detail?id=471

Great :) Then the issue I raised for SeleniumLibrary can be closed,
instead if this is now handled globally. Or is the issue above for the
framework, and various libraries should have their own issue?

I would argue the severity of the issue though. In my experience,
problems with Umlauts arise quite often, and if you have this kind of
problem, it's very difficult to debug, because you are not getting a
proper error message (since the error message could not be
decoded...). Anyway, if this is going to be fixed, I don't mind the
severity :)

Kind Regards,

Andreas

Pekka Klärck

unread,
Feb 22, 2010, 9:04:58 AM2/22/10
to Andreas...@gmx.de, robotframework-users
2010/2/22 AndreasEK <Andreas...@gmx.de>:

> On Feb 22, 12:57 pm, Pekka Klärck <pekka.kla...@gmail.com> wrote:
>
>> It's true that str() is potentially risky here too. Luckily in most
>> places in the core framework helper methods utils.unic() and
>> utils.get_error_message() are used and they ought to handle Unicode
>> just fine. I just submitted a bug report to convert all remaining
>> str() calls to something more safe:http://code.google.com/p/robotframework/issues/detail?id=471
>
> Great :) Then the issue I raised for SeleniumLibrary can be closed,
> instead if this is now handled globally. Or is the issue above for the
> framework, and various libraries should have their own issue?

The above issue is just for the core framework and the standard
libraries distributed with it. SeleniumLibrary having its own issue is
thus just a good thing.

> I would argue the severity of the issue though. In my experience,
> problems with Umlauts arise quite often, and if you have this kind of
> problem, it's very difficult to debug, because you are not getting a
> proper error message (since the error message could not be
> decoded...). Anyway, if this is going to be fixed, I don't mind the
> severity :)

I set the priority to medium because a) my fast review of places where
str() were used within the core framework indicated nothing critical,
and b) nobody has reported any real life bugs about them. I expect
this to be fixed for RF 2.1.3 anyway so the priority doesn't really
matter.

AndreasEK

unread,
Mar 25, 2010, 1:24:17 PM3/25/10
to robotframework-users
Hi,

I'm still having that problem. *breath and calm down* ...

I am clueless what still to do about that. This is the exception, that
I'm seeing:

[INFO]
==============================================================================
[INFO] Volltextsuche :: Dieser Test pr?ft die
Volltextsuche | FAIL |
[INFO] UnicodeEncodeError: 'ascii' codec can't encode character
u'\xdf' in position 51: ordinal not in range(128)
[INFO]
------------------------------------------------------------------------------

The test tries to click on a link like this:

<a href="/web/html/privat/funktionen/volltextsuche/
volltextsuche_fliesstext/index.html" class="linkSSRubrik
linkIconLinksGross linkIconGross xLink_Weiter_Gross_Links
Dokument_Link">
<strong>

Volltextsuche im Fließtext
</strong>
</a>

</p>


I'm using the latest version of the SeleniumLibrary, where the bug is
supposed to be fixed (heck, I tested that myself), but apparently,
there are situations, in which the fix still fails.

def click_link(self, locator, dont_wait=''):
"""Clicks a link identified by locator.

Key attributes for links are `id`, `name`, `href` and link
text. See
`introduction` for details about locating elements and about
meaning
of `dont_wait` argument.
"""
self._info("Clicking link '%s'." % locator)
try:
self._click(self._parse_locator(locator, 'link'),
dont_wait)
except Exception, err:
if 'not found' not in unicode(err):
raise
self._click("link=%s" % locator, dont_wait)

When I remove the last if ... raise line, the test works fine. And
what is very surprising: the test only fails in firefox 3.0.18, but
works in IE8!

When I play around in the shell, I can provoke some Unicode errors,
but the UnicodeEncodeError only occurs, when the unicode() is missing,
and that is existing in the click_link method above.

>>> try:
... raise Exception(u'äöü')
... expect Exception, e:
File "<stdin>", line 3
expect Exception, e:
^
SyntaxError: invalid syntax
>>> try:
... raise Exception(u'äöü')
... except Exception, e:
... print e
...
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-2: ordinal not in range(128)
>>> try:
... raise Exception(u'äöü')
... except Exception, e:
... print unicode(e)
...
äöü
>>> try:
... raise Exception('äöü')
... except Exception, e:
... print unicode(e)
...
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0x84 in position
0: ordinal not in range(128)
>>> try:
... raise Exception('äöü')
... except Exception, e:
... print e
...
äöü

Let me know if you have an idea, what we can do about it.

Andreas

Pekka Klärck

unread,
Mar 25, 2010, 4:22:52 PM3/25/10
to Andreas...@gmx.de, robotframework-users
2010/3/25 AndreasEK <Andreas...@gmx.de>:

>
> I'm still having that problem. *breath and calm down* ...
>
> I am clueless what still to do about that. This is the exception, that
> I'm seeing:
>
> [INFO] Volltextsuche :: Dieser Test pr?ft die
> Volltextsuche                  | FAIL |
> [INFO] UnicodeEncodeError: 'ascii' codec can't encode character
> u'\xdf' in position 51: ordinal not in range(128)

Could you run the test using '--loglevel debug' and provide the
traceback you got? Otherwise it's pretty hard to know where the
unicode error originated from. Please also upgrade to RF 2.1.3, if you
haven't already, because Unicode handling was improved there too.

Cheers,
.peke

AndreasEK

unread,
Mar 28, 2010, 6:13:22 PM3/28/10
to robotframework-users
Back to the mailing list from a private message exchange with Pekka.
Now I not only learned a lot about Python, but we were also able to
get to the root cause of the Problem. And it's (once again): Jython :
( It just cannot handle Unicode in Exceptions correctly. Issues were
created for Jython to fix the Problem and for the SeleniumLibrary to
work around it:

http://bugs.jython.org/issue1585
http://code.google.com/p/robotframework-seleniumlibrary/issues/detail?id=107&colspec=ID%20Type%20Status%20Priority%20Target%20Owner%20Summary%20Stars

Thanks for your support!

Andreas

On Mar 25, 10:22 pm, Pekka Klärck <pekka.kla...@gmail.com> wrote:
> 2010/3/25 AndreasEK <Andreas.Ebb...@gmx.de>:

Reply all
Reply to author
Forward
0 new messages