I am wondering if there is a way to use unicode characters as data
input.
For example, one can escape carriage return with \ (backslash) such
as "text\rmore text" which will prevent robot from parsing r
character
what if I want to use unicode such as "text\uxxxx\more text"?
Is it possible?
Thanks,
-Snejana
Yes it is. You can actually use real characters in your data if you
follow these format specific rules:
- With HTML test data you need to specify correct encoding in the
title section. Alternatively in HTML you can use entity references
such as "ä".
- With TXT and TSV formats you need encode your files using UTF-8 encoding.
For more information, see the format specific sections in the User Guide:
http://robotframework.googlecode.com/svn/tags/robotframework-2.1.3/doc/userguide/RobotFrameworkUserGuide.html#test-data-syntax
Another solution is using variables. Keywords can return values
containing Unicode, and if you need many strings, or want to be able
to use different strings on different runs, you can use variable
files:
http://robotframework.googlecode.com/svn/tags/robotframework-2.1.3/doc/userguide/RobotFrameworkUserGuide.html#variable-files
> For example, one can escape carriage return with \ (backslash) such
> as "text\rmore text" which will prevent robot from parsing r
> character
>
> what if I want to use unicode such as "text\uxxxx\more text"?
>
> Is it possible?
Unicode escape sequences like this aren't supported. We could probably
add support for them relatively easily, but because you can just use
those actual characters I don't see big need for that.
Cheers,
.peke
--
Agile Tester/Developer/Consultant :: http://eliga.fi
Lead Developer of Robot Framework :: http://robotframework.org
I used following snipped of java code to parse robot input data like Google\\u000DScholar (\\u000D is a carriage return)
Pattern pattern = Pattern.compile("\\\\u[\\w]{4}");
Matcher matcher = pattern.matcher(cp);
while (matcher.find()){
char unicode = (char) Integer.parseInt(matcher.group().substring(2),16);
codePoint = codePoint.replace(matcher.group(), unicode + "");
Yes, your code converts an ASCII string to Unicode so that \u000D is
converted to a backslash.
> What did you mean by "you can just use those actual characters"? \uxxxx is a
> valid unicode string, where xxxx in this case can be anything in a range
> 0000-FFFF
'\uxxxx' is a Unicode escape sequence that Java and many other
programming languages support. This syntax isn't, however, supported
in in Robot Framework test data. Instead of using those escape
sequences you can use the actual characters. For example, instead of
'\u00e4' you can just use 'ä'. For this to work, you just need to take
care of the encoding as I explained in the previous mail.
If you really want to use Unicode escape sequences instead of the real
characters, you can create variables using them. You could return
create and return a Unicode character e.g. in your Java keyword, but
if you need more of them then variable files are probably a better
approach. You could, for example, have this code in `myvars.py` file:
auml = u'\u00e4'
bspace = u'\u000d'
and then use them in your test data like:
***Settings***
Variables myvars.py
***Test Case***
Example
Log ${AUML}
Log ${BSPACE}
Cheers,
.peke
I recommend you to try out variable files.
> It would be nice if support for unicode sequences is added some time in the
> future, though I agree there is no high priority
I agree sometimes a direct support for \uxxxx escape sequences would
be handy. Please submit an enhancement request about it to the
tracker.
Cheers,
.peke