NumberFormat currency pattern issues (wrong positiveSuffix and possible encoding issues)

Tiago Rinck Caveden

unread,

Dec 15, 2011, 3:39:47 AM12/15/11

to google-we...@googlegroups.com

Hello,

I'm having some weird issues with NumberFormat. I'm using it with the currency pattern, which for my locale is the euro.

The first problem I've notice is that the parser is treating the string " €" as a "positiveSuffix". And if such string is not at the end of the string being parsed, the error "X does not have either positive or negative affixes" is raised. And well, it's normal that such suffix is not there, people won't manually type a space plus the euro symbol. It's ok if such string is added by the format method but it should not be required by the parser.

I guess that's pretty much a bug for NumberFormat, right?

And well, that's not all. After trying to work around this by manually adding the " €" to the end of my numbers, it still raised the same exception. So I decide to debug it, and I saw that, in spite of the correct string being there, the endsWith method was returning false. I just couldn't understand, until I decide to add a watch expression requesting the getBytes() of each string. And what was my surprise when I saw that the blank space character had different bytes in each string being compared! In the "positiveSuffix" variable of NumberFormat, the space character was represented by two bytes, while in my input string, it was only the byte 32. Actually, it seems my input string was using an 8 bit encoding, since only the euro symbol used more than one byte.

These are the strings and their bytes:

The input: "10 €" => [49, 48, 32, -30, -126, -84]

The positiveSuffix: " €" => [-62, -96, -30, -126, -84]

As you see, it seems blank space in the input is 32 while in the positiveSuffix it is [-62, -96].

I'm guessing javascript doesn't give the same guarantee that Java does, that is, that every in memory string uses the same encoding. But, if that's the case, shouldn't GWT try to assure always the same encoding is being used? Because, otherwise, how should we do? I know that we can set encoding for Writers and Readers in Java but for in memory strings, how can we make sure we're always using the same encoding?

Thank you,

--
Tiago Rinck Caveden

Thomas Broyer

unread,

Dec 15, 2011, 8:40:04 AM12/15/11

to google-we...@googlegroups.com

-62, -96, or in hexadecimal C2,A0 is the encoding (in UTF-8) of a non-breaking space (U+00A0, widely known on the web as  ) It's a distinct Unicode character than a space (U+0020, encoded in UTF-8 as 32); it's not an encoding issue (btw, all strings in JavaScript are in UCS-2, just like in Java)

Tiago

unread,

Dec 15, 2011, 11:14:29 AM12/15/11

to Google Web Toolkit

Thank you Thomas for the clarification.

Do you agree though that there's a bug on NumberFormat? It should not
require " €" to be in the end of a string it parses, although it
may add it to the end of a string it formats. Right?

Thomas Broyer

unread,

Dec 16, 2011, 3:41:14 AM12/16/11

to google-we...@googlegroups.com

Not sure this is a bug, maybe by-design (i.e. don't use the currency format for parsing).

If you want to parse a number, don't use a currency format.

If you expect the user to (optionally) type in the currency symbol, you'd better pre-process the value (what if the user types "$10" in a € locale? how about "10 USD" or "10 EUR"?

Tiago

unread,

Dec 16, 2011, 9:54:50 AM12/16/11

to Google Web Toolkit

Well, "$10" or "10 USD" should clearly be invalid strings for an euro
locale.
I wouldn't mind "10 EUR" being invalid for the default format too
since using currency codes is quite specific. Actually, all these
could be configurable parameters (requiring, ignoring or refusing
currency symbols and/or currency codes). What I don't find correct is
to always require a " €".. the user can't even type a  ...

Not being able to use NumberFormat to parse currency values would be a
pity. I would expect to be able to use the same format both ways
(formatting and parsing). And well, if this choice is made "by
design", shouldn't it be documented? The exception "X does not have
either positive or negative affixes" doesn't have anything to do with
the actual issue. " €" is not a "positiveSuffix".

Thanks

Reply all

Reply to author

Forward