String.matches() does not work with \p{ASCII}

1,273 views
Skip to first unread message

Vlad

unread,
Aug 11, 2010, 6:39:45 PM8/11/10
to Google Web Toolkit
Hi,
I have a simple code that works correctly in development mode but
fails to run on production.
Basically, I need to check if the text entered by user contains only
ASCII characters. So, I do the following:

String s = getTextArea().getText();
if(s.matches("\\p{ASCII}*"))
{
...
}
else
{
// Some non ASCII characters found
}

On the production it always comes to the "else" section. I've tried it
with IE, FireFox and Chrome. The results are the same.
Any suggestions of how to fix this?

cokol

unread,
Aug 12, 2010, 9:17:23 AM8/12/10
to Google Web Toolkit
as you probably know, regex is belongs to that cases not fully
compatible between java and javascript, and in dev mode your GWT
engine uses real JDK therefore it works, whereas after compilation
your matches() is performed on the browser with its regex engne and it
fails.


well u have to rewrite the pattern \\p{ASCII}* to JS compatible
fashion

cokol

unread,
Aug 12, 2010, 9:40:24 AM8/12/10
to Google Web Toolkit
and if you really want to check for ascii why dont u just check for
the ascii code? try java.lang.Character to check or write in JSNI
small for-each testing if decimal value of char is greater than 127 is
faster than a regex evaluation

Thomas Broyer

unread,
Aug 12, 2010, 1:21:14 PM8/12/10
to Google Web Toolkit


On 12 août, 15:40, cokol <eplisc...@googlemail.com> wrote:
> and if you really want to check for ascii why dont u just check for
> the ascii code?

Such as "[\\u0000-\\u007F]*"

> try java.lang.Character to check or write in JSNI
> small for-each testing if decimal value of char is greater than 127 is
> faster than a regex evaluation

Are you sure? given that trimming blanks is faster with regexes
(str.replace(/^\s+/, "").replace(/\s+$/, "")), I tend to think that
regexes are really fast.

cokol

unread,
Aug 13, 2010, 7:16:26 AM8/13/10
to Google Web Toolkit
hard to say, its up to the engine implementing regex, if you want to
check for ascii code of each character you gonnamake a loop

like

function isAscii(str){
for(i=0;i<str.length;i++){
if(str[i]>127)return false;
}
return true;
}

when making regex its like a touring machine which scans the character
applied to the pattern, so it also depends on a pattern how many chars
are scanned, and in your case you want to know if a character is in
ascii range, so anyway it has to scan each. there is one chance regex
is faster inthis case, if its implemented natively and not
interpreted.

the loop above has the advantage is has not to scan all of the chars,
in case its not ascii but returns after first non-ascii

Vlad

unread,
Aug 13, 2010, 5:38:37 PM8/13/10
to Google Web Toolkit
Thanks, with the hexidecimal range 00, 7F it works.

Raz

unread,
Nov 11, 2014, 5:15:16 AM11/11/14
to google-we...@googlegroups.com
hi please can u give me full code to figure out this kind of  problem same thing happen to me.
Reply all
Reply to author
Forward
0 new messages