That form also has a regex feature.
Is there a regex that would match all the odd not-normal-ascii
characters these turkeys like to use (characters that show up
as phone icons, musical notes, etc)
Maybe!
You might try [^[:graph:]], although I honestly can't say whether
this works or not. Quick messing around with other programs suggests
that something to that effect might work; I also got interesting results
from [^ -~].
-s
--
Copyright 2009, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
You could also exclude anything with
User-Agent: G2/1.0
in the header.
Gets most junk.
--
Any time things appear to be going better, you have overlooked
something.
It's terribly hard to idiotproof things; idiots are clever and cunning
:)
To me it seems easier to go the other way and simply accept alphanum,
maybe [[alphanum]] or something. I know next to nothing about i18n,
locale's, or unicode issues but it seems alphanum would be in the
ballpark for many/most environments.