I tried for instance
WHERE col REGEXP '\u004c'
which should find any occurences of 'L', but it doesn't.
'\x4c' fails as well.
What's the correct syntax for MySQL?
The field is of type text collation utf8_roman_ci (or similar).
(Of course, I do not want to find 'L' this way. It's just a simplified
query to get a working syntax.)
Kai
Where on
http://dev.mysql.com/doc/refman/5.1/en/regexp.html
or
http://dev.mysql.com/doc/refman/5.1/en/pattern-matching.html
or
http://dev.mysql.com/doc/refman/5.1/en/string-syntax.html
is \u or \x discussed? Outside of the case on the last one where it says
explicitly 'For example, "\x" is just "x".'
--
Revenge is an integral part of forgiving and forgetting.
-- The BOFH
> Where on
>
..
>
> is \u or \x discussed?
\u and \x is standard syntax in various regular expression
implementations. I gave it as an example of what obvious things I tried.
That this is not supported is *exactly* the problem! How do I specify a
certain Unicode code point either for matching or for insertion if not
this way?
Kai
--
Conactive Internet Services, Berlin, Germany
It's an inconvenience. But at least the manual spells out exactly which
method of parsing is behind REGEXP, and details what is supported, so
there's not much point in being unhappy that something that wasn't
suggested would work does not, in fact, work.
> How do I specify a
> certain Unicode code point either for matching or for insertion if not
> this way?
There's a couple of ways that would likely work. The most
straightforward way is to simply set the connection character set
correctly, and construct the pattern using the explicit characters that
you're looking for. If you're looking for 'L', send an L. If you're
looking for a 'þ', send a þ.
Another method would be to REGEXP on a HEX(my_col), which might be
entirely reasonable for small sets of data. It's a little ... suboptimal
on large datasets because it'll force a tablescan.
And while you're playing with those, remember that MySQL uses 0x
notation to express hex constants, and you might find the
CHAR(N,... [USING charset_name]) function pretty handy....
> And while you're playing with those, remember that MySQL uses 0x
> notation to express hex constants, and you might find the
> CHAR(N,... [USING charset_name]) function pretty handy....
Thanks for this. The solution was to use something like
UPDATE table SET fr=REPLACE(fr, ' ?',CONCAT(CHAR(0xc2a0),'?')) WHERE fr
LIKE '%?%'
and
SELECT fr FROM table WHERE fr LIKE CONCAT('%',CHAR(0xc2a0),'?','%')
I didn't test REGEXP.