I've got a problem with regular expressions and strings containing
Swedish characters (едц).
I basically have a PHP script that highlights certain words in a text. I
found the code attached below in the commented manual at php.net. It
works great for all words that do not contain Swedish characters. The
words that do contain едц will not be highlighted.
Can anyone suggest how I should change my regexp to fix this?
Thanks,
toffe
Code:
=============
function highlightErrors($text, $errors) {
foreach($errors as $e) {
$text = highlight_word($text,$e);
}
return $text;
}
function highlight_word($buff,$query) {
$buff = preg_replace("/(^|[^A-ZедцЕДЦ]){1}(".preg_quote($query,"/").
")($|[^A-ZедцЕДЦ]){1}/i",
"\\1<span class='highlight'>\\2</span>\\3", $buff);
return $buff;
}
=========
Sorry for being ignorant and not reading the FAQ before posting, won't
happen again...
Thanks a lot for the information!
-toffe
Hi, thanks for the pointer.
It works almost like I want it to now.
My script should highlight certain words in the text, but the text could
be a mix of upper and lower case letters, and if $query below is hxllo
and $buff is HXLLO, where x and X is some Swedish character in its lower
and upper cases, I still don't get a match.
Any suggestions for how I can fix this?
Thanks,
toffe
Code:
====
$buff =
preg_replace("/(^|[^A-Z\xe5\xe4\xf6\xc5\xc4\xd6]){1}(".preg_quote($query,"/").
")($|[^A-Z\xe5\xe4\xf6\xc5\xc4\xd6]){1}/i",
"\\1<SURROUNDING>\\2<TAG>\\3", $buff);
return $buff;
=========
IIRC, there is no lower-upper case distinction for the foreign
characters--so you may have to add those upper/lower case characters in
the set. Probably you may need to look at
<http://in.php.net/ucwords#51137>
BTW, we don't have any FAQ yet. We're just compiling and the
question was asked previously.
--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/