If I apply the following very simple command (replacing one or more
consecutive CR chars with one LF char)
(cl-ppcre:regex-replace-all (concatenate 'string (string #\return) "+")
mystring (string #\linefeed))
to my string of 455079 characters (loaded from a utf-8 file), some of
the last #\return characters are not substituted (even if they should,
since if a apply again the command to the resulting string they ARE
subsituted).
It looks like in the search there is a sort of length limit, or maybe some string length mistake connected to multi-byte characters representation ?
Cheers.
Mario
_______________________________________________
cl-ppcre-devel site list
cl-ppcr...@common-lisp.net
http://common-lisp.net/mailman/listinfo/cl-ppcre-devel
Thanks,
Edi.
But I have another question: how do I enter Unicode chars in the rexexp?
For example I need to replace "whatever" with “whatever”, I tried to replace
"([^"\r\n]*)"
with
\u201c\1\u201d
but it didn't work.
I know I could generate and concatenate Unicode chars with Lisp, e.g.
(code-char #x201c), but it'd be cleaner to do it directly inside the regexp.
Thanks.
Mario
> .
> But I have another question: how do I enter Unicode chars in the rexexp?
> For example I need to replace "whatever" with “whatever”, I tried to replace
>
> "([^"\r\n]*)"
>
> with
>
> \u201c\1\u201d
>
> but it didn't work.
>
> I know I could generate and concatenate Unicode chars with Lisp, e.g.
> (code-char #x201c), but it'd be cleaner to do it directly inside the regexp.
For a portable solution, you could give this a try:
Edi.