[cl-ppcre-devel] Question about a strange behavior of cl-ppcre

3 views
Skip to first unread message

張 漢秀

unread,
Aug 15, 2008, 3:50:43 AM8/15/08
to cl-ppcr...@common-lisp.net
Hello,

I post this question because the description of this mailing list says
that this is the place to post a question about cl-ppcre.

I am a user of cl-ppcre on sbcl. I just re-installed cl-ppcre on sbcl,
so I guess the version of cl-ppcre is the newest one.

I found a strange behavior as below.

CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\"))
\
CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\\\"))
\
CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\\\\\\\"))
\\

For me, it is very difficult to figure out what's going on. Would
someone kindly help me understand this problem?

Han-Soo

_______________________________________________
cl-ppcre-devel site list
cl-ppcr...@common-lisp.net
http://common-lisp.net/mailman/listinfo/cl-ppcre-devel

Edi Weitz

unread,
Aug 15, 2008, 10:29:49 AM8/15/08
to ch...@saitama-med.ac.jp, General interest list about cl-ppcre and cl-unicode
On Fri, 15 Aug 2008 16:50:43 +0900, 張 漢秀 <ch...@saitama-med.ac.jp> wrote:

> CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\"))
> \
> CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\\\"))
> \
> CL-USER> (format t "~A" (cl-ppcre:regex-replace "a" "a" "\\\\\\\\"))
> \\

The backslash in the replacement specification is special - it can be
followed by things like #\& or #\` to denote specific parts of the
target string - see documentation. So, if you just want to have a
backslash, you need two backslashes in order to avoid confusion:

CL-USER 1 > (ppcre:regex-replace "a" "xay" "\\&\\&")
"xaay"
T

CL-USER 2 > (ppcre:regex-replace "a" "xay" "\\&\\\\&")
"xa\\&y"
T

Your second example is one (escaped) backslash, your third example
consists of two (escaped) backslashes. This is conforming with Perl:

edi@miles:~$ perl -le '$_ = "a"; s/a/\\/; print'
\
edi@miles:~$ perl -le '$_ = "a"; s/a/\\\\/; print'
\\

In your first example, there's only one backslash, but as there's
nothing following it, the parser figured out that you probably meant a
backslash. This is some kind of a DWIM behaviour and you can of
course argue if it's a good thing or not.

HTH,
Edi.

Reply all
Reply to author
Forward
0 new messages