Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Handling special characters while using regular expressions

1,556 views
Skip to first unread message

Nitesh

unread,
Aug 18, 2010, 6:28:43 PM8/18/10
to
Hi all

I have a small query about regular expressions. Suppose if your
regular expression matching pattern is itself a variable and could
contain special characters like *, +, \ , etc. how do you make it
work properly without doing string manipulation of the pattern.

For ex-

set pattern $var
regexp $pattern $data match

Is there any quick way to mask special characters which does not
involve any string manipulation like inserting backslashes before
these special characters in the pattern string.

Nitesh

Bruce

unread,
Aug 18, 2010, 8:01:16 PM8/18/10
to

regexp "(?q)$pattern" $data match

see <http://www.tcl.tk/man/tcl8.6/TclCmd/re_syntax.htm#M82>

for details

Nitesh

unread,
Aug 19, 2010, 1:35:01 AM8/19/10
to

Thanks for your reply. This does what I intend to do to solve part of
the problem. Thanks a bunch.

I have one connected question. How do you create a pattern where say
in the first part of the pattern you want the special characters to be
treated normally whereas in the second part you want to mask the
special characters.

For ex.

set filename "b+c_d.cmd"

and you want to match "#a FileName=b+c_d.cmd" in the data pattern. Now
this matching pattern including filename could be written in a more
generic form as follows:

"#a{1 or more spaces}FileName{0 or more spaces}={0 or more spaces}b
+c_d.cmd"

which translates into a matching pattern as follows

"#a\s*FileName\s*=\s*" for the first part, for the second part where
we want to mask the special characters the suggested regular
expression is ("?q)$filename"

combining the two parts we get the final regular expression as
follows:

"#a\s*FileName\s*=\s*(?q)$filename"

But when I try to use combined regular expression and run the
following code

set data "#a FileName=b+c_d.cmd"

regexp "#a\s*FileName\s*=\s*(?q)$filename" $data match

I get the error saying it cant compile the regular expression


It looks like I am missing out on something. Any pointers?


Thanks
Nitesh

Bruce

unread,
Aug 19, 2010, 9:56:11 AM8/19/10
to

the special qualifiers (?q) (?i) etc are only valid at the start of
the RE.

To do what you want you will need to modify your pattern to escape
special chars where you don't want them treated special and leave
them alone where you want them. there is not that many special chars
to REs so a fairly simple string map will work fine.

Bruce

Aric Bills

unread,
Aug 19, 2010, 10:31:01 AM8/19/10
to

(?q) only works at the beginning of a regular expression.

One solution to your problem would be to convert $filename to proper
regular expression syntax. The following expression will escape more
characters than really need to be escaped, but it should have the
desired effect:

regsub -all {\W} $filename {\\&} re_filename

You can then build your expression using any number of methods. If
you're not putting curly braces around your expression, don't forget
to escape the backslashes:

set expression "#a\\s*FileName\\s*=\\s*$re_filename"

Alternatively, you could build your string piecewise:

set expression {#a\s*FileName\s*=\s*}
append expression $re_filename

Then just run the expression:

regexp $expression $data match

0 new messages