How to do a NOT in TCL?

cabkiz...@hotmail.com

unread,

May 31, 2007, 8:12:59 PM5/31/07

to

Hi,
I have a program that uses tcl7.5 regexp to find the following
unix:.* (WARN|NOTICE)

I would like to alter this to something like.
unix:.* (WARN|NOTICE) && ![^core]

I have been told that TCL cannot handle logical NOTs, but I thought
I'd just
check with the experts in this group for confirmation.

I tested the second expression above using awk and of course it
worked.

I have also tried
unix:.* (WARN|NOTICE).*[^core]{0}

I'm not sure if I have the syntax correct above, but what I'm trying
to say is
look for unix: then any characters followed by a space, then EITHER
WARN or NOTICE, then
any characters, but only when there a '0' (zero) occurances of core.

Hmm, hope someone understands what I'm trying to do. ;-)

Cheers
Craig.

bill...@alum.mit.edu

unread,

May 31, 2007, 9:59:32 PM5/31/07

to

Logical not is not part of any standard regular expression notation.
The regular languages are closed under complementation, which one can
see by taking a deterministic machine and swapping the accepting and
non-accepting states, but there isn't any straightforward
corresponding regular expression notation.

However, Tcl does support logical not at the expression level, so you
can do things like:

[regexp {unix:.* (WARN|NOTICE).*} $s] && ![regexp {unix:.* (WARN|
NOTICE).*core} $s]

where the first regular expression is more general and the second one
matches a subset of the strings matched by the first.

In the specific case you mention, though, a "regular expression" with
a negative forward assertion will work:

^unix:.* (WARN|NOTICE(?!core)$

The part (?!core) blocks matches ending in "core".

bill...@alum.mit.edu

unread,

May 31, 2007, 10:00:42 PM5/31/07

to

On May 31, 5:12 pm, cabkiz_fam...@hotmail.com wrote:

Oh, I should say that Tcl 7.5 is very old. I think that it probably
doesn't support assertions since the new regular expression package
came in with Tcl 8.0. Why on earth are you using such an ancient Tcl?

Volker Hetzer

unread,

Jun 1, 2007, 12:27:43 PM6/1/07

to

cabkiz...@hotmail.com schrieb:

> Hi,
> I have a program that uses tcl7.5 regexp to find the following
> unix:.* (WARN|NOTICE)

If you wanted to match either WARN or NOTICE the expression
ought to be ((WARN)|(NOTICE)). Otherwise the only thing it
matches is WARNOTICE.

> I would like to alter this to something like.
> unix:.* (WARN|NOTICE) && ![^core]

I guess you do not mean "any character apart from c,e,o and r", right?
Because that's what [^core] means in a regular expression.

> I have been told that TCL cannot handle logical NOTs, but I thought
> I'd just
> check with the experts in this group for confirmation.

It can, just not in a single expression. You do
if {[regexp {.* ((WARN)|(NOTICE))} $String] && ![regexp core $String]} {
...
}

>
> I tested the second expression above using awk and of course it
> worked.

Not sure how you managed that. None of the expressions you posted
worked for the grep on /my/ machine.

>
> I have also tried
> unix:.* (WARN|NOTICE).*[^core]{0}
>
> I'm not sure if I have the syntax correct above, but what I'm trying
> to say is
> look for unix: then any characters followed by a space, then EITHER
> WARN or NOTICE, then
> any characters, but only when there a '0' (zero) occurances of core.
>
> Hmm, hope someone understands what I'm trying to do. ;-)

Lots of Greetings!
Volker

--
For email replies, please substitute the obvious.

Ramon Ribó

unread,

Jun 1, 2007, 12:37:20 PM6/1/07

to

En Fri, 01 Jun 2007 18:27:43 +0200, Volker Hetzer
<firstname...@ieee.org> escribió:

> (WARN|NOTICE)

> I have a program that uses tcl7.5 regexp to find the following
> unix:.* (WARN|NOTICE)

> If you wanted to match either WARN or NOTICE the expression
> ought to be ((WARN)|(NOTICE)). Otherwise the only thing it
> matches is WARNOTICE.

Did you try it before answering?

(Documents) 1 % regexp {^(WARN|NOTICE)$} NOTICE
1
(Documents) 2 %

Ramon Ribó

Schelte Bron

unread,

Jun 1, 2007, 2:14:48 PM6/1/07

to

I assume you are updating an old script that used to run on Tcl7.5,
but you will be using a more recent version of Tcl. In that case
you may be able to use the negative lookahead feature for regular
expressions. I have to admit I am usually surprised by the results
when I try to use this feature, but I think for your case this
should work:
regexp {unix:.* (WARN|NOTICE)(?!.*core)} $str

Schelte.
--
set Reply-To [string map {nospam schelte} $header(From)]

Donal K. Fellows

unread,

Jun 2, 2007, 2:35:36 PM6/2/07

to

cabkiz_fam...@hotmail.com wrote:
> I would like to alter this to something like.
> unix:.* (WARN|NOTICE) && ![^core]
>
> I have been told that TCL cannot handle logical NOTs, but I thought
> I'd just check with the experts in this group for confirmation.

It doesn't handle logical negation, since that's a surprisingly non-
trivial concept with REs for fairly deep reasons. However, it does
provide some techniques that can help. Most important among these is
the negative lookahead assertion, which is *almost* negation. But with
some important restrictions, the most important of which are that they
cannot have backreferences in (not that that matters this time) and
that they are always non-greedy. They're also quite slow (though we
mitigate that by putting them at the end of the RE).

The RE you are looking for is probably something like this:
unix:.*\m(WARN|NOTICE)\M(?!.*\mcore\M.*$).*

You'll want to make sure you put that in {braces}, of course. (The \m
and \M ensure that you don't get tripped up by words like "scores"...)

Donal (alas, no time to explain properly what this does today...)

Volker Hetzer

unread,

Jun 13, 2007, 7:20:56 AM6/13/07

to

Ramon Ribó schrieb:

I did try it with grep and got different results.
However, I just tried again, with both tcl and egrep
and got your result. No idea what I did wrong last
time around.
Sorry about it. :-(

Jonathan Bromley

unread,

Jun 13, 2007, 8:58:15 AM6/13/07

to

On Sat, 02 Jun 2007 11:35:36 -0700, Donal K. Fellows
<donal.k...@man.ac.uk> wrote:

[negative lookahead assertions]

>are always non-greedy. They're also quite slow

Indeed. Horribly slow. Can anyone give me any
idea why this is so, and what I can do to
ease the problem? I have an example (real,
practical) like this: Search for all occurrences
of a pattern which is two words followed by a
literal left parenthesis...

regexp -inline -all {\m\w+\s+\w+\s*\(} $str

This is great - it scans a 1MB $str, with about 2000
matches of the RE, in around half a second on my PC.
But now I add the criterion that the first of the
two words must not be the literal string "module";
this will reject around 150 of the 2000 matches...

{\m(?!module\M)\w+\s+\w+\s*\(}

and the execution time goes up by a factor of FIFTY.

Obviously, the pragmatic solution is to use my first
RE and then post-process all the matches; but why is
the second RE so very slow?
--
Jonathan Bromley, Consultant

DOULOS - Developing Design Know-how
VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK
jonathan...@MYCOMPANY.com
http://www.MYCOMPANY.com

The contents of this message may contain personal views which
are not the views of Doulos Ltd., unless specifically stated.

bill...@alum.mit.edu

unread,

Jun 13, 2007, 5:59:38 PM6/13/07

to

Please note that the OP continued this thread under another topic with
the title "Logical NOTs - continued" at:
http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/d3fec44d7694e297/860f1226c6c6cf1b#860f1226c6c6cf1b.

Bill