Regular Expression newbie needs help

2222 views
Skip to first unread message

Jeff Whitmire

unread,
May 28, 1998, 3:00:00 AM5/28/98
to

I have never done much with regular expressions, and I now find myself
struggling with something simple. I am using the new re module (as
opposed to the regex module). I am trying to match a string that does
not contain a period. According to the documentation, the pattern
'[^.]' should work, but it does not. If I use '[.]', it matches all
strings that contain a period, but I cannot seem to get the complement
to work. Am I missing something simple?

---
Jeff Whitmire
Xerox Corporation
DocuShare Development Team
jwhi...@crt.xerox.com
http://www.xerox.com/products/docushare

Lloyd Zusman

unread,
May 29, 1998, 3:00:00 AM5/29/98
to

whit...@wrc.xerox.com (Jeff Whitmire) writes:

> I have never done much with regular expressions, and I now find myself
> struggling with something simple. I am using the new re module (as
> opposed to the regex module). I am trying to match a string that does
> not contain a period. According to the documentation, the pattern
> '[^.]' should work, but it does not. If I use '[.]', it matches all
> strings that contain a period, but I cannot seem to get the complement
> to work. Am I missing something simple?

Well, just follow this explanation ...

The `[.]' pattern does this:

Match any string which contains at least one period.

The `[^.]' pattern does this:

Match any string which contains at least one non-period.

In other words, `[.]' and `[^.]' are not really complements.

Given some sample strings, the following table shows how they will be
matched given these two regular expressions, assuming that you're
always using re.search() to do the matching:

String Matches `[.]' Matches `[^.]'
------ ------------- --------------
Hello. Yes Yes
Hello No Yes
. Yes No


Here's one possible regular expression that will match a string that
has no periods:

^[^.]*$

Here's what this regular expression means:

Match any string which has zero or more occurrences of a string
of non-periods that stretches completely from the beginning of the
string to the end. The `^' at the left and the `$' at the right
are necessary to show that you are looking at the entire string,
from the leftmost character to the rightmost character; the `*'
is necessary to show that you want to match any number of non-periods,
including none.

In other words, `^[^.]$' is the complement of `[.]' ... at least when
using re.search().

Another way to handle this case is to use the `[.]' regular
expression, but to only look at strings which *don't* match it. In
other words:

if not re.search('[.]', string):
print '%s contains no periods' % (string)

Note the `not' in the `if' statement. This is interpreted as follows:

If it is *not* true that `string' contains at least one period,
then print the message.

I hope this helps. Good luck!

--
Lloyd Zusman
l...@asfast.com

Fredrik Lundh

unread,
May 29, 1998, 3:00:00 AM5/29/98
to

>Another way to handle this case is to use the `[.]' regular
>expression, but to only look at strings which *don't* match it. In
>other words:
>
> if not re.search('[.]', string):
> print '%s contains no periods' % (string)

How about:

if "." not in string:
print string, "contains no periods"

Cheers /F
fre...@pythonware.com
http://www.pythonware.com

"Some people, when confronted with a problem, think 'I know, I'll use
regular expressions.' Now they have two problems."
-- Jamie Zawinski, on comp.lang.emacs

Lloyd Zusman

unread,
May 29, 1998, 3:00:00 AM5/29/98
to

"Fredrik Lundh" <fre...@pythonware.com> writes:

> >Another way to handle this case is to use the `[.]' regular
> >expression, but to only look at strings which *don't* match it. In
> >other words:
> >
> > if not re.search('[.]', string):
> > print '%s contains no periods' % (string)
>
> How about:
>
> if "." not in string:
> print string, "contains no periods"

Yes, that's another approach that will work, too ... although the
original poster was specifically asking for help in using a regular
expression.

--
Lloyd Zusman
l...@asfast.com

Sid

unread,
May 31, 1998, 3:00:00 AM5/31/98
to

I'm not a regular expression guru, but I believe that the
period is a METACHARACTER that matches any
character. I think you need to quote the character
in order to do the match you propose.

[^\.]

Notice the backslash before the period. This is how
you unescape the meaning of a METACHARACTER.

Sid
wiz...@gte.net


Lars Marius Garshol

unread,
May 31, 1998, 3:00:00 AM5/31/98
to

* wiz...@gte.net


|
| I'm not a regular expression guru, but I believe that the period is
| a METACHARACTER that matches any character.

Correct, except in []-groups where it matches itself. This is because
[.], [.a-z] and . all have the same meaning, so it's pointless to make
it difficult to use . in []-groups.

--
"These are, as I began, cumbersome ways / to kill a man. Simpler, direct,
and much more neat / is to see that he is living somewhere in the middle /
of the twentieth century, and leave him there." -- Edwin Brock

http://www.stud.ifi.uio.no/~larsga/ http://birk105.studby.uio.no/

Reply all
Reply to author
Forward
0 new messages