Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help with Regexp, \b

0 views
Skip to first unread message

andrew cooke

unread,
May 29, 2010, 11:04:51 AM5/29/10
to

This is a bit embarassing, but I seem to be misunderstanding how \b
works in regexps.

Please can someone explain why the following fails:

from re import compile

p = compile(r'\bword\b')
m = p.match(' word ')
assert m

My understanding is that \b matches a space at the start or end of a
word, and that "word" is a word - http://docs.python.org/library/re.html

What am I missing here? I suspect I am doing something very stupid.

Thanks,
Andrew

Duncan Booth

unread,
May 29, 2010, 11:24:48 AM5/29/10
to
andrew cooke <and...@acooke.org> wrote:

> Please can someone explain why the following fails:
>
> from re import compile
>
> p = compile(r'\bword\b')
> m = p.match(' word ')
> assert m
>
> My understanding is that \b matches a space at the start or end of a
> word, and that "word" is a word - http://docs.python.org/library/re.html
>
> What am I missing here? I suspect I am doing something very stupid.
>

You misunderstand what \b does: it doesn't match a space, it matches a 0
length string on a boundary between a non-word and a word.

Try:

p.match(' word ', 1).group(0)

and you'll see that you are only match the word, not the surrounding
puctuation.

andrew cooke

unread,
May 29, 2010, 11:30:18 AM5/29/10
to
On May 29, 11:24 am, Duncan Booth <duncan.bo...@invalid.invalid>
wrote:

> andrew cooke <and...@acooke.org> wrote:
> > Please can someone explain why the following fails:
>
> >         from re import compile
>
> >         p = compile(r'\bword\b')
> >         m = p.match(' word ')
> >         assert m
[...]

> You misunderstand what \b does: it doesn't match a space, it matches a 0
> length string on a boundary between a non-word and a word.
[...]

That's what I thought it did... Then I read the docs and confused
"empty string" with "space"(!) and convinced myself otherwise. I
think I am going senile.

Thanks very much!
Andrew

John Machin

unread,
May 31, 2010, 8:12:09 PM5/31/10
to
On May 30, 1:30 am, andrew cooke <and...@acooke.org> wrote:

>
> That's what I thought it did...  Then I read the docs and confused
> "empty string" with "space"(!) and convinced myself otherwise.  I
> think I am going senile.

Not necessarily. Conflating concepts like "string containing
whitespace", "string containing space(s)", "empty aka 0-length
string", None, (ASCII) NUL, and (SQL) NULL appears to be an age-
independent problem :-)

0 new messages