My closest to successfull attempt:
Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]
Type "copyright", "credits" or "license" for more information.
IPython 0.9.1 -- An enhanced Interactive Python.
In [161]: re.findall('\d+','this is test a3 attempt 79')
Out[161]: ['3', '79']
What I really want in just the 79, as a3 is not a decimal number, but
when I add the \b word boundaries I get:
In [162]: re.findall('\b\d+\b','this is test a3 attempt 79')
Out[162]: []
What am I missing?
~Ethan~
The sneaky detail that the regexp should be in a raw string
(always a good practice), not a cooked string:
r'\b\d+\b'
The "\d" isn't a valid character-expansion, so python leaves it
alone. However, I believe the "\b" is a control character, so
your actual string ends up something like:
>>> print repr('\b\d+\b')
'\x08\\d+\x08'
>>> print repr(r'\b\d+\b')
'\\b\\d+\\b'
the first of which doesn't match your target string, as you might
imagine.
-tkc
Try this:
>>> re.findall(r'\b\d+\b','this is test a3 attempt 79')
['79']
The \b is a backspace, by using raw strings you get an actual backslash
and b.
--
Sjoerd Mullender
You need to use a raw string (r'...') to prevent \b from being interpreted
as a backspace:
re.findall(r'\b\d+\b','this is test a3 attempt 79')
\d isn't a recognised escape sequence, so it doesn't get interpreted:
> print '\b'
^H
> print '\d'
\d
> print r'\b'
\b
Try to get into the habit of using raw strings for regexps.
ARGH!!
Okay, I need two \\ so I'm not trying to match a backspace. I knew
(okay, hoped ;) I would figure it out once I posted the question and
moved on.
*sheepish grin*
~Ethan~