>>> r=re.compile(r'(?:[a-zA-Z]:)([\\/]\w+)+')
>>> r.search(r'c:/tmp/spam/eggs').groups()
('/eggs',)
Obviously, I would like to capture all groups:
('/tmp', '/spam', '/eggs')
But it seems that re captures only the last group. Is there any way to
capture all groups with repeat following it, i.e. (...)+ or (...)* ?
Even better would be:
('tmp', 'spam', 'eggs')
Yes, I know about re.split:
>>> re.split( r'(?:\w:)?[/\\]', r'c:/tmp/spam\\eggs/' )
['', 'tmp', 'spam', '', 'eggs', '']
My interest is more general in this case: how to capture many groups
with a repeat?
Regards,
mk
You'll have to do something else, for example:
>>> s = re.compile(r'(?:[a-zA-Z]:)')
>>> n = re.compile(r'[\\/]\w+')
>>> m = s.match('c:/tmp/spam/eggs')
>>> n.findall(m.string[m.end():])
['/tmp', '/spam', '/eggs']
--
Neil Cerutti
re.findall is what you're looking for. Here's all words not followed by a
colon:
>>> import re
>>> re.findall(u'(\w+)(?!:)',r'c:\tmp\spam/eggs')
['tmp', 'spam', 'eggs']
-Mark