Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: Splitting text at whitespace but keeping the whitespace in the returned list

1 view
Skip to first unread message

MRAB

unread,
Jan 24, 2010, 12:24:40 PM1/24/10
to pytho...@python.org
pyt...@bdurham.com wrote:
> I need to parse some ASCII text into 'word' sized chunks of text AND
> collect the whitespace that seperates the split items. By 'word' I mean
> any string of characters seperated by whitespace (newlines, carriage
> returns, tabs, spaces, soft-spaces, etc). This means that my split text
> can contain punctuation and numbers - just not whitespace.
>
> The split( None ) method works fine for returning the word sized chunks
> of text, but destroys the whitespace separators that I need.
>
> Is there a variation of split() that returns delimiters as well as tokens?
>
I'd use the re module:

>>> import re
>>> re.split(r'(\s+)', "Hello world!")
['Hello', ' ', 'world!']

pyt...@bdurham.com

unread,
Jan 24, 2010, 12:36:33 PM1/24/10
to MRAB, pytho...@python.org
MRAB,

"MRAB" <pyt...@mrabarnett.plus.com> wrote:
> >>> import re
> >>> re.split(r'(\s+)', "Hello world!")
> ['Hello', ' ', 'world!']

That was exactly (EXACTLY!) the solution I was looking for.

Thank you!
Malcolm

Tim Arnold

unread,
Jan 25, 2010, 12:47:25 PM1/25/10
to
"MRAB" <pyt...@mrabarnett.plus.com> wrote in message
news:mailman.1362.1264353...@python.org...

also, partition works though it returns a tuple instead of a list.
>>> s = 'hello world'
>>> s.partition(' ')
('hello', ' ', 'world')
>>>

--Tim Arnold


Roy Smith

unread,
Jan 26, 2010, 7:40:15 AM1/26/10
to
In article <hjklfd$llm$1...@foggy.unx.sas.com>,
"Tim Arnold" <tim.a...@sas.com> wrote:

> also, partition works though it returns a tuple instead of a list.
> >>> s = 'hello world'
> >>> s.partition(' ')
> ('hello', ' ', 'world')

I've never used partition() before; my first thought on reading the above
was, "That's weird, it should be returning a list". Then I went and looked
at the docs. Given the description (returns specifically a 3-tuple), I
guess a tuple makes sense, but now I'm wondering what the use case was for
this method when it was invented?

Having a variant of split() which either leaves the delimiter on the end of
each word, or returns a list of alternating [word, delimiter, word,
delimiter, word] seems logical and orthogonal. In fact, partition() is
really just the hypothetical whitespace-preserving variant of split(), with
maxsplit=1, except that it returns a tuple instead of a list.

So, what was the original problem partition() was trying to solve?

Aahz

unread,
Jan 26, 2010, 6:36:26 PM1/26/10
to
In article <roy-E19D6B.0...@news.panix.com>,

Roy Smith <r...@panix.com> wrote:
>
>I've never used partition() before; my first thought on reading the above
>was, "That's weird, it should be returning a list". Then I went and looked
>at the docs. Given the description (returns specifically a 3-tuple), I
>guess a tuple makes sense, but now I'm wondering what the use case was for
>this method when it was invented?

http://docs.python.org/whatsnew/2.5.html
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/

import antigravity

0 new messages