how to emulate itertools.groupby with a series/dataframe?

660 views
Skip to first unread message

Michael

unread,
Apr 24, 2014, 12:31:14 PM4/24/14
to pyd...@googlegroups.com
given a list L = [1, 1, 1, 2, 2, 2, 3, 3, 1, 1, 3, 3,  2, 2, ]
with itertools.groupby I would get 6 groups

I'm trying to get the same result with pandas, 
but I'm not yet aware of a method

Is anyone able to generate a solution for pandas?

If I think of something, i will write it here...

D. S. McNeil

unread,
Apr 24, 2014, 12:46:35 PM4/24/14
to pyd...@googlegroups.com
There are several open tickets which are relevant:

https://github.com/pydata/pandas/issues/4059

has links to many of them.  I tend to use (from my ticket on the same, https://github.com/pydata/pandas/issues/5494) s.groupby((s != s.shift()).cumsum()), e.g:

>>> s = pd.Series([1, 1, 1, 2, 2, 2, 3, 3, 1, 1, 3, 3, 2, 2])
>>> s.groupby((s != s.shift()).cumsum())
<pandas.core.groupby.SeriesGroupBy object at 0xb364b4c>
>>> s.groupby((s != s.shift()).cumsum()).groups
{1: [0L, 1L, 2L], 2: [3L, 4L, 5L], 3: [6L, 7L], 4: [8L, 9L], 5: [10L, 11L], 6: [12L, 13L]}



Doug

Michael

unread,
Apr 24, 2014, 1:33:32 PM4/24/14
to pyd...@googlegroups.com
ingenious!

it's kind of a mouthful though! but it's a nice solution...

I hope they will settle over a new function, like split() or partition(), they are both intuitive 
Reply all
Reply to author
Forward
0 new messages