Pandas Dataframe count entire row duplicates

1,498 views
Skip to first unread message

mat lon

unread,
Feb 5, 2017, 2:02:49 PM2/5/17
to PyData
I have a Dataframe as follows: 
Name      ID1        ID2      Info             Owner
John      1001       100      \\go\to\store    MATT
John      1001       100      \\go\t0\store    MATT
John      1001       10o      \\go\to\store    JOE
John      1001       100      \\go\to\store    MATT

I am looking to create an additional column for frequency of duplicate rows such as below:
Name      ID1        ID2      Info             Owner     Freq
John      1001       100      \\go\to\store    MATT      2
John      1001       100      \\go\t0\store    MATT      1
J0hn      
1001       100      \\go\to\store    JOE       1
John      1001       100      \\go\to\store    MATT      2

I have tried the two things below to no avail:
1)  df.groupby('Name').agg(lambda x: list(x).count(['Name', 'Info','Owner'])
2)  df.groupby(['Name', 'Info","Owner"]).size())

The two IDs are not needed for the duplicate frequency count but are needed for additional processing. I could really use some assistance with this as I am having troubles figuring it out. Thanks for your help. 

Pratap Vardhan

unread,
Feb 5, 2017, 2:24:36 PM2/5/17
to PyData
Something like this might work?

In [434]: df['Freq'] = df.groupby(['Name', 'Info', 'Owner'])['ID1'].transform('count')

In [435]: df
Out[435]:
   Name   ID1  ID2           Info Owner  Freq
0  John  1001  100  \\go\to\store  MATT     2
1  John  1001  100  \\go\t0\store  MATT     1
2  John  1001  10o  \\go\to\store   JOE     1
3  John  1001  100  \\go\to\store  MATT     2

Paul Hobson

unread,
Feb 5, 2017, 2:50:47 PM2/5/17
to pyd...@googlegroups.com

--
You received this message because you are subscribed to the Google Groups "PyData" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pydata+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages