--Cross-posted in SO: http://stackoverflow.com/questions/30676476/pandas-trouble-transforming-dataframe-into-aggregated-dataframe
I'm a new user of pandas. I have been going through the help docs, and trying various experiments (groupby(), multiindex, value_cuounts()). But I am not able to get the desired end result.
My dataframe is as follows (it is time indexed):
DATE, GROUP, X, Y, STATUS 2014-01-01 A 0 0 PASS 2014-01-01 A 0 1 FAIL 2014-01-01 A 1 0 PASS 2014-01-02 B 0 0 PASS 2014-01-02 B 0 1 PASS 2014-01-02 B 1 1 FAIL ....
The 'STATUS' column is of dtype=category. I would like to end up with a new dataframe that looks like as follows:
DATE GROUP STATUS PCT 2014-01-01 A PASS 0.667 2014-01-01 A FAIL 0.333 2014-01-02 B PASS 0.667 2014-01-02 B FAIL 0.333
Essentially, for each group, I want to calculate the % of all status.
I have tried df.groupby('GROUP').value_counts() followed by divide by sum() to calculate the percentages. That works OK. However, I lose the index information and I don't know to add it to the new dataframe to achieve the desired output above. There must be some easy way in pandas to do it, but I'm not seeing it.
Any suggestions are appreciated. Thanks.
You received this message because you are subscribed to the Google Groups "PANDA Project Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to panda-project-u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.