How can I aggregate rows for many columns?

66 views
Skip to first unread message

Simon Givoli

unread,
Oct 4, 2015, 9:41:29 AM10/4/15
to Israel R User Group

Hi,

 

My df basically looks like this:

 

Id  A    B    C   total

3   5    0    1   6

3   4    3    4   11

3   2    1    2    5

4   5    4    3   12

4   3    2    4    9

4   1    1    1    3

 

I want to collapse the rows by Id and get:

 

Id   A    B    C  total

3    11   4    7  22

4    9    7    8   24

 

I was able to do so for one column with:

df.grouped<- aggregate(df$A~Id, data=df, FUN="sum")

 

I have many columns (A-Z), so I need some kind of loop. I tried:

 

df.grouped<- aggregate(df[4:51]~Id, data=df, FUN="sum")

names(df.grouped)<-paste(names(df)[4:51])

 

But got:

 

Error in model.frame.default(formula = df[4:51] ~ Id, data = df) :

  invalid type (list) for variable 'df[4:51]'

 

As you can see, I also want the names in df.grouped to be the same as in df.

 

Any ideas?

 

Thanks,
Simon

amit gal

unread,
Oct 4, 2015, 9:57:55 AM10/4/15
to israel-r-...@googlegroups.com

You really should take a deep look at dplyr and tidyr. In this case group_by and summarise should do the work. In base r you can look at the by command.

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-g...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Simon Givoli

unread,
Oct 4, 2015, 10:01:25 AM10/4/15
to israel-r-...@googlegroups.com

Thanks Amit

I tried but I got really confused and couldn't find a solution. Can you point me to a good tutorial on these packages?

 

Simon


סיימון גבעולי
פסיכולוג תעסוקתי 
052-3626345

Yoni Sidi

unread,
Oct 4, 2015, 11:54:24 PM10/4/15
to Israel R User Group

Simon Givoli

unread,
Oct 5, 2015, 1:12:40 AM10/5/15
to israel-r-...@googlegroups.com

Ariel Telpaz

unread,
Oct 8, 2015, 3:54:56 AM10/8/15
to Israel R User Group
יש פתרון פשוט, אתה יכול להשתמש בcbind בפורמולה עצמה בשביל לעשות אגרגציה על כמה משתנים. 
(aggregate(cbind(A,B,C,total)~Id,data=df,FUN=sum


בתאריך יום שני, 5 באוקטובר 2015 בשעה 08:12:40 UTC+3, מאת Simon Givoli:
תודה!

סיימון גבעולי
פסיכולוג תעסוקתי 
052-3626345

2015-10-05 6:54 GMT+03:00 Yoni Sidi <yon...@gmail.com>:

--
You received this message because you are subscribed to the Google Groups "Israel R User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to israel-r-user-group+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages