[R] Removing rows w/ smaller value from data frame

2 views
Skip to first unread message

ramoss

unread,
May 23, 2013, 10:23:58 AM5/23/13
to r-h...@r-project.org
Hello,

I have a column called max_date in my data frame and I only want to keep the
bigger values for the same activity. How can I do that?

data frame:

activity max_dt
A 2013-03-05
B 2013-03-28
A 2013-03-28
C 2013-03-28
B 2013-03-01

Thank you for your help



--
View this message in context: http://r.789695.n4.nabble.com/Removing-rows-w-smaller-value-from-data-frame-tp4667816.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

PIKAL Petr

unread,
May 23, 2013, 10:35:03 AM5/23/13
to ramoss, r-h...@r-project.org
Hi

change max_dt do PISIX class and use standard comparison operator and use the result for selecting rows.

> s<-seq(c(ISOdate(2000,3,20)), by = "day", length.out = 10)
> s<s[5]
[1] TRUE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE FALSE
>

Regards
Petr

arun

unread,
May 23, 2013, 10:39:38 AM5/23/13
to ramoss, R help
Hi,
Try:
datNew<- read.table(text="
activity    max_dt
A            2013-03-05
B            2013-03-28
A            2013-03-28
C            2013-03-28
B            2013-03-01
",sep="",header=TRUE,stringsAsFactors=FALSE)
datNew$max_dt<- as.Date(datNew$max_dt)
 aggregate(max_dt~activity,data=datNew,max)
#  activity     max_dt
#1        A 2013-03-28
#2        B 2013-03-28
#3        C 2013-03-28
#or

library(plyr)
 ddply(datNew,.(activity),summarize, max_dt=max(max_dt))
#  activity     max_dt
#1        A 2013-03-28
#2        B 2013-03-28
#3        C 2013-03-28
#or
ddply(datNew,.(activity),summarize, max_dt=tail(sort(max_dt),1))
#  activity     max_dt
#1        A 2013-03-28
#2        B 2013-03-28
#3        C 2013-03-28


A.K.

----- Original Message -----
From: ramoss <ramine.m...@finra.org>
To: r-h...@r-project.org
Cc:
Sent: Thursday, May 23, 2013 10:23 AM
Subject: [R] Removing rows w/ smaller value from data frame

arun

unread,
May 23, 2013, 10:58:10 AM5/23/13
to Mossadegh, Ramine N., R help
>From your email, it seems like aggregate() is working.

Could you please provide the sessionInfo()?
My guess is that some other loaded library is masking the summarize().
For example, if I load
library(Hmisc)
#The following object is masked from ‘package:plyr’:
#
 #   is.discrete, summarize


ddply(datNew,.(activity),summarize, max_dt=max(max_dt)) #
#Error in is.list(by) : 'by' is missing
 ddply(datNew,.(activity),plyr::summarize,max_dt=max(max_dt))


#  activity     max_dt
#1        A 2013-03-28
#2        B 2013-03-28
#3        C 2013-03-28
A.K.


----- Original Message -----
From: "Mossadegh, Ramine N." <Ramine.M...@finra.org>
To: arun <smartp...@yahoo.com>
Cc:

Sent: Thursday, May 23, 2013 10:44 AM
Subject: RE: [R] Removing rows w/ smaller value from data frame

Thank but I get : Error in is.list(by) : 'by' is missing
When I tried ddply(datNew,.(activity),summarize, max_dt=max(max_dt))


A.K.

Hello,

data frame:


Confidentiality Notice:  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you

Reply all
Reply to author
Forward
0 new messages