Getting the most recent dates in a new column from dates in four columns

26 views
Skip to first unread message

Pradip Muhuri

unread,
Nov 8, 2014, 11:46:56 PM11/8/14
to lubr...@googlegroups.com
 

Hello,

 

My apology for reposting to this Group (posted earlier in the r-help forum).

 

The example data frame in the reproducible code below has 5 columns (1 column for id and 4 columns for dates), and there are 7 observations.  I would like to insert the most recent date from those 4 date columns into a new column (oiddate) using the mutate() function in the dplyr package although I have loaded the lubridate package.   I am getting correct results (NA in the new column) if a given row has all NA's in the four columns.  However, the issue is that the date value inserted into the new column (oidflag) is incorrect for 5 of the remaining 6 rows (with a non-NA value in at least 1 of the four columns).  

 

I would appreciate receiving your help toward resolving the issue.  Please see the R console and the R script (reproducible example)below.

 

Thanks in advance.

 

Pradip

 

 

######  from the console ########

print (data2)

  id    mrjdate    cocdate    inhdate    haldate    oidflag

1  1 2004-11-04 2008-07-18 2005-07-07 2007-11-07 2011-11-04

2  2       <NA>       <NA>       <NA>       <NA>       <NA>

3  3 2009-10-24       <NA> 2011-10-13       <NA> 2011-11-04

4  4 2007-10-10       <NA>       <NA>       <NA> 2011-11-04

5  5 2006-09-01 2005-08-10       <NA>       <NA> 2011-11-04

6  6 2007-09-04 2011-10-05       <NA>       <NA> 2011-11-04

7  7 2005-10-25       <NA>       <NA> 2011-11-04 2011-11-04

 

 

 

##################  Reproducible code and data #####################################

 

library(dplyr)

library(lubridate)

library(zoo)

# data object - description of the

 

temp <- "id  mrjdate cocdate inhdate haldate

1     2004-11-04 2008-07-18 2005-07-07 2007-11-07

2             NA         NA         NA         NA    

3     2009-10-24         NA 2011-10-13         NA

4     2007-10-10         NA         NA         NA

5     2006-09-01 2005-08-10         NA         NA

6     2007-09-04 2011-10-05         NA         NA

7     2005-10-25         NA         NA 2011-11-04"

 

# read the data object

 

data1 <- read.table(textConnection(temp),

                    colClasses=c("character", "Date", "Date", "Date", "Date"), 

                    header=TRUE, as.is=TRUE

                    )

# create a new column

 

data2 <- mutate(data1,

                oidflag= ifelse(is.na(mrjdate) & is.na(cocdate) & is.na(inhdate)  & is.na(haldate), NA,

                                  max(mrjdate, cocdate, inhdate, haldate,na.rm=TRUE )

                                )

                )

 

# convert to date

data2$oidflag = as.Date(data2$oidflag, origin="1970-01-01")

 

# print records

 

print (data2)

Reply all
Reply to author
Forward
0 new messages