[R] subsetting and Dates

2 views
Skip to first unread message

Denis Chabot

unread,
May 23, 2013, 5:35:49 PM5/23/13
to R-h...@r-project.org
Hi,

I am trying to understand why creating Date variables does not work if I subset to avoid NAs.

I had problems creating these Date variables in my code and I thought that the presence of NAs was the cause. So I used a condition to avoid NAs.

It turns out that NAs are not a problem and I do not need to subset, but I'd like to understand why subsetting causes the problem.
The strange numbers I start with are what I get when I read an Excel sheet with the function read.xls() from package gdata.

dat1 = c(41327, 41334, 41341, 41348, 41355, 41362, 41369, 41376, 41383, 41390, 41397)
dat2 = dat1
dat2[c(5,9)]=NA
Data = data.frame(dat1,dat2)

keep1 = !is.na(Data$dat1)
keep2 = !is.na(Data$dat2)


Data$Dat1a = as.Date(Data[,"dat1"], origin="1899-12-30")
Data$Dat1b[keep1] = as.Date(Data[keep1,"dat1"], origin="1899-12-30")
Data$Dat2a = as.Date(Data[,"dat2"], origin="1899-12-30")
Data$Dat2b[keep2] = as.Date(Data[keep2,"dat2"], origin="1899-12-30")

Data
dat1 dat2 Dat1a Dat1b Dat2a Dat2b
1 41327 41327 2013-02-22 15758 2013-02-22 15758
2 41334 41334 2013-03-01 15765 2013-03-01 15765
3 41341 41341 2013-03-08 15772 2013-03-08 15772
4 41348 41348 2013-03-15 15779 2013-03-15 15779
5 41355 NA 2013-03-22 15786 <NA> NA
6 41362 41362 2013-03-29 15793 2013-03-29 15793
7 41369 41369 2013-04-05 15800 2013-04-05 15800
8 41376 41376 2013-04-12 15807 2013-04-12 15807
9 41383 NA 2013-04-19 15814 <NA> NA
10 41390 41390 2013-04-26 15821 2013-04-26 15821
11 41397 41397 2013-05-03 15828 2013-05-03 15828

So variables Dat1b and Dat2b are not converted to Date class.


sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] fr_CA.UTF-8/fr_CA.UTF-8/fr_CA.UTF-8/C/fr_CA.UTF-8/fr_CA.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] gdata_2.12.0

loaded via a namespace (and not attached):
[1] gtools_2.7.0

Thanks in advance,

Denis
______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

arun

unread,
May 23, 2013, 9:44:19 PM5/23/13
to Denis Chabot, R help
You could convert those columns to "Date" class by:


Data[,c(4,6)]<-lapply(Data[,c(4,6)],as.Date,origin="1970-01-01")
#or
Data[,c(4,6)]<-lapply(Data[,c(4,6)],function(x) structure(x,class="Date"))


#  dat1  dat2      Dat1a      Dat1b      Dat2a      Dat2b
#1  41327 41327 2013-02-22 2013-02-22 2013-02-22 2013-02-22
#2  41334 41334 2013-03-01 2013-03-01 2013-03-01 2013-03-01
#3  41341 41341 2013-03-08 2013-03-08 2013-03-08 2013-03-08
#4  41348 41348 2013-03-15 2013-03-15 2013-03-15 2013-03-15
#5  41355    NA 2013-03-22 2013-03-22       <NA>       <NA>
#6  41362 41362 2013-03-29 2013-03-29 2013-03-29 2013-03-29
#7  41369 41369 2013-04-05 2013-04-05 2013-04-05 2013-04-05
#8  41376 41376 2013-04-12 2013-04-12 2013-04-12 2013-04-12
#9  41383    NA 2013-04-19 2013-04-19       <NA>       <NA>
#10 41390 41390 2013-04-26 2013-04-26 2013-04-26 2013-04-26
#11 41397 41397 2013-05-03 2013-05-03 2013-05-03 2013-05-03
A.K.

Denis Chabot

unread,
May 23, 2013, 10:06:16 PM5/23/13
to arun, R help
Thank you for the 2 methods to make the columns class Date, but I would really like to know why these variables were not in Date class with my code. Do you know?

Denis

David Winsemius

unread,
May 23, 2013, 10:55:55 PM5/23/13
to Denis Chabot, R help

On May 23, 2013, at 7:06 PM, Denis Chabot wrote:

> Thank you for the 2 methods to make the columns class Date, but I would really like to know why these variables were not in Date class with my code. Do you know?

I suspect that the problem lies in the dispatch to `[<-.<class>` or `$<-`. When the first argument is 'logical', then the first argument is not of class Date and so not dispatched to `[<-.Date` but rather to .Primitive("[<-"), there being no `$<-.logical or `[<-.logical` .

"Arguably", as it were, someone should write S4 methods for `[<-` and `$<-` that would dispatch to the expected method on a signature where the second argument is a Date or POSIXct class. We might also then want methods for the other two "[/$" indexing classes, "numeric" and "character".

--

David.
David Winsemius
Alameda, CA, USA

arun

unread,
May 23, 2013, 10:56:12 PM5/23/13
to Denis Chabot, R help
I guess it is due to vectorization.
vec1<- as.Date(Data[,2],origin="1899-12-30")
class(vec1)
#[1] "Date"
 as.vector(vec1)
# [1] 15758 15765 15772 15779    NA 15793 15800 15807    NA 15821 15828


 head(as.list(vec1),2)
#[[1]]
#[1] "2013-02-22"
#
#[[2]]
#[1] "2013-03-01"
 head(data.frame(vec1),2)
#        vec1
#1 2013-02-22
#2 2013-03-01


unlist(as.list(vec1))
# [1] 15758 15765 15772 15779    NA 15793 15800 15807    NA 15821 15828
 

Also, please check:

http://r.789695.n4.nabble.com/as-vector-with-mode-quot-list-quot-and-POSIXct-td4667533.html

David Winsemius

unread,
May 24, 2013, 1:33:57 AM5/24/13
to arun, R help

On May 23, 2013, at 7:56 PM, arun wrote:

> I guess it is due to vectorization.

The concept of "vectorization" is much broader than the activities of `as.vector`, but it needs a specific functional mechanism to be considered an explanation.

> vec1<- as.Date(Data[,2],origin="1899-12-30")
> class(vec1)
> #[1] "Date"
> as.vector(vec1)
> # [1] 15758 15765 15772 15779 NA 15793 15800 15807 NA 15821 15828

It is certainly true that `as.vector` could unclass a Date-classed vector, but why do you believe this has anything to do with how `$<-` returns its functional result? Setting `trace` on `as.vector` does not result in any signal suggesting that it was called:

> trace('as.vector')
> Data$Dat1b = as.Date(Data[ ,"dat1"], origin="1899-12-30")

# Nothing

> trace( .Primitive("$<-") )
> Data$Dat1b = as.Date(Data[ ,"dat1"], origin="1899-12-30")
trace: `$<-`(`*tmp*`, Dat1b, value = c(15758, 15765, 15772, 15779, 15786,
15793, 15800, 15807, 15814, 15821, 15828))


>
> head(as.list(vec1),2)
> #[[1]]
> #[1] "2013-02-22"
> #
> #[[2]]
> #[1] "2013-03-01"
> head(data.frame(vec1),2)
> # vec1
> #1 2013-02-22
> #2 2013-03-01
>
>
> unlist(as.list(vec1))
> # [1] 15758 15765 15772 15779 NA 15793 15800 15807 NA 15821 15828
>
>
> Also, please check:
>
> http://r.789695.n4.nabble.com/as-vector-with-mode-quot-list-quot-and-POSIXct-td4667533.html

Interesting but I fail to see the connection to this instance other than R behaving somewhat differently than we might at one time have expected.

--
David.
David Winsemius
Alameda, CA, USA

Reply all
Reply to author
Forward
0 new messages