zoo question

2 views
Skip to first unread message

Mark Knecht

unread,
Aug 27, 2009, 11:40:28 AM8/27/09
to Bay Area R Helpers
Hi,
Are there zoo users here? I hope so. If so then how would I create
a zoo starting on 1/1/2004 going through today's date with a daily
frequency? I.e. - it matches the calendar correctly?

Thanks,
Mark

Jim Porzak

unread,
Aug 27, 2009, 1:19:16 PM8/27/09
to Mark Knecht, Bay Area R Helpers
Hey Mark,

I'm a great fan of zoo & xts - which adds some interesting extensions.

Using seq() on Date objects is pretty natural:

library(zoo)
library(xts)

days <- seq(as.Date("2004-01-01"), Sys.Date(), by = "day")
MyZoo <- zoo(rnorm(length(days)), order.by = days)
str(MyZoo)
summary(MyZoo)

## show off xts a bit:
MyXts <- as.xts(MyZoo)
plot(MyXts["2009-08"])


HTH,
Jim Porzak
Ancestry.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/

Mark Knecht

unread,
Aug 27, 2009, 1:42:41 PM8/27/09
to Jim Porzak, Bay Area R Helpers
Jim,
This looks pretty interesting. Thanks!

It's interesting to me that MyZoo doesn't have any dimension. When
I print MyZoo it appears to be something horizontal, but again,
dimensionless. MyXts is vertical which is the way my current data
frame is arranged.

> dim(MyZoo)
NULL
> dim(MyXts)
[1] 2066 1
>

I think MxXts might work well for my needs. I can compare the date
in MyXts with the dates in my other data.frames and if I find a match
then copy data into MyXts.

Thanks,
Mark

NOTE: I sent a similar question to R-help just before your response
came in. That was more impatience on my part! - MWK

Ted Dunning

unread,
Aug 27, 2009, 1:45:50 PM8/27/09
to Mark Knecht, Jim Porzak, Bay Area R Helpers

So post your answer to your own question to help people who search the archives!


On Thu, Aug 27, 2009 at 10:42 AM, Mark Knecht <markk...@gmail.com> wrote:
NOTE: I sent a similar question to R-help just before your response
came in. That was more impatience on my part! - MWK



--
Ted Dunning, CTO
DeepDyve

Mark Knecht

unread,
Aug 27, 2009, 2:17:13 PM8/27/09
to Ted Dunning, Jim Porzak, Bay Area R Helpers
I don't understand. What answer to my own question? I don't have one yet.

- Mark

Mark Knecht

unread,
Aug 27, 2009, 4:48:46 PM8/27/09
to Jim Porzak, Bay Area R Helpers
Hi again jim,
OK, so this language once again throws me for loops every time I
try to do something new. In the MyXts data the dates aren't data but
rather seem to be more like names. I'm not sure what they really are?
How do I use them once I have MyXts?

I have a data.frame with sparce data. I'd like to first move a copy
of that data from my data.frame to MyXts when there is a date match,
but I'm having trouble finding a way to relate to the date, such as
something like the following which won't work but hopefully
communicates the query:

MyXts = subset(My.data.frame, My.data.frame$EnDate==MyXts$Date)

Cheers,
Mark

On Thu, Aug 27, 2009 at 10:19 AM, Jim Porzak<jpo...@gmail.com> wrote:

Jim Porzak

unread,
Aug 27, 2009, 8:02:43 PM8/27/09
to Mark Knecht, Bay Area R Helpers
HTH/Best,
Jim Porzak
Ancestry.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/

Right! xts inherits a lot of zoo stuff. See the vignettes for both
packages for a lot of hints!

index(MyXts) pulls out the dates & coredata(MyXts) the data.

HTH, JIm

Mark Knecht

unread,
Aug 27, 2009, 9:58:46 PM8/27/09
to Jim Porzak, Bay Area R Helpers
index and coredata are helping but I cannot quite close the deal yet.
I'll give it some thought overnight and then try again tomorrow.

My basic issue at this point is that MyZoo has an entry for every
possible date, while MyData has entries only for the dates where there
is data. I need to find all data in MyData for each date in MyZoo and
then use coredata to write it in. Problem is MyZoo and MyData aren't
the same lengths which causes all the ideas I've tried so far to fail.

I'll get back on it tomorrow morning when I'm fresher.

Thanks!

cheers,
Mark

Ted Dunning

unread,
Aug 28, 2009, 12:14:40 AM8/28/09
to Mark Knecht, Jim Porzak, Bay Area R Helpers

merge?


On Thu, Aug 27, 2009 at 6:58 PM, Mark Knecht <markk...@gmail.com> wrote:
My basic issue at this point is that MyZoo has an entry for every
possible date, while MyData has entries only for the dates where there
is data.



Mark Knecht

unread,
Aug 28, 2009, 11:43:15 AM8/28/09
to Ted Dunning, Jim Porzak, Bay Area R Helpers

Good morning Ted,
I think merge could be part of the solution, but take a look at the
code below and tell me how I can possibly change it to get this
working the way I require.

X is the original stock trade data. (1 month - August only) This is
trade-by-trade, when ever they happen. Some days there are no trades,
other days there are multiple trades.

Y is a new data.frame created to merge against. It has only the date
range than I'm interested in looking at. (2 months by design to ensure
date coverage if August has a trade on every date.

Z is my attempt as using merge for the first time.

What works:

- merge works fine, in a sense. All trades in X matching dates in Y
are copied to Z.

What doesn't work yet:

- I need the cumsum of PL_Pos on a given date. For instance, Z has 3
trades on 2009-08-27. I need these values summed into a single value
and only that value in merged into Z so that I have a single even on
2009-08-27.

- I need the NAs converted to 0.

If I could figure out how to do those two things then I'd be able
to make the calendar-based plot that I need.

For my purposes - if it's easier - Z doesn't need to be a complete
merge. It only needs MyDate from Y and the cumsum(X$PL_Pos) for each
date.

Off looking for answers.

Thanks,
Mark

TStoDate = function (TSDate) {
X = strptime(TSDate + 19e6L, "%Y%m%d")
return(as.Date(X))
}

X = structure(list(Trade = 1951:1971, PosType = c(1, 1, -1, -1, -1,
1, 1, 1, -1, -1, 1, 1, 1, 1, 1, -1, 1, -1, -1, 1, 1), EnDate = c(1090803,
1090804, 1090805, 1090806, 1090806, 1090810, 1090811, 1090812,
1090813, 1090817, 1090819, 1090820, 1090820, 1090824, 1090825,
1090825, 1090826, 1090826, 1090827, 1090827, 1090827), EnTime = c(1004,
812, 641, 706, 1103, 1117, 633, 641, 641, 645, 641, 641, 958,
641, 919, 1037, 650, 853, 641, 932, 932), ExDate = c(1090803,
1090804, 1090805, 1090806, 1090806, 1090810, 1090811, 1090812,
1090813, 1090817, 1090819, 1090820, 1090820, 1090824, 1090825,
1090825, 1090826, 1090826, 1090827, 1090827, 1090827), ExTime = c(1259,
1058, 1258, 1258, 1259, 1311, 702, 1258, 1258, 1258, 1258, 1258,
1313, 1258, 1037, 1313, 1311, 1313, 1258, 1313, 1300), PL_Pos = c(174,
-26, 614, 344, -26, 414, -626, 544, -106, -146, 1004, 344, 224,
-716, -176, 44, 354, -346, -296, 564, 354)), .Names = c("Trade",
"PosType", "EnDate", "EnTime", "ExDate", "ExTime", "PL_Pos"), class =
"data.frame", row.names = c("733",
"734", "3631", "3641", "736", "2403", "2413", "3651", "3661",
"3671", "3681", "3691", "1303", "3701", "1304", "1305", "2432",
"1306", "3712", "1307", "4214"))

X$MyDate = TStoDate(X$EnDate)
X

days <- seq(as.Date("2009-07-01"), Sys.Date(), by = "day")
Y = data.frame(MyDate=days)
Y

Z = merge(X,Y, by.x="MyDate", by.y="MyDate", all.y=TRUE)
Z

dim(X)
dim(Y)
dim(Z)

X11(width=8, height=4)
par(mfrow=c(1,2))
plot(cumsum(X$PL_Pos), type="l")
plot(cumsum(Z$PL_Pos), type="l")

Mark Knecht

unread,
Aug 28, 2009, 2:34:32 PM8/28/09
to Ted Dunning, Jim Porzak, Bay Area R Helpers
OK - here's what I ended up with converting my event-based chart into
a time-based chart. Pretty straightforward.

The TStoDate function is only required to translate this specific
date/time format into what R understand. Others might not need that
for their data.frames.

Cheers,
Mark

TStoDate = function (TSDate) {
X = strptime(TSDate + 19e6L, "%Y%m%d")
return(as.Date(X))
}

X = structure(list(Trade = 1951:1971, PosType = c(1, 1, -1, -1, -1,
1, 1, 1, -1, -1, 1, 1, 1, 1, 1, -1, 1, -1, -1, 1, 1), EnDate = c(1090803,
1090804, 1090805, 1090806, 1090806, 1090810, 1090811, 1090812,
1090813, 1090817, 1090819, 1090820, 1090820, 1090824, 1090825,
1090825, 1090826, 1090826, 1090827, 1090827, 1090827), EnTime = c(1004,
812, 641, 706, 1103, 1117, 633, 641, 641, 645, 641, 641, 958,
641, 919, 1037, 650, 853, 641, 932, 932), ExDate = c(1090803,
1090804, 1090805, 1090806, 1090806, 1090810, 1090811, 1090812,
1090813, 1090817, 1090819, 1090820, 1090820, 1090824, 1090825,
1090825, 1090826, 1090826, 1090827, 1090827, 1090827), ExTime = c(1259,
1058, 1258, 1258, 1259, 1311, 702, 1258, 1258, 1258, 1258, 1258,
1313, 1258, 1037, 1313, 1311, 1313, 1258, 1313, 1300), PL_Pos = c(174,
-26, 614, 344, -26, 414, -626, 544, -106, -146, 1004, 344, 224,
-716, -176, 44, 354, -346, -296, 564, 354)), .Names = c("Trade",
"PosType", "EnDate", "EnTime", "ExDate", "ExTime", "PL_Pos"), class =
"data.frame", row.names = c("733",
"734", "3631", "3641", "736", "2403", "2413", "3651", "3661",
"3671", "3681", "3691", "1303", "3701", "1304", "1305", "2432",
"1306", "3712", "1307", "4214"))

X$MyDate = TStoDate(X$EnDate)

X1 = aggregate(X$PL_Pos, list(X$MyDate), sum)
colnames(X1)<-c("MyDate","PL_SUM")

days <- seq(as.Date("2009-07-01"), Sys.Date(), by = "day")
Y = data.frame(MyDate=days)

Z = merge(X1,Y, by.x="MyDate", by.y="MyDate", all.y=TRUE)
Z

Z$PL_SUM[is.na(Z$PL_SUM)] <- 0

dim(X)
dim(Y)
dim(Z)

X11(width=12, height=6)


par(mfrow=c(1,2))
plot(cumsum(X$PL_Pos), type="l")

plot(cumsum(Z$PL_SUM) ~ Z$MyDate, type="l")

Reply all
Reply to author
Forward
0 new messages