Find closest date/time match?

109 views
Skip to first unread message

Mark Knecht

unread,
Jul 21, 2009, 3:28:36 PM7/21/09
to Bay Area R Helpers
Hi,
What command might be best for finding a 'close' match?

I have two data.frames:

1) DF1 has events that happen on a fast time increment - say 1 minute.
There are dates and times for each event called EnDate & EnTime.

2) DF2 has events that change more slowly, like 15 minutes, 60
minutes, etc. (Or possibly daily or monthly later) The fields are just
Date & Time.

I need to find the event in DF2 that most closely matches the
date/time from each DF1 row.

It seems like the match logic is basically ( (File1$EnDate ==
File2$Date) & (File1$EnTime>=File2$Time) ), or something close to
that, but I'm not sure of what the right command is to look for a
'best match' for each row in DF1.

Thanks in advance,
Mark

Jim Porzak

unread,
Jul 21, 2009, 6:19:13 PM7/21/09
to Mark Knecht, Bay Area R Helpers
Hey Mark,

I'd suggest you start by combining your date & time into a single column. See as.POSIX.

OTOH, if you are really dealing with irregular time series, which it looks like you are, I'd recommend taking a close look at the packages zoo & xts - both have excellent vignettes.

I tend to go straight to xts for my time series stuff.

On thing to we careful of is POSIX uses time zone info when loading & plotting.

HTH,
Jim Porzak
Ancestry.com
San Francisco, CA
www.linkedin.com/in/jimporzak
use R! Group SF: www.meetup.com/R-Users/

Mark Knecht

unread,
Jul 21, 2009, 7:02:00 PM7/21/09
to Jim Porzak, Bay Area R Helpers
Thanks Jim. I'll check them out.

I'm sort of able to dig into the data just using some simple subset
commands, but this only covers checking the date and it's only for the
last event in the FD1 file. None the less it shows me some of the
market data for today's market. Now, if I can figure out how to:

1) Get the right time record
2) and do this for 1000's of records in DF1

then I'll be in pretty good shape.

Cheers,
Mark


> tail(SystemLongWinners, 1)
Trade PosType EnDate EnTime ExDate ExTime PL_Pos
1325 1325 1 1090721 1208 1090721 1313 74
> tail(SystemLongWinners, 1)$EnDate
[1] 1090721
> subset(Market_TF_15Min_MA, Date>=tail(SystemLongWinners, 1)$EnDate)[1:7]
Date Time Open High Low Close Volume
40745 1090721 645 528.6 529.0 523.1 523.4 6115
40746 1090721 700 523.4 523.9 521.3 523.3 4100
40747 1090721 715 523.3 523.7 520.0 521.5 3121
40748 1090721 730 521.4 523.1 521.0 522.3 2155
40749 1090721 745 522.4 524.9 522.2 523.9 3284
40750 1090721 800 524.0 524.0 522.4 523.6 2008
40751 1090721 815 523.6 524.2 521.6 522.2 1578
40752 1090721 830 522.2 523.5 522.1 522.8 1391
40753 1090721 845 522.9 523.3 521.3 521.3 1577
40754 1090721 900 521.3 521.6 518.3 518.8 1679
40755 1090721 915 518.7 519.9 518.5 519.7 1121
40756 1090721 930 519.6 519.8 518.8 519.4 788
40757 1090721 945 519.3 519.6 518.1 518.8 1591
40758 1090721 1000 518.6 519.2 517.2 518.0 1623
40759 1090721 1015 517.9 517.9 516.0 517.4 2194
40760 1090721 1030 517.3 519.3 517.3 519.2 1998
40761 1090721 1045 519.2 519.7 518.7 519.1 1203
40762 1090721 1100 519.2 520.0 519.1 519.8 1751
40763 1090721 1115 519.8 521.1 519.6 521.0 1622
40764 1090721 1130 521.0 521.7 520.1 520.7 1265
40765 1090721 1145 520.8 521.7 519.6 521.3 1606
40766 1090721 1200 521.4 522.5 520.9 522.2 1327
40767 1090721 1215 522.1 522.6 521.6 522.2 1273
40768 1090721 1230 522.3 522.9 521.7 522.4 1269
40769 1090721 1245 522.5 523.0 521.7 523.0 1884
40770 1090721 1300 523.0 523.8 522.4 522.9 3420
40771 1090721 1315 522.9 525.7 522.9 524.6 2724
Reply all
Reply to author
Forward
0 new messages