Relative cell references?

Mark Knecht

unread,

Jul 17, 2009, 11:31:51 AM7/17/09

to Bay Area R Helpers

Hi,
Here an example of something I finding difficult so far to do in R.
I'm hoping maybe someone can give me a quick example of an equation to
demonstrate good R programming.

DF <- read.table(textConnection("
Trade PosType EnDate EnTime ExDate ExTime PL_Pos DayMargin
MarginAvail InPosition TradeProfit Equity PossibleEquity Diff
1 1 1 1040106 1227 1040106 1251 -146 10000
6000 1 -146 9854 9854 0
2 2 1 1040107 641 1040107 1306 294 9854
5854 1 294 10148 10148 0
3 3 1 1040107 915 1040107 1300 164 9854
1854 1 164 10312 10312 0
4 4 1 1040108 909 1040108 1300 184 10312
6312 1 184 10496 10496 0
5 5 1 1040108 930 1040108 1300 124 10312
2312 1 124 10620 10620 0
6 6 1 1040108 953 1040108 1301 24 10312
-1688 0 0 10620 10644 -24
7 7 1 1040109 1241 1040109 1311 -146 10620
6620 1 -146 10474 10498 -24
8 8 1 1040112 641 1040112 1306 344 10474
6474 1 344 10818 10842 -24
9 9 1 1040112 708 1040112 1311 874 10474
2474 1 874 11692 11716 -24
10 10 1 1040113 840 1040113 1311 224 11692
7692 1 224 11916 11940 -24
"),header=TRUE,row.names=1)

DF

Take a look at the DayMargin column. When the date in EnDate
changes the value in DayMargin is the value from the Equity column the
day before, but when the EnDate column doesn't change it's the value
one up in the DayMargin column.

MarginAvail is either DayMargin-4000 on a date change or
MarginAvail(up one cell) -4000 when there's no date change.

In general, if I'm trying to calculate a given row/column in a
data.frame, how do I get a relative reference to another location in
the same data.frame???

Thanks,
Mark

Mark Knecht

unread,

Jul 17, 2009, 11:40:06 AM7/17/09

to Bay Area R Helpers

Possibly this will survive email better?

structure(list(Trade = 1:10, PosType = c(1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L), EnDate = c(1040106L, 1040107L, 1040107L,
1040108L, 1040108L, 1040108L, 1040109L, 1040112L, 1040112L, 1040113L
), EnTime = c(1227L, 641L, 915L, 909L, 930L, 953L, 1241L, 641L,
708L, 840L), ExDate = c(1040106L, 1040107L, 1040107L, 1040108L,
1040108L, 1040108L, 1040109L, 1040112L, 1040112L, 1040113L),
ExTime = c(1251L, 1306L, 1300L, 1300L, 1300L, 1301L, 1311L,
1306L, 1311L, 1311L), PL_Pos = c(-146L, 294L, 164L, 184L,
124L, 24L, -146L, 344L, 874L, 224L), DayMargin = c(10000L,
9854L, 9854L, 10312L, 10312L, 10312L, 10620L, 10474L, 10474L,
11692L), MarginAvail = c(6000L, 5854L, 1854L, 6312L, 2312L,
-1688L, 6620L, 6474L, 2474L, 7692L), InPosition = c(1L, 1L,
1L, 1L, 1L, 0L, 1L, 1L, 1L, 1L), TradeProfit = c(-146L, 294L,
164L, 184L, 124L, 0L, -146L, 344L, 874L, 224L), Equity = c(9854L,
10148L, 10312L, 10496L, 10620L, 10620L, 10474L, 10818L, 11692L,
11916L), PossibleEquity = c(9854L, 10148L, 10312L, 10496L,
10620L, 10644L, 10498L, 10842L, 11716L, 11940L), Diff = c(0L,
0L, 0L, 0L, 0L, -24L, -24L, -24L, -24L, -24L)), .Names = c("Trade",
"PosType", "EnDate", "EnTime", "ExDate", "ExTime", "PL_Pos",
"DayMargin", "MarginAvail", "InPosition", "TradeProfit", "Equity",
"PossibleEquity", "Diff"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))

Ted Dunning

unread,

Jul 17, 2009, 1:42:59 PM7/17/09

to Mark Knecht, Bay Area R Helpers

In general, you don't do that. Not as such anyway.

The issue is that spreadsheets conflate execution (program) with data, R separates these concepts. Thus a data frame is data-only and code is program-only.

You run code to get your data. But if you change part of your data, the code does not magically run again.

To get the effect that you are after, you can do a variety of things. The simplest conceptually is to use indexing with a vector. If you have

n <- dim(DF)[1]

as the number of rows in your data frame, you can look at DF[1:(n-1),] versus DF[2:n,] to find the differences you are after.

For many other situations where you actually want running differences, you can use the diff function:

> diff(DF$ExDate)
[1] 1 0 1 0 0 1 3 0 1

Note that differences of this sort have fewer elements than the original so if you are putting the difference back into DF, you have to think a moment:

> DF$DiffAbove = 0
> DF[2:n,]$DiffAbove = diff(DF$ExDate)
> DF$DiffBelow = 0
> DF[2:n,]$DiffAbove = diff(DF$ExDate)

IF these diffs are purely temporary, then you don't need to put them in the data frame, but can keep them in local variables.

On Fri, Jul 17, 2009 at 8:31 AM, Mark Knecht <markk...@gmail.com> wrote:

In general, if I'm trying to calculate a given row/column in a
data.frame, how do I get a relative reference to another location in
the same data.frame???

--
Ted Dunning, CTO
DeepDyve

Mark Knecht

unread,

Jul 17, 2009, 2:12:35 PM7/17/09

to Ted Dunning, Bay Area R Helpers

Ted,
Thanks very much. Your response is sort of along the direction I
was going but it was feeling 'difficult'. Not so much to code it but
more that it wasn't very elegant and maybe there were some packaged
commands out there that already addressed this.

I'm sure that this stuff is probably coming up for me a bit more
than some as a lot of what I work on is time series oriented so the
idea of looking 2 minutes back, 3 trades back, previous day, etc., is
pretty natural.

I like the description of the difference between R and a
spreadsheet and really I'm trying to get used to doing the same things
I was able to accomplish in Excel but now within R. The R experience
so far has born a lot of good results and is impacting how I might use
some of my trading systems. Interesting stuff.

Cheers,
Mark

Ted Dunning

unread,

Jul 17, 2009, 2:41:00 PM7/17/09

to Mark Knecht, Bay Area R Helpers

R has explicit support for time series as well. Check out help("ts"). Your data is more rooted in daily sorts of things so that isn't likely what you want, but it may be good to know about.

On Fri, Jul 17, 2009 at 11:12 AM, Mark Knecht <markk...@gmail.com> wrote:

I'm sure that this stuff is probably coming up for me a bit more
than some as a lot of what I work on is time series oriented so the
idea of looking 2 minutes back, 3 trades back, previous day, etc., is
pretty natural.

Mark Knecht

unread,

Jul 17, 2009, 3:19:58 PM7/17/09

to Ted Dunning, Bay Area R Helpers

I'll check that out.

The real data is not really daily stuff. It's actually in minutes. In
this data I'm only displaying the final results of the trade (entry
time, exit time, money made or lost on the trade) and not the complete
path a trade took to get from start to stop.

I have a different file which has the market data, so if I wanted to
chart the path a trade took I'd go into that file at EnDate/EnTime and
get the market data up to ExTime.

However the issue I'm struggling with goes like this:

1) Assume I have so much money to trade, say $10K, and I want to trade
it across so many systems, say 5.

2) Each trade requires $4K in cash or the broker won't make the trade
so on day 1 I can make at most 2 trades. (I.e. - using $8K)

3) If the account builds value then I get to $12K and can make 3
trades that day, or $16K and can make 4 trades that day, etc.

I am interested in modeling, in R, different ways of using my capital
over different market types using multiple trading systems. For
instance, if I have $10K and can make 2 trades, I am better off to
apply 100% to the first trade and not be able to take the second trade
that day? Or am I better off to apply 50% to the first system and 50%
to the second remembering that when I use the first 50% I cannot know
whether the second trade will ever come or not so I might end up
trading only 50% that day. Or maybe I apply 100% of the capital to the
first trade but exit 50% of the position when the second trade
arrives. What happens when I have $40K and can choose different
position sizes for different systems, etc.

All of this ideas result in some combined equity curve and then I get
to model the statistics of what the differences are, etc. If I'm
shooting for maximum equity gain then maybe I do one thing. If I'm
shooting for the most linear equity gain then possibly I do something
else. Maybe it's different in bull and bear markets, etc.

This is conceptually straight forward in Excel but falls apart quickly
in practice as the spreadsheet isn't that good at playing all these
games. On the other hand extra coding in R doesn't bother me too much
as my data set is probably less than 10K-20K trades and R can do the
work in a minute or two almost no matter what my code looks like. (Or
so it seems so far.)

My work in R has already born fruit on individual systems. Some simple
data mining ideas have shown that I can reduce the number of trades I
take and increase my profits. I'm better off to trade certain aspect
of the system and not the whole system. However in those studies I use
all trades offered me and assume I have enough money that none of the
above is an issue. When I start working on combinations of systems I
dont want to make those assumptions so I have to deal with this sort
of problem in the data.

Thanks,
Mark

Ted Dunning

unread,

Jul 17, 2009, 5:37:26 PM7/17/09

to Mark Knecht, Bay Area R Helpers

It is still a very strange beast because of the closing of the markets. That is what I meant by "not really a time-series".

On Fri, Jul 17, 2009 at 12:19 PM, Mark Knecht <markk...@gmail.com> wrote:

The real data is not really daily stuff. It's actually in minutes.

Ted Dunning

unread,

Jul 17, 2009, 5:41:38 PM7/17/09

to Mark Knecht, Bay Area R Helpers

This should be a piece of cake. You should be able to create a simulated market and a program that represents a strategy. Running the strategy multiple times against different kinds of markets should give you the statistics you need. Then you can do modeling on the strategy options relative to the stochastic return and find out all kinds of things.

On Fri, Jul 17, 2009 at 12:19 PM, Mark Knecht <markk...@gmail.com> wrote:

I am interested in modeling, in R, different ways of using my capital
over different market types using multiple trading systems.

Mark Knecht

unread,

Jul 17, 2009, 5:47:58 PM7/17/09

to Ted Dunning, Bay Area R Helpers

Ah, OK. Sure - that's a point. It's a time series when it's running,
but not 24/7/365.

I see it more as a large collection of smaller time series that run
over the time period that a system might trade. Each time the market
is open and my system is running then I get another time series to add
to a larger collection of similar beasts. A data frame where each row
has a time series as one element in the data frame. There are normal
days, short days at holidays, and days the market is closed, but
certainly there isn't a single time series that would represent the
whole continuous market.

Thanks,
Mark

Ted Dunning

unread,

Jul 17, 2009, 6:43:56 PM7/17/09

to Mark Knecht, Bay Area R Helpers

There is. It just has lots of times with no data.

:-)

That is what can throw many algorithms for a loop.

And even if you had an instrument that is traded 24x7 on different markets, time of day, day of week and holiday still makes a huge difference because of differing overlap and trading characteristics of different markets.

On Fri, Jul 17, 2009 at 2:47 PM, Mark Knecht <markk...@gmail.com> wrote:

but certainly there isn't a single time series that would represent the
whole continuous market.

Reply all

Reply to author

Forward