Secondly, along the same idea, I have a file with a large number of
stocks. They are all mixed in (the file is in chronological order).
Is there an sql type 'group by' command that lets me see moving
averages for each stock?
Third, if I'm able to get moving average for each stock in the list,
can I plot all of them in one command (I guess this technique is
called "small multiples" in charting vernacular). Obviously I will
only have 20 or 30 stocks.
Thanks
if I understood you correctly, try the following code:
data = Sort[({#1, Sqrt[#1] + Sin[#1]} & ) /@ RandomReal[{0, 100},
{100}]];
ListPlot[data, Joined -> True]
IntervalMovingAverage[dat_List, width_, index_: 1] :=
If[index =!= 1, Sort, Identity][
Mean /@
Function[pos, Select[dat, pos[[index]] <= #1[[index]] <=
pos[[index]] + width & ]] /@ dat]
ListPlot[IntervalMovingAverage[data, 5], Joined -> True]
ListPlot[IntervalMovingAverage[data, 20], Joined -> True]
ListPlot[IntervalMovingAverage[data, 3, 2], Joined -> True]
hope this helps,
Peter
falcon schrieb:
In[1]:= data =
Sort[({#1, Sqrt[#1] + Sin[Pi*#1]} & ) /@ RandomReal[{0, 100},
{1000}]];
In[2]:= IntervalMovingAverage[dat_List, width_, index_: 1] :=
If[index =!= 1, Sort, Identity][
Mean /@
Function[pos, Select[dat, pos[[index]] <= #1[[index]] <=
pos[[index]] + width & ]] /@ dat]
In[3]:= ima[dat_, width_] := Mean /@ Block[{lst = dat, k},
Reap[While[Length[lst] > 0,
Sow[Take[lst, k = 1; While[k <= Length[lst] &&
lst[[k, 1]] - lst[[1, 1]] <= width, k++];
k - 1]];
lst = Rest[lst]]][[2, 1]]]
In[4]:= (tst = (Timing[#1[data, 5]] & ) /@ {IntervalMovingAverage,
ima})[[All, 1]]
Out[4]= {5.736359, 0.32802100000000056}
In[5]:= SameQ[tst[[All, 2]]]
Out[5]= True
Peter
falcon schrieb:
Interesting problem. Here's my take:
MovingTimeAverage::usage = "MovingTimeAverage[data, lag] takes a list
\
of {time, value} pairs and returns a list of {time, avg} pairs where \
avg is an average of the values computed from the window [time - lag,
time]. \
MovingTimeAverage assumes the data are sorted.";
MovingTimeAverage[data : {{_, _} ..}, lag_] :=
With[{lagindex = LagIndexCompiled[data[[All, 1]], lag]},
Table[{data[[i, 1]], Mean[data[[lagindex[[i]] ;; i, 2]]]},
{i, Length[data]}]
]
LagIndexCompiled =
Compile[{{time, _Real, 1}, {lag, _Real, 0}},
Module[{j = 1},
Table[While[time[[i]] - time[[j]] > lag, j++]; j,
{i, Length[time]}]
]];
ndraws = 10^4;
times = Accumulate[RandomReal[ExponentialDistribution[5], ndraws]];
values = Accumulate[RandomReal[NormalDistribution[], ndraws]];
data = Transpose[{times, values}];
ma = MovingTimeAverage[data, 5]; // Timing
ListLinePlot[{data, ma}]
On my laptop, it takes about .1 second to compute the moving average.
The key is to compute the position of where the window starts for each
observation (lagindex).
--Mark
Following up on my previous post, this version is faster:
MovingTimeAverage =
Compile[{{data, _Real, 2}, {lag, _Real, 0}},
Module[{j = 1},
Table[
While[data[[i, 1]] - data[[j, 1]] > lag, j++];
{data[[i, 1]], Mean[data[[j ;; i, 2]]]},
{i, Length[data]}]
]]
It didn't occur to me right away to put the two pieces together.
--Mark
In a private communication, Daniel Lichtblau sent me a yet faster
version. It uses of Accumulate[] as part of a faster way to compute
the mean, by differencing the accumulated values and avoiding the
repeated calls to Mean[].
MovingTimeAverage =
Compile[{{data, _Real, 2}, {lag, _Real, 0}},
Module[{
j = 1,
d2 = Prepend[Accumulate[data[[All, 2]]], 0]
},
Table[
While[data[[i, 1]] - data[[j, 1]] > lag, j++];
{data[[i, 1]], (d2[[i + 1]] - d2[[j]])/(i + 1 - j)},
{i, Length[data]}]
]]
--Mark