ftdnn_net = newfftd(y,y,[1:8], H); % H = 5 (Ver 2009b)
ftdnn_net = timedelaynet([1:8], H); % H = 10 (Ver 2010b)
Pi = y(1:8);
p = y(9:end); % length(y) = 600
t = y(9:end);
The last two statements are in error.
If t = p there is no step ahead prediction.
For an s-step ahead predictor using a delay buffer
of d previous inputs,
Pi = y(1:d);
p = y(d+1,end-s); % Input
t = y(d+1+s,end); % target
Given this correction, three questions immediately come
to mind:
Q1: WHY USE d = 8?
Q2: WHY USE H = 5 IN 2009b AND H = 10 IN 2010b?
Q3. WHAT IS THE PREDICTION PERFORMANCE ON
NONDESIGN DATA?
To answer the questions I downloaded the original
N0 = 1000 observations from the Santa Fe Competition
website
(http://www-psych.stanford.edu/~andreas/Time-Series/SantaFe/A.dat)
and normalized them to the interval [-1 1]. Then I extracted
the N = 600 observations used in the MATLAB example.
To answer Q1 consider the nonnegative lag values of the
autocorrelation function of y given by
y = y(1:600);
N = length(y)
acf = xcorr(y);
acfss = acf(N:end)/max(acf); % Single-sided (nonegative lags)
movavgacf = cumsum(acfss(2:21))./(1:20); %Moving avg positive lags
%Find maximum correlation for positive lags (d = 8)
[maxacf1 d] = max(acfss(2:end)) % [0.91551 8]
%Tabulate the positive lag correlations and their
%moving average:
acftab = [(1:10)' acfss(2:11)' movavgacf(1:10)']
N = 600
maxacf1 = 0.91551
d = 8
LAG ACFSS MOVAVG
acftab = 1 0.82013 0.82013
2 0.54863 0.68438
3 0.41022 0.59299
4 0.3866 0.5414
5 0.45524 0.52416
6 0.63382 0.54244
7 0.8636 0.58832
8 ** 0.91551 0.62922
9 0.71236 ** 0.63846
10 0.49943 0.62455
Therefore, d = 8 is a good choice, However, d = 9
might be better.
To answer questions Q2 and Q3, the N = 600
observations used in the example were divided into
a training set (Ntrn = 400) and a holdout test set
(Ntst = 200) in order to obtain an unbiased estimate
of the prediction performance.
Sixty nets were designed: 10 random weight initializtion
trials for each of the 6 values of H in [0 5]. The
training goal was to obtain a degree-of-freedom adjusted
R^2 ( Ratrn^2), of at least 0.99.
When H = 0 and 1, the performance results did not vary
with the random weight initialization.
When H = 0 all trials converged in 3 epochs. For H = 1,
the convergence ranged between 44 and 56 epochs.
The resulting R^2 statistic results are
H = 0 H = 1
Ratrn^2 0.822 0.855
Rtst^2 0.674 0.671
The results for H = 2:5 are tabulated below. The
maximum No. of allowed epochs was 1000
No. of Epochs
Trial H = 2 H = 3 H = 4 H = 5
1 102 114 351 1000
2 250 1000 31 31
3 45 1000 150 30
4 631 27 23 57
5 841 52 88 28
6 34 14 108 123
7 1000 1000 555 251
8 37 1000 1000 12
9 1000 1000 713 58
10 1000 16 1000 19
Ratrn^2 ( x indicates nonconvergence )
Trial H = 2 H = 3 H = 4 H = 5
1 x 0.944 0.994 0.992 x 0.976
2 x 0.872 x 0.945 0.992 0.990
3 0.990 x 0.926 0.992 0.992
4 x 0.899 0.991 0.990 0.991
5 x 0.876 0.990 0.990 0.992
6 0.990 0.993 0.998 0.991
7 x 0.922 x 0.949 x 0.965 0.995
8 0.990 x 0.968 x 0.922 0.990
9 x 0.922 x 0.954 x 0.932 0.991
10 x 0.980 0.993 x 0.954 0.991
Rtst^2
x indicates nonconvergence
** indicates best performance
* indicates good performance
Trial H = 2 H = 3 H = 4 H = 5
1 x 0.778 ** 0.900 0.796 x 0.735
2 x 0.463 x 0.395 0.859 0.734
3 ** 0.935 x 0.612 0.894 * 0.915
4 x 0.770 0.861 0.818 0.835
5 x 0.484 0.826 0.771 0.825
6 * 0.934 0.845 ** 0.946 ** 0.949
7 x 0.660 x 0.557 x 0.653 0.873
8 ** 0.935 x 0.431 x 0.748 0.855
9 x 0.660 x 0.298 x 0.780 * 0.922
10 x 0.010 0.899 x 0.746 0.867
It is somewhat surprising that 3 good test
results (Rtst^2 >= 0.915) are obtained when
H = 2 and 5, but only 1 good reult for H = 4
and a maximum of only 0.90 for H =3.
For those who are interested in a more detailed
investigation of this website's time series data
a. Train with 600 / Test with next 400
b. Train with 1000 / Test with next 10,000.
Also check out the other time series data on the
website.
clear all, close all, clc, k = 0
load laserdata.m
x0 = laserdata';
[I0 N0] = size(x0) % [1 1000]
[minx0 imin] = min(x0) % [ 2 610 ]
[maxx0 imax] = max(x0) % [ 255 603 ]
k=k+1,figure(k) % figure 1
hold on
plot(x0,'k')
plot(x0,'.')
ylabel('Intensity')
title('Laser Data')
x0n = 2*(x0-minx0)/(maxx0-minx0)-1;
k=k+1,figure(k) % figure 2
hold on
plot(x0n,'k')
plot(x0n,'.')
ylabel('Normalized Intensity')
title('Normalized Laser Data')
Ntrn = 400
acf = xcorr(x0n(1:Ntrn));
acfss = acf(Ntrn:end)/max(acf); % Single-sided (nonegative lags)
% Find maximum correlation for positive lags (d = 8)
[maxacf d] = max(acfss(2:end)) %^ [0.90368 7]
movavgacf = cumsum(acfss(2:21))./(1:20); %Moving avg positive lags
[maxmovavg d] = max(movavgacf(2:end)) % [0.92094 8]
% Tabulate the positive lag correlations and their
% moving average:
acftab = [(1:10)' acfss(2:11)' movavgacf(1:10)']
% Plot the autocorrelation results
k=k+1,figure(k) % figure 3
subplot(211)
plot(1:Ntrn-1,acfss(2:end),'.')
axis([0 Ntrn 0 1])
title('Laser Data Autocorrelation Function')
subplot(212)
hold on
plot(1:20,acfss(2:21),'.')
plot(1:20,movavgacf(1:20),'r')
axis([0 21 0 1])
legend('Autocorr','MovingAvg',4)
N = 600
Ntrn = 400
d = 8
y = x0n(1:N);
Pitrn = y(1:d); % inputs that yield no direct output
ptrn = y(d+1:Ntrn-1); % Training input
ttrn = y(d+2:Ntrn); % Target for predicting 1 point ahead
[I Nt] = size(ptrn) % [1 391]
[O Nt] = size(ttrn) % [1 391]
Neq = Nt*O % No. of training equations
% Find reference MSE corresponding to a
% constant model
MSEa00 = mean(var(ttrn')) % 0.1385
RMSEa00 = sqrt(MSEa00) % 0.37216
% Intialize random number generator so that
% results are reproducible
rand('state',0)
% Nw = (I+d+1)*H + (H+1)*O % No of weights <= Neq
Hub = floor((Neq-O)/(I+d+1+O)) % 35 = H upper bound
Hmax = 5 % 2009b Demo choice
Nwmax = (I+d+1)*Hmax + (Hmax+1)*O % 56 = max No. of weights
r = Neq/Nwmax % 6.9821 Equation/Unknowm ratio
Ntrials = 10
Pitrns = con2seq(Pitrn);
ptrns = con2seq(ptrn);
ttrns = con2seq(ttrn);
Pitsts = con2seq(y(Ntrn-d+1:Ntrn));
ptsts = con2seq(y(Ntrn:N-1));
ttst = y(Ntrn+1:N);
for j = 0:Hmax
H = j
for i = 1:Ntrials
if H ==0
net = newfftdgh(minmax(ptrn),[1:d],[O]); %Older Version
Nw = (I+d+1)*O;
else
net = newfftdgh(minmax(ptrn),[1:d],[H O]);
Nw = (I+d+1)*H + (H+1)*O;
end
net.trainParam.goal = 0.01*(Neq-Nw)*MSEa00/Neq;
net.trainParam.epochs = 1000;
net.trainParam.show = inf;
[net tr yptrns etrns Pftrns] = train(net,ptrns,ttrns,Pitrns);
Nepochs(i,j+1) = tr.epoch(end);
MSEtrn = tr.perf(end)
MSEatrn = Neq*MSEtrn/(Neq-Nw)
RMSEatrn = sqrt(MSEatrn)
R2atrn(i,j+1) = 1-MSEatrn/MSEa00;
% Uncomment this code to obtain 60 plots
% k=k+1,figure(k)
% subplot(211)
% plot(tr.epoch,1-Neq*(tr.perf/MSEa00)/(Neq-Nw))
% ylim([0 1])
% xlabel('Epoch')
% ylabel('Adjusted R^2 Statistic')
% title(['Laser Data Training Performance: H = ',...
% num2str(j),', Trial No. ',num2str(i)])
% subplot(212)
% plot(1:Nt,cell2mat(etrns))
% ylim([-1 1])
% xlabel('Observation No.')
% ylabel('Training Error')
yptsts = sim(net,ptsts,Pftrns);
yptst = cell2mat(yptsts);
etst = ttst-yptst;
MSEtst = mse(etst)
RMSEtst = sqrt(MSEtst)
R2tst(i,j+1) = 1-MSEtst/MSEa00;
% k=k+1, figure(k)
% hold on
% plot(ttrn,'.')
% plot(yptrn,'r.')
% drawnow
end
end
for kk= k:-1:1
figure(kk)
end
resulttrn1 = [(1:Ntrials)' Nepochs]
% 1 3 48 102 114 351
1000
% 2 3 49 250 1000
31 31
% 3 3 44 45 1000
150 30
% 4 3 46 631 27
23 57
% 5 3 47 841 52
88 28
% 6 3 51 34 14
108 123
% 7 3 51 1000 1000
555 251
% 8 3 48 37 1000
1000 12
% 9 3 51 1000 1000
713 58
% 10 3 56 1000 16
1000 19
resulttrn2 = [(1:Ntrials)' R2atrn]
% 1 0.82155 0.8548 0.94439 0.99418
0.99153 0.9759
% 2 0.82155 0.8548 0.87227 0.94492
0.99219 0.99045
% 3 0.82155 0.8548 0.99007 0.92551
0.99171 0.99169
% 4 0.82155 0.8548 0.89933 0.99132
0.99026 0.99101
% 5 0.82155 0.8548 0.87647 0.99031
0.99016 0.99207
% 6 0.82155 0.8548 0.99005 0.99297
0.99079 0.99133
% 7 0.82155 0.8548 0.92216 0.94885
0.96529 0.99454
% 8 0.82155 0.8548 0.99006 0.96842
0.92218 0.99049
% 9 0.82155 0.8548 0.92216 0.95419
0.93237 0.9906
% 10 0.82155 0.8548 0.98043 0.99299
0.95443 0.99095
resulttst = [(1:Ntrials)' R2tst]
% 1 0.67354 0.67085 0.77774 0.90006* 0.796
0.73522
% 2 0.67354 0.67085 0.4632 0.39454 0.8589*
0.73365
% 3 0.67354 0.67085 0.9346* 0.61228 0.89415*
0.91544*
% 4 0.67354 0.67085 0.77017 0.86093* 0.81804
0.83511
% 5 0.67354 0.67085 0.4843 0.82633 0.77111
0.82466
% 6 0.67354 0.67085 0.93425* 0.84488 0.94588*
0.94863*
% 7 0.67354 0.67085 0.65968 0.55672 0.65269
0.87336
% 8 0.67354 0.67085 0.9346* 0.43103 0.74806
0.85471
% 9 0.67354 0.67085 0.65967 0.29821 0.78016
0.9224*
% 10 0.67354 0.67085 0.0098806 0.89895* 0.74645
0.86665
Hope this helps.
Greg
ftdnn_net = newfftd(y,y,[1:8], H); % H = 5 (Ver 2009b)
ftdnn_net = timedelaynet([1:8], H); % H = 10(Ver 2010b)
Pi = y(1:8);
p = y(9:end); % length(y) = 600
t = y(9:end);
The last two statements are in error.
If t = p there is no step ahead prediction.
For an s-step ahead predictor using a delay buffer
of d previous inputs,
Pi = y(1:d);
p = y(d+1,end-s); % Input
t = y(d+1+s,end); % target
Given this correction, three questions immediately come
to mind:
Q1: WHY USE d = 8?
Q2: WHY USE H = 5 IN 2009b AND H = 10 IN 2010b?
Q3. WHAT IS THE PREDICTION PERFORMANCE ON NONDESIGN DATA?
To answer the questions I downloaded the original
N0 = 1000 observations from the Santa Fe Competition
website
(http://www-psych.stanford.edu/~andreas/Time-Series/SantaFe/A.dat)
and normalized them to the interval [-1 1]. Then I extracted
the N = 600 observations used in the MATLAB example.
To answer question Q1, consider the nonnegative lag values
of the autocorrelation function of y given by
y = y(1:600);
N = length(y)
acf = xcorr(y);
acfss = acf(N:end)/max(acf); % Single-sided (nonegative lags)
movavgacf = cumsum(acfss(2:21))./(1:20); %Moving avg positive lags
% Find maximum correlation for positive lags (d = 8)
[maxacf1 d] = max(acfss(2:end)) % [0.91551 8]
% Tabulate the positive lag correlations and their
% moving average:
acftab = [(1:10)' acfss(2:11)' movavgacf(1:10)']
N = 600
No. of Epochs
Rtst^2
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
x0n = 2*(x0-minx0)/(maxx0-minx0)-1;
rand('state',0)
%Trial H = 0 H = 1 H = 2 H = 3 H = 4 H = 5
%
% 1 3 48 102 114 351 1000
% 2 3 49 250 1000 31 31
% 3 3 44 45 1000 150 30
% 4 3 46 631 27 23 57
% 5 3 4 841 52 88 28
% 6 3 51 34 14 108 123
% 7 3 51 1000 1000 555 251
% 8 3 48 37 1000 1000 12
% 9 3 51 1000 1000 713 58
% 10 3 56 1000 16 1000 19
resulttrn2 = [(1:Ntrials)' R2atrn]
%Trial H = 0 H = 1 H = 2 H = 3 H = 4 H = 5
%
% 1 0.8216 0.8548 0.9444 0.9942 0.9915 0.9759
% 2 0.8216 0.8548 0.8723 0.9449 0.9922 0.9905
% 3 0.8216 0.8548 0.9901 0.9255 0.9917 0.9917
% 4 0.8216 0.8548 0.8993 0.9913 0.9903 0.9910
% 5 0.8216 0.8548 0.8765 0.9903 0.9902 0.9921
% 6 0.8216 0.8548 0.9901 0.9930 0.9908 0.9913
% 7 0.8216 0.8548 0.9222 0.9489 0.9653 0.9945
% 8 0.8216 0.8548 0.9901 0.9684 0.9222 0.9905
% 9 0.8216 0.8548 0.9222 0.9542 0.9324 0.9906
% 10 0.8216 0.8548 0.9804 0.9930 0.9544 0.9910
resulttst = [(1:Ntrials)' R2tst]
%Trial H = 0 H = 1 H = 2 H = 3 H = 4 H = 5
%
% 1 0.6735 0.6709 0.7777 0.9000 0.7960 0.7352
% 2 0.6735 0.6709 0.4632 0.3945 0.8589 0.7337
% 3 0.6735 0.6709 0.9346 0.6123 0.8942 0.9154
% 4 0.6735 0.6709 0.7702 0.8609 0.8180 0.8351
% 5 0.6735 0.6709 0.4843 0.8263 0.7711 0.8247
% 6 0.6735 0.6709 0.9343 0.8449 0.9459 0.9486
% 7 0.6735 0.6709 0.6597 0.5567 0.6527 0.8734
% 8 0.6735 0.6709 0.9346 0.4310 0.7481 0.8547
% 9 0.6735 0.6709 0.6597 0.2982 0.7802 0.9224
% 10 0.6735 0.6709 0.0099 0.8990 0.7465 0.8667
Hope this helps.
Greg
However, to avoid biasing the test set estimate of
prediction error, it is better to only use parameters
estimated from nontest design data.
In this example validation is not used. First, the acf
of the training (Ntrn = 400) data is tabulated. The
maximum value of the acf for positive delays is 0.9
at d = 7.
Next, a limited search for an optimal value of d in
[1:10] is performed for H = 5. Tabulations of the
test R^2 statistic (R2tst) are obtained from ten
trials each for each candidate value of d.
The resulting summary of test set R^2 statistics is
Delay Max Median Mean Min
1 -0.53 -0.58 -0.57 -0.66
2 0.89 0.85 0.78 0.44
3 0.94 0.89 0.89 0.83
4 0.94 0.90 -13.33 -141.37
5 0.96 0.91 0.88 0.68
6 0.94 0.92 0.90 0.84
7 0.96 0.87 0.86 0.77
8 0.93 0.78 0.77 0.55
9 0.91 0.72 0.67 0.19
10 0.86 0.69 -1.58 -17.01
The negative values indicate that some of the NNs
perform worse than a constant model that always
outputs the average of the target values.
When negative Rtst^2 values are replaced by NaNs,
the summary statistics change to
Delay Max Median Mean Min
1 NaN NaN NaN NaN
2 0.89 0.84 0.78 0.44
3 0.94 0.89 0.89 0.83
4 0.93 0.90 0.90 0.82
5 0.96 0.91 0.88 0.68
6 0.94 0.92 0.90 0.84
7 0.96 0.87 0.86 0.75
8 0.93 0.78 0.77 0.55
9 0.91 0.72 0.67 0.19
10 0.86 0.76 0.70 0.40
Although more confident statistics would result
from choosing Ntrials >> 10, the following
observations are interesting:
1. d = 1 is useless.
2. Although the best result occurred for d = 5,
very good results were obtained throughout
the interval 3 <= d <= 8.
3. Median and mean statistics tend to increase the
confidence for the narrower interval 3 <= d <= 6.
4. Training set max and median statistics in the
following tabulation tend to favor the broader
interval 4 <= d <= 10. This result tends to
emphasize the need to use nontraining data to
confidently estimate both parameter bounds and
prediction performance on unseen data.
Delay Max Median Mean Min
1 0.12 0.06 0.08 0.03
2 0.95 0.93 0.92 0.83
3 0.98 0.96 0.96 0.94
4 0.99 0.99 0.98 0.93
5 0.99 0.99 0.98 0.93
6 0.99 0.99 0.99 0.99
7 0.99 0.99 0.99 0.99
8 0.99 0.99 0.98 0.94
9 0.99 0.99 0.99 0.97
10 0.99 0.99 0.98 0.94
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
clear all, close all, clc, k = 0
tic
load laserdata.m
x0 = laserdata';
[I0 N0] = size(x0) % [1 1000]
[minx0 imin] = min(x0) % [ 2 610 ]
[maxx0 imax] = max(x0) % [ 255 603 ]
x0n = 2*(x0-minx0)/(maxx0-minx0)-1;
N = 600
y = x0n(1:N);
Ntrn = 400
ytrn = y(1:Ntrn);
delaymax = 10
acf = xcorr(ytrn); % Autocorrelation Function
acfss = acf(Ntrn:end)/max(acf); % Single-sided (delays >=0)
movavgacf = cumsum(acfss(2:21))./(1:20); % Moving average
% Find maximum correlation for positive delays (d > 0)
[maxacf1 d] = max(acfss(2:end)) % [0.904 7]
% Tabulate the positive delay correlations and their
% moving average:
acftab = [(1:delaymax)' acfss(2:delaymax+1)' ...
movavgacf(1:delaymax)']
% maxacf1 = 0.904
% d = 7
% acf1tab =
% DELAY ACFSS MOVAVG
% 1 0.848 0.848
% 2 0.608 0.728
% 3 0.476 0.644
% 4 0.457 0.597
% 5 0.536 0.585
% 6 0.715 0.607
% 7 ** 0.904 0.649
% 8 * 0.903 0.681
% 9 0.706 * 0.684
% 10 0.529 0.668
% Intialize random number generator so that results
% are reproducible
rand('state',0)
for j = 1:delaymax
d = j
Pitrn = y(1:d); % inputs that yield no direct output
ptrn = y(d+1:Ntrn-1);% Training input
ttrn = y(d+2:Ntrn); % Target for predicting 1 point ahead
[I Nt] = size(ptrn) % [1 Ntrn-d-1]
[O Nt] = size(ttrn) % [1 Ntrn-d-1]
Neq = Nt*O % No. of training equations
Pitrns = con2seq(Pitrn);
ptrns = con2seq(ptrn);
ttrns = con2seq(ttrn);
Pitsts = con2seq(y(Ntrn-d+1:Ntrn));
ptsts = con2seq(y(Ntrn:N-1));
ttst = y(Ntrn+1:N);
% Find reference MSE corresponding to a constant model
MSEa00 = mean(var(ttrn'))
RMSEa00 = sqrt(MSEa00)
% Nw = (I+d+1)*H + (H+1)*O % No of weights <= Neq
Hub = floor((Neq-O)/(I+d+1+O)) % upper bound for H
H = 5 % 2009b Demo choice
Nw = (I+d+1)*H + (H+1)*O % No. of weights
r = Neq/Nw % Equation/Unknowm ratio
MSEgoal = 0.01*(Neq-Nw)*MSEa00/Neq % Training Goal
Ntrials = 10
for i = 1:Ntrials
net = newfftdgh(minmax(ptrn),[1:d],[H O]);
net.trainParam.goal = MSEgoal;
net.trainParam.epochs = 1000;
net.trainParam.show = inf;
warning off MATLAB:nearlySingularMatrix
[net tr yptrns etrns Pftrns] = ...
train(net,ptrns,ttrns,Pitrns);
Nepochs(i,j) = tr.epoch(end);
MSEtrn = tr.perf(end)
yptrn = cell2mat(yptrns);
eptrn = cell2mat(etrns);
MSEatrn = Neq*MSEtrn/(Neq-Nw)
RMSEatrn = sqrt(MSEatrn)
R2atrn(i,j) = 1-MSEatrn/MSEa00;
% Uncomment to obtain 100 plots
%
% k=k+1, figure(k)
%
% subplot(311)
% plot(tr.epoch,1-Neq*(tr.perf/MSEa00)/(Neq-Nw))
% ylim([0 1])
% xlabel('Training Epoch')
% ylabel('Adjusted R^2 Statistic')
% title(['H = ',int2str(H),', Input Delay = ',...
% int2str(d),', Trial No. ',int2str(i)])
%
% subplot(312)
% hold on
% plot(ttrn,'.')
% plot(yptrn,'r.')
% legend('Target','Prediction')
% ylim([-1.5 1])
% ylabel('Laser Intensity')
%
% subplot(313)
% hold on
% plot(1:Nt,zeros(1,Nt),'k')
% plot(1:Nt,eptrn)
% ylim([-1 1])
% xlabel('Observation No.')
% ylabel('Prediction Error')
yptsts = sim(net,ptsts,Pftrns);
yptst = cell2mat(yptsts);
etst = ttst-yptst;
MSEtst = mse(etst)
RMSEtst = sqrt(MSEtst)
R2tst(i,j) = 1-MSEtst/MSEa00;
end
end
% for kk= 1:k
% figure(kk)
% pause(1)
% end
%
% for kk= k:-1:1
% figure(kk)
% end
Nepochs = Nepochs
epochstats = [(1:delaymax)' max(Nepochs)' median(Nepochs)'...
mean(Nepochs)' min(Nepochs)']
R2atrn = R2atrn
R2trnstats = [(1:delaymax)' max(R2atrn)' median(R2atrn)'...
mean(R2atrn)' min(R2atrn)']
R2tst = R2tst
R2tststats = [(1:delaymax)' max(R2tst)' median(R2tst)'...
mean(R2tst)' min(R2tst)']
r2tst = R2tst;
r2tst(r2tst < 0) = NaN;
r2tststats = [(1:delaymax)' nanmax(r2tst)' nanmedian(r2tst)'...
nanmean(r2tst)' nanmin(r2tst)']
k=k+1,figure(k)
subplot(211)
hold on
plot(1:delaymax,max(R2atrn),'k.-')
plot(1:delaymax,median(R2atrn),'b.-')
plot(1:delaymax,mean(R2atrn),'r.-')
plot(1:delaymax,min(R2atrn),'g.-')
legend('max','median','mean','min',4)
axis([1 11 0.8 1.05])
% xlabel('Maximum Input Delay')
ylabel('Training R^2 ')
title('Training and Prediction Performance (Laser Data)')
subplot(212)
hold on
plot(1:delaymax,max(R2tst),'k.-')
plot(1:delaymax,nanmedian(r2tst),'b.-')
plot(1:delaymax,nanmean(r2tst),'r.-')
plot(1:delaymax,nanmin(r2tst),'g.-')
legend('max','median','mean','min',4)
axis([1 11 0 1.2])
xlabel('Maximum Input Delay')
ylabel(' Prediction R^2')
time = toc %558.22 ( ~ 10 min )
Hopw thia hwlpa.
Greg
So much for touch typing.
Greg
Is Nw <> size(getX(net),1) ?
From Matlab Help:
getx
Purpose
All network weight and bias values as single vector
Best regards
Jorge
Since it's been more than 2 weeks since your post,
you have probably found out by using trial and
error with a smaller data set.
What did you find?
Sorry for the late response.
Greg
by running your code I found that Nw <> size(getX(net),1), so I replace in your code:
Nw = (I+d+1)*H + (H+1)*O %% Nw <> size(getX(net),1)
with
Nw = (d+1)*H + (H+1)*O %% Nw = size(getX(net),1)
and so
Hub = floor((Neq-O)/(I+d+1+O))
with
Hub = floor((Neq-O)/(d+1+O))
is that correct? can you check it?
Best regards
Jorge
>I'm a student from University of Cyprus. I don't
>want to take your time but i saw your code in
>mathwork about newffdt. I didn't understand,
>where is the ahead prediction of your code? i mean,
>how your code predicts the value at 601,602..?
With d = 7, Use 594-600 to predict 601, then
use 595-601 to predict 602, etc.
Zinon Taousianis <taousian...@ucy.ac.cy>
Sent: Mar 15, 2011 03:38:04 PM
> I don't understand exactly where is the prediction.
> i.e. There are 600 observations, where is the values
> of s-step ahead predictions?
I only considered the case s = 1.
Pi = y(1:d);
p = y(d+1,end-s); % Input
t = y(d+1+s,end); % target
You can easily generalize to an s >= 1 dimensional output
Pitrn = y(1:d); % inputs that yield no direct output
ptrn = y(d+1:Ntrn-s); % Training input
% Target for predicting the next s points
ttrn = y(d+2:d+1+s,Ntrn+1-s:Ntrn);
There will be Ntrn-d-s s-dimensional predictions in yptrns
There will be Ntst-d-s s-dimensional predictions in yptsts
Obviously, there will be s separate predictions for most
of these points.
>I would like to ask you something more. Assume you
>have a vector with 500 values [x1,x2.......x500].
>I thought a way to make prediction ahead but i
>don't know if the idea is good and if it worths
>to implement it.
>i thought the following procedure for training :
>inputs x1,x2,x3 and the target to be x4. Then
>inputs x2,x3,x4 and target x5 and so on until to
>reach the final three observations and forecast
>the next one. is that procedure a good choice or
>there is no meaning?
Yes it can be done with newff. I have several
time series posts which do just that. You might
be able to find them searching with
heath time-series newff
Hope this helps.
Greg
For an I dimensional input series used to
predict an O-dimensional output series
with H hidden nodes, there are
(I+1)*H (input+bias)-to-hidden weights
(H+1)*O (hidden+bias)-to-output weights
therefore, for newff,
Nw = (I+1)*H+(H+1)*O
For newfftd, there will be d additional
delayed inputs. However, those inputs
are I-dimensional.
So now I get
Nw = ((d+1)*I+1)*H+(H+1)*O
clear all, clc
I = 3
H = 4
O = 5
N = 6
p = rand(I,N);
t = rand(O,N);
net = newff(minmax(p),[H O]);
NwL = (H+1)*O % 25
NwI0=length(getx(net))-NwL % 16
NwI = (I+1)*H % 16
for d = 0:3
net = newfftdgh(minmax(p),[0:d],[H O]);
d = d
NwL = (H+1)*O % 25
NwI0 = length(getx(net))-NwL
NwI = ((d+1)*I+1)*H
end
Hope this helps.
Greg
When input delay (ID) does begin with 0, NwI0 = NwI. But when ID does not begin with 0 then NwI0 <> NwI.
clear all, clc
I = 3
H = 4
O = 5
N = 6
p = rand(I,N);
t = rand(O,N);
d = 3
ID = 0:d
net = newfftdgh(minmax(p),ID,[H O]);
NwL = (H+1)*O
NwI0 = length(getx(net))-NwL
NwI = ((d+1)*I+1)*H % NwI = NwI0
ID = 1:d
net = newfftdgh(minmax(p),ID,[H O]);
NwL = (H+1)*O
NwI0 = length(getx(net))-NwL
NwI = ((d+1)*I+1)*H % NwI <> NwI0
when
>> NwI = ((I*length(ID))+1)*H
then NwI=NwI0
Best regards
Jorge
Correct! ...Thanks.
Replace d+1 by length(ID)
i = 0
for d = 0:5
d = d
% ID = [0:d]
% ID = [1:d]
% ID = [0:2:d]
ID = [1:2:d]
LID = length(ID)
net = newfftdgh(minmax(p),ID,[H O]);
NwL = (H+1)*O % 25
NwI0 = length(getx(net))-NwL
NwI = (LID*I+1)*H
i=i+1
dNw(i) = NwI-NwI0
I try to predict vector of three variables. First of all I spent lot of time with a "laser example" and lines
p = y(9:end);
t = y(9:end);
Fortunatelly I found your explanation.
I have some apriori knowledge about my variables: v1<=v2<=v3 for the same time
Is any way to introduce this constrains in dynamic net ?
In input data this relation is always satisfy but in predicted result not always.
Regards -
Andrzej
I don't know of anyway to introduce that constraint.
Sorry,
Greg