I have commonly used a regression model in which a series
of Ys are regressed against a multiple series of Xs as sin
(X), cos(X), sin(2X), cos(2X) etc. This is giving
incomplete regression results under Excel 2003. In the
Summary Output the Regression Statistics are OK, and the
ANOVA is OK. But the table of coefficients and some othe
results are incomplete. The intercept and one
coefficient are identical to those using the same data on
a second computer running Excel 97 under Windows 98, but
other coefficients under Excel 2003 on the first computer
are shown as zero. All coefficients show correctly under
Windows 98 on the second computer.
Trouble Shooting Steps: I removed and reinstalled
Microsoft Office. I installed SP1. I searched the
Microsoft data base unsuccessfully. Attempts to get in
touch directly with Microsoft are unsuccessful.
Can anyone help in overcoming this problem?
We have had @strange@ math errors on a Anthlon based machine which could not
be reproduced on a Pentium rig
n10
"James Kelly" <anon...@discussions.microsoft.com> wrote in message
news:099401c48571$8b3c9ae0$7d02...@phx.gbl...
Any suggestions or advice?
Jim Kelly
>.
>
How big is the data set. Could you reasonably paste the values (as
text) into a reply to this thread?
Jerry
You might want to search the google archive of the XL newsgroups for
more on this subject.
If you still have problems, share some more specifics. What method /
formulas are you using? How do the issues discussed in the archive
compare with your data set? For example, do you have missing data?
Anecdotally, how XL handles this seems to be a major change in 2003.
--
Regards,
Tushar Mehta
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions
In article <099401c48571$8b3c9ae0$7d02...@phx.gbl>,
anon...@discussions.microsoft.com says...
Another correspondent suggested sending a text file of
the data and output. In order to keep the structure
presentable I prepared a RTF file of the Excel 97 and
Excel 2003 outputs on the saame data. (Note that the X
data I use are kust the sines and cosines, not the first
column. I regret that you nay need to cast the data in a
more usable format to examine it properly.
If you prefer, communicate with me direct at
jakel...@bigpond.com This is in Queensland Australia.
I would be grateful for any comments, including whether
you believe that we are have a processor fault (Celeron
CPU with the wrong results) or a software error in Excel
2003.
Regards
James Kelly
Transcript of the RTF file
==========================
Data and results using computer
A, with Windows XP and Excel 2003
X Sin(X) Cos(X) Sin(2X) Cos(2X) Sin(3X) Cos(3X)
Y
0.3 0.3 1.0 0.5 0.9 0.7 0.7
5.269939394
0.8 0.7 0.7 1.0 0.0 0.7 -0.7
4.737128072
1.3 1.0 0.3 0.5 -0.9 -0.7 -0.7
4.210490308
1.8 1.0 -0.3 -0.5 -0.9 -0.7 0.7
3.442649165
2.4 0.7 -0.7 -1.0 0.0 0.7 0.7
2.596875
2.9 0.3 -1.0 -0.5 0.9 0.7 -0.7
2.360915493
3.4 -0.3 -1.0 0.5 0.9 -0.7 -0.7
2.45915986
3.9 -0.7 -0.7 1.0 0.0 -0.7 0.7
3.113424658
4.5 -1.0 -0.3 0.5 -0.9 0.7 0.7
4.0492891
5.0 -1.0 0.3 -0.5 -0.9 0.7 -0.7
4.662025316
5.5 -0.7 0.7 -1.0 0.0 -0.7 -0.7
5.095438175
6.0 -0.3 1.0 -0.5 0.9 -0.7 0.7
5.361955086
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.998899611
R Square 0.997800433
Adjusted R Square 0.995160953
Standard Error 0.077881527
Observations 12
ANOVA
df SS MS F Significance F
Regression 6 13.75768852 2.292948087
378.0291643 1.78126E-06
Residual 5 0.030327661 0.006065532
Total 11 13.78801618
Coefficients Standard Error t Stat P-value
Lower 95% Upper 95% Lower 95.0%
Upper 95.0%
Intercept 3.946607469 0.02248246
175.5416186 1.13829E-10 3.888814465
4.004400473 3.888814465 4.004400473
X Variable 1 0 0.031795 0 1 -
0.08173165 0.08173165 -0.08173165 0.08173165
X Variable 2 0 0.031795 0 1 -
0.08173165 0.08173165 -0.08173165 0.08173165
X Variable 3 0.039817726 0.031795
1.252326642 0.265836001 -0.041913924
0.121549376 -0.041913924 0.121549376
X Variable 4 0 0.031795 0 1 -
0.08173165 0.08173165 -0.08173165 0.08173165
X Variable 5 0 0.031795 0 1 -
0.08173165 0.08173165 -0.08173165 0.08173165
X Variable 6 0 0.031795 0 1 -
0.08173165 0.08173165 -0.08173165 0.08173165
RESIDUAL OUTPUT
Observation Predicted Y Residuals
1 3.966516332 1.303423062
2 3.986425195 0.750702878
3 3.966516332 0.243973976
4 3.926698606 -0.484049441
5 3.906789743 -1.309914743
6 3.926698606 -1.565783113
7 3.966516332 -1.507356472
8 3.986425195 -0.873000537
9 3.966516332 0.082772768
10 3.926698606 0.735326711
11 3.906789743 1.188648432
12 3.926698606 1.43525648
Data and results using computer B, with
Windows 98 and Excel 97
X Sin(X) Cos(X) Sin(2X) Cos(2X) Sin(3X) Cos(3X)
Y
0.262 -0.259 0.966 0.5 0.866 0.707 0.707
5.269939
0.785 -0.707 0.707 1 0 0.707 -0.707
4.737128
1.309 -0.966 0.259 0.5 -0.866 -0.707 -0.707
4.21049
1.833 -0.966 -0.259 -0.5 -0.866 -0.707 0.707
3.442649
2.356 -0.707 -0.707 -1 0 0.707 0.707
2.596875
2.88 -0.259 -0.966 -0.5 0.866 0.707 -0.707
2.360915
3.403 0.259 -0.966 0.5 0.866 -0.707 -0.707
2.45916
3.927 0.707 -0.707 1 0 -0.707 0.707
3.113425
4.451 0.966 -0.259 0.5 -0.866 0.707 0.707
4.049289
4.974 0.966 0.259 -0.5 -0.866 0.707 -0.707
4.662025
5.498 0.707 0.707 -1 0 -0.707 -0.707
5.095438
6.021 0.259 0.966 -0.5 0.866 -0.707 0.707
5.361955
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.99889752
R Square 0.997796255
Adjusted R Square 0.99515176
Standard Error 0.077955465
Observations 12
ANOVA
df SS MS F Significance F
Regression 6 13.75762921 2.292938202
377.3107802 1.78972E-06
Residual 5 0.030385273 0.006077055
Total 11 13.78801449
Coefficients Standard Error t Stat P-value
Lower 95% Upper 95% Lower 95.0%
Upper 95.0%
Intercept 3.946607333 0.022503804
175.3751168 1.1437E-10 3.888759557
4.00445511 3.888759557 4.00445511
X Variable 1 0.28165063 0.031824273
8.850182649 0.000306124 0.199843865
0.363457395 0.199843865 0.363457395
X Variable 2 1.480953202 0.031824273
46.53533465 8.65453E-08 1.399146437
1.562759967 1.399146437 1.562759967
X Variable 3 0.039817833 0.031825185
1.251142225 0.266232927 -0.041991277
0.121626943 -0.041991277 0.121626943
X Variable 4 -0.131709584 0.031826119 -
4.13841173 0.00901094 -0.213521094 -
0.049898075 -0.213521094 -0.049898075
X Variable 5 -0.000818718 0.031829992 -
0.025721577 0.980474418 -0.082640184
0.081002748 -0.082640184 0.081002748
X Variable 6 0.03641867 0.031829992
1.144162093 0.304357929 -0.045402795
0.118240136 -0.045402795 0.118240136
RESIDUAL OUTPUT
Observation Predicted Y Residuals
1 5.235278196 0.034660804
2 4.808005251 -0.070877251
3 4.166899954 0.043590046
4 3.411444362 0.031204638
5 2.685797757 -0.088922757
6 2.282762777 0.078152223
7 2.469633304 -0.010473304
8 3.164845082 -0.051420082
9 3.994253546 0.055035454
10 4.670073471 -0.008048471
11 5.127781243 -0.032343243
12 5.342513056 0.019441944
>.
>
Thank you for your reply.
Here are the data and the outputs as well as I can
manage. To keep the structure presentable I prepared a
RTF file of the Excel 97 and Excel 2003 outputs on the
same data.
(Note that the X data I use are just the sines and
cosines, not the first column. I regret that you may
need to cast the data in a more usable format to examine
it properly.
If you prefer, communicate with me direct at
jakel...@bigpond.com. This is in Queensland Australia.
Regards
James Kelly
>.
>
.
>.
>
After that, I got the same results using both the LINEST function and
the regression data analysis tool with XL97 SP2, XL2002 SP1, and XL2003
base release.
Coefficients
Intercept 3.944842019
Sin(X) -0.271500439
Cos(X) 1.478896612
Sin(2X) 0.042754559
Cos(2X) -0.104511766
Sin(3X) -0.000111206
Cos(3X) 0.038105963
and the corresponding p-values
P-value
1.63782E-11
5.61769E-05
1.22855E-08
0.104242157
0.00462718
0.995938638
0.149091385
The above do not match the results of either of your computers.
I'm not sure what to tell you, especially as to the discrepancy in the
XL97 results.
How exactly are you computing the various columns? And, why did you
have only 1 decimal digit in the x-data you posted?
Can you try the LINEST function? Though, I suspect you will get the
same zeros since my *guess* is that the Regression data analysis tool
internally uses LINEST.
Is it easy to uninstall Office 2003 SP1? If so, maybe, you can test
with the base release. Or did you install SP1 because the base release
itself gave coefficients of zero?
Maybe, Jerry has some ideas...
--
Regards,
Tushar Mehta
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions
In article <0a7001c4864f$39201a00$7d02...@phx.gbl>,
anon...@discussions.microsoft.com says...
Firstly I must say that I am impressed that there are
correspondents who go out of their way to try and help.
Thank you for your part in that
To comment on your query in detail...
A)My regression is a simultaneous multiple regression of
Y against six Xs, respectively Sin(X), Cos(X)......Sin
(3X), Cos(3X). (I believe that you knew that, because of
your comment about the coefficients.)
B) Your ouput is the correct output for Xs with
sufficient precision. They are rounded down in the
example I sent, simply to fit them on the page. The
exact Xs are 0.5/12*2Pi, 1.5/12*2Pi ... 11.5/12*2Pi. I
would expect that you would find the same results as in
my message if you used those Xs.
New information! Running Exclel 97 on my XP gets the
corrects results! I conclude that the problem must be
within Excel 2003, or between Excel 2003 and my XP
computer. In either case Microsoft appears to be
implicated. But how does one contact them? They seem to
be behind the ramparts in my attempts to raise this issue
with them.
Regards
Jim Kelly
>.
>
It is as though LINEST thinks that the model is singular; even though
the problem is not at all ill-conditioned (condition number =2).
Perturbing the model can get you around this problem.
=LINEST(y,X+c,,TRUE)
would mathematically alter only the intercept term. You could get back
the true intercept by
Intercept = intercept + c*SUM(regressionCoefs)
where Intercept is the true intercept, intercept is the LINEST intercept
using the perturbed model, and regressionCoefs are all of the estimates
except for intercept from the perturbed model.
Numerically an additive perturbation could reduce precision due to
cancellation, therefore I would make c the smallest power of 2 that
produces all nonzero regression coefficients.
Alternately, you might be able to find separate multipliers for each
column that would result in nonzero regression coefficients. In theory
that should have no impact on the precision of calculations, but you
would have to factor the multipliers out of the standard errors of
estimates as well as out of the estimates themselves.
Take home lesson appears to be that though LINEST in 2003 uses an
algorithm that generally produces far more accurate results than
previous versions; beware of results where estimates are exactly zero.
Jerry
I shall try your LINEST suggestions.
Can and should one notify Microsoft of the anomaly, and
if, so, how?
Regards
Jim Kelly
>.
>
http://support.microsoft.com/default.aspx?scid=sz;EN-US;top&product=ofw&prname=Office
for reporting bugs.
Jerry W. Lewis
Excel MVP
Thank you. Microsoft needs to know about this anomaly.
Regards
>.
>
Jerry