Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

COnfidence Intervals in Multiple Regression

355 views
Skip to first unread message

FredZack

unread,
Sep 3, 2009, 11:13:01 AM9/3/09
to
Is there a way to get a confidence interval about your predicted values for
your dependent variable (Y) in Excels multiple regression?

Sean Timmons

unread,
Sep 3, 2009, 11:17:02 AM9/3/09
to

Shane Devenshire

unread,
Sep 3, 2009, 11:50:01 AM9/3/09
to
Take a look at Regression in the Analysis ToolPak. There is also a
CONFIDENCE fucntion. The ATP can be attached by choosing Tools, Add-ins and
checking Analysis ToolPak in 2003 or earlier.

--
If this helps, please click the Yes button.

Cheers,
Shane Devenshire

Jerry W. Lewis

unread,
Sep 3, 2009, 11:10:01 PM9/3/09
to
Not directly. The ATP regression tool merely reformats and elaborates on the
output of LINEST. LINEST gives standard errors of regression coefficients,
which can be used to construct t-tests to test the significance of a given
coefficient, but which are useless for confidence intervals on predicted
values, because no native excel functions directly give the covariances
between these parameter estimates.

The Multiple Regression template by Business Spreadsheets adds a few things
beyond what LINEST gives, but standard errors and/or confidence intervals for
predicted values are not among them.

Excel's CONFIDENCE function has no bearing on your question. Moreover, it
is rarely useful for anything at all, because it presumes that you know the
true underlying standard deviation instead of having to estimate it from data.

If you are comfortable with linear algebra, you can "roll your own"
solution, however. If X is the model matrix (including a column of 1's for
the mean term) for the known y values, and v is the analogous row of
predictors for single point to be predicted, then if there are not numerical
issues, then
=MMULT(v,MMULT(MINVERSE(MMULT(TRANSPOSE(X),X)),MMULT(TRANSPOSE(X),y)))
should agree with
=TREND(y,X,v)
Excel versions since 2003 use a mathematically equivalent but numerically
superior calculation. If the two calculatiosn do not agree, the the
following confidence interval calculation will not be trustworthy. Assuming
that they agree, then
=SQRT(MMULT(MMULT(v,MINVERSE(MMULT(TRANSPOSE(X),X))),TRANSPOSE(v)))*S
(where S is the sey estimate from the LINEST output) is the standard error
of the fitted multiple regression at the predictors v. If you are predicting
a future value instead of just the fitted mean, use
=SQRT(1+MMULT(MMULT(v,MINVERSE(MMULT(TRANSPOSE(X),X))),TRANSPOSE(v)))*S
A 1-alpha level confidence interval is then
=prediction +/- TINV(alpha,df)*SE
so that alpha=0.05 would give a 95% confidence interval.

Jerry

Jerry W. Lewis

unread,
Sep 4, 2009, 5:30:01 AM9/4/09
to
Here are a few clarifications of my previous relply. The details of my reply
presumed some subtle unstated aspecs of the linear algebra formulation of the
regression ... sorry

If X is the model matrix augmented by a column of 1's for the mean, as
presumed in my MMULT formulas, then you would also need Xm, the same matrix
without the column of 1's. The TREND formula for comparison, then would be
either
=TREND(y,Xm,v)
or equivalently
=TREND(y,X,v,FALSE)
The degrees of freedom, df, and the sey estimate, S, would be from
=LINEST(y,Xm,,TRUE)

Also, while I said that Excel's CONFIDENCE function has no bearing on your
question, I did not say why. The reason is that it deals with the confidence
interval for a simple population mean, not a regression of any kind. As I
did state, it is rarely useful even for its intended purpose, as it presumes
that you are using information that you almost never will have (although it
can be a decent approximation in that context if the sample size is large).

Jerry

0 new messages