Regression and p-values

Bendik Mjaaland

unread,

May 18, 2009, 2:46:01 PM5/18/09

to

Hi!

Hi want to regress over some observations and model y = a +bx.

I use regstats:
stats = regstats(testy,testx,'linear');

Now, I want to test the significance of the coefficients. From reading about regstats I find the p-values by typing
stats.tstat.pval

And this, I think, is the p-value for the following hypothesis
H0: a = 0, b = 0

Now, if I haven't forgotten all about statistics, the clue is that very low p-values indicate that H0 is rejected - the coefficients are significant. For instance, for a 5% significance level, two-sided test, I want p = 0.025 or lower to reject H0.

Am I right so far?

Anyway, this is what I have assumed. I decided to run a test.
>> testx = 1:1000;
>> testy = ones(1,length(testx));
>> stats = regstats(testy,testx,'linear');
>> stats.beta

ans =

1.000000000000002
0.000000000000000

Ok, so the linear regression is about y = 1, which would be correct.

However:
>> stats.tstat.pval

ans =

0
0.378372334394027

Ok, so the a is definitely significant, but I am only somewhat sure about the b. Why is this? Shouldn't this be very very certain?

The weirdest things of all, if I repeat this test with testx = 1:100, i.e. with less data to regress over, I get a p-value of 0.853129913004647. As far as I can understand, I should now be less certain about my regression line.

Can someone clear this up for me?

Thanks in advance,
Bendik

Wayne King

unread,

May 18, 2009, 3:02:01 PM5/18/09

to

Hi Bendik, Remember the hypothesis test is whether a coefficient is significantly different from zero as you have set it up. Your data clearly has a beta_1 (slope) of zero. So you do not want to reject the null hypothesis

H_0: beta_1 = 0

If you had a small p-value, then you would reject the null hypothesis and conclude that the slope is different from zero, which is not want you want to do in this case.

hope that helps,
wayne

"Bendik Mjaaland" <mjaa...@stud.removethis.ntnu.no> wrote in message <gusad9$p8i$1...@fred.mathworks.com>...

Peter Perkins

unread,

May 18, 2009, 3:53:37 PM5/18/09

to

Bendik Mjaaland wrote:

> Now, I want to test the significance of the coefficients. From reading about regstats I find the p-values by typing
> stats.tstat.pval
>
> And this, I think, is the p-value for the following hypothesis
> H0: a = 0, b = 0
>
> Now, if I haven't forgotten all about statistics, the clue is that very low p-values indicate that H0 is rejected - the coefficients are significant. For instance, for a 5% significance level, two-sided test, I want p = 0.025 or lower to reject H0.
>
> Am I right so far?

Not exactly. The t statistics are for separate hypothesis tests on the parameters individually. In particular, the second p-value is for the hypothesis H0: b == 0. It's actually for a two-sided test already -- you don't need to worry about tails.

>>> testx = 1:1000;
>>> testy = ones(1,length(testx));
>>> stats = regstats(testy,testx,'linear');

[snip]

> However:
>>> stats.tstat.pval
> ans =
> 0
> 0.378372334394027
>
> Ok, so the a is definitely significant, but I am only somewhat sure about the b. Why is this? Shouldn't this be very very certain?

What's the null hypothesis corresponding to that second p-value? H0: b==0. What is the "true" b? It's zero. So you correctly fail to reject. What you've said is correct for std errors or confidence intervals, though. And if you look at stats.tstat.se, you'll find that both are extremely small.