i have a very stupid question: what is the difference
between
[r,p]=corr(X)
and
[r,p]=corrcoef(X)?
it seems that corr calcualtes the R,Bravais-Pearson
correlation coefficent: why then the diagonal is 0?Should
not be 1?
R=cov(x,y)/std(x)*std(y)
cov(x,x)=v(x)=std(x)^2=std(x)*std(x)-->R=1
Finally, how to decide which one to use to test if between
two variable there is a linear correlation?
P.S.
I also calcualte the regression line, if it can be used to
calculate it.
cheers
> i have a very stupid question: what is the difference
> between
>
> [r,p]=corr(X)
>
> and
>
> [r,p]=corrcoef(X)?
The difference is mostly:
'type' 'Pearson' (the default) to compute Pearson's linear
correlation coefficient, 'Kendall' to compute Kendall's
tau, or 'Spearman' to compute Spearman's rho.
i.e., corrcoef _only_ computes the linear correlation, while corr computes rank
correlations as well.
> it seems that corr calcualtes the R,Bravais-Pearson
> correlation coefficent: why then the diagonal is 0?Should
> not be 1?
> R=cov(x,y)/std(x)*std(y)
> cov(x,x)=v(x)=std(x)^2=std(x)*std(x)-->R=1
Under what conditions is the diagonal of r from [r,p]=corr(X) zero? Certainly
the diagonal of p is zero.
Hope this helps.
b and d are different: diag(b)=1 1, diag(d)=0 0
Why is that?
Peter Perkins <Peter.Perki...@mathworks.com> wrote
in message <g3ram4$muh$1...@fred.mathworks.com>...
Lorenzo, I'm curious what you think the right answer is, and why it would be useful.
CORRCOEF only accepts a single input, thus the diagonal elements necessarily
correspond to a correlation of a variable with itself, which necessarily is 1,
and therefore the p-value is 0. CORR allows "cross correlation" (although you
aren't using that syntax), and so the diagonal elements could be correlations
between two variables, and if the two vectors of values happen to be identical,
then the correlation is 1, but the p-value is 0.
Peter Perkins <Peter.Perki...@mathworks.com> wrote
in message <g3tjee$nbm$1...@fred.mathworks.com>...
Lorenzo, we're talking about the _diagonal_ of the outputs, are we not? What
test are you using to see if a variable is significantly correlated with itself,
and what would the result tell you?
Thanks
Peter Perkins <Peter.Perki...@mathworks.com> wrote
in message <g408gl$soq$1...@fred.mathworks.com>...
Lorenzo, again, why do you want to test if a variable is correlated with itself?
For the reason why the two functions return different values _along the
diagonal_ of P, see my previous post.
Let me try:
>> x = rand(10,2);
>> [r,p] = corrcoef(x)
r =
1.0000 0.2682
0.2682 1.0000
p =
1.0000 0.4537
0.4537 1.0000
>> [r,p] = corr(x)
r =
1.0000 0.2682
0.2682 1.0000
p =
0 0.4537
0.4537 0
Both corr and corrcoef produce the same correlation between the x columns
(0.2682) and the same p-value for it (0.4537).
It generally doesn't make sense to test whether a variable is correlated
with itself. So corrcoef gives 1 along the diagonal of the correlation
matrix, but gives 1 as its p-value. That way if you search for significant
correlations, you'll never flag the diagonal as significant.
The corr function is a little more general and can compute correlations
between pairs of columns from two different inputs. In that case the result
won't necessarily be square, it won't necessarily have 1's along the
diagonal, and a correlation of 1 is highly significant. So it gives the
p-value as 0 when the correlation is 1. But still, if you compute a
correlation matrix for a single input, the entries along the diagonal are
necessarily 1 and the p-value is not meaningful.
So it's perhaps unfortunate that these two functions have adopted different
conventions for what to put along the diagonal for the matrix of p-values
when there's a single input matrix, but you don't want to use those diagonal
values anyway.
-- Tom