How do i get the output in terms of the cumulative % of the total
variance, so when i go from total solution of 8 (8 variables in the data
set), to a reduced number of components, i can evaluate % of variance
explained, or am I missing something??
8 variables in the data set
> princ = prcomp(df[,-1],rotate="varimax",scale=TRUE)
> summary(princ)
Importance of components:
PC1 PC2 PC3 PC4 PC5 PC6 PC7 PC8
Standard deviation 1.381 1.247 1.211 0.994 0.927 0.764 0.6708 0.4366
Proportion of Variance 0.238 0.194 0.183 0.124 0.107 0.073 0.0562 0.0238
Cumulative Proportion 0.238 0.433 0.616 0.740 0.847 0.920 0.9762 *1.0000*
> princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.75)
> summary(princ)
Importance of components:
PC1 PC2 PC3
Standard deviation 1.381 1.247 1.211
Proportion of Variance 0.387 0.316 0.297
Cumulative Proportion 0.387 0.703 *1.000*
[[alternative HTML version deleted]]
______________________________________________
R-h...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Stephen Sefick
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.
-K. Mullis
> princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
> summary(princ)
Importance of components:
PC1
Standard deviation 1.38
Proportion of Variance 1.00
Cumulative Proportion 1.00
--
Stephen Sefick
Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.
-K. Mullis
______________________________________________
In the second PCA you ask how much variance of the THREE (!) variables is
captured by the first, second, and third principal component.
Of course you need only as many PCs as there are variables to capture 100 %
of the variance. Your "problem" thus comes from the fact that you have eight
variables in the first PCA, which requires eight PCs to capture 100%, and
that you have only three variables in the second PCA, which naturally only
requires three PCs to capture 100% of the variance.
So it's more, yes, you are missing something in this case, rather than that
something is wrong with the analyses.
HTH,
Daniel
-------------------------
cuncta stricte discussurus
-------------------------
-----Ursprüngliche Nachricht-----
Von: r-help-...@r-project.org [mailto:r-help-...@r-project.org] Im
Auftrag von zubin
Gesendet: Monday, November 09, 2009 12:37 PM
An: r-h...@r-project.org
Betreff: [R] prcomp - principal components in R
Example 1 component 8 variables, there is no way 1 component explains
100% of the variance of the 8 variable data set.
> princ = prcomp(df[,-1],rotate="varimax",scale=TRUE,tol=.95)
> summary(princ)
Importance of components:
PC1
Standard deviation 1.38
Proportion of Variance 1.00
Cumulative Proportion 1.00
> summary(princ)
Rotation:
PC1
VIX0 -0.08217686
UUP0 -0.18881983
USO0 0.26647346
GLD0 0.26983923
HYG0 0.60674758
term0 0.18220237
spread0 0.61614047
TNX0 0.18111684
> -----Ursprüngliche Nachricht-----
> Von: [1]r-help-...@r-project.org
[[2]mailto:r-help-...@r-project.org] Im
> Auftrag von zubin
> Gesendet: Monday, November 09, 2009 12:37 PM
> An: [3]r-h...@r-project.org
> [4]R-h...@r-project.org mailing list
> [5]https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
[6]http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
______________________________________________
[7]R-h...@r-project.org mailing list
[8]https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
[9]http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
References
1. mailto:r-help-...@r-project.org
2. mailto:r-help-...@r-project.org
3. mailto:r-h...@r-project.org
4. mailto:R-h...@r-project.org
5. https://stat.ethz.ch/mailman/listinfo/r-help
6. http://www.R-project.org/posting-guide.html
7. mailto:R-h...@r-project.org
8. https://stat.ethz.ch/mailman/listinfo/r-help
9. http://www.R-project.org/posting-guide.html
Here's how to get the output you want (the last line in the transcript below):
> set.seed(1)
> summary(pc1 <- prcomp(x))
Importance of components:
PC1 PC2 PC3 PC4 PC5
Standard deviation 1.175 1.058 0.976 0.916 0.850
Proportion of Variance 0.275 0.223 0.190 0.167 0.144
Cumulative Proportion 0.275 0.498 0.688 0.856 1.000
> summary(pc2 <- prcomp(x, tol=0.8))
Importance of components:
PC1 PC2 PC3
Standard deviation 1.17 1.058 0.976
Proportion of Variance 0.40 0.324 0.276
Cumulative Proportion 0.40 0.724 1.000
> pc2$sdev
[1] 1.1749061 1.0581362 0.9759016
> pc1$sdev
[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122
> svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1)
[1] 1.1749061 1.0581362 0.9759016 0.9164905 0.8503122
> cumsum(pc1$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2)
[1] 0.2752317 0.4984734 0.6883643 0.8558386 1.0000000
>
> # output in terms of the cumulative % of the total variance
> cumsum(pc2$sdev^2) / sum((svd(scale(x, center=T, scale=F))$d / sqrt(nrow(x)-1))^2)
[1] 0.2752317 0.4984734 0.6883643
>
It's probably better to get prcomp to compute all the components in the first place, because the SVD is the bulk of the computation anyway (so doing it again will be slower for large matrices.) Then just look at the most important principal components. However, there may be a shortcut for computing the values of D in the SVD of a matrix -- you could look for that if you have demanding computations (e.g., the sqrts of the eigen values of the covariance matrix of scaled x: sqrt(eigen(var(scale(x, center=T, scale=F)), only.values=T)$values)).
-- Tony Plate