require(pcaMethods) require(ggplot2)
x = c(runif(50,min=0,max=1),runif(50,min=0.5,max=3)) dim(x) = c(10,10)
PCA<- pca(x, method="nipals", scale="uv", nPcs=3, center=T, completeObs=T, cv="q2") Q2<-Q2(PCA) Q2 PCA@R2 #Need to find the point at which Q2 no longer increases #bio data R2~= 0.3 and Q2 <= R2 +/- 0.2 is fine. PCA@scores PCA@loadings #loadings are matrix, need data.frame for ggplot loadings<-as.data.frame(PCA@loadings) #carry over names of rows loadings$Names<-rownames(loadings) #scores=matrix too scores<-data.frame(PCA@scores) #pure number scaling not very good but works... for now #rescaling Loadings to scores loadings[[1]]<-rescale(loadings[[1]], to = c(min(scores[[1]]),max(scores[[1]]))) loadings[[2]]<-rescale(loadings[[2]], to = c(min(scores[[2]]),max(scores[[2]]))) loadings[[3]]<-rescale(loadings[[3]], to = c(min(scores[[3]]),max(scores[[3]]))) ggplot()+ geom_point(data=scores, aes(x=PC1, y=PC2))+ geom_segment(data=loadings, aes(x=0, y=0, xend=PC1, yend=PC2) , arrow=arrow(length=unit(0.2,"cm")), alpha=0.25)+ geom_text(data=loadings, aes(x=PC1, y=PC2, label=Names), alpha=0.5, size=3)+ scale_colour_discrete("Variety")+ scale_x_continuous("Principal Component 1")+ scale_y_continuous("Principal Component 2")+ theme_bw()
--
You received this message because you are subscribed to the ggplot2 mailing list.
Please provide a reproducible example: http://gist.github.com/270442
To post: email ggp...@googlegroups.com
To unsubscribe: email ggplot2+u...@googlegroups.com
More options: http://groups.google.com/group/ggplot2
scale_x_continuous(sprintf("PC1 (%s%%)", round(PCA@R2[1],digits=2)*100))+ scale_y_continuous(sprintf("PC2 (%s%%)", round(PCA@R2[2],digits=2)*100))+ theme_bw()Thanks for sharing.
I think by
scale_x_continuous(sprintf("PC1 (%s)", round(PCA@R2cum[1],digits=2)))+
scale_y_continuous(sprintf("PC2 (%s)", round(PCA@R2cum[2],digits=2)))
2010/12/7 Brandon Hurr <bhi...@gmail.com>
You should also use coord_equal to ensure that distances are preserved
accurately.
Hadley
--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/
Dear Brandon,
I was trying to make this biplot, but I run into problems:
1) I cannot use at the same time colour=factor(data$Treatment) and shape=factor(data$Subject) in geom_point(...)
Only one either colour or shape works.
2) When I put everything in function body, somehow it complains about missing labels in geom_text
Error: geom_text requires the following missing aesthetics: label
When I run code line by line it works...
3) Is the scaling loadings the correct way to do it ?
I tried 3 methods and in the end invented my own:
(1) Multiply the loadings by the standard deviation of the scores (this is how SIMCA-P does it)
(2) Multiply all the loadings by the square root of the number of variables
(3) Divide the scores by the variance of each score column, or the eigenvalue (this is Richard Brereton’s method)
None of these give you what I think you want which is a superimposed plot.
My method, which is aesthetically pleasing at least, is to divide both the scores and the loadings by their row or column standard deviation.