How to obtain variation contribution information from eigenvalues? glPca and poppr related

769 views
Skip to first unread message

Yu NING

unread,
Dec 15, 2017, 4:00:17 AM12/15/17
to poppr
Hello, everyone

I've been practicing "poppr" and the author's prime is very helpful. Yet I'm a little confused about the PCA analysis section in PART3. Shown in the picture:

The example acquired variances contribution"63.183%" through plotting eigenvalue. When I implement the script to my data ,it turns out to be only the eigenvalue like this picture:

I browse the documents about "glPca", yet have not found solution to know the specific variation contribution of each axis.  Can u help me to know why these two graph results are different and how can I get the variation contribution information? Your advice is highly grateful.


My script is :

> Kob_pca <- glPca(Kob_gl, nf = 3)
> barplot(Kob_pca$eig, col = heat.colors(50), main="PCA Eigenvalues")
> Kob_pca
 === PCA of genlight object ===
Class: list of type glPca
Call ($call):glPca(x = Kob_gl, nf = 3)

Eigenvalues ($eig):
 69.735 61.345 60.613 59.171 58.447 57.455 ...

Principal components ($scores):
 matrix with 41 rows (individuals) and 3 columns (axes) 

Principal axes ($loadings):
 matrix with 21373 rows (SNPs) and 3 columns (axes) 


My infor is :

> sessionInfo()
R version 3.3.3 (2017-03-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
   
attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] igraph_1.1.2   readxl_0.1.1   ape_5.0        poppr_2.5.0    vcfR_1.5.0    
[6] adegenet_2.1.0 ade4_1.7-8    

loaded via a namespace (and not attached):
 [1] phangorn_2.3.1    gtools_3.5.0      memuse_4.0-0      reshape2_1.4.2   
 [5] splines_3.3.3     lattice_0.20-35   colorspace_1.3-2  expm_0.999-2     
 [9] htmltools_0.3.5   viridisLite_0.2.0 pegas_0.10        mgcv_1.8-17      
[13] rlang_0.1.4       glue_1.2.0        sp_1.2-4          bindrcpp_0.2     
[17] bindr_0.1         plyr_1.8.4        stringr_1.2.0     munsell_0.4.3    
[21] gtable_0.2.0      coda_0.19-1       permute_0.9-4     httpuv_1.3.5     
[25] parallel_3.3.3    spdep_0.7-4       Rcpp_0.12.10      pinfsc50_1.1.0   
[29] xtable_1.8-2      scales_0.4.1      gdata_2.18.0      vegan_2.4-2      
[33] mime_0.5          deldir_0.1-14     fastmatch_1.1-0   ggplot2_2.2.1    
[37] digest_0.6.12     stringi_1.1.2     gmodels_2.16.2    dplyr_0.7.4      
[41] shiny_1.0.5       grid_3.3.3        quadprog_1.5-5    tools_3.3.3      
[45] LearnBayes_2.15   magrittr_1.5      lazyeval_0.2.0    tibble_1.3.4     
[49] cluster_2.0.5     seqinr_3.4-5      pkgconfig_2.0.1   MASS_7.3-45      
[53] Matrix_1.2-8      spData_0.2.6.7    assertthat_0.1    R6_2.2.2         
[57] boot_1.3-18       nlme_3.1-131  






Zhian Kamvar

unread,
Dec 15, 2017, 10:34:14 AM12/15/17
to Yu NING, poppr
Hello,

This question is more appropriate for the adegenet forums since glPca() is an adegenet function, but since you're here:

The plots represent the eigenvalues (you can think of these as variance) for each principal component. The sum of these eigenvalues represents the total variance of the observed data. Knowing that, the proportion of variance observed is obtained by taking the sum of the fraction of variance:

var_frac <- Kob_pca$eig/sum(Kob_pca$eig)
signif(sum(var_frac[1:3]) * 100, 3)

You can see this in practice when you look at the source Rmarkdown document for that section: https://github.com/grunwaldlab/Population_Genetics_in_R/blob/master/gbs_analysis.Rmd#principal-components-analysis

A word of caution: you specified to retain 3 principal components for the analysis, but your plot shows that you probably want to retain 12 or 19 components. 

Hope that helps,
Zhian

-----
Zhian N. Kamvar, Ph. D.
Postdoctoral Researcher (Everhart Lab)
Department of Plant Pathology
University of Nebraska-Lincoln
ORCID: 0000-0003-1458-7108




--
You received this message because you are subscribed to the Google Groups "poppr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to poppr+un...@googlegroups.com.
To post to this group, send email to po...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/poppr/f2d1b051-1b6d-4fda-9d28-9b3f1bb549eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Yu NING

unread,
Dec 16, 2017, 12:25:33 AM12/16/17
to poppr
Great! I get it now.  I found up to 10 axises can explain 72% or so variances. I'll pick the link and learn more. Really appreciate your help!

在 2017年12月15日星期五 UTC+8下午11:34:14,Zhian Kamvar写道:
Reply all
Reply to author
Forward
0 new messages