how to get PC - percent variation explained %

1,049 views
Skip to first unread message

Ming Liao

unread,
Dec 23, 2015, 9:30:33 AM12/23/15
to Qiime 1 Forum
hello, there,
I am reading papers about the 16s study using qiime. In the PCoA plot, there were PC1 and PC2 in x and y axis, along with  PC1 - percent variation explained 28 % or  PC2- percent variation explained 14% . How to get the percentage?

I tried to find it our from "emperor_pcoa_plots", but there were PC1,PC2 and PC3 in one plot. I can not find the original files to draw this plot and no data about "precent variation explained %". 

Next, I tried "pcoa_unweighted_unifrac.txt", there were:  (by the way, I have 134 samples)
Eigvals 134
...
Proportion explained 134
...
Species 0 0

Site 134 134
...

Biplot 0 0
...
Site constraints 0 0
...
I can not find  where is PC1 or PC2, and no PC1 - percent variation explained  % or  PC2- percent variation explained % .


Then I tried 'cmdscale' in the 'vegan' package, I can get the PC1 and PC2 but no  precent variation explained %, it seems I need to calculate manually. but I am not sure how to do that.

Collectively, how to get the PC1 or PC2 data from QIIME? how to get the corresponding "percentage variation explained%"? Thanks.



Sincerely,

Ming

Colin Brislawn

unread,
Dec 23, 2015, 5:11:18 PM12/23/15
to Qiime 1 Forum
Hello Ming,

Making an Emperor plot has three steps:
  1. calculate distances (often UniFrac distances) between samples.
  2. ordinate these samples using the distances (PCoA)
  3. plot these using emperor.
In qiime, you can do these steps with these scripts:

Have you used these qiime scripts before? If you have, perhaps you can post the output file from step 2 and I can help you find the relevant parts.

Thanks!
Happy Holidays,
Colin

Ming Liao

unread,
Dec 24, 2015, 5:09:04 PM12/24/15
to Qiime 1 Forum

Thanks, Colin. I have tried that. The principal_coordinates.py came up with the following output. Is that OK?

RuntimeWarning: The result contains negative eigenvalues. Please compare their magnitude with the magnitude of some of the largest positive eigenvalues. If the negative ones are smaller, it's probably safe to ignore them, but if they are large in magnitude, the results won't be useful. See the Notes section for more details. The smallest eigenvalue is -0.0822042908367 and the largest is 7.39238517342.

Anyway, I got the results. The fist three PCs have Proportion explained
0.0525904588458 0.0369141304469 0.024300850773
Are these number too small? What is the acceptable proportion?

I really have no idea about how to explain the results. 

Ming

Colin Brislawn

unread,
Dec 24, 2015, 6:37:19 PM12/24/15
to Qiime 1 Forum
Hi Ming,

Those Eigenvalues look OK to me. I think you can safely ignore that warning. 

The proportion explained from those PCs are kind of small, but that happens sometimes. May I ask what distance metric are you using? Sometimes I find that one metric works better than another. For example, sometimes unweighted UniFrac works better than weighted UniFrac. Maybe you could try another metric and see if that produces better results. Also, how many samples do you have? With more samples, having lower proportions explained is expected.

Colin

Ming Liao

unread,
Dec 25, 2015, 1:02:46 AM12/25/15
to qiime...@googlegroups.com
Thanks. Colin,

Actually, I am not sure what distance metric I used. I just follow the tutorial in Werner Lab:
http://www.wernerlab.org/teaching/qiime/overview/f

jackknifed_beta_diversity.py -i otu_table.biom -o jackknifed_beta_diversity/ -e 20000 -m Fasting_Map.txt -t rep_set_tree.tre

I have 134 samples. The Proportion explained I showed previously was from this file
 "pcoa_unweighted_unifrac_rarefaction_20000_0.txt"

Moreover, I don't know why the default method is "jackknifed". Can other resampling methods can be applied to QIIME? There are at least four major types of resampling. Randomization exact test: Cross-validation: Jackknife: Bootstrap.  http://pareonline.net/getvn.asp?v=8&n=19


Merry Christmas

Ming

Colin Brislawn

unread,
Dec 25, 2015, 5:19:56 PM12/25/15
to Qiime 1 Forum
Merry Christmas Ming!

Looks like weighted and unweighted unifrac are used by default.

I have not used this script before so I can't speculate why those numbers are low or how to make them higher. I usually use the three scripts I listed before, and can choose different distances metrics when running beta_diversity.py.

The default method of 'jackknifed' was chosen a while ago. It looks like bootstrap may be supported too. It may be worth trying

Colin

Reply all
Reply to author
Forward
0 new messages