Understanding qiime beta diversity results, PCOA plot, beta diversity matrix

1,425 views
Skip to first unread message

julie smith

unread,
Aug 31, 2015, 7:51:38 PM8/31/15
to qiime...@googlegroups.com
Hi, 

I am very new to qiime and metagenomics field and need help to understand the results/plots that I have generated by using qiime. 

I have run beta diversity workflow script on my input biom file. There are 4 different metagenomics bacterial communities and each community has around 20 species.  I didn't group the 4 metagenomics communities with each other, i just wanted to see the differences among these 4 communities. These 4 communities are labelled as Sample1, Sample2, Sample3, Sample4 in the emperor plots/screenshots attached. 

I decided to do a beta diversity analysis in these 4 samples using qiime. After installing qiime, I prepared the input files by studying the tutorials and examples given on the qiime website which was very helpful.  

Attached is the input OTU txt file that I prepared and its biom version. The numbers in the OTU file represent the number of read per species in each sample. I ran the beta diversity through plots workflow:


beta_diversity_through_plots.py -i OTU.biom -m map.txt -p Parameter.txt -o output -f


I am also pasting the commands that were run in the below. My questions are listed below.  I would appreciate any feedback/comments. 

1. What does the PCOA plot(attached) tell about these 4 samples (attached plot)? 
2. What does the PCOA distances tell about these samples (attached file)? 
3. How do I interpret the PC matrix? What does this tell (also file is attached)?
4. What do the numbers in the distance metric tell (attached)? 

Basically, I currently need some help to understand how to interpret these results. I would very much appreciate any insights you could provide.   

Thank you,
Julie


# Beta Diversity (euclidean) command 
beta_diversity.py -i OTU_species.biom -o outputDir --metrics euclidean 

Stdout:

Stderr:
/usr/local/lib/python2.7/dist-packages/numpy/core/fromnumeric.py:2507: VisibleDeprecationWarning: `rank` is deprecated; use the `ndim` attribute or function instead. To find the rank of a matrix see `numpy.linalg.matrix_rank`.
  VisibleDeprecationWarning)

# Rename distance matrix (euclidean) command 
mv /euclidean_OTU_species.txt  euclidean_dm.txt

Stdout:

Stderr:

# Principal coordinates (euclidean) command 
principal_coordinates.py -i euclidean_dm.txt -o euclidean_pc.txt 

Stdout:

Stderr:
/usr/local/lib/python2.7/dist-packages/skbio/stats/ordination/_principal_coordinate_analysis.py:107: RuntimeWarning: The result contains negative eigenvalues. Please compare their magnitude with the magnitude of some of the largest positive eigenvalues. If the negative ones are smaller, it's probably safe to ignore them, but if they are large in magnitude, the results won't be useful. See the Notes section for more details. The smallest eigenvalue is -5.97764983814e-08 and the largest is 322496773.33.
  RuntimeWarning

# Make emperor plots, euclidean) command 
make_emperor.py -i euclidean_pc.txt -o /euclidean_emperor_pcoa_plot/ -m map.txt 

Stdout:

Stderr:


Logging stopped at 16:18:16 on 25 Aug 2015


OTU_species.txt
OTU_species.biom.txt
euclidean_pc_table.txt
euclidean_dm.txt
map.txt
Parameter.txt
PCOAPlot_1.png
DistancePlot.png

Antonio González Peña

unread,
Sep 1, 2015, 8:20:02 AM9/1/15
to Qiime Forum
At this point the best you can do is to say that using Euclidean
distance (not sure why you decided to use Euclidean but suggest taking
a look at http://www.nature.com/nmeth/journal/v7/n10/abs/nmeth.1499.html)
on PC1 green and red are more similar than orange and blue, you could
do something similar with the other axes.

Anyway, this conclusion is something really basic and doesn't tell you
anything really interesting. The way to make it interesting is to
related it to something in your metadata, for example a treatment or
something that you are testing. Additionally, try to figure out if you
have specific taxa that makes this differences and even possibly test
if you could build a classifier to catch those differences. However,
to be honest, not sure how much you can do with only 4 samples, for a
lot of tests this is a really small number of samples.

Finally, if you want to learn more about this suggest taking at least
10 high profile papers that look into similar things you are looking
at in your study and go over the methods sections so you can
understand better the methodological decision they took. Note do not
have to be the same, just similar, for example: same questions in
another environment, same environment with other questions, etc.
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "Qiime Forum" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to qiime-forum...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Antonio

julie smith

unread,
Sep 3, 2015, 2:20:38 PM9/3/15
to qiime...@googlegroups.com
OTU_species.txt
PCOAPlot_1.png
DistancePlot.png
OTU_species.biom.txt
euclidean_pc_table.txt
euclidean_dm.txt
map.txt
Parameter.txt

Antonio González Peña

unread,
Sep 3, 2015, 4:22:41 PM9/3/15
to Qiime Forum
These questions were already answered here:
https://groups.google.com/forum/#!topic/qiime-forum/oboEutau8YQ

julie smith

unread,
Sep 14, 2015, 4:48:15 PM9/14/15
to qiime...@googlegroups.com
OTU_species.txt
PCOAPlot_1.png
DistancePlot.png
OTU_species.biom.txt
euclidean_pc_table.txt
euclidean_dm.txt
map.txt
Parameter.txt
Reply all
Reply to author
Forward
0 new messages