unweighted_unifrac_dm.txt for ANOSIM

karol.pe...@tidal.com

unread,

Apr 24, 2017, 8:57:06 AM4/24/17

to Qiime 1 Forum

Hello,

I have afollowing issue. To perform ANOSIM I need an unweighted_unifrac_dm.txt file/s. I do not know exactly where to find it/them. When I am doing beta_diversity analysis I have a folder "rare_dm" in which I have several files:

unweighted_unifrac_rarefaction_1000_0.txt
unweighted_unifrac_rarefaction_1000_1.txt
unweighted_unifrac_rarefaction_1000_2.txt
unweighted_unifrac_rarefaction_1000_3.txt
unweighted_unifrac_rarefaction_1000_4.txt
unweighted_unifrac_rarefaction_1000_5.txt
unweighted_unifrac_rarefaction_1000_6.txt
unweighted_unifrac_rarefaction_1000_7.txt
unweighted_unifrac_rarefaction_1000_8.txt
unweighted_unifrac_rarefaction_1000_9.txt

Can/must I use this files for ANOSIM??? What does numbers 0-9 mean?

Sorry if this question is stupid, but I am new with Qiime.

:)

Greg Caporaso

unread,

Apr 24, 2017, 10:50:42 AM4/24/17

to Qiime 1 Forum

Hello,

These files are from a jackknifed beta diversity analysis, and represent distance matrices from 10 different random subsamples (i.e., rarefaction runs) of the OTU table. While you theoretically should be able to provide any of these and get the same statistical result, the more standard way of doing this would be to provide the unweighted_unifrac_dm.txt file that is generated by running core_diversity_analyses.py (or, the distance matrix resulting from running single_rarefaction.py followed by beta_diversity.py).

Best,

Greg

karol.pe...@tidal.com

unread,

Apr 24, 2017, 1:33:37 PM4/24/17

to Qiime 1 Forum

Ok. Thank you!!! I wile try core_diversity_analyses.py but I have one more question regarding --sampling_depth parameter. In example command -e value is 20 seqences/sample. Is there any universal value or way to establish this value?

Greg Caporaso

unread,

Apr 25, 2017, 5:46:18 PM4/25/17

to qiime...@googlegroups.com

Unfortunately there isn't a universal value (or approach for picking a value) that we are aware of. This step is a bit subjective, and is data-set-specific. You'll want to review the counts of sequences per sample in the output from biom summarize-table, and then choose a depth that is as high as possible, while retaining as many of your samples as possible. If some samples have much lower sequence counts than the others, you will probably need to exclude them by choosing a sampling depth that is higher than the number of sequences they contain. If you're running into trouble picking a value, I recommend searching this forum for "sampling depth" and "rarefaction depth" as this topic has come up frequently, so you should be able to learn about how people are doing choose this based on prior posts.

Reply all

Reply to author

Forward