Beta diversity through plots on biom files generated with summarize taxa.py default taxonomic levels

Deanna

unread,

Dec 17, 2015, 4:38:49 PM12/17/15

to Qiime 1 Forum

I’ve generated biom files with summarize_taxa.py (default all taxonomic levels). I’ve run the script to generate biom files with relative abundances and biom files with absolute abundances by passing –a. Next, I ran beta_diversity_through_plots.py with relative abundance biom files and I get error: empty table! Subsequently, I ran beta_diversity_through_plots.py with absolute abundance biom files and the script starts but then freezes. For both, I am using the default diversity metrics, unweighted and weighted unifrac. Any ideas on what I’m doing wrong?

My goal is to generate distance matrices from biom files at different levels of taxonomic resolution.

Thanks for any guidance!

Deanna

unread,

Dec 18, 2015, 3:40:11 PM12/18/15

to qiime...@googlegroups.com

I've also tried core_diversity_analyses.py with biom files (absolute abundance) generated from summarize_taxa.py and the script freezes at parallel_beta_diversity.py step (for unifrac distance matrices).

TonyWalters

unread,

Dec 18, 2015, 6:35:32 PM12/18/15

to Qiime 1 Forum

Hello Deana,

You can't use UniFrac metrics with summarized data, as you do not have a matching tree (unlike the uncollapsed OTU level data, where the OTU IDs will match the tree tip ids).

I'd recommend changing the metrics to non-phylogenetic metrics, such as bray-curtis, and you'd want to also run bray-curtis or whichever non-UniFrac metric(s) you decided on at the OTU level as well

Deanna

unread,

Dec 21, 2015, 12:09:42 PM12/21/15

to Qiime 1 Forum

Thank you Tony! This was very helpful.

I do have a second question. I have been using rep_set.tre generated from the OTU picking step (pick_open_reference_otus.py) for scripts including core_diversity_analyses.py and beta_diversity_through_plots.py. I ran the OTU picking step (generating otu_table_mc_no_pynast_failures.biom & rep_set.tre) on a group of samples from multiple experiments. I am now using this rep_set.tre (generated from a larger set of samples) with a subset of these samples for core_diversity_analyses.py. Will this be problematic?

Thanks for your advice,
Deanna

TonyWalters

unread,

Dec 21, 2015, 1:59:20 PM12/21/15

to Qiime 1 Forum

Hello Deanna,

Using the tree for the subset of the data should be fine.

-Tony

Deanna

unread,

Dec 21, 2015, 2:07:25 PM12/21/15

to Qiime 1 Forum

THANK YOU!!

Reply all

Reply to author

Forward