How to normalize by copy number after OTU picking with SILVA

Dani Haze

unread,

Aug 31, 2017, 12:18:31 AM8/31/17

to Qiime 1 Forum

Hello everyone,
I use QIIME 1 to pick OTUs against the Silva database (which has been found to be more accurate and up-to-date than Greengenes). I use the following command:

pick_closed_reference_otus.py -i combined_seqs.fna -o QIIME_RESULTS -r SILVA_128_SSURef_Nr99_tax_DNA.fasta -t SILVA_128_SSURef_Nr99_tax_8levels.txt

where the SILVA fasta and taxonomy txt files are parsed version of files downloaded from https://www.arb-silva.de/download/archive/

What I want to do now is to normalize abundances by copy number, but it seems the normalize_by_copy_number.py PICRUSt script can only deal with biom files obtained from OTU picking against Greengenes.

So my question is: how to normalize by copy number after OTU picking with SILVA?

After OTU picking, I calculate Alpha and Beta diversities using the alpha_diversity.py and beta_diversity_through_plots.py scripts. Should I do the copy number normalization before Alpha and Beta diversities calculation?

Many thanks!

TonyWalters

unread,

Aug 31, 2017, 4:34:50 AM8/31/17

to Qiime 1 Forum

Hello Dani,

Unfortunately, you can not use SILVA data with PiCrust. You would need to redo OTU picking against the Greengenes (13_8) database to make use of the PiCrust metagenomic predictions.

For other analyses like alpha and beta diversity on your SILVA OTU table, you would want to do normalization with SILVA just as you would with Greengenes; with alpha diversity it already does multiple rarefactions at different depths (10x repeat by default). For beta diversity, there isn't a perfect answer thus far, but you can use rarefaction or other normalization approaches with SILVA just as one would with Greengenes picked data.

-Tony

Daniel Laubitz

unread,

Sep 1, 2017, 2:53:03 PM9/1/17

to Qiime 1 Forum

Recently I did OTU picking against Silva 128 and then I had to run PICRUSt. For this I had to do closed OTU picking with GG 13_8. I have compared both and the taxonomic analyses (compositions) were very very similar for both runs.

Daniel

Reply all

Reply to author

Forward