Looking for an alternative to DeSeq2 for taxa higher than OTU level

S Macdonald

unread,

Dec 20, 2016, 11:04:31 AM12/20/16

to Qiime 1 Forum

Hello,

I am using DeSeq2 in Macqiime to analyse differential abundance between my samples at OTU level.

I would also like to examine changes between higher taxonomic levels eg. genus, family etc.

The following links suggests not using DeSeq2 to look at anything other than OTU level, which seems reasonable and suggests using Mann Whitney U.

https://groups.google.com/forum/#!searchin/qiime-forum/using$20DeSeq2$20for$20class%7Csort:relevance/qiime-forum/13YdbVfEKUY/eNbO2Oh2sroJ

My problem is that I have <10 samples per group and Mann Whitney U requires at least 20.

Also when looking at levels higher than OTU is it still recommended to manipulate the data i.e using the negative binomial or log ratio?

My data is 16s data, collected from birds with different lesion scores (0, 1, 2, 3, 4) and uninfected controls, for each I have <10 biological replicates per group.

Any suggestions as to how to analyse at levels other than OTU would be greatly appreciated.

Thanks

Sarah

Tomasz

unread,

Dec 20, 2016, 1:17:21 PM12/20/16

to Qiime 1 Forum

Hi Sarah,

I forwarded your question to my colleague, who should be able to provide you with an accurate answer.

He should get back to you shortly.

t.

Barvaz Sini

unread,

Dec 22, 2016, 6:12:54 AM12/22/16

to Qiime 1 Forum

Hi Sarah,

I am not sure it will be totally valid to use Deseq2 for the higher level taxonomy (for example, not clear how valid the zero inflated ditribution they use fits the joined reads of several bacteria). However, if you have a small sample size and treat the results as preliminary indicators of taxonomic groups that may have differential abundance for further study (as opposed to using the results as a proof that these taxonomic groups definitely change), i think it's a legitimate possibility.

Another option is to use permutation based tests on whatever statistic you want to calculate on these groups (i.e. log ratio / difference in mean / presence-absence etc.). Just calculate the statistic for each taxonomic group (between the conditions) and compare to random label permutations. Note that you will still need FDR correction for the p-values.

Another option if you have not enough samples, is to maybe join groups together (i.e. compare samples from lesion groups 0,1 together to group 3,4 together)?

How many samples do you have in each group? Note that for the Mann-Whitney U, the requirement for >20 samples is in order to approximate the distribution of the U scores using normal approximation, but it can also be calculated for <20 samples (i.e. using permutations).

Does this make sense?

Cheers

Amnon

S Macdonald

unread,

Jan 10, 2017, 12:31:45 PM1/10/17

to Qiime 1 Forum

Thanks Barvaz Sini for your reply, it was really useful and indeed I think I will try to use DeSeq2 at higher taxonomic levels, taking the results as preliminary indicators. I am also interested in using permutation based tests.

However for both of these techniques I am really struggling (hence why so long for my reply) how to find out how to do this at any level other than OTU level, as tests like group_significance is used to compare OTU frequency. I haven't come across a corresponding test for other levels, am I missing something simple?

The only thing I have come across is a thread on the phyloseq forum https://github.com/joey711/phyloseq/issues/683, however I am having problems replicating the data input in the first place (I have posted about this: https://github.com/joey711/phyloseq/issues/700).

Are you able to point me in the right direction?

Cheers,

Sarah

S Macdonald

unread,

Jan 11, 2017, 12:50:25 PM1/11/17

to Qiime 1 Forum

So I have worked out how to analyse use DeSeq2 at higher taxonomic levels in R, still not sure how to do in Qiime, but in case this can help anyone, its actually pretty simple (although not for me to find!!!!!!!)

You can agglomerate taxa of the same type, so for me I did this:

(x1 <- tax_glom(qiimedata, taxrank="Family"))

ntaxa(qiimedata); ntaxa(x1)

qiimedata is my data matrix and then I can simply choose which rank to agglomerate to, then I run my DeSeq2 analysis as usual.

If anyone has any tips about doing this is Qiime, I would be interested.

Cheers,

Sarah

Reply all

Reply to author

Forward