Specific Gene Abundance

adamber1187

unread,

Oct 16, 2019, 11:43:59 AM10/16/19

to HUMAnN Users

Hello all,

This has sort of been hinted to with previous questions, but how do I go about looking for specific genes only for two separate groups? For instance, I'm using metagenomics to look at the relative abundance of a few particular enzymes. Would the BLAST sequences need to be used as a reference for Diamond rather than the Uniref files? Any easy way for that to work?

Thanks in advance,

Adam

Eric Franzosa

unread,

Oct 16, 2019, 11:49:04 AM10/16/19

to humann...@googlegroups.com

Hi Adam,

You could either look up the corresponding UniRef IDs for those proteins, or use DIAMOND to search your query proteins against HUMAnN2's DIAMOND-formatted UniRef databases and pull out the IDs of the top hits directly.

Thanks,

Eric

--
You received this message because you are subscribed to the Google Groups "HUMAnN Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to humann-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/humann-users/438ea3bf-d4f0-4093-824c-148d470ec19b%40googlegroups.com.

adamber1187

unread,

Oct 18, 2019, 12:03:40 PM10/18/19

to HUMAnN Users

Hi Eric,

Thanks for the reply. That certainly makes sense. I'm a bit confused on the downstream analysis. I have two groups I'm looking at (disease vs control) with ~25 metagenomes in each. For data analysis, how do you go about comparing the disease vs healthy sequences with that many samples in each? Is there a way to merge all 25 diseased sequencing data into one and same for the controls, then compare just one data set of disease vs healthy (after normalization)? I'm looking at the examples from the tutorial with the different body sites, and this is kind of the same idea I suppose. The merge command just combines all 25 samples into one .tsv file, but not sure how to manipulate it after that.

Thanks in advance,

Adam

On Wednesday, October 16, 2019 at 9:49:04 AM UTC-6, Eric Franzosa wrote:

Hi Adam,

You could either look up the corresponding UniRef IDs for those proteins, or use DIAMOND to search your query proteins against HUMAnN2's DIAMOND-formatted UniRef databases and pull out the IDs of the top hits directly.

Thanks,
Eric

On Wed, Oct 16, 2019 at 11:44 AM adamber1187 <adam.be...@gmail.com> wrote:

Hello all,

This has sort of been hinted to with previous questions, but how do I go about looking for specific genes only for two separate groups? For instance, I'm using metagenomics to look at the relative abundance of a few particular enzymes. Would the BLAST sequences need to be used as a reference for Diamond rather than the Uniref files? Any easy way for that to work?

Thanks in advance,
Adam

--
You received this message because you are subscribed to the Google Groups "HUMAnN Users" group.

To unsubscribe from this group and stop receiving emails from it, send an email to humann...@googlegroups.com.

Eric Franzosa

unread,

Oct 21, 2019, 3:49:01 PM10/21/19

to humann...@googlegroups.com

Hi Adam,

That starts to get beyond the scope of the profiling HUMAnN2 does and into statistical analysis. For case/control testing our group uses a linear modeling framework called MaAsLin:

https://huttenhower.sph.harvard.edu/maaslin2

But there are LOTS of things you can do once you have a merged feature x sample abundance table.

Thanks,

Eric

To unsubscribe from this group and stop receiving emails from it, send an email to humann-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/humann-users/7e78c308-3423-40a3-b328-77ac61ff13cc%40googlegroups.com.

Reply all

Reply to author

Forward