Specific Gene Abundance

37 views
Skip to first unread message

adamber1187

unread,
Oct 16, 2019, 11:43:59 AM10/16/19
to HUMAnN Users
Hello all,

This has sort of been hinted to with previous questions, but how do I go about looking for specific genes only for two separate groups? For instance, I'm using metagenomics to look at the relative abundance of a few particular enzymes. Would the BLAST sequences need to be used as a reference for Diamond rather than the Uniref files? Any easy way for that to work?

Thanks in advance,
Adam

Eric Franzosa

unread,
Oct 16, 2019, 11:49:04 AM10/16/19
to humann...@googlegroups.com
Hi Adam,

You could either look up the corresponding UniRef IDs for those proteins, or use DIAMOND to search your query proteins against HUMAnN2's DIAMOND-formatted UniRef databases and pull out the IDs of the top hits directly.

Thanks,
Eric



--
You received this message because you are subscribed to the Google Groups "HUMAnN Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to humann-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/humann-users/438ea3bf-d4f0-4093-824c-148d470ec19b%40googlegroups.com.

adamber1187

unread,
Oct 18, 2019, 12:03:40 PM10/18/19
to HUMAnN Users
Hi Eric,

Thanks for the reply. That certainly makes sense. I'm a bit confused on the downstream analysis. I have two groups I'm looking at (disease vs control) with ~25 metagenomes in each. For data analysis, how do you go about comparing the disease vs healthy sequences with that many samples in each? Is there a way to merge all 25 diseased sequencing data into one and same for the controls, then compare just one data set of disease vs healthy (after normalization)? I'm looking at the examples from the tutorial with the different body sites, and this is kind of the same idea I suppose. The merge command just combines all 25 samples into one .tsv file, but not sure how to manipulate it after that.

Thanks in advance,
Adam



On Wednesday, October 16, 2019 at 9:49:04 AM UTC-6, Eric Franzosa wrote:
Hi Adam,

You could either look up the corresponding UniRef IDs for those proteins, or use DIAMOND to search your query proteins against HUMAnN2's DIAMOND-formatted UniRef databases and pull out the IDs of the top hits directly.

Thanks,
Eric



On Wed, Oct 16, 2019 at 11:44 AM adamber1187 <adam.be...@gmail.com> wrote:
Hello all,

This has sort of been hinted to with previous questions, but how do I go about looking for specific genes only for two separate groups? For instance, I'm using metagenomics to look at the relative abundance of a few particular enzymes. Would the BLAST sequences need to be used as a reference for Diamond rather than the Uniref files? Any easy way for that to work?

Thanks in advance,
Adam

--
You received this message because you are subscribed to the Google Groups "HUMAnN Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to humann...@googlegroups.com.

Eric Franzosa

unread,
Oct 21, 2019, 3:49:01 PM10/21/19
to humann...@googlegroups.com
Hi Adam,

That starts to get beyond the scope of the profiling HUMAnN2 does and into statistical analysis. For case/control testing our group uses a linear modeling framework called MaAsLin:


But there are LOTS of things you can do once you have a merged feature x sample abundance table.

Thanks,
Eric



To unsubscribe from this group and stop receiving emails from it, send an email to humann-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/humann-users/7e78c308-3423-40a3-b328-77ac61ff13cc%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages