Adding modules to KEGG database

115 views
Skip to first unread message

kathi_...@web.de

unread,
Feb 6, 2018, 2:32:05 AM2/6/18
to HUMAnN Users
Hei,

I am using humann2 to interpret metagenomes predicted by PICRUSt. I have downloaded the KEGG database from and assigned KOs to modules. The scripts works fine, however, I noticed that many modules are actually missing in the "modulec" data file. Everything with a module ID greater M00377 is missing thus many of the KOs are ending up a "unintegrated".
As some of the modules I am interested in are also missing, I was thinking of adding them manually to "modulec". This works well with "simple modules". However, I have some modules that have alternative genes for certain module-steps. An example would be the denitrification module M00529 which has two alternatives for the nitrate reduction step ((K00370+K00371+K00374) or (K02567+K02568)), the nitrite reduction step (K00368 or K15864) and the nitric oxide reduction step. How do I add a pathway like that into the file? Is there a possibility to provide multiple options, or should I add separate lines for the possible combinations? If the latter is the case, should I then add up abundances of the different combinations?

Hopefully somebody can help me with this issue.

Regards,
Katharina

Eric Franzosa

unread,
Feb 16, 2018, 1:35:29 PM2/16/18
to humann...@googlegroups.com
Hi Katharina,

The files embedded with HUMAnN1 only reflect the last public release of KEGG (v56), which is probably why you're seeing some (presumably newer?) modules missing. Note that modulec represents modules as unstructured lists of genes. The modulep file represents pathways using the more sophisticated logical structure, so you could add modules in that format to modulep. HUMAnN2 is able to recognize structured vs. unstructured pathway/module definition files.

Thanks,
Eric

kathi_...@web.de

unread,
Feb 20, 2018, 1:04:07 AM2/20/18
to HUMAnN Users
Hei Eric,

thank you for the answer. I took a look at the modulep file and tried to add my pathway there in the same form it is displayed in KEGG: (K00370+K00371+K00374,K02567+K02568) (K00368,K15864) (K04561+K02305,K15877) K00376

However, the pathway is not found, even though most of the KOs are present in my database (K15864 and K15877 are not, but they are alternatives). Did I misinterpret the logic for the entry? Is it somehow different in KEGG and in the modulep file? In the KEGG help a module definition is explained like this:
"The definition of the module as a list of K numbers. Comma separated K numbers indicate alternatives. Plus signs are used to represent a complex and a minus sign denotes a non-essential component in the complex." It does not say anything about the use of parentheses, so I was wondering whether those are needed to define the module.

BR,
Katharina

Eric Franzosa

unread,
Feb 21, 2018, 2:45:51 PM2/21/18
to humann...@googlegroups.com
Hi Katharina,

You can think of commas as logical ORs and the parenthetical blocks as separated by ANDs (+ is also treated as AND within the context of a complex definition). So the module definitions behave like (A OR B) AND (C OR D), meaning that the definition is satisfied by AC, AD, BC, or BD. Based on that, it sounds like if you had all of the KOs except K15864 and K15877 that you could still satisfy the module above. Are you sure all of the other KOs are non-zero?

Thanks,
Eric

katharin...@oulu.fi

unread,
Feb 22, 2018, 7:15:57 AM2/22/18
to HUMAnN Users
Hei,

yes all other are. If I remove the "+", "," and parentheses (and the 2 KOs that are not there), the pathway will be detected. As soon as I put the logical signs back in, the pathway is not detected any more.

BR,
Katharina

Eric Franzosa

unread,
Feb 23, 2018, 1:26:31 PM2/23/18
to humann...@googlegroups.com
Hi Katharina,

Just chatted about this with Lauren and we remain puzzled why this module isn't being detected given what you've said. Are you able to send me your KO abundance file and the modified module definition file to experiment with? Feel free to reply directly to me.

Thanks,
Eric

Reply all
Reply to author
Forward
0 new messages