humann2_regroup_table usage

francesca....@unina.it

unread,

Feb 28, 2017, 11:37:59 AM2/28/17

to HUMAnN Users

Hi there
I'd like to regroup the humann output table to kegg pathways.
I downloaded the utility_mapping files.
As I could understand from the script help, I should pass it the --groups option.

I tried

humann2_regroup_table --input 01BA/01BA_trim_filt.fastq_genefamilies.tsv --groups uniref90_kegg --output 01BA_genefamilies_grouped2.tsv

but I got this error

invalid choice: 'uniref90_kegg' (choose from 'uniref90_rxn', 'uniref50_rxn')

When I ran it with --groups uniref90_rxn, it gave me a table with IDs like these

1.5.1.20-RXN|g__Bacteroides.s__Bacteroides_ovatus 13.682
1.5.1.20-RXN|g__Bacteroides.s__Bacteroides_stercoris 5.848
1.5.1.20-RXN|g__Bacteroides.s__Bacteroides_uniformis 11.194
1.5.1.20-RXN|g__Bacteroides.s__Bacteroides_vulgatus 146.771
1TRANSKETO-RXN|g__Peptostreptococcaceae_noname.s__Clostridium_bartlettii 30.352
1TRANSKETO-RXN|g__Prevotella.s__Prevotella_buccae 513.076

So how can I link these IDs with kegg?

Eric Franzosa

unread,

Feb 28, 2017, 12:12:10 PM2/28/17

to humann...@googlegroups.com

It sounds like you have not downloading the utility mapping files (RXNs are bundled with HUMAnN2, the others come separately). Please see the instructions under:

https://bitbucket.org/biobakery/humann2/wiki/Home#markdown-header-humann2_regroup_table

Notably, regroup_table will sum UniRefs to KO abundance, but doesn't reconstruct KEGG pathways. To reconstruct KEGG pathways from KO abundance, you can follow these instructions:

https://bitbucket.org/biobakery/humann2/wiki/Home#markdown-header-picrust-output

You can skip step 2 (splitting the table) if you have one KO file per sample.

Thanks,

Eric

francesca....@unina.it

unread,

Feb 28, 2017, 12:15:10 PM2/28/17

to HUMAnN Users, francesca....@unina.it

Hi Eric
thanks for the reply. Indeed, I already downloaded all the utility_mapping files...

Eric Franzosa

unread,

Feb 28, 2017, 12:23:05 PM2/28/17

to humann...@googlegroups.com

Hmm... and just to confirm, you downloaded them with the humann2_databases script to a given location, and they are still in that location? One of the actions of humann2_databases is to "register" the locations of the databases with scripts that need them, such as regroup_table.

You can always manually point regroup_table at a specific mapping file using the "--custom" option if the automatic registration isn't working for some reason. The KO mapping files have names like "map_ko_uniref90.txt.gz".

Thanks,

Eric

Francesca De Filippis

unread,

Feb 28, 2017, 2:48:20 PM2/28/17

to humann...@googlegroups.com

Hi Eric

I tried to specify the “custom” database, and it worked! It seems I didn’t have the permissions to write in the config file, so it skipped that.

I have another question:

I didn’t understand how I can link KO to KEGG pathways or modules, from the link you sent me (as for picrust output).

I already ran humann2 script as follows

humann2 --input PRIN/filtered/01BA_trim_filt.fastq.fastq --nucleotide-database humann2_database/chocophlan --protein-database humann2_database/uniref

-o 01BA --gap-fill on

Do I need to run it again for each sample starting from quality filtered reads? And where do I download the kegg database?

Thanks

Francesca

Da: <eric.f...@gmail.com> per conto di Eric Franzosa <fran...@hsph.harvard.edu>
Risposta: <humann...@googlegroups.com>
Data: martedì 28 febbraio 2017 18:22
A: "humann...@googlegroups.com" <humann...@googlegroups.com>
Oggetto: Re: humann2_regroup_table usage

Eric Franzosa

unread,

Feb 28, 2017, 3:14:40 PM2/28/17

to humann...@googlegroups.com

Hi Francesca,

You will not need to start from raw reads to compute KEGG pathways. Rather, once you have computed KO abundance, you can provide the KO abundance as input to HUMAnN2 and request to reconstruct KEGG pathways. The instructions that I gave you assume you are starting with predicted KOs from PICRUSt, but the workflow is otherwise the same.

You can get the KEGG pathway/module definitions from HUMAnN1's data/ directory:

https://bitbucket.org/biobakery/humann

Note that these pathways are only up-to-date as of the last public release of KEGG (v56 I believe).

Thanks,

Eric

Francesca De Filippis

unread,

Feb 28, 2017, 3:25:26 PM2/28/17

to humann...@googlegroups.com

Thanks, but I can’t find the download link for the kegg…

Da: <eric.f...@gmail.com> per conto di Eric Franzosa <fran...@hsph.harvard.edu>
Risposta: <humann...@googlegroups.com>
Data: martedì 28 febbraio 2017 21:13

Reply all

Reply to author

Forward