assemble or unassembled reads as input into HUman2

128 views
Skip to first unread message

orla...@gmail.com

unread,
Jun 25, 2015, 6:46:04 AM6/25/15
to humann...@googlegroups.com
HI
I am just about to emabrk on using HUmann2 for a large shotgun metagenomic project.
Prviously I would have always used assembled contigs for input into packages but it is my understanding for humann2 that the input is qualilty filtered unassembled reads?
Is this correct??
ORla

Eric Franzosa

unread,
Jun 25, 2015, 1:16:01 PM6/25/15
to humann...@googlegroups.com
That is correct. HUMAnN and HUMAnN2 profile microbial community function starting from short reads (no assembly required/expected).

Thanks,
Eric


orla...@gmail.com

unread,
Jun 29, 2015, 4:54:53 AM6/29/15
to humann...@googlegroups.com
Thanks very much for the reply.

I do have another query; for the outputs its all codes PWY-5047 for example.
Is there a fast way to link these to their full pathways names??

Lauren McIver

unread,
Jun 29, 2015, 4:40:04 PM6/29/15
to humann...@googlegroups.com
Hello - 

Yes, there is a fast way to link pathway ids to names. The following command will add the full pathway names.

$ humann2_rename_table --input SAMPLE_pathabundance.tsv --output SAMPLE_pathabundance_plus_names.tsv --names humann2/data/misc/map_metacyc-pwy_name.txt.gz

In the output file each pathway ID (PWY ID) will be replaced with "PWY ID : pathway name". 

Thank you,
Lauren

 


Orla O'Sullivan

unread,
Jun 29, 2015, 4:40:52 PM6/29/15
to humann...@googlegroups.com

Thanks so much for the fast reply.

Orla O'Sullivan

unread,
Jun 30, 2015, 6:45:30 AM6/30/15
to humann...@googlegroups.com, lauren....@gmail.com
Can I ask one more (stupid) question.
the file humann2/data/misc/map_metacyc-pwy_name.txt.gz doesnt exist but there is map_uniref50_name.txt.gz; is this the correct one to use?
 
Orla

Lauren McIver

unread,
Jun 30, 2015, 6:54:06 PM6/30/15
to Orla O'Sullivan, humann...@googlegroups.com
Hello Orla,

The map_metacyc-pwy_name.txt.gz file will map pathways to names. The map_uniref50_name.txt.gz will map uniref50 gene families to names. Map_metacyc-pwy_name.txt.gz might not exist in your repository/download as it was just recently added. 

If you download the new version (humann2 v0.1.10), this file should be included. Also this new version writes the pathways/gene families names to the output files by default. If you wanted to rewrite your *_genefamilies.tsv, *_pathcoverage.tsv, and *_pathabundance.tsv files (without having to redo all of the alignments) so that they now include the names, run humann2 with the same options as you ran with initially (providing the same output folder which includes all the original output files) adding the "--resume" option. This will start the humann2 flow after the alignments so just the gene families and pathways computations are run. This will replace your original 3 output files with new files that now include the pathway/gene families ids and names. 

Thank you,
Lauren


Reply all
Reply to author
Forward
0 new messages