About using PICRUST predicted metagenome in HUMAnN analysis

472 views
Skip to first unread message

Ning

unread,
Feb 19, 2014, 12:29:22 PM2/19/14
to humann...@googlegroups.com
Hi,

My question is About using PICRUST predicted metagenome in HUMAnN analysis. I am not sure it is the right place to put it.

Currently, my work is involved with 16S rRNA MiSeq data of microbiome. We thought it will be great to use PICRUST to predict metagenome, which has been done.  I run HUMAnN before on my real metagenome data, which gave us meaningful reasults. So it would be interesting to run HUMAnN on our predicted metagenome, and it would be more meaningful to consider the pathway instead of single ortholog.

Then question is more about how to make PICRUST predicted metagenome as input for HUMAnN. The output of PICRUST is counts in each orthologous group (KEGG ortholog IDs) in each sample. HUMAnN input is blastp (blastp like) against KEGG DB. Some one told me to make a fake blastp results table as HUMAnN input. For example, in sample1, it has K00002 with 138 counts. Consider 138 reads mapped to K00002. In the blastp file, it includes 138 read fake IDs, each of them is mapped to one protein in KEGG with ortholog ID as K00002. Since the results from PICRUST has no information about species, I just randomly sign 138 reads to proteins in bacteria and archaea. As I understand, HUMAnN doesn't particular consider which species the read came from (but filter reads from Eukaryotes), basically pool species together, and consider pathway.

I am not sure is there any problem with building fake blastp input file for HUMAnN?

I want to make sure I am using them properly.

Thank you very much for any help or suggestions. I would be really appreciate it.

Best,
Ning

Shafquat, Afrah

unread,
Feb 24, 2014, 12:33:20 PM2/24/14
to <humann-users@googlegroups.com>, Li, Dr Ning
Hello Ning,

HUMAnN takes PICRUSt output as input so you won't need to modify your output in anyway once it comes from PICRUSt. For details regarding the procedure, please refer to the documentation:

Hope this helps.

Sincerely,

Afrah Shafquat

Program Manager
The Huttenhower Lab
Department of Biostatistics
Harvard School of Public Health

Ning

unread,
Feb 24, 2014, 3:51:32 PM2/24/14
to humann...@googlegroups.com, Li, Dr Ning
Hi Afrah Shafquat,

Thanks for the information. I checked the details on picrust tutorials. I followed it, but it didn't work out.

Here is what I did, can you help me find out which part went wrong.
My picrust output has been converted to text file, including all samples for each column, KO IDs for each row, filled with copy numbers.
File is in ~/humann_0.99/input/ folder
Run $ scons

scons: Reading SConscript files ...
scons: done reading SConscript files.
scons: Building targets ...
scons: `.' is up to date.
scons: done building targets.

But there is no output files, which means it didn't go through.
I went back to check SConstruct file. It seems humann only read blast or mapped bam data. I couldn't find where it reads tabular picrust output.

Thank you very much for your help. I really hope it will work.

Ning

Shafquat, Afrah

unread,
Feb 24, 2014, 3:59:28 PM2/24/14
to <humann-users@googlegroups.com>, Li, Dr Ning
Hello Ning,

Can you please provide me with the input file that you are using as input for HUMAnN?

Sincerely,

Afrah Shafquat

Program Manager
The Huttenhower Lab
Department of Biostatistics
Harvard School of Public Health

On Feb 24, 2014, at 3:51 PM, Ning <lin...@gmail.com>
 wrote:

afrah.s...@gmail.com

unread,
Feb 25, 2014, 10:02:07 AM2/25/14
to humann...@googlegroups.com, Li, Dr Ning
Hello all,

For future reference, the error was due to the following:

(i) first line was #Converted from biom etc.
(ii) Last column which was for 'KEGG Pathways'. After removing the column, HUMAnN ran smoothly.

The format of the output file from PICRUSt as input for HUMAnN should be the following;

#OTU ID<tab>Column1<tab>Column2<tab>.....
KOID1<tab>data<tab>data<tab>...
KOID2<tab>data<tab>data<tab>...


The first line should start with #OTU ID .....
If the first line says #Converted from biom etc., please go ahead and remove that line.

Sincerely,
Afrah Shafquat

Ning

unread,
Feb 25, 2014, 10:17:48 AM2/25/14
to humann...@googlegroups.com, Li, Dr Ning, afrah.s...@gmail.com
Thank you very much for your help Afrah.

It works after I modified the file.

Best,
Ning
Reply all
Reply to author
Forward
0 new messages