Paul Naphtali

Oct 9, 2019, 12:04:25 AM10/9/19
to HUMAnN Users
I am currently working with a CF lung metatranscriptome data set that contains 10000-50000 reads per fastq file after removing human and rRNA reads. I ran these fastq files through Humann2 with the following command: 

humann2 --metaphlan ~/.local/bin/metaphlan2 --input ${mysample} --output ./output_directory

When I ran my fastq files through this command, I noticed that metaphlan2 isn't picking up any of the taxa I know to be present when I run these samples through other taxonomic identification programs like KRAKEN and KAIJU. Instead, I receive the following warning: 

Total species selected from prescreen: 0

Selected species explain 0.00% of predicted community composition

No species were selected from the prescreen.
Because of this the custom ChocoPhlAn database is empty.
This will result in zero species-specific gene families and pathways

Since I already had an idea of which species were present in CF sputa using KRAKEN and KAIJU (eg. Pseudomonas aeruginosa, Staphylococcus aureus, Stenotrophomonas maltophilia, Streptococcus spp., etc), I wanted Humann2 to create a custom Chocophlan2 database with those taxa before performing the nucleotide-level search. I added the --taxonomic-profile bugs_list.tsv flag so that a custom Chocophlan database could be prepared from the bugs list I made. 

My issue is that I'm not sure how to format the bugs_list.tsv file since running Metaphlan2 on my samples yielded an output file of 100% unclassified for half of my samples. 

Eric Franzosa

Oct 9, 2019, 1:16:51 PM10/9/19
to humann...@googlegroups.com
Hi Paul,

Those sequencing depths are pretty low relative to where our methods usually operate. Assuming 100 nt reads, 50K reads = 5 Mnt, which is ~1x coverage of a single bacterial genome. HUMAnN2's sensitivity falls off below 1x coverage, so I'm not sure that you'd get useful results even if you reformatted your existing taxonomic profile for HUMAnN2.


