bowtie2out file created but the profile result is 100% unclassified for fasta and raw fastq file

745 views
Skip to first unread message

jigyasa arora

unread,
May 30, 2018, 10:03:03 PM5/30/18
to MetaPhlAn-users
Hey, I am using metaphlan2 for taxonomy analysis of my insect gut metagenome. It was sequenced using Hiseq4000, generating 9Gb of paired end data. I ran the metaphlan2 on the a) raw fastq file and b)SPAdes assembled fasta file (trimmed above 500 bps of contigs)

$module load python/3.5.2
$IN_DIR="my_input_dir"

#For a)
metaphlan2.py ${IN_DIR}/230-03-PHI8_S3_R1_001.join.fq --input_type fastq > 03_raw_fastq_profile.txt

#For b)
$metaphlan2.py ${IN_DIR}/03_K21-71_contigs.fasta --input_type fasta > 03_K21_contig_profile.txt

I checked the forum, but couldn't find a solution that I understand. There was a thread about changing the sensitivity, but thats for sam files, right? Could you guide me?

03_K21_contig_profile.txt
03_K21-71_contigs.fasta.bowtie2out.txt

Francesco Asnicar

unread,
May 31, 2018, 12:01:16 PM5/31/18
to jigyasa arora, MetaPhlAn-users, Moreno Zolfo
Hi,

For the case "b" the empty result makes sense since MetaPhlAn is not designed to work with assemblies. It is strange though that you got the same when using the raw reads (fastq).
One thing worth checking is the read length, in MetaPhlAn we have a parameter that regulates the minimum read length to be considered in the downstream analysis (--read_min_len), which by default is set to 70. If you know the distribution of your read lengths we can check if this is causing the empty results.

Another thing that could help us understand better this problem is if you can share with us the sam and bowtie2 outputs. Just run MetaPhlAn with the --samout and --bowtie2out params for getting those files.
If after these you still have an empty result, it would be great if you could share with us a portion of your input files (~100000 reads should be small enough to be shared and large enough to do some debugging).


Many thanks,
Francesco

--
You received this message because you are subscribed to the Google Groups "MetaPhlAn-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metaphlan-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

jigyasa arora

unread,
Jun 2, 2018, 1:31:32 AM6/2/18
to MetaPhlAn-users
Hey Francesco

Thank you so much for replying! I tried the following and I am still getting 100% unclassified output.

1. --samout #the profile file is empty.
I get the output-
"17406276 reads; of these:
17406276 (100.00%) were unpaired; of these:
17406199 (100.00%) aligned 0 times
24 (0.00%) aligned exactly 1 time
53 (0.00%) aligned >1 times
0.00% overall alignment rate

The code used to generate the sam file-
bowtie2 --sam-no-hd --sam-no-sq --no-unal --very-sensitive -S ${SAM_DIR}/230-03_metaphlyn.sam -x ${software}/db_v20/mpa_v20_m200 -U ${OUT_DIR}/230-03-PHI8_S3_R1_001.join.fq

2. --bowtie2out params
output- the bowtie2out profile is also empty.

3. please find attached the a) subset of original fastq file (I made 10000 reads subsample as 1000000 reads subsample was too big to be uploaded.)
b) bowtie2out file
c) the sequence length distribution R plot
d) sam file

Thank you helping out!
Jigyasa

230-03_metaphlyn.sam
Rplot_sequencelength_distribution_fastq.pdf
230-03-PHI8_S3_R1_001.join.fq.bowtie2out.txt
subsample-230-03-PHI8_S3_R1_001.join.fq

Francesco Asnicar

unread,
Jun 6, 2018, 1:03:24 PM6/6/18
to jigyasa arora, MetaPhlAn-users
Hi Jigyasa,

I did some tests using the 10k reads you shared and I couldn't get much out of them. I honestly don't have much experience with insects gut microbiome, but one idea is, have you screened your reads for possible contamination from the host? In other words, how many of those reads are coming from the insect you are studying?

To do this you can simply run bowtie2 using a reference genome for the species you are considering.

My feeling is that it is very likely that many many reads are coming from the genome of the host, and so are not coming from its gut microbial community.

Many thanks,
Francesco

Reply all
Reply to author
Forward
0 new messages