Error running abundance estimation with RSEM and Bowtie1

652 views
Skip to first unread message

Santiago

unread,
Mar 7, 2017, 2:53:15 PM3/7/17
to trinityrnaseq-users
Hi there,

It's been a while since the last time I properly worked with Trinity. I have a few questions that I will be splitting into separate threads because they are unrelated. I hope you don't mind.

I've setted up an installation using the latest versions of Trinity (v2.4.0), Trimmomatic (v.0.36), RSEM (v1.3.0) and Bowtie1 (v.1.2.0).

I've first ran the de novo assembly of the transcripts including trimming (Trimmomatic) and normalization without problems. However, when running the abundance estimation step ustin RSEM and Bowtie1, I've got the following errors, each one stopping a different sample (all the samples failed with one of these error messages):

Warning: Detected a read pair whose two mates have different names--MG00HS14:643:C8WVHACXX:4:1101:2654:2169 and MG00HS14:643:C8WVHACXX:4:1101:6787:2125!
Paired-end read MG00HS14:643:C8WVHACXX:4:1101:2654:2169 has alignments with inconsistent mate lengths!
"rsem-parse-alignments Trinity.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie.bam 3 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!

rsem-parse-alignments Trinity.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie.bam 3 -tag XM
Read MG00HS14:643:C8WVHACXX:4:1101:9250:2187: The adjacent two lines do not represent the two mates of a paired-end read! (RSEM assumes the two mates of a paired-end read should be adjacent)
"rsem-parse-alignments Trinity.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie.bam 3 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!

Warning: Detected a read pair whose two mates have different names--MG00HS14:643:C8WVHACXX:4:1101:4043:1985 and MG00HS14:643:C8WVHACXX:4:1101:6670:2218!
Read MG00HS14:643:C8WVHACXX:4:1101:4043:1985: The two mates do not align to a same transcript! RSEM does not support discordant alignments.
"rsem-parse-alignments Trinity.fasta.RSEM RSEM.temp/RSEM RSEM.stat/RSEM bowtie.bam 3 -tag XM" failed! Plase check if you provide correct parameters/options for the pipeline!

All steps have been ran with the default parameters, both the assembly and the abundance estimation (align_and_estimate_abundance.pl). I remember running this same configuration before with different software versions, so now I was wandering if you could tell me where should I look for the problem: is it due to a new behaviour of Bowtie1, a change in RSEM or some options inside the wrapper script?

Thank you very much in advance for everything!

Best regards,
Santiago

Brian Haas

unread,
Mar 8, 2017, 9:06:37 AM3/8/17
to Santiago, trinityrnaseq-users
The paired-end inputs to RSEM should have reads that are identically ordered.  Could it be that you did something (like combine reads) in such a way that the reads might be out of order?

~b

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.
Visit this group at https://groups.google.com/group/trinityrnaseq-users.
For more options, visit https://groups.google.com/d/optout.



--
--
Brian J. Haas
The Broad Institute
http://broadinstitute.org/~bhaas

 

Santiago

unread,
Mar 8, 2017, 9:46:54 AM3/8/17
to trinityrnaseq-users
Hi Brian,

Not at all. This was the sequence of commands:

"samples.list"
Control  Control_AN  rawdata/AN_C.R1.fastq.gz  rawdata/AN_C.R2.fastq.gz
Glands   Glands_AN   rawdata/AN_G.R1.fastq.gz  rawdata/AN_G.R2.fastq.gz

/software/trinityrnaseq-2.4.0/Trinity --trimmomatic --quality_trimming_params "ILLUMINACLIP:adapters.fa:2:30:10 LEADING:5 TRAILING:5 MINLEN:25" --seqType fq --CPU 16 --max_memory 220G --samples_file samples.list --output trinity

"samples.trimmed.list"
Control  Control_AN  trinity/AN_C.R1.fastq.gz.P.qtrim.gz  trinity/AN_C.R2.fastq.gz.P.qtrim.gz
Glands   Glands_AN   trinity/AN_G.R1.fastq.gz.P.qtrim.gz  trinity/AN_G.R2.fastq.gz.P.qtrim.gz

/software/trinityrnaseq-2.4.0/util/align_and_estimate_abundance.pl --thread_count 12 --transcripts Trinity.fasta --gene_trans_map Trinity.gene_trans_map --est_method RSEM --aln_method bowtie --prep_reference

/software/trinityrnaseq-2.4.0/util/align_and_estimate_abundance.pl --thread_count 12 --transcripts Trinity.fasta --gene_trans_map Trinity.gene_trans_map --est_method RSEM --aln_method bowtie --seqType fq --samples_file samples.trimmed.list

Cheers,
Santiago
To post to this group, send email to trinityrn...@googlegroups.com.

Brian Haas

unread,
Mar 8, 2017, 9:51:15 AM3/8/17
to Santiago, trinityrnaseq-users

Let's see if the line counts in the input fastq files all match up.  They should be equal in each of the left and right fq files.

  ex.
       gunzip -c file.fastq.gz | wc -l

best,

~b

Santiago

unread,
Mar 8, 2017, 2:48:09 PM3/8/17
to trinityrnaseq-users
Hi Brian,

Yes, all pairs of files have the same amount of lines/sequences (of course I double check it). Moreover, I was able to successfully ran the abundance estimation using eXpress + bowtie2, salmon and kallisto.

Another weird thing I saw was that when aligning using "bowtie2 --local" to test read representation, 98% of the reads align concordantly against the reference, while when running eXpress+bowtie2 (which runs "bowtie2 --no-mixed --no-discordant --gbar 1000 --end-to-end -k 200 -q -X 800") only an 87% of the reads align concordantly. Is this something I should be expecting? I'm having similar percentages (87%) for both salmon and kallisto.

Best,
Santiago

Brian Haas

unread,
Mar 8, 2017, 2:52:33 PM3/8/17
to Santiago, trinityrnaseq-users
OK - I can't explain this one...    At least every other method you tried worked. ;-)

~b

--
You received this message because you are subscribed to the Google Groups "trinityrnaseq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to trinityrnaseq-users+unsub...@googlegroups.com.
To post to this group, send email to trinityrnaseq-users@googlegroups.com.

Dario Strbenac

unread,
May 24, 2017, 2:00:24 AM5/24/17
to trinityrnaseq-users
I found exactly the same problem and I also use Bowtie 1.2.0. I used cutadapt rather than trimmomatic to do adapter trimming, and found that the reads are in the correct order after trimming (i.e. the read ID in the warning message is on the same line number in R1 and R2 FASTQ files). The problem may be in the BAM file generated by Bowtie. The message "The two mates do not align to a same transcript! RSEM does not support discordant alignments." suggests that the read alignments, not the input for alignment is a problem. I don't know how to check the BAM file for properly paired reads, though.

Bowtie seems to be mapping well:

# reads processed: 59106462
# reads with at least one reported alignment: 52710186 (89.18%)
# reads that failed to align: 6396276 (10.82%)
Reported 197639975 paired-end alignments to 1 output stream(s)

I've made an issue about rsem-parse-alignments at https://github.com/deweylab/RSEM/issues/66
Reply all
Reply to author
Forward
0 new messages