Another strand specific forward probability question

592 views
Skip to first unread message

James Hanks

unread,
Feb 6, 2017, 12:21:30 PM2/6/17
to RSEM Users
I am trying to set up an analysis pipeline, and I am stuck on what settings to use for --forward-prob. I've looked at a few topics here, but I still don't really understand this option.

According to the RSEM manual:
"--forward-prob Probability of generating a read from the forward strand of a transcript. 1: strand-specific protocol where all (upstream) reads are derived from the forward strand; 0: strand-specific protocol where all (upstream) read are derived from the reverse strand; 0.5: non-strand-specific protocol."

I am confused because the RNA-seq protocol that was used (SMARTer Stranded Total RNA pico mammalian) with Illumina sequencing generates reads from both the forward and reverse strands (actually, two reads for each). R1 is supposed to be the sense strand, R2 the antisense. I am using the output of STAR (set to Stranded) as input for rsem-calculate-expression. As the output is only one file (Aligned.toTranscriptome.out.bam), I am confused about the rsem manual's instructions, which seem to imply that the file should either be from one strand or the other.

Can anyone advise, or at least suggest whether I'm misunderstanding the RSEM instructions, did the STAR alignment incorrectly, or something else?

Bo Li

unread,
Feb 10, 2017, 4:38:18 PM2/10/17
to rsem-...@googlegroups.com
Hi James,

Sorry for the confusion. I recommend you update RSEM to v1.3.0, in which
we replaced --forward-prob with --strandedness.

Please also see my comments on your questions below.

> According to the RSEM manual:
> "--forward-prob Probability of generating a read from the forward
> strand of a transcript. 1: strand-specific protocol where all
> (upstream) reads are derived from the forward strand; 0:
> strand-specific protocol where all (upstream) read are derived from
> the reverse strand; 0.5: non-strand-specific protocol."
>
> I am confused because the RNA-seq protocol that was used (SMARTer
> Stranded Total RNA pico mammalian) with Illumina sequencing generates
> reads from both the forward and reverse strands (actually, two reads
> for each). R1 is supposed to be the sense strand, R2 the antisense.

In RSEM manual, the strandedness is with respect to the transcript
instead of the cDNA fragment where the reads are sequenced from. For
Illumina sequencing technology, R1 is supposed to be the sense strand of
the cDNA fragment, but not necessarily the sense strand of the
transcript. It is similar for R2.

In a strand-specific protocol, if the cDNA reverse complimentary with
the transcript is sequenced (as used in Illumina TruSeq strand-specific
protocol), R1 comes from the reverse strand of the transcript and you
should set '--strandedness reverse'. If instead the cDNA has the same
sequence as the transcript, R1 comes from the forward strand and you
should set '--strandedness forward'. If the protocol is strand
non-specific, you should set '--strandedness none'.

> I am using the output of STAR (set to Stranded) as input for
> rsem-calculate-expression. As the output is only one file
> (Aligned.toTranscriptome.out.bam), I am confused about the rsem
> manual's instructions, which seem to imply that the file should either
> be from one strand or the other.
>

STAR does not have a strandedness option. If you set the correct RSEM
strandedness option, RSEM will help you to filter out alignments in the
incorrect strand.

Hope it helps,
Bo


> Can anyone advise, or at least suggest whether I'm misunderstanding
> the RSEM instructions, did the STAR alignment incorrectly, or
> something else?
>
> --
> RSEM website: http://deweylab.biostat.wisc.edu/rsem/ [1]
> ---
> You received this message because you are subscribed to the Google
> Groups "RSEM Users" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to rsem-users+...@googlegroups.com.
> To post to this group, send email to rsem-...@googlegroups.com.
> Visit this group at https://groups.google.com/group/rsem-users [2].
>
>
> Links:
> ------
> [1] http://deweylab.biostat.wisc.edu/rsem/
> [2] https://groups.google.com/group/rsem-users

James Hanks

unread,
Feb 11, 2017, 12:01:31 AM2/11/17
to RSEM Users
Yes, that does, thank you!

Devinder Kaur

unread,
Jul 6, 2018, 2:00:22 PM7/6/18
to RSEM Users

In RSEM manual, the strandedness is with respect to the transcript
instead of the cDNA fragment where the reads are sequenced from. For
Illumina sequencing technology, R1 is supposed to be the sense strand of
the cDNA fragment, but not necessarily the sense strand of the
transcript. It is similar for R2.

In a strand-specific protocol, if the cDNA reverse complimentary with
the transcript is sequenced (as used in Illumina TruSeq strand-specific
protocol), R1 comes from the reverse strand of the transcript and you
should set '--strandedness reverse'.

Sorry, I am still confused..
Does --strandedness reverse' gives the count of reads from sense strand?
Further will setting '--strandedness forward' provides the read count from antisense strand?
Reply all
Reply to author
Forward
0 new messages