RNA STAR AGAG bug?

47 views
Skip to first unread message

Peter Sudmant

unread,
Jan 12, 2015, 5:39:01 PM1/12/15
to rna-...@googlegroups.com
I've noticed that RNA STAR seems to output highly strand-biased results for AGAGs on the + and - strand (ie, 2nt long alternative 3' splice sites). 

For example, using Mouse Brain data from Merkin et al, I ran RNA-STAR building an index with all putative AGAGs on the + and - strand. When I analyze the resulting SJ.out.tab however, I get 4004 AGAGs on the - strand compared to only 226 on the + strand. I've experimented with other short 3' alternative exons and AGAGs are the only ones that show this effect. Any ideas what might be going on? I've included the SJ.out.tab as an attachment, I'd be happy to attach further data or explore more if you have suggestions,

Thanks very much! 

Cheers, Peter
SJ.out.tab.gz

Alexander Dobin

unread,
Jan 13, 2015, 5:51:37 PM1/13/15
to rna-...@googlegroups.com
Hi Peter,

I cannot think of any reason for strand bias in mapping to annotated junctions with a specific intron motif. 
How many AGAG junctions are annotated on + and - strands - in principle, there could be annotation bias for these rare non-canonical junctions.

I think we will need to have a second opinion here, say from GSNAP or TopHat - could you run them on the same dataset and check for the bias?

Thanks!
Alex

Peter Sudmant

unread,
Jan 14, 2015, 12:30:28 PM1/14/15
to rna-...@googlegroups.com
Thanks very much Alex, 

I did actually confirm a number of junctions with tophat but not STAR, but, I think I've found the issue. tophat junctions are 0 based last and first coding basepair of left and right exons respectively while STAR junctions are 1 based and the first and last bp of the intron,

Thus, for the following simple example ( a 7 base intron ), the STAR junction should be chrXX\t2\t8\t+ correct?

    0 based - 012345678
    1 based - 123456789
    genome  - XGT---AGX

    tophat chrXX\t0\t8\t+
    STAR chrXX\t2\t8\t+

Alexander Dobin

unread,
Jan 14, 2015, 3:55:23 PM1/14/15
to rna-...@googlegroups.com
Hi Peter,

that's correct - please let me know if this removes the strand bias.

Cheers
Alex

Peter Sudmant

unread,
Jan 14, 2015, 4:16:42 PM1/14/15
to rna-...@googlegroups.com
Testing as we speak! Thanks!
Reply all
Reply to author
Forward
0 new messages