Non-canonical SJ have no strand information

135 views
Skip to first unread message

Stefanos Bamopoulos

unread,
Feb 4, 2018, 5:14:58 PM2/4/18
to rna-star
Hello Alex,

I have (once again) a question regarding the splice junctions reported in the SJ.out.tab file.

After running a star 2-pass mapping, I have detected some splice juntions that interest me and it seems that they have a non-canonical motif.
What made me curious is the fact, that those SJs (its 2) don't have a strand information, in fact its undefined.
I wanted to ask you, why the aligner cannot determine the strand information (The library is strand specific)? 
It seems strange, since the aligner can pinpoint the exact location of the junction, but not the strand.
And I guess that the motif is reported as non-canonical, because no typical motif is present on either strand at the reported position?

Thank you!

Best
Stefan

Alexander Dobin

unread,
Feb 5, 2018, 12:39:44 PM2/5/18
to rna-star
Hi Stefanos,

STAR does not consider the strandedness of the library directly - that is why it cannot determine the strand of the non-canonical junctions. You would have to check the alignments over these junctions to determine their strand. You can add jM jI tags to the --outSAMattributes options (e.g. --outSAMattributes NH HI nM AS jI jM) which will show the junction motif and intron strart/end for each alignment.

The junction is considered canonical if it matches canonical motifs on either strand.

Cheers
Alex

Stefanos Bamopoulos

unread,
Feb 21, 2018, 7:11:27 AM2/21/18
to rna-star
Hi Alex,

thank you for clearing things up. I already have the information on the intron start/end from the SJ.out.tab file. I am not parsing the BAM files directly. 
So if I understand you correctly STAR does not consider which strand each read comes from, which makes sense considering that there is no option to provide that information as an input parameter. 
What I am unsure about is how STAR is able to determine the strand of the canonical junctions if it is not able to do it for non-canonical junctions. Does STAR align the read and then look at each strand and try to find a canonical motif? 

Best
Stefan

Alexander Dobin

unread,
Feb 23, 2018, 1:18:21 PM2/23/18
to rna-star
Hi Stefanos,

the canonical motifs are strand-specific, i.e. the motif has to be GT/AG on the DNA strand the junction is transcribed from. That's how STAR makes determination of the strand of the junctions, it does not check the strand of the reads that map to the junctions. Some of the reads even in the stranded library will map to the "wrong" strand (typically <1%).

If you want to see how many reads are mapped to the junctions in each strand orientation, you would need to parse the BAM files, SJ.out.tab does not contain this information. This would allow you to find the (most likely) strand of non-canonical junctions.

Cheers
Alex

Stefanos Bamopoulos

unread,
May 2, 2018, 5:24:02 PM5/2/18
to rna-star
Hi Alex,

I am sorry for the late reply. 
Thank you for illuminating the inner working. of STAR!

Best
Stefan
Reply all
Reply to author
Forward
0 new messages