Conflicting novel vs annotated junctions in 2-pass mode

28 views
Skip to first unread message

Tonya Brunetti

unread,
Dec 18, 2021, 10:28:58 PM12/18/21
to rna-star
Dear Alex,

I am using STAR in 2-pass mode to get the novel vs annotated intron junctions relative to an input GTF.  I have 8 samples, and I have called the junctions independently of each other and then merged the junction files together and dropped any duplicate rows.

Interestingly, some junctions are the same intronic region, but one is annotated and another is unannoated status.  Why does this occur?  What is the best way to mitigate this?

Thanks!
Tonya

Alexander Dobin

unread,
Dec 21, 2021, 2:00:16 PM12/21/21
to rna-star
Hi Tonya,

did you run the 2-pass mode on each sample separately?
After the 2-pass run, the junctions that were detected in the 1st pass are considered annotated, which happened for some samples.
In other samples, these junctions were only detected in the 2nd pass, and so they will be considered unannotated.

Cheers
Alex

Tonya Brunetti

unread,
Dec 27, 2021, 3:26:12 PM12/27/21
to rna-star
Hi Alex,

Yes, I did run the 2-pass mode on each sample separately.  Your explanation makes sense!  I think I misinterpreted what annotated versus unannotated meant.  Would it be best for me take all the junctions detected and merged them across all samples and ignore the annotation status column and compare that to what junctions exist in the gtf?  I am looking to find which junctions are previously known in the gtf to what is not listed in the gtf.

Thanks!
Tonya

Alexander Dobin

unread,
Dec 27, 2021, 3:51:52 PM12/27/21
to rna-star
Hi Tonya,

right, for "true" annotated (i.e.present in GTF) comparison, you would ignore the annotated column and compare the junction loci to the list of junctions extracted from GTF, e.g. sjdbList.out.tab file in the genome directory.

Cheers
Alex
Reply all
Reply to author
Forward
0 new messages