Hi, STAR users,
I am using STAR for part of the data processing with RNA-seq reads from lariat sequencing samples of human cell lines. In literature, STAR alignment is used at the first step of data processing (reads aligned to the hg38 genome, local alignment mode), and the unaligned reads are used for the next stages for the determination of branch points. The branch point informative (BP-informative) reads are chimeric reads (image attached to illustrate those reads). These BP-informative chimeric reads contain two parts: The part at 5' (part A in the image) of the read maps to a region near the branch point of one intron and the other part at 3' (part B in the image) maps to a region near the 5' splice site of the same intron. I have had this question in my head for a long time, why does STAR alignment (with chimeric mode turned off) consider these BP-informative chimeric reads unaligned? I think STAR can easily do a soft clipping of part A or part B of these BP-informative chimeric reads and then the other part can be easily aligned to the hg38 genome (if so, these reads will not be present in the unaligned reads list). I suspect that the reason why STAR is missing these BP-informative chimeric reads is because in the local alignment mode (even with chimeric mode turned off), STAR still does chimeric alignments of those chimeric reads and knows these reads are chimeric since those chimeric reads would have much higher alignment scores in chimeric align mode than in local mode with soft clip applied, however, STAR would not output these chimeric reads with chimeric mode turned off so that those chimeric reads ended up in the unaligned reads list.
I am not sure if I am on the right track, and I appreciate any input from you guys.
Thanks ahead,
Xiao