Hi, I have encountered a case shown in the following figure: this is a TCGA sample. Top one is the RNAseq data, middle one is the WXS data of the same patient tumor, the bottom one is the WXS of the matched normal. You can see there is a deletion from chr17:74732936 to chr17:74732959 according to the WXS of the patient tumor. However, there are some reads aligned by STAR to this deleted region and caused a seemingly mutation call at position: chr17:74732942.
CCCCGTACCTGCGGGGTGGCGGTCCCCGGCGGCCGTAGCGAGCCATTTGC
GTCCCTGCGGGGTGGCGGTCCCCGGCGGCCGTAGCGCGCCATTTGCACCC
CACCGCCCCCGTACCTGCGGGGTGGCGGTCCCCGGCGGCCGTAGCGCGCC
They are supposed to aligned as the following: (where '|' indicates the junction due to deletion)
CCCCGTACCTGCGGGGTGGCGGTCCCC|GGCGGCCGTAGCGAGCCATTTGC
GTCCCTGCGGGGTGGCGGTCCCC|GGCGGCCGTAGCGCGCCATTTGCACCC
CACCGCCCCCGTACCTGCGGGGTGGCGGTCCCC|GGCGGCCGTAGCGCGCC
However, they are aligned actually as: (where, 'C' is the mutation due to the misalignment, '[...]' indicates the soft-clip sequence part by STAR)
CCCCGTACCTGCGGGGTGGCGGTCCCCGGCGGCCGT[AGCGAGCCATTTGC]
GTCCCTGCGGGGTGGCGGTCCCCGGCGGCCGT[AGCGCGCCATTTGCACCC]
CACCGCCCCCGTACCTGCGGGGTGGCGGTCCCCGGCGGCCGT[AGCGCGCC]
Basically, STAR chose to align a read like this with one mutation + soft-clip, instead of split the reads to find a better alignment. In this case, STAR should find a new junction within this SRSF2 exon that actually is a deletion according to the exome-seq data. Is there a way to avoid such misalignment using STAR? The idea behavior should be that STAR should call this as a new junction that is actually due to deletion within this exon.