Hi, all,
I'm trying STAR-2.4.0j for fusion transcript detection. However, STAR always loses some known chimeric junctions. Here is an fusion transcript for which STAR reports no read.
EML4 exon 5 EML4 exon 6 ALK exon 19
GTCGAAAATACCTTCAACACCCAAATTAATACCAAAAGTTACCAAAACTGCAGACAAGCATAAAGATGTCATCATCAACCAAGTGTCACCCACCCCGGAGCCACACCTGCCACTCTCGCTGATCCTCTCTGTGGTGACCT
And here is a read aligned to the complementary strand of this fusion transcript by bowtie fed with manually curated fusion sequence, yet reported unmapped by STAR.
@V946R:00252:00995
TCACCACAGAGAGGATCAGCGAGAGTGGCAGGTGTGGCTCCGGGGTGGGTGACACTTGGTTGATGATGACATCTTTATGCTTGTCTGCAGTTTTGGTAACTTTTGGTATTAATTTGGGTGTTGA
+
<>>?ADCFDDCCCACCCCCC@@@CCCCACCCADCCCACCCACCC8CCC>CCCDCCCBDACACCCC???CCCCCCC>CCCCCACCCCCAA::::.:=::;?EFE:C@C::5:4::/::/:CCA>>
The parameters I used were as followed.
--outFilterType BySJout
--outFilterMismatchNmax 999
--outFilterMismatchNoverLmax 0.06
--outFilterMultimapNmax 20
--outFilterMatchNminOverLread 0.66
--outFilterIntronMotifs None
--outSJfilterReads All
--outSAMstrandField intronMotif
--alignSJoverhangMin 8
--alignSJDBoverhangMin 1
--alignIntronMin 20
--alignIntronMax 1000000
--chimSegmentMin 15
--chimJunctionOverhangMin 15
--chimScoreMin 0
--chimScoreDropMax 20
--chimScoreSeparation 10
--chimScoreJunctionNonGTAG -1
I doubt the false negative is caused by the micro-exon (EML4 exon 6). In fact, STAR constantly lost more than 1/2 of reads for other fusions where EML4 exon 6 was involved. And my question is how to set the STAR parameters to correctly detect the fusion transcript above.
Best,
Ming Su