Hello,
I have been trying to understand how MATS decides which alternative events to test for, but I am confused between the documentation and the output. In the documentation it says that "possible alternative splicing events are identified from the RNA-Seq data and annotation of transcripts in GTF format". Does that mean that it looks for the events from the transcripts of the gtf and also look at the RNA-Seq data for possible splice junctions that are not in the annotation files?
In the SE.MATS.JunctionCountOnly.txt (and the SE.MATS.ReadsOnTargetAndJunctionCounts.txt too) I also see events that are not in my annotation gtf (see example below). If that is the case how does it decide which splice junctions/skipped exons to test for? Does it have to be observed in a minimum number of reads?
For example:
ID GeneID geneSymbol chr strand exonStart_0base exonEnd upstreamES upstreamEE downstreamES downstreamEE ID IC_SAMPLE_1 SC_SAMPLE_1 IC_SAMPLE_2 SC_SAMPLE_2 IncFormLen SkipFormLen PValue FDR IncLevel1 IncLevel2 IncLevelDifference
304 "NM_002972" NA chr22 - 49242328 49242406 49241786 49241968 49244549 49244687 304 53 43 45 0 142 85 6.24715057462e-10 1.79293221492e-07 0.425 1.0 -0.575
and my gtf is downloaded from UCSC tables - RefSeq Genes - refGene table (hg18) where the alternative transcript is not there - there is only one transcript for that gene. (I can include parts of the gtf if that would help or a screen shot from my browser.)
Thank you in advance for your help.
Best wishes,
Emma