I don't have time think about this exhaustively, but a couple of thoughts
that might give you ideas:
* For the RI example, you have a very large expression difference between
conditions, and the intron expression looks rather low to me. So maybe
you have a high "background noise" that is more or less the same in both
conditions, but as a percentage of the "signal" (exon expression), it is
much higher in the low expression of the "normal" condition.
I would look to see if there are reads goiing across the exon-intron
junctions (both ends of the intron) in the BAM files with IGV (or some
other viewer, or even samtools). If not, I would be suspicious, and maybe
instead of actual intron retention you are seeing something else.
* In your SE example, you also have some expression differences. The
skipped exon is also barely expressed in the "disease" condition (i.e.
skipped most of the time), and you have only 5 reads spanning the
exon2-exon3 junction, so even if you were to expect the same number for
exon1-exon2, statistically it is not unlikely that you observe none of
these due to sampling error. The "normnal" condition has 70 exon2-exon3
reads, so that explanation seems a lot less likely. But you can see that
exon1 expression is pretty low as well (possibly lower than exon2). Is
exon1 the first exon in this gene? (This in itself would account for
lower expression, since mRNA is being degraded from the ends.) So
what if instead of the annotated SE event, you are really looking at an
alternative starting exon (i.e. transcript1 = exon1-exon3-... and
transcript2 = exon2-exon3-...)? (If exon1 is not a the start but in the
middle of the transcript, maybe this is more like an MXE event, i.e.
exon0-exon1-exon3 vs exon0-exon2-exon3, and you don't see the reads
spanning the exon0-exon2 junction, since exon0 is outside the scope
of the SE testing and is therefore being ignored here.)
I am just speculating here, and maybe I am overlooking something, but
maybe it gives you some ideas. You may also want to look at your quality
control parameters (especially mapping QC, things like percentage of reads
in exons, introns & intergenic regions, etc.), to make sure your libraries
are suitable for splicing analysis. (If the nuclear membrane was ruptured
during RNA extraction, you are getting a bunch of transcripts that are not
completely spliced [with high intron content], as opposed to the finished
and exported mRNA from the cytosol, and while you are selecting against
those with poly-A tail selection, that selection is more an enrichment
rather than definite exclusion of non-poly-A RNAs, so if you have enough
of them, you will see them in your reads and the sample is not suitable
for splicing analysis, although it may be fine for differential gene
expression tests.)
Good luck!
Thomas
> --
> You received this message because you are subscribed to the Google Groups "rMATS User Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
rmats-user-gro...@googlegroups.com.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/rmats-user-group/598e5314-3706-48b3-bb34-d1afd078fd9cn%40googlegroups.com.
>