In your example it is strange that would get a very low FDR with almost no
IncLevelDiff; I suspect that the read numbers are very high, so that
minimal differences appear statistically significant, even though the
biological significance is questionable. Maybe Eric has more insights
into that.
The IncLevelDiff is certainly connected to the biological significance, so
it makes sense to apply a cutoff there as well, but your proposal of 0.5
seems very high and with most samples you will get no or very few events.
In my experience, splicing events are often more subtle, and remember that
IncLevelDiff is a difference between fractions/percentages, not a
foldchange as you use in differential expression.
Example 1:
IncLevel1 = 0.1, IncLevel2 = 0.2, IncLevelDiff = 0.1,
foldchange = 0.2/0.1 = 2x.
Example 2:
IncLevel1 = 0.5, IncLevel2 = 1.0, IncLevelDiff = 0.5,
foldchange = 1.0/0.5 = 2x.
With gene expression, you have two values per gene -- expression
(counts/CPM/TPM) in group 1, and expression in group 2. With splicing
events you have four values: Levels isoform A and levels of isoform B in
group 1, and levels of both isoforms in group 2. The fraction of isoform
A in group 1 is IncLevel1, the fraction of isoform B is (1 - IncLevel1).
So the situtation is a bit more complex than for simple gene expression.
Obviously, if you can find events with IncLevelDiff of >=0.5, these would
be very strong and likely worth looking at more closely. What an event
means in the context of the biology depends not only on the effect size
but also on the biological function of the isoforms -- what does a shift
from one to the other do in the cell?
Hope this helps,
Thomas
On Wed, 16 Jun 2021,
laha.sayan...@gmail.com wrote:
>
> Hi,
> I have just run rmats with paired model. I am looking into the outputs
> for A3SS splicing event. I have sorted the FDR in descending to get the
> most significant events first.
>
> However, my second entry has Inclevel1 and IncLevel2 values as (12
> replicates in each group):
>
> *IncLevel1:
> 1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0*
>
> *IncLevel2:*
> *1.0,1.0,1.0,0.999,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.999,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0*
>
> The computed inclusion level difference is 0 in this case.
>
> This event is showing *FDR = 4.83518780569625E-12*. is this right?
>
> P.S. I have looked into the Junction Counts and they have values like (only
> 5 are shown for brevity)
>
> *IJC_SAMPLE_1: 151, 129, 260, 98, 129*
> *SJC_SAMPLE_1: 0,0,0,0,0*
>
> *IJC_SAMPLE_2: 7100, 23500, 7576, 8069, 11191*
> *SJC_SAMPLE_2: 2, 9, 0, 10, 4*
>
> So from this we can see that the inclusion isoform is present in both the
> conditions, but in condition1 it is more dominant compared to condition2.
> The skipping isoform is not epressed / meagerly expressed in both
> conditions. However, since this program determines alternative splicing
> between 2 conditions, judging from the IncLeveldifference value, how is it
> significant at all?
>
> I was thinking of selecting candidates from the outputs based on FDR and
> IncLevelDifference. For instance if I impose a threshold of FDR <= 0.05 and
> |IncLevelDiff| >= 0.5 (say), will it be a appropriate? In gene expression
> studies, normally a fold change cutoff of 2 folds is used as standard. I
> believe fold change and IncLevelDiff are analogous, so a suitable cutt-off
> should help me select the best results?
>
> Thanks,
> Sayantan
>