MAJIQ parameters for rMATS comparison

Skip to first unread message

Ganesh Subramani

Dec 19, 2023, 7:27:39 PM12/19/23
to Biociphers

I apologize if this is a repeat posting from me (as I cannot find the posting I had made earlier).

I would like see if statistically significant differential splicing events identified from rMATS can also be indenitified by MAJIQ.
I had run rMATS with the default --cstat value of 0.0001 to compare splicing changes in 3 WT replicates versus 3 KO replicates. Then I filtered events to have FDR<0.1 and deltaPSI>0.1.

My questions are regarding what threshold and which probability values I should use in MAJIQ to compare with rMATS.
I intend to run a majiq deltapsi analysis for comapring my 3 WT vs 3 KO replicates. Is this OK or should I use heterogen instead?

I do not understand the statistics behind rMATS or MAJIQ. So, I have a few practical questions to understand what I should be doing.

1. As I understand, the p-value rMATS gives is the probability of deltaPSI<=threshold and setting the default threshold to something very small value (0.0001) is to detect whether there is any change at all between the groups.
So, if I want to replicate my rMATS analysis with MAJIQ, would setting the viola --threshold to 0.0001 (and filtering for events with probability_not_changing<0.1 and deltaPSI>=0.1) make sense?
Or do the statistics used by MAJIQ not make sense with such a low threshold?

2. On my initial run to familiarise myself with MAJIQ and its outputs, I used majiq deltapsi to compare my 3 WT vs 3 KO replicates, followed by voila tsv with the default threshold of 0.2.
I would like understand the probabilty values (probability_changing and probability_not_changing) in the voila .tsv output file.
From reading other threads here and watching the workshop videos, it seems to me that, probability_changing is the probability that an LSV changes by 20% between WT and KO groups; and probability_not_changing is the probability that an LSV does not change by 20% between WT and KO groups. Is that correct?

3. Are these probabilty values (probability_changing and probability_not_changing) already corrected for multiple testing?
For claity, rMATS gives a p-value and an FDR. Or in the context of MAJIQ, multiple testing is unnecessary?

4. Assuming I can use probabilty_changing and probability_not_changing values without further multiple testing, would FDR<0.1 in rMATS be comparable to probability_not_changing<0.1 (since they are both testing the probability of null hypothesis being true)?
Meaning, when I filter LSVs/events, I would take rows with probability_not_changing<0.1?
Or would it be more advisable to take LSVs with probability_changing>0.9?

5. Is there some method or a script to convert MAJIQ outputs to something similar to what rMATS outputs?
For example, I would like to convert the "cassette" events described in the viola modulize output to the rMATS output format for skipped exons (i.e., one exon skipping event as coordinates of spliced exon, upstream exon and downstream exon and the PSI values for splicing between each of these exons in each of my samples, instead of having them split over 4 rows).

Thank you very much and sorry if my questions are too basic.
Would really appreciate your help.
Reply all
Reply to author
0 new messages