Hello San and Majiq Team,
I would like to extend this question for more clarification on the columns of the output files from modulizer.
I 've already searched previous posts on these statistical values, but I think these questions haven't been answered.
I apologize if you have to repeat things said, that I haven't understood.
I am also interested in filtering my results further, after modulizer,
I firstly used the switch/flag when running modulizer: --changing-between-group-dpsi 0.3,
which is shown on results file as "dpsi_changing_threshold: 0.3".
Although the splicing events reduced drastically compared to the default threshold of 0.2, I still have thousands of events, all splicing types considered.
Q1. If I understood well from San's response in the previous post, we can filter the columns "junction_changing" or "event_changing", but is it always the value =TRUE, we want to keep if we are interested in finding the most differentially spliced events between two groups ?
In my results I have often event_changing=FALSE & junction_changing=TRUE, but I m mostly interested in finding the events that are differentially spliced from group1 vs group2, and not really on a specific junction.
Also, there are cases where event_changing=FALSE and also Event_non_changing =FALSE ! Aren't these two values opposite/complementary ( if one is TRUE, the other should be FALSE) ? In which case can we have both FALSE ?
Q2: Moreover, I still have doubts about "grp2-grp1_probability_changing" and "grp2-grp1_median_dpsi" :
i. I have results with column "grp2-grp1_probability_changing" > 0.95 and also < 0.95. But why is that, if --probability-changing-threshold was set at 0.95 when running modulizer ?
Is this parameter/switch associated with this column in results or am I mistaken ?
Can I filter further my results using this column ? I don't know however, if I should use the above or below 0.95 ? Is the highest value the highest "probability_changing" ?
ii. About the "grp2-grp1_median_dpsi" : the higher the value, the event or junction is more differently spliced between the two groups, so possibly biologically significant, as I understand it, right ?
Similarly to above, I find results with median_dpsi < 0.3 even though I have set the threshold --changing-between-group-dpsi 0.3, when running modulizer ? Could you explain why these results are kept ?
Should I filter again the results with median_dpsi < 0.3 ?
Here is an overview of my results, given by reading the modulizer outputs in R, and printing number of rows, after applying a filter on one column/variable each time ( all splice-types merged) :
```
~~~> 3744 Splicing events detected for "C002I3Z-vs-all_others".
650 splicing events remain after filtering with grp2-grp1_probability_changing > 0.95.
2045 splicing events remain after filtering with grp2-grp1_probability_changing < 0.95.
532 splicing events remain after filtering with grp2-grp1_median_dpsi > 0.3.
3132 splicing events remain after filtering with grp2-grp1_median_dpsi < 0.3.
650 remain after adding filtering junction_changing=TRUE.
~~~> 2752 Splicing events detected for "C002I44-vs-all_others".
488 splicing events remain after filtering with grp2-grp1_probability_changing > 0.95.
1847 splicing events remain after filtering with grp2-grp1_probability_changing < 0.95.
337 splicing events remain after filtering with grp2-grp1_median_dpsi > 0.3.
2371 splicing events remain after filtering with grp2-grp1_median_dpsi < 0.3.
488 remain after adding filtering junction_changing=TRUE.
```
So, I am confused on which filter to keep for all these variables offered in the results.
Any hints and suggestions would be highly appreciated, as I am exploring unknown data, so I don't know what to expect on the splicing results.
Thanks,
Best regards,
Maria