ABout grp2-grp1_median_dpsi and grp2-grp1_probability_changing in TSV file (after vovoila modulize)

25 views
Skip to first unread message

Shengwei Xiong

unread,
Apr 21, 2026, 3:25:43 PMApr 21
to Biociphers

Hello,

I have a question about voila modulize.

After running:

voila modulize build/sg.zarr build/AQR_C.sgc build/AQR_sh.sgc dpsi/AQR_K562_C-sh.dpsicov -d modulized --show-all --keep-constitutive --overwrite --debug

I noticed that the output TSV file contains headers such as:

"dpsi_changing_threshold": 0.2
"dpsi_probability_changing_threshold": 0.95

Does this mean that all splicing events in the TSV are already considered significant?

However, when I examined the grp2-grp1_probability_changing and grp2-grp1_median_dpsi columns in the TSV, very few values were above 0.95. How should I interpret grp2-grp1_probability_changing and grp2-grp1_median_dpsi in this context?

Best,
Shengwei

San Jewell

unread,
Apr 21, 2026, 5:05:20 PMApr 21
to Biociphers
Hi Shengwei, 

In classic dpsi mode: dpsi_changing_threshold is the threshold needed in delta psi in order to consider the junction changing, and dpsi_probability_changing_threshold is the confidence threshold needed to consider the junction changing. In order to be considered changing, both of these must pass. These flags influence the "junction_changing" column (and also the "event_changing" column -- as long as one junction in the event passes these till be true) These headers are provided at the top of output files for ease of organization / recall. 

The rules for dropping events when not using --show-all are different, as shown in the console log, after making modules, the modules will be dropped if these changing thresholds are not passed by at least one junction in the module. This rule drops at the module level (single exit single entry), rather than the event level. However, it's easy to filter on either the junction_changing or event_changing columns if you would like to limit the result set further. 

Let me know if this helps!
-San

M K

unread,
May 27, 2026, 4:59:24 AM (9 days ago) May 27
to Biociphers

Hello San and Majiq Team,

I would like to extend this question for more clarification on the columns of the output files from modulizer.
I 've already searched previous posts on these statistical values, but I think these questions haven't been answered.
I apologize if you have to repeat things said, that I haven't understood. 

I am also interested in filtering my results further, after modulizer,
I firstly used the switch/flag when running modulizer:  --changing-between-group-dpsi 0.3,
which is shown on results file as "dpsi_changing_threshold: 0.3".
Although the splicing events reduced drastically compared to the default threshold of 0.2, I still have thousands of events, all splicing types considered.

Q1. If I understood well from San's response in the previous post, we can filter the columns "junction_changing" or "event_changing", but is it always the value =TRUE, we want to keep if we are interested in finding the most differentially spliced events between two groups ?
In my results I have often event_changing=FALSE & junction_changing=TRUE, but I m mostly interested in finding the events that are differentially spliced from group1 vs group2, and not really on a specific junction. 

Also, there are cases where event_changing=FALSE and also Event_non_changing =FALSE !  Aren't these two values opposite/complementary ( if one is TRUE, the other should be FALSE) ? In which case can we have both FALSE ? 

Q2: Moreover, I still have doubts about "grp2-grp1_probability_changing" and "grp2-grp1_median_dpsi" :  

i. I have results with column "grp2-grp1_probability_changing" > 0.95 and also < 0.95. But why is that, if --probability-changing-threshold was set at 0.95 when running modulizer ? 
 Is this parameter/switch associated with this column in results or am I mistaken ? 
Can I filter further my results using this column ?  I don't know however, if I should use the above or below 0.95 ?  Is the highest value the highest "probability_changing" ? 

ii. About the "grp2-grp1_median_dpsi" : the higher the value, the event or junction is more differently spliced between the two groups, so possibly biologically significant, as I understand it, right ?
Similarly to above, I find results with median_dpsi < 0.3 even though I have set the threshold --changing-between-group-dpsi 0.3, when running modulizer  ? Could you explain why these results are kept ?
Should I filter again the results with median_dpsi < 0.3 ? 

Here is an overview of my results, given by reading the modulizer outputs in R, and printing number of rows, after applying a filter on one column/variable each time ( all splice-types merged) : 

```
~~~> 3744 Splicing events detected for "C002I3Z-vs-all_others". 650 splicing events remain after filtering with grp2-grp1_probability_changing > 0.95. 2045 splicing events remain after filtering with grp2-grp1_probability_changing < 0.95. 532 splicing events remain after filtering with grp2-grp1_median_dpsi > 0.3. 3132 splicing events remain after filtering with grp2-grp1_median_dpsi < 0.3. 650 remain after adding filtering junction_changing=TRUE. 

~~~> 2752 Splicing events detected for "C002I44-vs-all_others".
488 splicing events remain after filtering with grp2-grp1_probability_changing > 0.95.
1847 splicing events remain after filtering with grp2-grp1_probability_changing < 0.95.
337 splicing events remain after filtering with grp2-grp1_median_dpsi > 0.3.
2371 splicing events remain after filtering with grp2-grp1_median_dpsi < 0.3.
488 remain after adding filtering junction_changing=TRUE.
``` 

So, I am confused on which filter to keep for all these variables offered in the results. 
Any hints and suggestions would be highly appreciated, as I am exploring unknown data, so I don't know what to expect on the splicing results.

Thanks,
Best regards,
Maria

San Jewell

unread,
May 27, 2026, 3:13:33 PM (9 days ago) May 27
to Biociphers
Hi Maria, 

Many good clarification questions here! I will say I know a little bit less on some parts of this question, and I'm going to ask some other lab members to chip in, but in general I'll try to go over what I can answer below. I think I will also need some clarification on exactly what is being asked in some cases. 

Q1. If I understood well from San's response in the previous post, we can filter the columns "junction_changing" or "event_changing", but is it always the value =TRUE, we want to keep if we are interested in finding the most differentially spliced events between two groups ?
In my results I have often event_changing=FALSE & junction_changing=TRUE, but I m mostly interested in finding the events that are differentially spliced from group1 vs group2, and not really on a specific junction. 

The default behavior of the modulizer, when supplied with HET or DPSI inputs, is to filter out everything which is not considered changing. To get a better idea of all events, if you would like, you can pass the switch --show-all ; in this case you will see some cases of FALSE instead of TRUE. As you say, it is possible that there will be a single junction, possible with low read coverage, which registers a large DPSI across groups, while the majority of the event stays nonchanging, we have noticed this as well, so there are two switches available:

--changing-between-group-dpsi  <- The threshold to mark changing that at least a single junction in the event must pass to be considered changing
--changing-between-group-dpsi-secondary <- The threshold to mark changing that ALL junctions in the event must pass. 

In practice, we have usually used this by setting the primary flag to be a higher threshold value than the secondary, but if you want to be even more stringent you could set them to the same value which would require every junction to look changing for the event to be changing. There is a default 10% psi for the secondary flag but in some cases we have needed to increase it. 

Also, there are cases where event_changing=FALSE and also Event_non_changing =FALSE !  Aren't these two values opposite/complementary ( if one is TRUE, the other should be FALSE) ? In which case can we have both FALSE ? 

No, these two are not complementary. There are specific criteria for non_changing which are even more stringent than changing by default. This is because ALL junctions in the event must look non-changing for us to be confident that it is non_changing. You can see if even one junction was changing a little, it might not be considered either changing or non_changing. This has often left us in our usage with a set of indeterminate events which we must consider differently in our experiments. 

---

Q2: Moreover, I still have doubts about "grp2-grp1_probability_changing" and "grp2-grp1_median_dpsi" :  

i. I have results with column "grp2-grp1_probability_changing" > 0.95 and also < 0.95. But why is that, if --probability-changing-threshold was set at 0.95 when running modulizer ? 
 Is this parameter/switch associated with this column in results or am I mistaken ? 

The ---probability-changing-threshold will mark a specific junction as changing if passed, however, the "only keep changing" logic in modulizer only applies to event level changing, not junction level changing. So there may be some events where one junction passed this threshold but others did not, but it was still enough (along with the other types of filters we discussed earlier) to make the event as a whole "changing", so for these events with column event_changing=True and junction_changing=False, you may see probability_changing value > 0.95. Let me know if this answers what you were meaning by the question or if I did not understand properly. 

Can I filter further my results using this column ?  I don't know however, if I should use the above or below 0.95 ?  Is the highest value the highest "probability_changing" ? 

You can further filter using this column but you should make sure to preserve the event_id of interesting events that you find, so that you can go back and view all junctions of event, as you might be missing some rows from an interesting event that didn't make this threshold. Actually, this sounds very much like the secondary deltapsi filter from above, but applied to the statistical power instead of the dpsi. In our previous research, it wasn't determined that we needed a similar secondary switch here, but if you think this would be interesting to have I can include it in a future lab discussion. For now, you may just manually filter using a spreadsheet program as you imply. 

ii. About the "grp2-grp1_median_dpsi" : the higher the value, the event or junction is more differently spliced between the two groups, so possibly biologically significant, as I understand it, right ?
Similarly to above, I find results with median_dpsi < 0.3 even though I have set the threshold --changing-between-group-dpsi 0.3, when running modulizer  ? Could you explain why these results are kept ?
Should I filter again the results with median_dpsi < 0.3 ? 

Calling something biologically significant is a little less in my comfort zone, but indeed, that is what we aim to find! And higher values are of course more interesting. For this question, I think some of my other replies about secondary thresholds and the difference between event and junction level changing should answer it, but if you still have questions please feel free to reply again and I will clarify more. 

Thanks!
-San
Reply all
Reply to author
Forward
0 new messages