a little confused with het differential stats

31 views

Skip to first unread message

Alisha Jones

unread,

Mar 17, 2026, 5:55:55 PM (2 days ago) Mar 17

to Biociphers

Hi all,

I am a little confused about how to perceive the TNOM, Wilcoxon, and Ttest values after analysis following het differential analysis (header below). I downloaded two replicates (along with the two controls) of shRNA knockdown data from ENCODE (all from the same cell line) to look at the involvement of a protein in the splicing of RNA.

The values that are in the Wilcoxon, TNOM, and Ttest columns are not p-values, right?

I've read that all three should be <0.05 (with the |dPSI| > 0.1) for it to be significant, but I am also reading in the documentation that 'A TNOM score = 0 means the set of E[PSI] in each group are completely separable '.

My TNOM values are only one of three values: 0.33, 0.66, 1.

The problem is that for the near 1,500 splicing events in my csv file, virtually none of them satisfy the 'all three less than 0.05' requirement. Also, I don't quite understand how TNOM can be a fraction.

How do I determine which splicing events are significant? Any help to clear my confusion would be appreciated. Also, was the HET differential analysis appropriate for this analysis? In the past, I used the earlier version of MAJIQ to do the same analysis.

Thanks,

Jonesy

# {
#     "command": "/home/jonesylab/Documents/Software/htslib-1.13/env/bin/voila modulize output/build/splicegraph.sql output/heterogen/control_sh_U2AF1_K562-sh_U2AF1_K562.het.voila -d output/modulized --overwrite --show-all",
#     "het_IQR_nonchanging_threshold": 0.1,
#     "het_changing_threshold": 0.2,
#     "het_nonchanging_threshold": 0.05,
#     "het_pvalue_changing_threshold": 0.05,
#     "het_pvalue_nonchanging_threshold": 0.05,
#     "voila_version": "2.4.dev105+g9e42f278"
# }

Matthew Gazzara

unread,

Mar 18, 2026, 6:53:25 PM (18 hours ago) Mar 18

to Alisha Jones, Biociphers

Hi Jonesy,

For HET outputs Wilcoxon and t-test columns do represent p-values. You likely aren’t seeing many with p<0.05 because with only 2 vs. 2 samples for a typical ENCODE RBP knockdown the tests are very underpowered. With small numbers of samples we recommend you use regular deltapsi quantification mode to find differential splicing in replicate experiments like these. HET's use case is more appropriate for larger heterogeneous sample groups.

The TNOM value output by HET represents a p-value based on permutation probabilities (This is why you see the values of only 0.33, 0.66, and 1 in your output; there aren’t many ways to order 2 vs 2 samples). You’re right that a TNOM score of 0 means the PSI values from group1 and group2 are perfectly separable, however I believe HET currently will only output the p-value version of this, there may be a way to output the score but you can easily derive it numerically.

TNOM score 0 is perfect separation so if you have a group of size K and a group of size M (in your case I believe M=K=2), then there are only two ways to get a score of 0 if you order the N=(M+K) samples from low to high PSI: Either group 1 (say of size K) is all on the left, or group 2 is all on the left (say the group of size M). It is easy to see that if you consider all possible ways to order N such samples by random you have N chose K such possible ordering so the best TNOM score of 0 will always have a value of 2/(N chose K). For example, with 2 vs 2 samples you can arrange the 4 PSI values in 4 choose 2 = 6 ways. From those, zero misclassifications between group1 and group2 represent two of the arrangements (0011 or 1100), so you get 2/6 = 0.33. For one misclassification you can arrange PSI values 4 ways so the cumulative probability of TNOM score <= 1 is 6/6 = 1. I’d have to check for 0.66, but I assume it’s an artifact of how ties are handled or perhaps it is a missing value for a sample.

Extending this to 3v3 the minimum observable for 0 misclassifications is 2 / (6 choose 3) = 2/20 = 0.1 and for 4v4 the minimum is 2 / (8 choose 4) = 2/70 = 0.029. So again at small sample sizes TNOM p-values aren’t very informative.

Note that the above p-values are NOT corrected for multiple hypothesis tests, and are not meant to be. They just serve as another type of score to assess group separation. In practice what users are applying is a composite test of differences between group medians AND a p-value threshold. This means a true null would need to assess against both criteria which is something more costly and would in turn depend on the underlying assumptions, so we have not implemented that.

Practical recommendation: You can use TNOM score of 0 misclassifications (p-val in your case of 0.33) as a better heuristic and combine that with a dPSI minimum to call changing events or use deltapsi quantification mode instead which uses a Bayesian framework to estimate the posterior distribution of dPSI. Under that quantification mode the test moves from "is the distribution of PSI values in group1 different from PSI values in group2" to "given the reads I observed from group1 and group2, what is the probability that the dPSI between the groups is meaningfully large?". For smaller sample sizes and well behaved replicates this second approach is typically more useful. You can then adjust the thresholds on dPSI reporting. There are two thresholds here - the minimal dPSI value you want to consider, and the posterior probability by the Bayesian model that the change is at least that. By default the minimal change is 0.2 and the confidence is very high (0.95) as the defaults are meant to be conservative and validate well. Sometimes users drop the dPSI to 0.15 or 0.1 but I would not go below that, and sometimes they drop the confidence a bit to say 0.7 or something similar.

Hopefully this clears things up and let me know if you have any questions.

-Matt Gazzara

--
You received this message because you are subscribed to the Google Groups "Biociphers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to majiq_voila...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/majiq_voila/603153ff-586e-4b51-822a-b607cf27c383n%40googlegroups.com.

Reply all

Reply to author

Forward

0 new messages