Sum of PSI values in an LSV does not equal 1

19 views
Skip to first unread message

Conall Moore

unread,
Apr 23, 2026, 6:53:42 PMApr 23
to Biociphers
Hi Biociphers,

Thank you for the amazing tool!

I have previously presented data from MAJIQ PSI by generating stacked bar charts for an entire LSV with PSI values on the y-axis, conditions on the x-axis, and each stack is a splice junction. That way, the sum of PSI values for all junctions in the LSV is approximately 1.0 for each condition. I've attached a diagram for clarity. 

Using the same MAJIQ build, I recently ran this data through HET and for some conditions, the sum of all junction PSIs was quite a bit lower than expected (as low as 0.86 in some cases). Do you have any idea why this could be the case? I could normalise the PSI values for each condition, but I'm just curious as to what could be the cause here.

I've outlined my commands below.

Any help would be greatly appreciated!

Conall
_______________

Trimming:
~/TrimGalore-0.6.10/trim_galore --paired --phred33 --fastqc --gzip -j 8 fastq.gz

Alignment:

STAR --runMode alignReads --genomeDir genome_directory/ --sjdbOverhang 99 --readFilesCommand zcat --outSAMtype BAM SortedByCoordinate --readFilesIn $R1 $R2 --outFileNamePrefix path/${SAMPLE}_ --runThreadN 8

Build:

config file:

[info]

readlen=100

bamdirs=sorted_by_coordinate_bams/

genome=hg38

genome_path=genome.fa

[experiments]

group1=g1_Aligned.sortedByCoord.out,g2_Aligned.sortedByCoord.out,g3_Aligned.sortedByCoord.out

group2=h1_Aligned.sortedByCoord.out,h2_Aligned.sortedByCoord.out,h3_Aligned.sortedByCoord.out

/home/user/majiq/bin/majiq --license majiq_license build genome.annotation.gff3 -o output -c config.ini -j 4 --minreads 10


PSI:

majiq --license majiq_license psi -o output -n group1 g1_Aligned.sortedByCoord.out.majiq g2_Aligned.sortedByCoord.out.majiq g3_Aligned.sortedByCoord.out.majiq 


majiq --license majiq_license psi -o output -n group2 h1_Aligned.sortedByCoord.out.majiq h2_Aligned.sortedByCoord.out.majiq h3_Aligned.sortedByCoord.out.majiq 

Heterogen:

majiq --license majiq_license heterogen -j 4 -o output -grp1 g1_Aligned.sortedByCoord.out.majiq g2_Aligned.sortedByCoord.out.majiq g3_Aligned.sortedByCoord.out.majiq  -grp2 h1_Aligned.sortedByCoord.out.majiq h2_Aligned.sortedByCoord.out.majiq h3_Aligned.sortedByCoord.out.majiq  -n group1 group2 --stats TNOM TTEST WILCOXON

Voila:

voila --license majiq_license tsv splicegraph.sql group1-group2.het.voila -f group1_group2_het_voila.tsv




example.pdf

San Jewell

unread,
Apr 27, 2026, 12:48:29 PMApr 27
to Biociphers
Hi Conall, 

Thank you for reaching out! Initially, I don't feel that this observation should be expected, but I'd just like to clarify some points before I try to reproduce this myself. 

1) which versions of majiq are you using? I notice that you specify something you've done before, and something you've done now. Initially I would have assumed this was majiq v2 and majiq v3, but from your run pipeline it looks like the new run is v2 ; so is it earlier and later versions of v2? 
2) I assume the diagram attached is the old run from the prior version?
3) Assuming I'm unable to reproduce this issue using my own sample data later, would you be willing to share your inputs?

Thanks,
-San

Conall Moore

unread,
Apr 28, 2026, 7:49:04 PMApr 28
to San Jewell, Biociphers
Hi San,

Thanks for your reply and help with this.
1) All the majiq analysis I have done (previously and now) is on v2.

$ majiq -v

2.5.10.dev1+ge7fb4fcc


2) Yes, the attached diagram is the old analysis, which I generated from majiq PSI outputs. As expected, the PSI values in each LSV of this run sum to approx 1.0, allowing me to generate stacked bar-charts as I showed previously for visualisation. When I recently re-ran the same .majiq files as a new analysis in heterogen, however, the PSI values in some LSVs do not sum to 1.0.


3) Yes, I would be happy to share my inputs with you. They are available on SRA and publicly available, so this is no problem. 


Thanks again! I hope this clears up the problem I'm facing. If not, please let me know.


All the best,


Conall




--
You received this message because you are subscribed to the Google Groups "Biociphers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to majiq_voila...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/majiq_voila/7397b941-50be-4be1-b4dc-50e8322a3e49n%40googlegroups.com.

bsl...@seas.upenn.edu

unread,
May 4, 2026, 2:42:19 PM (11 days ago) May 4
to Biociphers
Hi Conall,

MAJIQ does not constrain the sum of E[PSI] to be 1 over junctions/introns in an LSV. This behavior is intentional and goes back to PSI quantification in MAJIQ V1 and continues in V2 and V3 with more recently-added quantifiers such as HET.

MAJIQ takes the approach of calculating a beta posterior for PSI per junction/intron. The idea is to estimate the variation in read coverage across each junction/intron depending on the reads starting at each possible junction-adjacent position (see Vacquero et al 2016, 2023 methods for details). The steps taken to calculate this are not constrained to result in sum E[PSI] of 1 over the LSV junction/introns. However, the sum of E[PSI] is usually very close to 1 because that's true by definition for the raw counts in the input RNAseq experiments.

Happy to discuss further, please let us know.
Best Regards,
Barry

San Jewell

unread,
May 13, 2026, 10:07:21 AM (2 days ago) May 13
to Biociphers
Hi Conall, 

In addition to Barry's clarification, I just wanted to mention, as the direction of your original post indicates, for plotting your own plots / visualizations of the LSV data, it is appropriate to re-normalize the psi values to add to 1.0, as this is what voila itself does as well. 

-San
Reply all
Reply to author
Forward
0 new messages