Pipeline doubts regarding .psicov files and moccasin correction (partly particular case)

Miriam Martínez

unread,

Oct 15, 2025, 1:12:19 AM10/15/25

to Biociphers

Hi biociphers team,

while running my splicing analysis, on the heterogen step I stumbled upon a doubt regarding the .psicov files and the moccasin correction. To put some context, I have 725ish samples to perform splicing analysis, so as they are case-control samples, I did the psi-coverage step to group these samples into 2 groups (case and control).

Before performing the quantification (in my case heterogen) step, I wanted to do some correction with moccasin, so as prefix in the model matrix I used the names of the .sj files, but the input files where the 2 .psicov files generated on the previous step. Is this correct to do? Will moccasin actually be able to correct the data as they are grouped in 2 .psicov files? (For more info, i can't do a matrix for just case.psicov and control.psicov as the samples forming each group have different confounding factors).

If the data is actually corrected, I just obtain a single corrected.psicov file so, how should the different groups be indicated? By using the --select-grp1-prefixes/--select-grp2-prefixes on the heterogen step?

If the data isn't actually corrected, should I go back, do a .psicov file for each sample, correct each sample with moccasin and finally fuse the .psicov files into 2 .psicov files grouped by status (case/control)? If so, how can I do this final fusion?

Please let me know if something is unclear and thank you again for your help.

Best regards,

Miriam Martínez

San Jewell

unread,

Oct 16, 2025, 11:07:07 AM10/16/25

to Biociphers

Hi Miriam,

Barry, who you've communicated with previously, is currently out of office. I can also forward this question to him, but in the meantime I will answer from my knowledge.

The psicov files are containers which can contain one or more prefixes. For a number of steps in the pipeline you can either use the files entirely, specify multiple or specify prefixes inside the files, but the usage of those prefixes and data should not change based on the groupings of number of files specified or prefixes inside the files. (i.e. specifying one pricov file with three prefixes to majiq heterogen should give the same result as specifying three files with one prefix each)

Therefore, I believe that you will receive a single corrected psicov file. You can use it downstream with --select-grp1-prefixes and --select-grp2-prefixes. Just note that the behavior of majiq heterogen when specifying only -psi1 argument is to automatically split the prefixes of psi1 and ignore --select-grp2-prefixes. Therefore, to run as you expect, you should specify the combined file for both -psi1 and -psi2. (i.e. $ majiq heterogen -psi1 corrected.psicov -psi2 corrected.psicov --select-grp1-prefixes prefix1 prefix2 --select-grp2-prefixed prefix3 prefix4

Let me know if it makes sense, and I'll also ping Barry for further confirmation in the meantime.

-San

Miriam Martínez

unread,

Oct 19, 2025, 6:46:18 PM10/19/25

to Biociphers

Hi San,

thanks for your answer. I will try to continue the pipeline as you suggest. Nonetheless, it would be great if Barry could confirm when he is back.

Best,

Miriam

bsl...@seas.upenn.edu

unread,

Oct 20, 2025, 11:54:23 AM10/20/25

to Biociphers

I agree with San.

Moreover, I would caution against workflows which run moccasin separately on different groups of samples and then run delta-psi or het on those separately-run moccasin outputs. That is likely to introduce a new computational "batch effect" which will affect delta-psi or het results.

Best Regards,

Barry

Miriam Martínez

unread,

Oct 20, 2025, 6:49:53 PM10/20/25

to Biociphers

Hi San and Barry,

thank you very much for your help! I will continue the analysis as you suggest then. Hope you have a nice a day.

Best,

Miriam

San Jewell

unread,

Oct 21, 2025, 10:56:26 AM10/21/25

to Biociphers

Thanks! You as well. Please don't hesitate to reach out if you have any more usage questions.

-San

Reply all

Reply to author

Forward