Pipeline doubts regarding .psicov files and moccasin correction (partly particular case)

10 views
Skip to first unread message

Miriam Martínez

unread,
Oct 15, 2025, 1:12:19 AM (2 days ago) Oct 15
to Biociphers
Hi biociphers team,

while running my splicing analysis, on the heterogen step I stumbled upon a doubt regarding the .psicov files and the moccasin correction. To put some context, I have 725ish samples to perform splicing analysis, so as they are case-control samples, I did the psi-coverage step to group these samples into 2 groups (case and control). 

Before performing the quantification (in my case heterogen) step, I wanted to do some correction with moccasin, so as prefix in the model matrix I used the names of the .sj files, but the input files where the 2 .psicov files generated on the previous step. Is this correct to do? Will moccasin actually be able to correct the data as they are grouped in 2 .psicov files? (For more info, i can't do a matrix for just case.psicov and control.psicov as the samples forming each group have different confounding factors).

If the data is actually corrected, I just obtain a single corrected.psicov file so, how should the different groups be indicated? By using the --select-grp1-prefixes/--select-grp2-prefixes on the heterogen step?

If the data isn't actually corrected, should I go back, do a .psicov file for each sample, correct each sample with moccasin and finally fuse the .psicov files into 2 .psicov files grouped by status (case/control)? If so, how can I do this final fusion?

Please let me know if something is unclear and thank you again for your help. 

Best regards,

Miriam Martínez

San Jewell

unread,
Oct 16, 2025, 11:07:07 AM (15 hours ago) Oct 16
to Biociphers
Hi Miriam,

Barry, who you've communicated with previously, is currently out of office. I can also forward this question to him, but in the meantime I will answer from my knowledge.

The psicov files are containers which can contain one or more prefixes. For a number of steps in the pipeline you can either use the files entirely, specify multiple or specify prefixes inside the files, but the usage of those prefixes and data should not change based on the groupings of number of files specified or prefixes inside the files. (i.e. specifying one pricov file with three prefixes to majiq heterogen should give the same result as specifying three files with one prefix each)

Therefore, I believe that you will receive a single corrected psicov file. You can use it downstream with --select-grp1-prefixes and --select-grp2-prefixes. Just note that the behavior of majiq heterogen when specifying only -psi1 argument is to automatically split the prefixes of psi1 and ignore --select-grp2-prefixes. Therefore, to run as you expect, you should specify the combined file for both -psi1 and -psi2. (i.e. $ majiq heterogen -psi1 corrected.psicov -psi2 corrected.psicov --select-grp1-prefixes prefix1 prefix2 --select-grp2-prefixed prefix3 prefix4

Let me know if it makes sense, and I'll also ping Barry for further confirmation in the meantime.
-San
Reply all
Reply to author
Forward
0 new messages