how should I voila modulize with multiple samples

28 views
Skip to first unread message

jin jiacheng

unread,
Mar 11, 2026, 11:37:09 AM (yesterday) Mar 11
to Biociphers
Hi San,
I hope this mail finds you well, I have encountered a problem when i'm trying to voila modulize with multiple input psicov files and sgc files, because i want to check the psi of every sample of mine in every lsv. My command was
$voila modulize -d voila /rna_seq/results/majiq/build/sg.zarr /rna_seq/results/majiq/psi/sample_1.psicov 
/rna_seq/results/majiq/psi/sample_2.psicov 
/rna_seq/results/majiq/sgc/sample_1.sgc 
/rna_seq/results/majiq/sgc/sample_2.sgc --show-per-sample-psi --show-read-counts -j 20 (I demonstrated with only 2 samples but actually there's more, and these samples have only one replicate in it) 
and i am using the latest version 3.0.18.
However, the data in two columns(sample_1.median_psi, sample_2.median_psi) of the output tsvs turned out to be the same in every row, and here i show the last six columns in the junctions.tsv, is there something wrong with my command and if so how should i revise? I have tried to add --psicov-grouping-file but it won't work, or is it because i have only one rep? or is there something wrong with my input file, could you please kindly help me out with this problem and explain the reason why? Please let me know if any additional information is needed.
sample_1.median_reads   sample_1.median_psi    sample_1.var_psi  
sample_2.median_reads   sample_2.median_psi    sample_2.var_psi 
347 8.971e-01 5.001e-04 291 8.971e-01 8.722e-04
 21 6.646e-02 4.465e-04 25 6.646e-02 7.517e-04
246 7.836e-01 1.539e-03 126 7.836e-01 1.982e-03
50 1.890e-01 1.461e-03 27 1.890e-01 1.904e-03
Thank you,
Jiacheng

San Jewell

unread,
Mar 11, 2026, 2:52:54 PM (yesterday) Mar 11
to Biociphers
Hi Jiacheng, 

You are correct. After a recent patch to address multiple groups inside single psi-coverage files, the wrong mode was specified, which caused the modulizer to take the mean over the file groups instead of showing each group individually. I have just pushed a patch which should address this. Please let me know if the problem persists after updating. 

Thank you for the report!
-San

jin jiacheng

unread,
Mar 11, 2026, 10:36:16 PM (yesterday) Mar 11
to Biociphers
Hi San,

Sorry for my late reply, the problem is successfully solved after the update! But the voila modulize process became very slow and it might take more than ten hours to finish with only two samples with 20 processes, is there any way to improve? Thanks again for your answer!

Jiacheng

San Jewell

unread,
3:05 PM (9 hours ago) 3:05 PM
to Biociphers
Jiacheng, 

No worries. 

For your new question, I have compared and averaged some run times between the new and old versions, human transcriptome with three samples. In my attempt to reproduce your issue, I actually have seen a minor improvement in run times in the newer version. Is it possible that there is some other factor that has changed for you? 10 hours sounds exceedingly excessive. What does your computing environment look like? If we cannot determine from discussion, would you be willing to share some sample data demonstrating the issue so that I may see if I can reproduce it?

Thanks, 
-San
Reply all
Reply to author
Forward
0 new messages