Dear Paul,
Thank you so much for such a quick reply! It is a bit difficult for me to formulate my ideas clearly, but I hope I will succeed.
I used "voila tsv" command to be able to further process my differentially spliced set of genes. I ran the following command to achieve this:
voila tsv $SPLICEGRAPH $VIOLAPSI -l violaTSV_N72HM.log -j 4 -f violaTSV_N72HM.tsv --probability-threshold 0.95
I assumed that the --probability-threshold 0.95 restricts both "P(|dPSI|>=0.20) per LSV junction" and "P(|dPSI|<=0.05) per LSV junction" values in the "majiq deltapsi" output (*deltapsi.tsv) and only those observations (observations here appear to be genes and their corresponding LSV IDs) that meet the given probability threshold value hence should be written into the "viola tsv" output. However, when I inspect both "majiq deltapsi" output and "viola tsv" output, I can see, that each observation contains several LSV types and only some of these LSV types are supported by the probability values that meet the given threshold. To illustrate this problem more clearly I am posting an example of one gene/LSV ID from the "voila tsv" output below:
gene_name gene_id lsv_id mean_dpsi_per_lsv_junction probability_changing probability_non_changing
HDAC9 gene:ENSG00000048052 gene:ENSG00000048052:t:18585281-18585522 -0.39720475460598303;-3.646654966312975e-08;-9.086655971137845e-05;0.40617946790480436;-3.646654966312975e-08;-3.646654966312975e-08 0.9992084455711601;6.297504308260824e-21;6.319190549793287e-21;0.9994105746946654;6.297504308260824e-21;6.297504308260824e-21 2.893312256446734e-10;0.9999988196695868;0.9969519749264952;1.1430489924834595e-10;0.9999988196695868;0.9999988196695868
There are 6 values for each LSV type that occurs in this particular event. However, only 2 of these 6 values meet the given threshold for probability changing (which corresponds to "P(|dPSI|>=0.20) per LSV junction" in deltapsi output file) and probability non changing (which corresponds to "P(|dPSI|<=0.05) per LSV junction" in deltapsi output file).
My question is: What are the criteria in "voila tsv" program to pass an observation to the "voila tsv" output file? How many LSV types within each observation must meet the given probability threshold to be passed to the "voila tsv" output file?
I hope that I formulated my question clear enough and thank you again for dealing with my problem!
Regards,
Pavlina