plot a 3D figure about the threshold and gene number

136 views
Skip to first unread message

Maoting Chen

unread,
Jun 14, 2021, 4:00:27 AM6/14/21
to majiq_voila
Hi,

I know in the voila we can change the threshold of the dPSI and confidence and get the genes that above the setted thresholds. However, I am wondering if we can plot a 3D figure of the gene number ~ dPSI + confidence, let's say, x - dPSI threshold, y - confidence, z - corresponding gene number. I think it can be helpful for us to find the suitable thresholds. 

Looking forward to your reply.

Thanks,
Maoting

Maoting Chen

unread,
Jun 29, 2021, 6:11:57 AM6/29/21
to majiq_voila
Hi,

Just to add on to my question. 
In Voila view html file, we can adjust  the threshold of the dPSI and confidence and get the corresponding number of entries (LVS). So what I am really wondering is that how those enteries are validated to fulfill the threshold requirement? The following is my guessing: 
From the tsv file generated from voila tsv, each entry/LSV has the columns 'mean_dpsi_per_lsv_junction', 'probability_changing', and 'probability_non_changing'. For one entry/LSV, if its 'mean_dpsi_per_lsv_junction' has at least one value larger than dPSI threshold and its 'probability changing' also has at least one value larger than confidence, is it counted?
I used R to compute a tsv file using my own data and setted threshold, but the result is not consistent with the number counted from voila. 

Really appreciate it if I can get some explanations. Or is there any public source code that I can access to?
Looking forward to your reply.

Thanks,
Maoting

Pavlína Věchtová

unread,
Nov 1, 2021, 8:46:13 AM11/1/21
to majiq_voila
Hi,
I am dealing with the very same issue. Since there is not any documentation for majiq output files, I was not able to figure out myself how the tsv file is created based on the selected threshold. When I check the values for both 'probability_changing', and 'probability_non_changing' not all the values for the listed LSVs comply with the selected threshold. Could you, please, clarify, what is the criteria to consider an LSV to be included in the resulting tsv file? 
Thank you,
Pavlina



Paul Jewell

unread,
Nov 1, 2021, 10:46:05 AM11/1/21
to majiq_voila
Hello Pavlina,

I'm not quite sure I understand. Could you please give the example of some data points you mean about "not all the values for the listed LSVs comply with the selected threshold." so I can try to explain what might be going on?

Thanks.

Pavlína Věchtová

unread,
Nov 3, 2021, 9:12:30 AM11/3/21
to majiq_voila
Dear Paul,
Thank you so much for such a quick reply! It is a bit difficult for me to formulate my ideas clearly, but I hope I will succeed. 
I used "voila tsv" command to be able to further process my differentially spliced set of genes. I ran the following command to achieve this: 
 
voila tsv $SPLICEGRAPH $VIOLAPSI -l violaTSV_N72HM.log -j 4 -f violaTSV_N72HM.tsv --probability-threshold 0.95

I assumed that the --probability-threshold 0.95 restricts both "P(|dPSI|>=0.20) per LSV junction" and "P(|dPSI|<=0.05) per LSV junction" values in the "majiq deltapsi" output (*deltapsi.tsv) and only those observations (observations here appear to be genes and their corresponding LSV IDs) that meet the given probability threshold value hence should be written into the "viola tsv" output. However, when I inspect both "majiq deltapsi" output and "viola tsv" output, I can see, that each observation contains several LSV types and only some of these LSV types are supported by the probability values that meet the given threshold. To illustrate this problem more clearly I am posting an example of one gene/LSV ID from the "voila tsv" output below:

gene_name gene_id lsv_id mean_dpsi_per_lsv_junction probability_changing probability_non_changing
HDAC9 gene:ENSG00000048052 gene:ENSG00000048052:t:18585281-18585522 -0.39720475460598303;-3.646654966312975e-08;-9.086655971137845e-05;0.40617946790480436;-3.646654966312975e-08;-3.646654966312975e-08 0.9992084455711601;6.297504308260824e-21;6.319190549793287e-21;0.9994105746946654;6.297504308260824e-21;6.297504308260824e-21 2.893312256446734e-10;0.9999988196695868;0.9969519749264952;1.1430489924834595e-10;0.9999988196695868;0.9999988196695868

There are 6 values for each LSV type that occurs in this particular event. However, only 2 of these 6 values meet the given threshold for probability changing (which corresponds to "P(|dPSI|>=0.20) per LSV junction" in deltapsi output file) and probability non changing (which corresponds to "P(|dPSI|<=0.05) per LSV junction" in deltapsi output file). 

My question is: What are the criteria in "voila tsv" program to pass an observation to the "voila tsv" output file? How many LSV types within each observation must meet the given probability threshold to be passed to the "voila tsv" output file?

I hope that I formulated my question clear enough and thank you again for dealing with my problem!
Regards,
Pavlina
Reply all
Reply to author
Forward
0 new messages