MAJIQ

Sasha G

unread,

Jul 22, 2022, 1:17:06 PM7/22/22

to majiq_voila

Hello,

I hope this email finds you well. I just had a few questions about to how interpret data from the tsv file.

What do all the column titles mean exactly?

For the mean delta psi per lev per junction, why are there more than one value for some rows? What does it mean when there are more than one value?

How do you determine which genes are statistically significant?

What is the difference between the s and the t in the lsv type? What does the I mean in the lev type?

I have attached my tsv file below.

Thank you so much,

Sasha

Sasha G

unread,

Jul 22, 2022, 1:19:34 PM7/22/22

to majiq_voila

Also, why is there no column for the deltaPSI values?

Paul Jewell

unread,

Jul 27, 2022, 11:42:31 AM7/27/22

to majiq_voila

Hello Sasha,

I don't think I see the tsv file attached here. But I can recommend looking at the basic column definitions of the TSV output from the vignette here: https://biociphers.bitbucket.io/majiq-docs-academic/gallery/heterogen-vignette.html#Generating-TSV-output (for the general overview of names and categories)

The reason there are more than one value for each LSV is because each LSV contains more than one junction (choice of splice path). So the number of semicolon-separated values, for each column, depends on the number of junctions in the LSV. (there ordering will always be consistent)

The test of significance is not based on a per-gene level but rather per-LSV. I will defer to others to provide more in depth explanation of the default significance criterion.

There are t, s, and u/unk ; 's' is for source, a location on the splice graph where many possible paths diverge from. 't' is for target, the opposite, where many paths may converge to. (For example, a module will always begin with a source LSV and conclude with a target LSV). For the unknown lsv type, this is due to cases there the reference exon would be a "half exon" (shown as a small green dashed line in voila view, with one of the coordinates being '-1'). These are cases where the potential for a splice to an exon exists, but we were not able to find the other end of the exon in the data provided.

for dpsi values, 'mean_dpsi_per_lsv_junction' is probably the column that you are looking for. In this case the mean is over all experiments in the group. (voila TSV does not currently show data per individual experiment, if this is desired, it's probably best to group experiments individually in the build)

Sasha G

unread,

Jul 29, 2022, 6:51:59 PM7/29/22

to majiq_voila

Thank you!

Reply all

Reply to author

Forward