voila view interpretation

27 views
Skip to first unread message

Lauren Malsick

unread,
Nov 20, 2025, 10:27:53 AMNov 20
to Biociphers
Hello!

I have gotten voila view to successfully work and now am looking through my data. I want to make sure I understand the interpretation of this correctly...

I have a screenshot of a gene I am looking at, RBM39:

I am really just wondering what the green boxes represent? I know in the voila view program it says it is RNAseq data only, so does that mean in the data I have that the exons are longer than the database (aka GFF files)? I guess I'm just confused and want to make sure I am interpreting right.

Additionally on the bottom for the LSV ID, under LSV type there's the number 3, does this refer to the third exon? therefore the 4th exon is showing variances in inclusion/exclusion? Since exon 4 in the map above shows as database only, does that mean in my RNAseq data set I don't have any sequences corresponding to this specific exon?

If it helps I am working with a weird genome, so this could be the reason for some of the odd data. I essentially just want to make sure I am interpreting these plots correctly!

I appreciate any insight! Thank you!!


rbm39Voilaview.pdf

San Jewell

unread,
Nov 20, 2025, 11:33:56 AMNov 20
to Biociphers
Hello, 

Thank you for reaching out with questions!

The green boxes in this splicegraph are all denovo intron retention events only found in the experiment data, not the annotation. It is also possible to get denovo exons or denovo exon extensions, which would be bigger green boxes that are the height of the other exons, but there aren't any of those in this splicegraph. 

On the lsv cartoon, the '3' is showing that exon 3 is the reference exon for the detected LSV. The lsv defines alternative inclusion of exons 4 and 5 by either junctions or intron retention. As for why the exon 4 is showing as clear colored, I need to go back and look at the definition, as by my interpretation is should actually be RNAseq only, not DB only. I will get back to you after conferring with some other lab members. 

Thanks, 
-San

Lauren Malsick

unread,
Nov 20, 2025, 12:40:43 PMNov 20
to Biociphers
Hi San!

Thank you so much that is incredibly helpful! So the number over the retained intron then refers to how many RNAseq reads I got that correspond to that intron retention? Should I be concerned I have as much intron retention as I do only in the RNAseq dataset or could that just be an artifact of a less annotated genome? 

And hopefully last question referring to the alternative inclusion of exon 4 and 5... each color refers to the different inclusions, red would be splicing out the intron between 3 and 4 (no change in psi), green refers to intron retention between 3 and 4 (which psi increases, therefore is retained more in my second condition?), and blue refers to splicing of exon 4 and that retained intron so now only exon 3 and 5 are there (which psi decreases, therefore it is retained more in my first condition (not skipped) but skipped in my second condition?). 

And thank you for looking into the exon 4 color! I want to make sure I write my legend correctly and don't misinterpret the data!

Thank you!

Lauren

San Jewell

unread,
Nov 24, 2025, 11:12:45 AM (12 days ago) Nov 24
to Biociphers
Hi Lauren, 

I don't in general think that high intron retention read counts are a cause for concern in the experiment, but because that part of the experiment design is a little less familiar to me I'm going to reach out to some lab members to verify. 

For the LSV shown in your screenshot, the conclusions that you draw from your second paragraph are all correct, yes. 

For exon 4 I have looked deeper and noticed that the coloring was a bug introduced with the release of majiq-L long reads software. In this case exon 4 should actually be colored green, as it is a denovo-only detected exon. This morning, I have pushed out a minor version patch which should correct this coloring in voila, so you may update whenever you have the chance to and verify if the color functions that way now. 

Thanks!
-San

Matthew Gazzara

unread,
Nov 24, 2025, 11:44:30 AM (12 days ago) Nov 24
to Biociphers
Hi Lauren,

No high levels of intron retention (IR) is not necessarily a concern. It will be a function of the RNA-seq data library prep method (polyA selected vs. rRNA depleted total RNA), the read coverage depth, the gene expression, etc. Is your data polyA selected? We do have a parameter in the build stage where you can turn up the threshold for IR detection if you feel it's too permissive on your dataset. 

If you look at each LSV that has a denovo IR detected you can look at the PSI of that intron and, I think, for most of the IRs in the RBM39 splicegraph the inclusion level seems fairly low compared to the spliced junction reads. RBM39, and many other RBPs, are regulated at the level of splicing and I believe the LSVs involving exons 3 and 4 in your splicegraph are premature termination codon introducing. The IR around these exons are probably also regulatory and tune RBM39 mRNA and protein levels. Seeing such patterns of splicing, IR, etc. in RBPs would be expected. 

-Matt Gazzara
Reply all
Reply to author
Forward
0 new messages