apparent contradiction between Voila & IGV genome browser Sashimi plot

40 views
Skip to first unread message

Justin Malin

unread,
Mar 26, 2021, 3:27:38 PMMar 26
to majiq_voila
Hi, 

Thanks again, Matt, for your answer to my previous question.
At this point, though, we're stuck due to what seems to be contradictory results given by the Voila splice graph and the IGV genome browser Sashimi plot based on the same data (please see my message from March 17 2021 for figures and detailed explanation).  Briefly, the Voila plot shows zero 1-4 splice junctions while the Sashimi plot shows a relatively large number, given the sample size. What can explain the difference between the two results? More importantly, how do we interpret the side-by-side results with confidence in the interpretation?

Any insights would be appreciated!

Thanks,
Justin

jai...@biociphers.org

unread,
Mar 26, 2021, 8:59:45 PMMar 26
to majiq_voila
Dear Justin,

The graphic in VOILA shows read coverage for junctions that met criteria in majiq build for inclusion in the splicegraph.

These are:
  • junction is part of an annotated transcript in your input GFF3 (annotated junction)
  • junction passes denovo filters for at least one build group in your build configuration.
    • That is, enough experiments pass per-experiment filters (determined by --min-experiments -- by default at least half per build group)
    • Per-experiment filters: the junction is in enough reads (--mindenovo), and unique positions on the aligned read (where the split is aligned to in the reads) (--minpos)
You are asking about junctions that are not annotated. So the junctions have to pass --mindenovo and --minpos in enough experiments to be visualized in VOILA.

By default, --mindenovo is 5, and --minpos is 2.

I believe when you say "1-4 splice junctions", I believe you are referring to splits with support from 1-4 reads.

So, if you are using default parameters, and this is the only input experiment into MAJIQ, these "1-4 splice junctions" do not have enough evidence for MAJIQ to add them to the splicegraph. It would require enough other experiments in the same or a different build group with the same splits, but with at least 5 reads of evidence. Then, you would see your missing "1-4 splice junctions".

The less likely but potentially expected reason why MAJIQ might not see them would be if the splits are near the ends of the aligned reads (MAJIQ requires >=8bp overhang).

How are you configuring your build? If I am interpreting your case appropriately, you could always set --minreads, --minpos, and --mindenovo all to 1 (but this might add a lot of other noise) so you can visualize the junctions in VOILA.

That said, are you interested in PSI quantifications? Or are you just trying to get a readout of low-coverage splits in this one particular genomic region?

Please let us know if this addresses your questions/concerns and/or if you have any additional questions or information about your specific use-case. Thank you!

Best,
Joseph

Justin Malin

unread,
Mar 26, 2021, 11:13:38 PMMar 26
to majiq_voila
Hi Joseph, 

As you surmise,  I was referring, incorrectly, to splits with support from 1-4 reads. And my parameters may have been set too stringently to visualize the splice junction of interest. 

We are just trying to get a sense of whether the limited data that we have justifies inferring  the presence of this un-annotated  junction in this one region, which would be expected in the case of a transgenically-induced inversion in our loxP system. Correct me if I'm wrong, but it sounds like you're saying that the default parameters, chosen to ensure a measure of statistical confidence, are precisely what is preventing the VOILA splicegraph from displaying these splice junctions. Therefore, we can't conclude with confidence that  this junction is supported by the 1-4 reads... If I don't hear back, I'll assume this is more or less what you're saying.

Thank you very much for your clear and concise answer.

Best,
Justin

Aicher Joseph

unread,
Mar 31, 2021, 11:25:40 PMMar 31
to Justin Malin, majiq_voila

If I remember correctly, your data are from some kind of scRNA-seq. Our default cutoffs were picked with typical (Illumina) bulk RNA-seq experiments in mind. I suspect that patterns for signal vs noise with your platform and sequencing depth might be quite different than bulk (or other single-cell platforms).

 

This is to say, without more information, I wouldn’t necessarily use the default parameters conclude that you can’t use your data to infer its presence – it’s not clear that the defaults are appropriate for your use case. The more replicate data you have with it relatively consistently there with counts comparable or large vs other junctions you are confident are there (especially considering/adjusting for any platform-specific biases), and doesn’t pick up too many other denovo junctions that you don’t have other reasons to believe in. In other words, from the splicegraph/majiq build side of things, you need to determine what the optimal thresholds are for your data.

 

Best,

Joseph

 

P.S. I don’t remember how exact Cre-Lox recombination is with the sites of the deletion, but keep in mind that MAJIQ will treat denovo junctions with different coordinates independently. If you care that it happened but not necessarily the exact coordinates, there might be some benefit to pooling evidence.

 

P.P.S. this is definitely outside of the usual use case of MAJIQ since you are trying to use RNA-seq to infer the presence of a genomic deletion – not really splicing

 

P.P.P.S. if your aligner calls both splits (cigar op N) and deletions (cigar op D), note that MAJIQ only calls the N operations as junctions

 

-- 

Joseph Aicher (he/him/his)

MD/PhD Candidate, GS6

Perelman School of Medicine

Genomics and Computational Biology Graduate Group

--
You received this message because you are subscribed to a topic in the Google Groups "majiq_voila" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/majiq_voila/E9PIlFTdazc/unsubscribe.
To unsubscribe from this group and all its topics, send an email to majiq_voila...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/majiq_voila/95387936-aeb9-4728-8dd0-abb86bde08cdn%40googlegroups.com.

Justin Malin

unread,
Apr 10, 2021, 2:31:21 PMApr 10
to majiq_voila
Hi Joseph, 

My apologies for the delayed response. 
Thanks very much for your helpful suggestion that we can and should pool our evidence since we are less interested in the exact coordinates of the de novo junction than its presence.

Best,
Justin

Reply all
Reply to author
Forward
0 new messages