Cassette exon events lack second LSV

49 views
Skip to first unread message

mario.ke...@googlemail.com

unread,
Sep 3, 2021, 12:24:14 PMSep 3
to majiq_voila

Hi everyone,

I've post-processed the MAJIQ output to extract simple cassette exon events involving only 3 exons. My general assumption is that for all cassette exons events I should have two LSVs (one from the source and one from the target exon perspective). However, for a hand full of events I only get a single LSV. I checked the output in the Voila GUI and according to the splicegraph there should be a second LSV. But unfortunately the second LSV is not reported. I added an example in the attachements:

Exon 23 is my cassette exon of interest. As you can see there is a LSV with exon24 as target exon that covers the cassette exon event. But I would also expect a second LSV with exon22 as source exon. Is there a logical explanation why the second LSV is missing?


Thanks in advance.

Best,
Mario

Screenshot 2021-09-03 at 18.13.50.png
Screenshot 2021-09-03 at 18.13.38.png

Matthew Gazzara

unread,
Sep 10, 2021, 11:45:37 AMSep 10
to majiq_voila
Hi Mario,

Thanks for your interest! From your screenshots, I agree it seems as if the exon 22 source LSV should exist. It's possible that some filtering during the builder or quantification step removed this LSV.

Would it be possible for you to share your settings file and build, dpsi quantifier, and voila commands so we can look into this issue for you? 

Thanks!
Matt

mario.ke...@googlemail.com

unread,
Sep 10, 2021, 12:05:05 PMSep 10
to majiq_voila
Hi Matt,
thanks for the reply.

The analysis is from a bachelor student but I think I found the correct build and delta psi shell-script:

He used the following commands:

majiq build /share/project/jonas/2/MAJIQ/gencode.v37.primary_assembly.annotation.gff3 -c /share/project/jonas/2/MAJIQ/MAJIQ.config -j 4 --minreads 20 --min-denovo 30 --min-intronic-cov 0.1 -o /share/project/jonas/2/MAJIQ/out

majiq deltapsi -grp1 "/share/project/jonas/2/MAJIQ/out/CT1-Aligned.sortedByCoord.out.majiq" "/share/project/jonas/2/MAJIQ/out/CT2-Aligned.sortedByCoord.out.majiq" "/share/project/jonas/2/MAJIQ/out/CT3-Aligned.sortedByCoord.out.majiq" -grp2 "/share/project/jonas/2/MAJIQ/out/KO1-Aligned.sortedByCoord.out.majiq" "/share/project/jonas/2/MAJIQ/out/KO2-Aligned.sortedByCoord.out.majiq" "/share/project/jonas/2/MAJIQ/out/KO3-Aligned.sortedByCoord.out.majiq" -j 4 --minreads 25 --prior-minreads 50 --output-type voila -o /share/project/jonas/2/MAJIQ -n A_CT A_KO

I think the first inclusion junction is an annotated one so --minreads is the parameter to look at. I am not really sure if I understand the parameter correctly but shouldn't the first inclusion junction survive this filtering step?

Best,
Mario

jai...@biociphers.org

unread,
Sep 10, 2021, 2:31:44 PMSep 10
to majiq_voila
Dear Mario,

I suspect the LSV is being filtered out at the quantification stage.

To explain how the parameters (e.g. minreads, etc.) work:
  • The command you shared compares group of experiments A_CT (n=3) to group of experiments A_KO (n=3).
  • There are three filtering criteria used by the quantifier: minreads, minpos, and min-experiments.
    • Per experiment (minreads, minpos): a junction "passes" the experiment if there experiment has at least minreads aligned reads with the junction, with the junction appearing on the aligned read in at least minpos unique positions.
    • Per group (min-experiments): a junction "passes" a group of experiments if at least min-experiments experiments in the group had the junction pass. If min-experiments < 1, then it's treated as a proportion of the size of the group.
    • (retained introns follow the same criteria after scaling with respect to intron length)
    • Per group: an LSV "passes" a group if at least one of its junctions (or retained introns) passes for the group.
    • Per quantification: MAJIQ deltapsi only quantifies an LSV if it passes in both groups.
  • The criteria used in the command above are minreads=25, minpos=3 (default), min-experiments=0.5 (default).
    • Group sizes are 3, so that means that each group needs at least 2 (ceil(0.5 * 3)) experiments to pass.
  • So MAJIQ will only quantify the LSV (and have it show up in VOILA) if in A_CT, at least two of the experiments have more than 25 reads in the same junction from that LSV, and same for A_KO (but could be the other junction for the second group).
Without getting into MAJIQ internals/etc. we can mostly check this in one of two ways: (1) from VOILA splicegraphs or (2) from rerunning MAJIQ with permissive thresholds.
  1. From VOILA splicegraphs (more informative):
    • The splicegraph with readcounts you screenshotted earlier only shows a single experiment or summary over groups.
    • You can view multiple splicegraphs/junction readcounts from specific experiments in VOILA by clicking on "SpliceGraph" in the top-left corner and adding the specific experiments of interest.
    • You can use this to view the read counts at this missing LSV to see which experiments/junctions pass experiment filters and if that translates to the group, then groups passing as well.
      • (thank you for bringing this up, we are now discussing adding a feature to make this process easier in the future)
  2. Rerunning MAJIQ with permissive thresholds:
    • Rerun MAJIQ deltapsi, changing minreads to 1 and adding flags --minpos 1, --min-experiments 1.
    • (FYI This will introduce a lot of additional quantified LSVs, some of which may be noisier than you want)
    • We expect that the missing LSV should show up -- otherwise, this would suggest that it's being filtered out in the build step.
I think that the VOILA splicegraphs approach will be most informative, but the MAJIQ with permissive thresholds approach could quickly answer whether the LSV goes missing from MAJIQ build or MAJIQ deltapsi.

Best,
Joseph

Mario Keller

unread,
Sep 10, 2021, 3:38:39 PMSep 10
to jai...@biociphers.org, majiq_voila
Dear Joseph,

thank you very much for the nice explanation. I will check it directly on Monday.

Just a brief question regarding the "minpos" parameter, since I am still struggling with the explanation. If by default minpos is 3 and I set minreads to 25, does this mean that the junction needs support from 25 reads that are aligned/positioned at atleast 3 different positions.
Should this prevent cases where reads are PCR duplicates and pileup at exactly the same position?

Best, 
Mario

--
You received this message because you are subscribed to a topic in the Google Groups "majiq_voila" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/majiq_voila/6HkdjYP8jVo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to majiq_voila...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/majiq_voila/88737250-5d9a-488d-8823-c0b6487487e0n%40googlegroups.com.

mario.ke...@googlemail.com

unread,
Sep 13, 2021, 5:07:42 AMSep 13
to majiq_voila
Dear Joseph and Matt,

I checked the inidividual experiments for each of the two groups and found out that indeed the 2nd LSV got lost during filtering.

What I found is that for the missing LSV for both groups only in a single experiment a junction was >= 25, while for the detected LSV the Inclusion Junction was in all 6 experiments >= 25 and the Skipping Junction in a single CT experiment. So everything makes perfect sense now :)

Missing LSV:

Inclusion Junction:
  • A_CT replicates: 18, 8, 20
  • A_KO replicates: 17, 36, 15
Skipping Juncion:
  • A_CT replicates: 11, 7, 25
  • A_KO replicates: 4, 5, 6

Detected LSV:

Inclusion Junction:
  • A_CT replicates: 31, 6, 36
  • A_KO replicates: 27, 37, 25
Skipping Juncion:
  • A_CT replicates: 11, 7, 25
  • A_KO replicates: 4, 5, 6



Reply all
Reply to author
Forward
0 new messages