LSV stringency parameters (min-experiments etc.)

115 views
Skip to first unread message

Rupert Hugh-white

unread,
Oct 19, 2018, 12:30:04 PM10/19/18
to majiq_voila

Hello,

I have questions regarding some of the majiq build and majiq psi parameters. I've run majiq build with min-experiments not specified (so the default of >=50% of samples is applied) on 24 samples (2 experiments of 12 replicates defined in the config file). I then run majiq psi and voila psi on each sample independently. This results in 215512 LSVs being quantified in at least one of these 24 samples. Many of these LSVs are not quantified in very many samples however, for example 8029 LSVs are only quantified in 1/24 samples.

My interpretation of this is that during the build step, 8029 LSVs are defined which only pass the min-reads, minpos etc. filters in one sample. I would expect each defined LSV to pass these filters in at least 6 samples (50% of experiments in the defined experimental groups). Am I missing something/misunderstood? 

My interest is to have a a matrix of LSVs with one PSI value per junction per sample for use with other tools downstream. Since if majiq psi and voila is run with multiple samples the result is an average PSI per LSV junction (not 1 per sample), I currently run these commands separately per sample and then merge the resulting data for each of the 24 samples by LSV ID. I was surprised that the resulting LSV-intersection matrix contains only 149041 out of 215512 total LSVs, and was worried that meaningful biology/LSVs are being lost.

I have further questions regarding minpos and minreads:
What is a "start position" with regards to -minpos?
Is minreads a simple threshold where all reads mapping to any location within an LSV are summed and the total must be >= minreads?

Hopefully my questions are clear, and thank you in advance for your time!

Rupert

Jordi Vaquero

unread,
Oct 22, 2018, 10:47:16 AM10/22/18
to majiq_voila, Rupert Hugh-white
Hello Rupert, 
Let me separate the question in different parts to better understand your comment. 
I have questions regarding some of the majiq build and majiq psi parameters. I've run majiq build with min-experiments not specified (so the default of >=50% of samples is applied) on 24 samples (2 experiments of 12 replicates defined in the config file). I then run majiq psi and voila psi on each sample independently. This results in 215512 LSVs being quantified in at least one of these 24 samples. Many of these LSVs are not quantified in very many samples however, for example 8029 LSVs are only quantified in 1/24 samples.

My interpretation of this is that during the build step, 8029 LSVs are defined which only pass the min-reads, minpos etc. filters in one sample. I would expect each defined LSV to pass these filters in at least 6 samples (50% of experiments in the defined experimental groups). Am I missing something/misunderstood? 

So you are right about your interpretation but you must take in care that the filters for the builder and the psi/dpsi (quantifier) are different. The latter are higher. 

By default, we consider the build filter as a mean to trust that the lsv is there, but some of them might be still not covered enough for a proper quantification. 
The  default filters are
Builder: 3 reads in 2 different positions
Quantifier: 10 reads in 3 positions. 

Remember that those filters are for at least one junction in the LSV, that means that three junctions with 9 reads each, won’t be pass the quantifier filter, but the builder would be more than enougth 

I have further questions regarding minpos and minreads:
What is a "start position" with regards to -minpos?
Is minreads a simple threshold where all reads mapping to any location within an LSV are summed and the total must be >= minreads?



Thanks

Jordi
--
You received this message because you are subscribed to the Google Groups "majiq_voila" group.
To unsubscribe from this group and stop receiving emails from it, send an email to majiq_voila...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/majiq_voila/ddba9809-e520-44b4-8411-d31b47c510c7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jordi Vaquero

unread,
Oct 22, 2018, 10:50:59 AM10/22/18
to majiq_voila, Rupert Hugh-white
Sorry I didn’t answer the second part, 

I have further questions regarding minpos and minreads:
What is a "start position" with regards to -minpos?
Is minreads a simple threshold where all reads mapping to any location within an LSV are summed and the total must be >= minreads?

The start position is the cordinate where the read starts, is a way to find biases based on the chromosomic position, min_pos tries to check for lsv where all the reads are stacked in the same chromosomic area, that would give us a hint for a possible artifact. 

To the secon part I think I answered previously, but min_reads is a filter to check that at least one junction is covered for total number of reads >= minreads.  ( Note that it is not sum of reads in an lsv, is sum of reads in one of the junctions)

Thanks

Jordi

Rupert Hugh-white

unread,
Oct 23, 2018, 4:23:35 AM10/23/18
to majiq_voila
Thanks - very helpful!

Elsa Claude

unread,
Feb 15, 2021, 11:50:57 AMFeb 15
to majiq_voila
Hello,

Sorry to re-open this discussion but I think it would be too much to open another one since I just need a little more explanation about what Jordi was answering.

According to the documentation, --min-experiments filter apply to GROUPS individually not the whole experiments if I understand this correctly.
So for example, if I have two conditions with 3 replicates in each does it mean that in both group, a junction meeting the thresholds such as minreads and minpos, in 2 (or 1 ?) of the replicates it will be kept ?
And if I want the junction to be validated in ALL the replicates within one experimental condition (also means group ?) I would put 1 instead of 0.5 no ? But in the documentation the number in fraction can't be strictly equal to 1 so how can we achieve this ?
Note : I can't put a fixed value such as 3, because I have included majiq in a pipeline, so the user can run it on 1,2,3,4...n replicates and should not modify the scripts.

Thanks for your time,
Have a nice day !

Elsa

Matthew Gazzara

unread,
Feb 24, 2021, 1:22:46 PMFeb 24
to majiq_voila
Hi Elsa,

Let me try and help clear up your questions:

"So for example, if I have two conditions with 3 replicates in each does it mean that in both group, a junction meeting the thresholds such as minreads and minpos, in 2 (or 1 ?) of the replicates it will be kept ?"
By default the junction / LSV would have to be present in 50% of the samples in a group. This is rounded up so in your case of 3 samples in a group it would need to be in 2 out of 3 of the samples in any one group to make it through the filters at the build stage.

"And if I want the junction to be validated in ALL the replicates within one experimental condition (also means group ?) I would put 1 instead of 0.5 no ? But in the documentation the number in fraction can't be strictly equal to 1 so how can we achieve this ?"
Yes here experimental condition would be a group that you define in the configuration file. Currently a float value less than one is interpreted as fraction of the samples in any one group. If you put 1 (or even 1.0) MAJIQ will interpret that as only needing to be in one sample in a group. 

We are going to update this parameter in a future release of MAJIQ. In the mean time the work around to accomplish what you want is to set --min-experiments to the maximum number of samples you might ever have in a group when you run your pipeline. If this integer is larger than the number of samples in a given group, MAJIQ build should just use the group size instead. 

Sorry for the inconvenience and let us know if you have more questions!

-Matt 

Elsa Claude

unread,
Feb 25, 2021, 5:41:40 AMFeb 25
to majiq_voila
Hi Matthew,

Thanks a lot for the answer ! That is much more clear now.
I'll find a way to automatically give the number of replicates to the majiq command.

And I'm happy to know that this feature will be updated in a future release.
Thanks again for such a good program !

Have a nice day,

Elsa

Elsa Claude

unread,
Mar 30, 2021, 12:44:18 PMMar 30
to majiq_voila
Hello,

I thought I understood the answer but regarding some of my results now, I may have not understand one detail.
Regarding the parameter --min-experiments, if I put 0.5 for example, does it mean that the LSV should be validated in at least ONE of the two conditions (I am working on two conditions) ? Or it HAS to be validated in BOTH condition ?
I need my LSV to be validated at least in one of the two condition not necessarily in both, is that possible ??

Thanks again,
Have a nice day,

Elsa

Joseph Aicher

unread,
Mar 30, 2021, 12:48:43 PMMar 30
to Elsa Claude, majiq_voila
Majiq build: at least one build group. Majiq deltapsi: both quantification groups.

Best,
Joseph

Elsa Claude

unread,
Mar 30, 2021, 1:00:19 PMMar 30
to majiq_voila
Thanks a lot.

In my lab, a previous bioinformatician was using Whippet, he found a skipped exon for which the corresponding junction has a high dpsi (-0.89) and when I look at the reads/junction, in one of the two condition, the 3 replicates have 0 reads for the junction but a mean aournd 30reads in the other condition.
The lab consider this event as one of the most important in our study because they validated it in wet lab. So I really would like to find it with MAJIQ. But from what I understand it is not possible no ? Since the junction would not validated the min number of reads in 0 replicate of one condition ? :/
And if I put a minimum of 0 read, I will get a lot of random junction I imagine...

Thanks again for the quick answer,
Best,

Elsa

Elsa Claude

unread,
Mar 31, 2021, 5:33:17 AMMar 31
to majiq_voila
Woops,  actually this particular events has 11reads/3reads/7reads in 3 replicates for one condition and more than 10 in the 3 replicates of my other condition.
My threshold to validate is 10reads. Knowing that I "should" put --min-experiments = 1 in order to be able to find it in MAJIQ (normally). But if I do that I will also get non so reliable other events no ?

Sorry for the multiple comment :/
Do you want me to open a new topic to talk about that precise parameter ?
Reply all
Reply to author
Forward
0 new messages