Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Moccasin: Model matrix & confounding factors

27 views
Skip to first unread message

Swethaa NG

unread,
Nov 19, 2024, 4:02:04 PM11/19/24
to Biociphers
Hi,
Thanks to the team for the guidance with Moccasin code usage. I have a small technical doubt which I would like to clarify please. I have .majiq files of samples from three batches to be corrected:
Controls (3) batch1
Disease (2) batch2
Disease(1) batch3

Since the sample size is highest in batch1, it would be ideal to have the batch2 and 3 adjusted based on batch1. Right? (hoping this does not correct for the biological differences?)

So is it okay that, during the model matrix generation step, model A could be removed to make it full rank, and the effects of Batch 1 are captured by the intercept term in the matrix? Then, technically for the confounding factor selection, I could just "--confounding_factors batch2 0 batch3 0" ? in the end, I would get the two batches 'batch-less' but they will be anyway adjusted to Batch 1? Hope I understood it correctly, please correct me.

I was also wondering if the intercept step could be avoided and only the confounding factors can be mentioned as "batch1 1 batch2 0 batch3 0".

Thank you!
Kind regards,
Swethaa

Barry Slaff

unread,
Nov 19, 2024, 5:24:28 PM11/19/24
to Swethaa NG, Biociphers
Dear Swethaa, 
Thank you for your question. MOCCASIN simply requires a full-rank model matrix; indeed, you could use an intercept column as you propose or not. Unfortunately, I see a potentially larger problem. Based on your description, the controls are all in one batch, while the disease samples are all in other batches. (Is this the case? It's my understanding from what you wrote.) Thus, there is no way for MOCCASIN to distinguish the batch effect from the control vs disease signal. Using MOCCASIN would adjust PSI values, but would treat the batch effect and disease effect as one joint effect to be corrected. I would not recommend this. With MOCCASIN, it's better for cases and controls to be distributed across the batches.
Barry

--
You received this message because you are subscribed to the Google Groups "Biociphers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to majiq_voila...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/majiq_voila/c9266670-42bc-4642-bd23-4e90af13e7b5n%40googlegroups.com.

Swethaa NG

unread,
Nov 19, 2024, 6:25:19 PM11/19/24
to Biociphers
Dear Barry,

Thank you for your prompt reply. It is highly appreciated!
Yes,  you are right. I had the same concern and now it is clarified that the adjustment would overlap both aspects: batches and biological differences. 
I will check the sample availability again and proceed. Thanks again!

Kind regards,
Swethaa
Reply all
Reply to author
Forward
0 new messages