counting number of individuals the possess a mutation in 2 sites

65 views
Skip to first unread message

Caterina

unread,
Oct 18, 2023, 3:51:06 AM10/18/23
to plink2-users
Hi I wanted to do something like this:

../plink --bfile sample --extract mutations.txt --recode A --out filtered_sample

where mutations.txt are just 2 SNPs. 
I would like to obtain the number of individuals with the 2 mutations at the same time, but would also like to know if they're homozygous or heterozygous.
Unfortunately I get this error:
Error: --recode does not support multipass recoding of very large files.

I imagine because it shows me the data of each individual. Is there anyway I can just get the count or frequency instead, but with the homozygous or heterozygous information?

Thank you!




Christopher Chang

unread,
Oct 18, 2023, 8:04:17 AM10/18/23
to plink2-users
1. Please post a full .log file, rather than just the error message, when asking for troubleshooting help.
2. plink 1.9 --freqx and plink 2.0 --geno-counts provide this information.

Caterina

unread,
Oct 19, 2023, 11:45:51 AM10/19/23
to plink2-users
All right. Thank you!
And how could I measure the number of people that have 2 mutations at the same time?

Caterina

unread,
Oct 19, 2023, 12:54:42 PM10/19/23
to plink2-users
Never mind I guess it would be just multiplying the frequencies of mutation in SNP1 and SNP2.

Christopher Chang

unread,
Oct 19, 2023, 1:00:48 PM10/19/23
to plink2-users
Actually, it would not, especially if the two genomic positions are very close to each other.

plink 1.x's --twolocus command provides one simple way to get these SNP1 x SNP2 counts.

Caterina

unread,
Oct 19, 2023, 1:07:59 PM10/19/23
to plink2-users
I see, thanks again! And how could I count SNP1xNOT_SNP2? Basically frequency of people with SNP1 but NOT SNP2? Or would twolocus give me that information as well?

Christopher Chang

unread,
Oct 19, 2023, 1:12:59 PM10/19/23
to plink2-users
--twolocus would also give you that information.

Caterina

unread,
Oct 19, 2023, 1:34:24 PM10/19/23
to plink2-users
What if I don't have the variant ID, only the position? What would be the syntax? Or it isn't possible to do something like that?

Christopher Chang

unread,
Oct 19, 2023, 1:35:56 PM10/19/23
to plink2-users
In that case, you can first use plink 2.0's --set-all-var-ids flag to change all the IDs to be position/allele-based.

Caterina

unread,
Oct 19, 2023, 3:50:43 PM10/19/23
to plink2-users
Thank you! Last question, I promise. What do I do if my loci are in 2 different chromosomes and my PLINK files (.bim, .fam, .bed) are separated by chromosome. Do I have to merge them or is there another way?

Christopher Chang

unread,
Oct 19, 2023, 6:16:01 PM10/19/23
to plink2-users
You have to merge them to use --twolocus (or extract just those SNPs first and merge them, etc.).
Reply all
Reply to author
Forward
0 new messages