I am trying to calculate Weir and Cockerham’s FST across the five 1000 Genomes supercontinent populations using plink 1.9, with the following command:
plink --bfile chr22_1KGP --fst --within ALLpops.txt --out chr22_FST
(The chr22_1KGP files were previously converted from a 1000 Genomes Project VCF file that had been filtered down to only contain biallelic SNPs and a subset of unrelated individuals).
However, I am having trouble with the --within flag. There should be five continental groups and 2460 individuals, but I keep getting the following output:
2460 people (0 males, 0 females, 2460 ambiguous) loaded from .fam.
--within: 4 clusters loaded, covering a total of 801 people.Before main variant filters, 2460 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is exactly 1.
131702 variants and 2460 people pass filters and QC.
Note: No phenotypes present.
Writing --fst report (4 populations) to chr22_FST.fst ... done.
129931 markers with valid Fst estimates (1771 excluded).
If it helps, the file “ALLpops.txt” used by the --within flag was created in R and contains 3 columns: FID, IID, and cluster (continental groups), as shown below:
I have checked that all 2460 individuals are there and separated into five groups. I also tried setting all FID values to 0 instead of matching the IID but got the same results. Additionally, I did not get any further using plink 2.0.
What would you advise me to do?
Thank you in advance,
Lynne