I would like to filter out SNPs as follows:
1. with a minor allele frequency (MAF) < 0.01,
2. with a Hardy-Weinberg equilibrium (HWE) test p value < 10−7
3. with a proportion of missingness (Pm) > 0.05
4. with an imputation information score < 0.8,
5. that are a duplicated SNP
I am using ukbiobank data and am new to PLINK. Can I achieve steps 1-3 as follows?
plink2 --bgen ukb_imp_chr1_v3.bgen ref-first --sample ukb22828_c1_b0_v3_s487253.sample --maf .01 --hwe 1e-7 --mind .05 --make-bpgen --out plink_outputs/chr1
How can I achieve steps 4 and 5? Can this be done in the same command or is a separate command needed?
Thank you kindly in advance,
Maria
--extract-col-cond-min .8
ukb_mfi_chr1_v3.txt 7 1