I have just run simple QC measures on some data that has been cleaned of heterozygous haploids etc..
Running
--geno 0.1
--maf 0.01
--mind 0.1
on the data filters 97% of my samples out (383 out of 396 removed)
I don't think the data is so poor that it fails basic QC so badly.
So I tried running each command by itself rather than in 1 starting with --maf, then --geno and finally --mind. I only lost one individual by doing this.
Another version, i.e. running --maf first then running --geno and --mind on the new file gives
Error: All people removed due to missing genotype data (--mind).
I don't quite understand why this happens? Can someone explain what plink is actually doing that gives such dramatically different results when you change the order of the commands?
Thanks
Maryam