Hi,
I calculated MAFs using PLINK 1.90 and PLINK 2.0 using the same input file and the same filters. I have copied and pasted the .log files and the output files below:
PLINK 1.90
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.1.9.log
PLINK v1.90p 64-bit (16 Apr 2016)
Options in effect:
--bfile /groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2
--chr 13
--freq
--from-bp 33590571
--out /groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.1.9
--to-bp 33640282
Hostname: login-3-internal
Working directory: /groupvol/med-bio/******/****
Start time: Wed Jul 25 14:10:41 2018
Random number seed: 1532524241
193488 MB RAM detected; reserving 96744 MB for main workspace.
13 variants loaded from .bim file.
488377 people (223506 males, 264857 females, 14 ambiguous) loaded from .fam.
Ambiguous sex IDs written to
/groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.1.9.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 488377 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.995749.
--freq: Allele frequencies (founders only) written to
/groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.1.9.frq .
End time: Wed Jul 25 14:10:41 2018
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.1.9.frq
CHR SNP A1 A2 MAF NCHROBS
13 rs9315201 T G 0.02868 976052
13 rs385564 G C 0.324 947324
13 rs526906 A G 0.1597 975754
13 rs537313 G A 0.3844 973512
13 rs577912 T G 0.1522 974090
13 rs118136643 T G 0.01641 969912
13 rs554634 C T 0.3086 974458
13 rs17643609 T C 0.01326 975728
13 rs9536314 G T 0.1602 975346
13 rs9527025 C G 0.1603 974732
13 rs141741908 C T 0.0005481 976062
13 Affx-89011680 TG T 4.097e-06 976244
13 rs146235320 A G 0.001433 974604
PLINK 2.0
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.2.0.log
PLINK v2.00a1LM 64-bit Intel (11 Feb 2018)
Options in effect:
--bfile /groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2
--chr 13
--freq
--from-bp 33590571
--out /groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.2.0
--to-bp 33640282
Hostname: login-3-internal
Working directory: /groupvol/med-bio/******/****/scans/***_KL_1/continuous
Start time: Wed Jul 25 14:04:49 2018
Random number seed: 1532523889
193488 MB RAM detected; reserving 96744 MB for main workspace.
Using up to 24 threads (change this with --threads).
488377 samples (264857 females, 223506 males, 14 ambiguous; 488377 founders)
loaded from /groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2.fam.
26806 variants loaded from
/groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2.bim.
1 categorical phenotype loaded (488377 values).
Calculating allele frequencies... done.
--freq: Allele frequencies (founders only) written to
/groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.2.0.afreq .
End time: Wed Jul 25 14:04:49 2018
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.2.0.afreq
#CHROM ID REF ALT ALT_FREQS OBS_CT
13 rs9315201 T G 0.971316 976052
13 rs385564 G C 0.676037 947324
13 rs526906 G A 0.159667 975754
13 rs537313 G A 0.615646 973512
13 rs577912 G T 0.15217 974090
13 rs118136643 T G 0.983585 969912
13 rs554634 T C 0.308568 974458
13 rs17643609 T C 0.986738 975728
13 rs9536314 G T 0.839793 975346
13 rs9527025 C G 0.8397 974732
13 rs141741908 C T 0.999452 976062
13 Affx-89011680 TG T 0.999996 976244
13 rs146235320 A G 0.998567 974604
The MAFs calculated by PLINK 2.0 don't make sense because some of them are greater than 0.5; this problem does not occur when PLINK 1.90 is used. I re-ran the command using the latest release, but the problem still occurred.
PLINK 2.0 19.07.2018
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.2.0.log
PLINK v2.00a2LM 64-bit Intel (19 Jul 2018)
Options in effect:
--bfile /groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2
--chr 13
--freq
--from-bp 33590571
--out /groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.2.0
--to-bp 33640282
Hostname: login-3-internal
Working directory: /groupvol/med-bio/******
Start time: Wed Jul 25 14:39:19 2018
Random number seed: 1532525959
193488 MiB RAM detected; reserving 96744 MiB for main workspace.
Using up to 24 threads (change this with --threads).
488377 samples (264857 females, 223506 males, 14 ambiguous; 488377 founders)
loaded from /groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2.fam.
26806 variants loaded from
/groupvol/med-bio/******/****/genotypes/***_cal_chr13_v2.bim.
1 categorical phenotype loaded (488377 values).
Calculating allele frequencies... done.
--freq: Allele frequencies (founders only) written to
/groupvol/med-bio/******/****/KL_gen_imp/snp_QC/called.25.2.0.afreq .
End time: Wed Jul 25 14:39:19 2018
-bash-4.2$ cat KL_gen_imp/snp_QC/called.25.2.0.afreq
#CHROM ID REF ALT ALT_FREQS OBS_CT
13 rs9315201 T G 0.971316 976052
13 rs385564 G C 0.676037 947324
13 rs526906 G A 0.159667 975754
13 rs537313 G A 0.615646 973512
13 rs577912 G T 0.15217 974090
13 rs118136643 T G 0.983585 969912
13 rs554634 T C 0.308568 974458
13 rs17643609 T C 0.986738 975728
13 rs9536314 G T 0.839793 975346
13 rs9527025 C G 0.8397 974732
13 rs141741908 C T 0.999452 976062
13 Affx-89011680 TG T 0.999996 976244
13 rs146235320 A G 0.998567 974604
Does anybody know why PLINK 2.0 is outputting MAFs > 0.5?
Many thanks,
Hasnat