hwe error message

1,710 views
Skip to first unread message

morten....@gmail.com

unread,
Dec 8, 2015, 2:17:12 PM12/8/15
to plink2-users
Hi,

I am performing QC on a GWAS dataset with 920 individuals of which 57 are cases, number of SNPs is around 1.5M. I keep getting the warning:

"Warning: --hwe observation counts vary by more than 10%.  Consider using
--geno, and/or applying different p-value thresholds to distinct subsets of
your data."

When I run the --hwe 5e-6 command. I have already applied --geno 0.01 prior to this, and I have also tried using the include-nonctrl optionm but I keep getting the exact same warning, which I think is kind of odd. Can anyone explain why the error remains, and what it means?

Thanks,

Christopher Chang

unread,
Dec 8, 2015, 2:19:03 PM12/8/15
to plink2-users
Hmm, that's odd.  Can you post or send me the full log file from your run?

morten....@gmail.com

unread,
Dec 12, 2015, 8:09:57 AM12/12/15
to plink2-users
Hi, sorry about the delay - here's my log file:

PLINK v1.90b3v 64-bit (15 Jul 2015)
Options in effect:
  --bfile TE9_2
  --hwe 5e-6 include-nonctrl
  --maf 0.01
  --make-bed
  --out TE9

Hostname: potentia
Working directory: /home/potentia/FHV/tulstrup/TE_QC
Start time: Thu Nov 19 10:57:45 2015

Random number seed: 1447927065
129146 MB RAM detected; reserving 64573 MB for main workspace.
2460056 variants loaded from .bim file.
928 people (508 males, 420 females) loaded from .fam.
928 phenotype values loaded from .fam.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 928 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Warning: 64 het. haploid genotypes present (see TE9.hh ); many commands
treat these as missing.
Total genotyping rate is 0.998034.

Warning: --hwe observation counts vary by more than 10%.  Consider using
--geno, and/or applying different p-value thresholds to distinct subsets of
your data.
--hwe: 928 variants removed due to Hardy-Weinberg exact test.
963346 variants removed due to minor allele threshold(s)
(--maf/--max-maf/--mac/--max-mac).
1495782 variants and 928 people pass filters and QC.
Among remaining phenotypes, 57 are cases and 871 are controls.
--make-bed to TE9.bed + TE9.bim + TE9.fam ... done.

End time: Thu Nov 19 10:57:49 2015

Does this help?

Christopher Chang

unread,
Dec 12, 2015, 12:34:07 PM12/12/15
to plink2-users
My guess is that this warning will disappear if you exclude the X chromosome (with e.g. "--not-chr x").  On the X chromosome, male observations do not count.  (I will clarify the error message to mention this when it's relevant.)

When there are fewer observations, p-values will tend to be less extreme when HWE is violated, so it makes sense to use a higher --hwe threshold.

(If the warning does *not* go away after using "--not-chr x", let me know; in that case, I may need you to send me a small test dataset so I can reproduce the issue on my end and nail down what's going on.)

morten....@gmail.com

unread,
Dec 12, 2015, 3:39:15 PM12/12/15
to plink2-users
I tried with --not-chr X, but the message did not disappear. Then I tried some other options:

--chr 1-22 -> no message
--chr 1-22,Y -> warning message
--chr 1-22,X -> warning message
--chr 1-22,XY -> no message

So I guess both the X and Y chromosomes are causing trouble?

Christopher Chang

unread,
Dec 13, 2015, 2:14:29 PM12/13/15
to plink2-users
Thanks for reporting the chrY issue.  The Dec 13 builds should report a more appropriate warning for chrX, and no longer raise a warning at all for chrY.

alexi...@gmail.com

unread,
Feb 5, 2018, 9:58:28 PM2/5/18
to plink2-users
Hi Chris when sex chromosomes are not being used, I still get the same hwe error msg:
"PLINK v2.00a1LM 64-bit Intel (3 Feb 2018)      www.cog-genomics.org/plink/2.0/
(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to chr21_plink2_qc.log.
Options in effect:
  --hwe 0.000001
  --maf 0.01
  --make-pgen
  --out chr21_plink2_qc
  --pfile chr21_plink2

Start time: Tue Feb  6 13:15:58 2018
32094 MB RAM detected; reserving 16047 MB for main workspace.
Allocated 676 MB successfully, after larger attempt(s) failed.
Using up to 4 compute threads.
746 samples (307 females, 439 males; 746 founders) loaded from
chr21_plink2.psam.
1099164 variants loaded from chr21_plink2.pvar.
1 binary phenotype loaded (466 cases, 280 controls).
Calculating allele frequencies... done.
Warning: --hwe observation counts vary by more than 10%.  Consider using
--geno, and/or applying different p-value thresholds to distinct subsets of
your data.
--hwe: 1632 variants removed due to Hardy-Weinberg exact test (founders only).
961151 variants removed due to minor allele threshold(s)
(--maf/--max-maf/--mac/--max-mac).
136381 variants remaining after main filters.
Writing chr21_plink2_qc.pgen ... done.
Writing chr21_plink2_qc.pvar ... done.
Writing chr21_plink2_qc.psam ... done"

Any idea why this happened?

Regards,
Alexis

在 2015年12月14日星期一 UTC+11上午6:14:29,Christopher Chang写道:

Christopher Chang

unread,
Feb 6, 2018, 1:31:14 PM2/6/18
to plink2-users
It's probably because you never filtered out high-missingness variants (--geno).  You shouldn't get that warning after "--geno 0.1".

(There is also an edge case where you filter out some samples after running the --geno filter, and as a consequence some variants that were close to 10% missingness are now slightly over 10%.  If that's the cause, I wouldn't worry about the warning.)
Reply all
Reply to author
Forward
0 new messages