PLINK 1.9-Heterozygous haploid warning-solution

3,337 views
Skip to first unread message

Prachi Chopra

unread,
Apr 27, 2014, 3:46:48 PM4/27/14
to plink2...@googlegroups.com

I tried plink 1.9 for solving heterozygous haploid warning using the following command-

'plink --file unclean_fileset --split-x b37 --make-bed --out clean_fileset'

Earlier I was getting 103317 heterozygous haploids but now it is decreased to 103205.

But still getting some errors which I have pasted below-

Warning: 103205 het. haploid genotypes present (see HbF_hh_clean.hh).
Warning: Nonmissing nonmale Y chromosome genotype(s) present.
Total genotyping rate is 0.996871.
894327 variants and 254 people pass filters and QC.
Error: --split-x cannot be used when the dataset already contains an XY region.
(Did you mean --merge-x instead?)

Then I used "--merge-x" for the warning "Nonmissing nonmale Y chromosome genotype(s) present"

It changed 646 chromosome codes and the heterozygous haploids are still same - 103205

Kindly help me out.

Thanks a lot

Christopher Chang

unread,
Apr 27, 2014, 6:56:10 PM4/27/14
to plink2...@googlegroups.com
If an XY region is already present, you shouldn't use either --split-x or --merge-x here.

Do you have sex information for all samples?  If you don't, you'll need to use --check-sex/--impute-sex to fill in the blanks.

If heterozygous haploid calls still remain, the most likely cause is nonmissing female genotype calls on the Y chromosome; others have reported that this is fairly common.  A quick way to check the number of these is to just load the Y chromosome with e.g. "plink --bfile semi_clean_fileset --chr 24 --freq".  If all the heterozygous haploid errors are on the Y chromosome, you can safely clobber them with --make-bed + --set-hh-missing.  (If some are on the X, --set-hh-missing *might* still be okay, but I'd need to know more about the data source and the --check-sex report to be sure.)

Vincent Laufer

unread,
Aug 14, 2016, 5:45:59 PM8/14/16
to plink2-users
Chris - I think what would help me most is to understand the underlying issue about this error and why it exists. Is this described somewhere?

Thank you.

Christopher Chang

unread,
Aug 15, 2016, 12:16:14 PM8/15/16
to plink2-users
This is discussed under the "Warnings" section of https://www.cog-genomics.org/plink2/errors , but yes, it's hard to know to look there.  I will try to (finally) put together a good one-stop FAQ within the next month.

kitty

unread,
May 19, 2017, 10:27:53 AM5/19/17
to plink2-users
Hi Chris,

I wonder if you could help with this please.

I had both  the warning "1681151 het. haploid genotypes present" & "Nonmissing nonmale Y chromosome genotype(s) present" when I ran the --check-sex. Here are the codes I used:

plink --bfile file --chr 23-24 --make-bed --out myfile_unsplit
plink --bfile  myfile_unsplit --split-x b37 --make-bed --out split_myfile
plink --bfile split_myfile --indep-pairphase 20000 2000 0.5 --chr 23-24
plink --bfile split_myfile --extract plink.prune.in --make-bed split_myfile_pruned
plink --bfile split_myfile_pruned --check-sex

The output is very strange - I had ~2000 samples, but half of the sample had sex mismatch according to the output. Then I checked plink.hh file from the procedure you recommended as below ("plink --bfile semi_clean_fileset --chr 24 --freq"), it seems all the errors are on Y, so I clobber them with:
plink --bfile split_myfile_pruned --set-hh-missing --make-bed --out split_myfile_pruned_set-hh-missing

It improved the --check-sex results, those males wrongly assigned as "female" all got right, but still ~500 female samples are assigned as "male"(1) with large F ( > 0.97). I wonder, in your opinion, what the problem is likely to be please?

Many thanks,

KItty


Reply all
Reply to author
Forward
0 new messages