Hi, I have this issue. When I did --ld analysis to run --check-sex to see if my samples ID are well identified, I have this warnings.
Ambiguous sex ID written to Bed_LD.nosex .
--extract: 379174 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 667 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Warning: 48318 het. haploid genotypes present (see Bed_LD.hh ); many commands treat these as missing.
Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing.
Total genotyping rate is 0.993207.
Then, when I run --check-sex I still have the .hh file as an output. So, I run --split-x to see if this can resolve this issue, and the ouput is:
Ambiguous sex ID written to prueba_split.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 667 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Warning: 48318 het. haploid genotypes present (see prueba_split.hh ); many commands treat these as missing.
Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing.
Total genotyping rate is 0.993207.
379174 variants and 667 people pass filters and QC.
Note: No phenotypes present.
Error: --split-x cannot be used when the dataset already contains an XY region.
I asked before, and it means that it is doesn't needed to do split-x to my data. So I defined F values for my data, when I use --check-sex 0.6 0.9.
Ambiguous sex ID written to Bed_sexcheck_X.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 667 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Warning: 48318 het. haploid genotypes present (see Bed_sexcheck_X.hh ); many commands treat these as missing.
Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing.
Total genotyping rate is 0.993207.
379174 variants and 667 people pass filters and QC.
Note: No phenotypes present.
--check-sex: 7124 Xchr and 0 Ychr variant(s) scanned, 26 problems detected.
And then I run --check-sex y-only, and I don't understand why I get so many IID with problems.
379174 variants loaded from .bim file.
667 people (429 males, 237 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to Bed_sexcheck_Y.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 667 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Warning: 48318 het. haploid genotypes present (see Bed_sexcheck_Y.hh ); many commands treat these as missing.
Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing.
Total genotyping rate is 0.993207.
379174 variants and 667 people pass filters and QC.
Note: No phenotypes present.
--check-sex: 0 Xchr and 45 Ychr variant(s) scanned, 238 problems detected.
I run --set-hh-missing after excluding all the variants that are in the .hh file and I have no problems any more in the --check-sex y-only analysis, except by one IID that has no defined sex before the analysis. I had a lot more of problems detections with the --check-sex analysis.
I apoligize to ask so much, but I want to know if this is right, or not. I know that I just can delete the 26 IDs that have problems, but I want to know if is right. My sample is a little one, of close to 600 people, so any individual is going to affect the maf in the posterior case-control analysis.
Thanks!