Missing pairs of variants when calculating R2 (LD) if all individuals are heterozygous

1 view
Skip to first unread message

paul jay

unread,
11:50 AM (4 hours ago) 11:50 AM
to plink2-users
Hi All,
I noticed a quite strange behaviors when calculating R2, both with Plink1.9 and Plink2.0:
If at one sites, all individuals are heterozygous (being therefore 0/1 or 1/0), the r2 between this site and all others site is not reported. 

A test dataset is attached (Test1.vcf). In this dataset:
  • The SNP 1 and 2 have the same genotype, but with the alternate allele on different strand. 
  • The SNP 2 and 3 have the same genotype except that the first individual is 0/0 instead of 0/1. 
  • The SNP 3 and 4 have the exact same genotype.
  • All individuals are heterozygous at SNP1 and 2
  • All but one individual are heterozygous for SNP 3 and 4

The command I used:
plink1.9 --vcf Test.vcf --maf 0.01 --r2 --out TestData
plink2 --vcf Test.vcf --maf 0.01 --r2-unphased --out TestData2

The result of plink1.9 is attached. Only the ld between 3 and 4 is reported by plink, whereas all sites are in very high LD.

Is there something important that I am missing or is there really a bug somewhere ?

Thank you very much,

Paul
TestData.ld
Test1.vcf
TestData.log
TestData.nosex
Reply all
Reply to author
Forward
0 new messages