Calculating KING relationship with WES data

547 views
Skip to first unread message

filippo abbondanza

unread,
Jun 22, 2020, 3:26:22 AM6/22/20
to plink2-users
Hi Chris. I have a question regarding --make-king-table with WES data. I have WES data (called with deepvariant) of 29 individuals across 6 families and I wanted to check the relatedness among them. To do so I have created the king table (see log below) but I have strange kinship coefficient, where everyone has a coefficient ~ 0.2 - 0.3 (M3061 is the mother of M3326, first row). Note that I've merged the individual gVCF using bcftools to get the cohort level VCF. Is there some error in the PLINK2 command? I've also called the data with GATK and computed KING relationship, and in this case the values are negative (logs/table below). I understand why the values are off in the case of GATK (few variants) but I thought with DeepVariant I would have get the 'expected' results?

Thank you!
Filippo


KING TABLE DEEPVARIANT

#ID1 ID2 NSNP HETHET IBS0 KINSHIP

M3326 M3061 5155143 0.0112402 0.000731503 0.297194

M3061 M3060 5071178 0.00862265 0.000912806 0.241459

M3290 M3060 4772992 0.00810226 0.000886656 0.234145

M3290 M3061 4763566 0.00828182 0.000892609 0.242888

M3293 M3060 4904381 0.00830013 0.000910818 0.234995

M3293 M3061 4893459 0.00851749 0.00095924 0.240258

M3293 M3290 4909967 0.00827052 0.000858865 0.256781

M3294 M3060 3824175 0.00666183 0.000760425 0.213884

M3294 M3061 3747587 0.00689537 0.000790909 0.222196

M3294 M3290 3922782 0.0071202 0.000658461 0.254849

M3294 M3293 3988325 0.00913215 0.000534811 0.314767

M3300 M3060 4837598 0.00825348 0.000852696 0.243545

M3300 M3061 4826064 0.00842322 0.00087359 0.249047

M3300 M3290 4837685 0.00815163 0.000759454 0.264442

M3300 M3293 4949150 0.00848206 0.000755685 0.268713



LOG DEEPVARIANT

PLINK v2.00a2LM 64-bit Intel (26 Aug 2019)     www.cog-genomics.org/plink/2.0/

(C) 2005-2019 Shaun Purcell, Christopher Chang   GNU General Public License v3

Logging to king_1000G_EUR_pruned.log.

Options in effect:

  --allow-extra-chr

  --extract indepSNP.prune.in # created previously with --indep-pairwise 50 5 0.2

  --make-king-table

  --out king_1000G_EUR_pruned  

  --read-freq /storage/home/users/fa36/data/1000G/100G_new.afreq #Keeping only EUR individuals

  --threads 16

  --vcf merged_deepvariant_bcftools.vcf.gz


Start time: Sun Jun 21 18:59:58 2020

516882 MiB RAM detected; reserving 258441 MiB for main workspace.

Using up to 16 threads (change this with --threads).

--vcf: 327875949 variants scanned.

--vcf: king_1000G_EUR_pruned-temporary.pgen +

king_1000G_EUR_pruned-temporary.pvar + king_1000G_EUR_pruned-temporary.psam

written.

29 samples (0 females, 0 males, 29 ambiguous; 29 founders) loaded from

king_1000G_EUR_pruned-temporary.psam.

327875949 variants loaded from king_1000G_EUR_pruned-temporary.pvar.

Note: No phenotype data present.

--extract: 327875949 variants remaining.

Warning: Ignoring --read-freq since no command would use the frequencies.

327875949 variants remaining after main filters.

Excluding 13069817 variants on non-autosomes from KING-robust calculation.

--make-king-table: Scanning for singletons and monomorphic variants... done.

249917141 variants handled by initial scan.

--make-king-table pass 1/1: Writing...                   

--make-king-table: 64888991 variants processed.

Results written to king_1000G_EUR_pruned.kin0 .

End time: Sun Jun 21 19:31:46 2020




KING TABLE GATK

#ID1 ID2 NSNP HETHET IBS0 KINSHIP

M3061 M3060 697381 0.0642217 0.140513 -0.622938

M3290 M3060 629885 0.0637974 0.129613 -0.609387

M3290 M3061 639494 0.0639678 0.131584 -0.630488

M3293 M3060 665038 0.0620867 0.133577 -0.628798

M3293 M3061 675750 0.0625572 0.138066 -0.665105

M3293 M3290 621010 0.0668604 0.121829 -0.53112

M3294 M3060 440548 0.0636571 0.0896611 -0.487747

M3294 M3061 444688 0.0634242 0.092141 -0.516524

M3294 M3290 427686 0.0714122 0.0780315 -0.357911


LOG GATK


PLINK v2.00a2LM 64-bit Intel (26 Aug 2019)

Options in effect:

  --allow-extra-chr

  --make-king-table

  --read-freq /storage/home/users/fa36/data/1000G/100G_new.afreq

  --threads 12

  --vcf recalibrated.filtered.vcf


Hostname: marvin.marvindomain.com

Working directory: /storage/home/users/av45/exome/reanalysis

Start time: Sun Jun 21 16:00:09 2020


Random number seed: 1592751609

516882 MiB RAM detected; reserving 258441 MiB for main workspace.

Using up to 12 threads (change this with --threads).

--vcf: 3847943 variants scanned.

--vcf: plink2-temporary.pgen + plink2-temporary.pvar + plink2-temporary.psam

written.

29 samples (0 females, 0 males, 29 ambiguous; 29 founders) loaded from

plink2-temporary.psam.

3847943 variants loaded from plink2-temporary.pvar.

Note: No phenotype data present.

Warning: Ignoring --read-freq since no command would use the frequencies.

Excluding 123692 variants on non-autosomes from KING-robust calculation.

--make-king-table: Scanning for singletons and monomorphic variants... done.

256480 variants handled by initial scan.

--make-king-table: 3467771 variants processed.

Results written to plink2.kin0 .


End time: Sun Jun 21 16:00:22 2020



Christopher Chang

unread,
Jun 22, 2020, 9:04:33 AM6/22/20
to plink2-users
This indicates that you aren't performing proper QC.  More precisely, it is important to exclude variants that are not close to Hardy-Weinberg equilibrium, especially when that's because the variant-caller is screwing up royally and calling most or all samples heterozygous; you're in a tough position with only 29 samples, but you may still be able to detect and filter enough instances of this.  The Hardy-Weinberg p-value calculator at https://www.cog-genomics.org/software/stats/ may be helpful re: setting a --hwe threshold.

Other notes:
- From the documentation at http://people.virginia.edu/~wc9c/KING/manual.html : "Please do not prune or filter any "good" SNPs that pass QC prior to any KING inference ... LD pruning is not recommended in KING."
- As noted in the .log files, --read-freq has no effect on the KING method.
Reply all
Reply to author
Forward
0 new messages