Query regarding Plink1.9 merge

138 views
Skip to first unread message

ramsh...@gmail.com

unread,
Mar 5, 2022, 12:05:19 PM3/5/22
to plink2-users
Hi, 

In my case, I had used GATK HaplotypeCaller to call variants.  I had used SNP_Hard_Filtered_VCF file obtained using GATK.  In my previous queries, I had issues regarding multi-allelic during plink merging. To solve that I had followed given below steps

I had followed these Steps (1 and 2) for 144 Samples.

1.
bcftools norm -Ou -m -any HF_PASS_SNPs.vcf.gz | bcftools norm -Ou -f Bos_taurus_Ensembl_UMD3.1/genome.fa | bcftools annotate -Ob -x ID -I +'%CHROM:%POS:%REF:%ALT' | /usr/bin/plink1.9 --bcf /dev/stdin --keep-allele-order -cow --allow-no-sex --nonfounders --make-bed --out HF_PASS_SNPs_plink

This step (above command) has been suggested in this link to convert VCF to plink format

2. Then I had performed QC steps 

/usr/bin/plink1.9 --bfile HF_PASS_SNPs_plink --cow --allow-no-sex --nonfounders --keep-allele-order --mind 0.1 --geno 0.1 --maf 0.05 --make-bed --out HF_PASS_SNPs_plink_QC

3. Then merge 144 Samples 
/usr/bin/plink1.9 --cow --make-bed --merge-list myFile.txt --out mymerged_144

PLINK v1.90b6.22 64-bit (3 Nov 2020)           www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to mymerged_144.log.
Options in effect:
  --cow
  --make-bed
  --merge-list myFile.txt
  --out mymerged_144

64245 MB RAM detected; reserving 32122 MB for main workspace.
Warning: Variants '1:21444:A:G' and '1:21444:A:*' have the same position.
Warning: Variants '1:21446:C:G' and '1:21446:C:*' have the same position.
Warning: Variants '1:21448:T:C' and '1:21448:T:*' have the same position.

7955 more same-position warnings: see log file.
Performing single-pass merge (138 cattle, 342592 variants).
Merged fileset written to mymerged_144-merge.bed + mymerged_144-merge.bim +
mymerged_144-merge.fam .
342592 variants loaded from .bim file.
138 cattle (0 males, 0 females, 138 ambiguous) loaded from .fam.
Ambiguous sex IDs written to mymerged_144.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 138 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.0729077.
342592 variants and 138 cattle pass filters and QC.
Note: No phenotypes present.
--make-bed to mymerged_144.bed + mymerged_144.bim + mymerged_144.fam ... done.


I am able to get merge files (bim, bed, fam). But I am not sure, If it is correct or not?

Before merging the Plink files, Total genotyping rate for each sample was 0.97.

Here After merging, Total genotyping rate is 0.0729077. 
Could you please explain, what might be the reason? Should I use the output for further steps?

Thanks a lot in advance.

Christopher Chang

unread,
Mar 7, 2022, 12:05:02 PM3/7/22
to plink2-users
Look at your starting VCF files.  Are there any "0/0" genotypes, or are they all "0/1" and "1/1"?  If there are no "0/0" genotypes, you need to regenerate them so that those genotypes are reported.

ramsh...@gmail.com

unread,
Mar 8, 2022, 4:52:21 AM3/8/22
to plink2-users
Thanks a lot

ramsh...@gmail.com

unread,
Mar 8, 2022, 6:25:14 AM3/8/22
to plink2-users

Dear Chang,

As you have mentioned that 
Look at your starting VCF files.  Are there any "0/0" genotypes, or are they all "0/1" and "1/1"?  If there are no "0/0" genotypes, you need to regenerate them so that those genotypes are reported.

I have checked it and found out that there are no "0/0" genotypes. Kindly apologize if it looks stupid question, But How to regenerate "0/0" genotypes? and why do we need "0/0" genotypes?  

zcat 994_FCH7L2KCCX2_L8_BISvveXAAEEAAA-95_HF_PASS_SNPs.vcf.gz | awk '{print $10}' | grep -c "0/1"
10308

zcat 994_FCH7L2KCCX2_L8_BISvveXAAEEAAA-95_HF_PASS_SNPs.vcf.gz | awk '{print $10}' | grep -c "1/1"
159196

 zcat 994_FCH7L2KCCX2_L8_BISvveXAAEEAAA-95_HF_PASS_SNPs.vcf.gz | awk '{print $10}' | grep -c "0/0"
0


I would be extremely grateful. Kindly help me regarding this issue.
Thank you so much in advance. 

Christopher Chang

unread,
Mar 8, 2022, 10:17:38 AM3/8/22
to plink2-users

ramsh...@gmail.com

unread,
Mar 8, 2022, 10:56:11 AM3/8/22
to plink2-users
Dear Chang,

Thank you so much.

Reply all
Reply to author
Forward
0 new messages