Convert 23andme data to VCF

389 views
Skip to first unread message

Emily Brown

unread,
Sep 26, 2021, 7:46:31 PM9/26/21
to plink2-users
Hi there,

I am trying to convert a 23andme file to a VCF file. Using Plink 1.9 and 2.0 is the closest I have gotten, but I am still not there. Here are my steps.

Downloaded Plink 1.9 and Plink 2.0 to my Lenovo Thinkpad
Converted a 23andme text file to plink_ebgenome

plink --23file genome_Emily_Brown_v5_Full.txt Brown Emily --out plink_ebgenome

This gives me three files plink_ebgenome.bed, plink_ebgenome.bim, plink_ebgenome.fam.

Then I tried using plink or plink2 to convert to vcf.

plink2 --bfile plink_ebgenome --recode vcf –out jbgenome.vcf

Both create a vcf, but it has some problems.

  1. there are entrys for SNPs that would match a reference (VCF files are supposed to ONLY store gene sequence variations)
  2. there is no reference allele
  3. heterozygotes have an entry in the QUAL column. 

Please see attached screenshot. I would love any advice on how to resolve these issues.

Best

Emily



23andmetoVCF.png

Christopher Chang

unread,
Sep 26, 2021, 8:24:19 PM9/26/21
to plink2-users
The main issue is that 23andMe files do not specify which alleles are reference.  See plink 2.0's --ref-from-fa flag for a fix.

As for "VCF files are supposed to ONLY store gene sequence variations", that has not been true for many years.  But if you are exporting a VCF to a tool that is based on such an assumption, you can use e.g. "--maf 0.1:nref" to filter out the homozygous-reference genotypes in a single-sample fileset.
Reply all
Reply to author
Forward
0 new messages