converting vcf to bgen

1,457 views
Skip to first unread message

Daniel McGuire

unread,
Sep 25, 2018, 10:11:41 AM9/25/18
to plink2-users
Hello,

I'm interested in converting a vcf file to bgen with plink2.  When I run the command

plink2 \
 --export bgen-1.2 ref-first bits=8 \
 --dosage-erase-threshold 0.006 \
  --out bgen/chr14 \
  --threads 7 \
  --vcf vcf/chr14.vcf.gz dosage=DS

I get the warning message : 

Warning: Unphased heterozygous hardcalls in partially-phased variants are
poorly represented with bits=8.
It is necessary to use e.g. --dosage-erase-threshold 0.006 to re-import them
cleanly.

Does this mean that the --dosage-erase-threshold flag I provided was not recognized?

I'm converting to bgen files specifically to make them 
compatible with the BOLTLMM program.  And it seems that BOLTLMM is rejecting the bgen files made by plink, with message such as:

ERROR: 14:20004949 has Phased = 2 (not 0)

I have vcf files that were converted using the qctool program which do not produce this kind of error, so I am wondering if it has to do with plink, and possibly the dosage-erase-threshold flag.  

According to the bgen format website:
If Phased=1 the row stores one probability per allele (other than the last allele) per haplotype (e.g. to represent phased data).
If Phased=0 the row stores one probability per possible genotype (other than the 'last' genotype where all alleles are the last allele), to represent unphased data.
Any other value for Phased is an error.

Converting vcf to bgen is extremely slow using the qctool program which does not implement multithreading for file conversions.  Given I only need the dosage, and not the GP's, I would much rather be able to do file conversion in plink if possible.  

Thanks in advance for any help or advice you can offer.







Christopher Chang

unread,
Sep 25, 2018, 11:26:52 AM9/25/18
to plink2-users
BOLT-LMM does not support bgen-1.2 phased dosages, so you need to remove phase information before exporting to bgen-1.2:

plink2 --vcf vcf/chr14.vcf.gz dosage=DS --make-pgen erase-phase --out chr14_unphased
plink2 --pfile chr14_unphased --export bgen-1.2 ref-first bits=8 --out bgen/chr14
rm chr14_unphased.*

"--dosage-erase-threshold 0.006" is only relevant to *re-import* of a phased-dosage .bgen, it has no effect when exporting; I'll see if I can make this clearer.

Daniel McGuire

unread,
Sep 25, 2018, 12:05:53 PM9/25/18
to plink2-users
Thanks very much! That is a big help.

reagan moseti

unread,
Feb 14, 2022, 4:06:59 AM2/14/22
to plink2-users
Hi everyone, Thanks for this platform.

I had imputed some UK Biobank genotype (about 9000 individual of African ancestry) using the Sanger Imputation Service (African Genome resources). I am trying to convert it to BGEN files using the following code

plink2 --vcf 22.vcf.gz dosage=DS --make-pgen erase-phase --out 22.unphased 

plink2 --pfile 22.unphased --export bgen-1.2 ref-first bits=8 --out chr22 

The first line runs well and generated the necessary files. However I get this error with the second PLINK2 command

(C) 2005-2017 Shaun Purcell, Christopher Chang   GNU General Public License v3

Logging to /well/hill/zmm016/UKB/qced/imputed/bgenfiles/chr22.log.

Options in effect:

  --export bgen-1.2 ref-first bits=8

  --out /well/hill/zmm016/UKB/qced/imputed/bgenfiles/chr22

  --pfile 22.unphased


Start time: Mon Feb 14 08:56:25 2022

Error: Invalid --export parameter 'bits=8'.

For more info, try 'plink2 --help [flag name]' or 'plink2 --help | more'.

Is there something I am doing wrong? Please assist.


Kind regards,


Reagan


Christopher Chang

unread,
Feb 14, 2022, 12:38:26 PM2/14/22
to plink2-users
1. Please include the version string when copying the .log/console output of a command.
2. In this case, I can see from the copyright range that you're using too old of a plink2 build; please update to the current build.

Reply all
Reply to author
Forward
0 new messages