bug in plink2 --bcf import - file ok in plink1.9 and bcftools

0 views
Skip to first unread message

Gabriel Doctor

unread,
7:12 PM (3 hours ago) 7:12 PM
to plink2-users
Hi Christopher, 
I have been using minimac4.1.6 to impute and chose to output as  a bcf file (this is not via TOPMED/MIS server but my own set up).
The outoput bcf file can be read by bcftools and plink1.9, but not by plink2.

If I use bcftools just to I/O back to bcf, the resulting file still doesn't work in plink2, with the same error.
If i create a vcf rrom bcftools, plink2 reads this. 
If i then back-convert this vcf to bcf in bcftools, plink2 can read the resutling bcftovcftobcf file... Note that it has the same HDS:DS fields as far as I can see. 

Selecting dosage=HDS or dosage=DS made no difference.

Perhpas this is all a quirk of the minimac4.1 output, but just FYI. 

IE:
ORIG=chr22_1_32000000.identicals.dose.bcf
./plink2 --bcf $ORIG --make-pgen --out ORIG
#Error: Variant record #1 of --bcf file is malformed.

$bcftools view $ORIG -Ov -o  ORIG.bcftovcf.vcf
plink2 --vcf ORIG.bcftovcf.vcf --make-pgen --out ORIG.bcftovcf
# vcf loads

$bcftools view $ORIG -Ob -o bcfdirectobcf.bcf
plink2 --bcf bcfdirectobcf.bcf --make-pgen --out bcfdirectobcf  
#Error: Variant record #1 of --bcf file is malformed.

$bcftools view ORIG.bcftovcf.vcf -Ob -o vcfbacktobcf.bcf
plink2 --bcf vcfbacktobcf.bcf --make-pgen --out vcfbacktobcf
#this is bcf-->vcf--> bcf loads without error!!

$bcftools view  $ORIG | head -n19 > originalheader.txt
$bcftools view  vcfbacktobcf.bcf | head -n21  > reconvertedheader.txt
diff originalheader.txt reconvertedheader.txt
#reports only expected header differences (2 additional lines of bcftools, even thoug this includes the first variant line. Explicitly:
tail -n1 originalheader.txt
tail -n1 reconvertedheader.txt
# these look the same
htsfile "$ORIG"
# file.bcf:   BCF version 2.2 compressed variant calling data



PLINK v2.0.0-a.7LM AVX2 Intel (28 Nov 2025)
Options in effect:
  --bcf file.bcf
  --make-pgen
  --out ORIG

Hostname: job-J4jzzj8Jfk46Y6g0b358X4Jv
Working directory: /home/dnanexus
Start time: Thu Dec  4 23:44:19 2025

Random number seed: 1764891859
7816 MiB RAM detected, ~6539 available; reserving 3908 MiB for main workspace.
Using up to 2 compute threads.
Error: Variant record #1 of --bcf file is malformed.

End time: Thu Dec  4 23:44:19 2025  


singlevariant.bcf
Reply all
Reply to author
Forward
0 new messages