Making bfiles but having Error: Line 385937 of .vcf file has fewer tokens than expected

879 views
Skip to first unread message

7X

unread,
Aug 16, 2021, 10:43:22 PM8/16/21
to plink2-users
I am starting to use PLINK 1.9 from my Windows Powershell Ubuntu 20.04 and convert 1000 Genomes files to bed/bim/fam. Bugs were popping up continuously. Help me please. 

./plink.exe --vcf ALL.chr15.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz --make-bed --out ALL.chr15.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes
PLINK v1.90b6.24 64-bit (6 Jun 2021)           www.cog-genomics.org/plink/1.9/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ALL.chr15.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.log.
Options in effect:
  --make-bed
  --out ALL.chr15.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes
  --vcf ALL.chr15.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz

32544 MB RAM detected; reserving 16272 MB for main workspace.
--vcf: 385k variants complete.
Error: Line 385937 of .vcf file has fewer tokens than expected.

Christopher Chang

unread,
Aug 16, 2021, 10:46:10 PM8/16/21
to plink2-users
What happens if you try to use plink 2.0 instead of 1.9 for this operation?

7X

unread,
Aug 16, 2021, 10:49:57 PM8/16/21
to plink2-users
Here it is. 

./plink2.exe --vcf ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz --make-bed --out ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes
PLINK v2.00a3 64-bit (4 Aug 2021)              www.cog-genomics.org/plink/2.0/
(C) 2005-2021 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.log.
Options in effect:
  --make-bed
  --out ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes
  --vcf ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz

Start time: Tue Aug 17 11:50:48 2021
32544 MiB RAM detected; reserving 16272 MiB for main workspace.
Using up to 16 threads (change this with --threads).
--vcf: 16k variants scanned.
Error: --vcf file decompression failure: Malformed BGZF block.

7X

unread,
Aug 16, 2021, 11:11:40 PM8/16/21
to plink2-users
I downloaded the 1KGP3 files from this source;

Christopher Chang

unread,
Aug 16, 2021, 11:17:40 PM8/16/21
to plink2-users
Are other VCF-processing programs able to fully parse the downloaded files?  The error messages imply corrupted downloads.

7X

unread,
Aug 17, 2021, 2:02:06 AM8/17/21
to plink2-users
Gunzip command did not work. /c/PLINK/plink1.9_win64_20210606
$ gunzip ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz

gzip: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5b.20130502.genotypes.vcf.gz: unexpected end of file

Tabix and vcftools are not properly working on Windows. Please let me know are there any extractor tools for Windows..?

Christopher Chang

unread,
Aug 17, 2021, 11:48:51 AM8/17/21
to plink2-users
The tools are fine, the problem is that you did not fully download the files.  You need to choose a more reliable downloading method.

7X

unread,
Aug 18, 2021, 9:59:11 PM8/18/21
to plink2-users
Thanks for the suggestions. Now, I am able to download the full-sized files from the ftp server via Windows system. Using right-click and deleting old cookies solved the errors.
Reply all
Reply to author
Forward
0 new messages