Error using single-sample VCF file

1,004 views
Skip to first unread message

Summer Rae

unread,
Oct 8, 2015, 7:24:32 PM10/8/15
to plink2-users
Hello-

I'm trying to convert a VCF file to Plink recode 12 format for use with the ADMIXTURE program. I'm having some trouble, so I would like to find out if I'm properly using PLINK for the conversion. 


My VCF file is a single-sample vcf formatted as: 

chr1 11675 . A . 0.00 LowGQX;HighDPFRatio . GT:GQX:DP:DPF .:.:0:1

chr1 11676 . A . 0.00 LowGQX END=11685;BLOCKAVG_min30p3a GT:GQX:DP:DPF 0/0:3:1:0

chr1 11686 . A . 0.00 LowGQX END=12007;BLOCKAVG_min30p3a GT:GQX:DP:DPF .:.:0:0

chr1 12008 . A . 0.00 LowGQX;HighDPFRatio END=12009;BLOCKAVG_min30p3a GT:GQX:DP:DPF .:.:0:1

chr1 12010 . G . 0.00 LowGQX END=12012;BLOCKAVG_min30p3a GT:GQX:DP:DPF 0/0:3:1:0


When I run the following command: 


plink --vcf test.vcf.gz -recode 12 --out test_plink


I get the following output: 


16384 MB RAM detected; reserving 8192 MB for main workspace.

--vcf: test_plink-temporary.bed + test_plink-temporary.bim +

test_plink-temporary.fam written.

999893 variants loaded from .bim file.

1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.

Ambiguous sex ID written to test_plink.nosex .

Using 1 thread (no multithreaded calculations invoked.

Before main variant filters, 1 founder and 0 nonfounders present.

Calculating allele frequencies... done.

Total genotyping rate is 0.996537.

999893 variants and 1 person pass filters and QC.

Note: No phenotypes present.

--recode to test_plink.ped + test_plink.map ... done.


And my .ped file is a long list of 1's, blanks and 2's. I am not sure what the correct output should look like, so I tried to run it through admixture and got a vague error stating 'Error: detected that all genotypes are missing for a SNP locus. Please apply quality-control filters to remove such loci.' 

Christopher Chang

unread,
Oct 8, 2015, 7:40:37 PM10/8/15
to plink2-users
Hi,

It looks like you need to add e.g. "--geno 0.1" to your command line to remove the variants with missing genotypes.

Summer Rae

unread,
Oct 9, 2015, 2:26:41 PM10/9/15
to Christopher Chang, plink2-users
Thank you so much, that did the trick! 

--
You received this message because you are subscribed to a topic in the Google Groups "plink2-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/plink2-users/J4F52Rac4SU/unsubscribe.
To unsubscribe from this group and all its topics, send an email to plink2-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Thanks,
Summer Rae

zillur rahman

unread,
Jul 13, 2018, 8:38:25 PM7/13/18
to plink2-users
Hello,
I have the same problem and using --geno 0.1 doesn't fix it. Any suggestion?

Thanks
Best Regards
Zillur

Christopher Chang

unread,
Jul 13, 2018, 8:51:09 PM7/13/18
to plink2-users
Can you post the full .log file from your run?

zillur rahman

unread,
Jul 16, 2018, 10:23:19 PM7/16/18
to plink2-users
 Hi,
I am facing related type of problems. Here it is:
" plink --vcf sample_64.vcf --make-bed --out sample64 --chr-set 31 --geno 0.1 -recode 12
PLINK v1.90b5.4 64-bit (10 Apr 2018)           www.cog-genomics.org/plink/1.9/
(C) 2005-2018 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to sample64.log.
Options in effect:
  --chr-set 31
  --geno 0.1
  --make-bed
  --out sample64
  --recode 12
  --vcf sample_64.vcf

177045 MB RAM detected; reserving 88522 MB for main workspace.
--vcf: sample64-temporary.bed + sample64-temporary.bim + sample64-temporary.fam
written.
Error: .bim file has a split chromosome.  Use --make-bed by itself to
remedy this."

If I run:
"plink --vcf sample_64.vcf --make-bed --out sample64 --chr-set 31"
I get a bed file but I can't use it in admixture analysis.
"Error: detected that all genotypes are missing for a SNP locus.
Please apply quality-control filters to remove such loci."

What should I do now?
Any help will be greatly appreciated.
Thanks in advance.

Best Regards
Zillur

Christopher Chang

unread,
Jul 17, 2018, 2:02:28 PM7/17/18
to plink2-users
1. Create the .bed file.
2. Run "plink --bfile sample64 --make-bed --out sample64_sorted --chr-set 31" to sort it.
3. You can perform the remaining operations off of the sorted .bed (--geno 0.1, --recode 12).
Message has been deleted

rgo

unread,
Apr 28, 2020, 7:29:08 AM4/28/20
to plink2-users
Hi,

I have the same problem to run admixture. I tried with --geno 0.1 but the problem persist.
My vcf file has scaffolds and contigs.

Thanks!!!

My log file is:
PLINK v1.90b6.12 64-bit (28 Oct 2019)
Options in effect:
  --allow-extra-chr 0
  --bfile guanacos_ID_DP13_MD50_2_SNPonly
  --geno 0.1
  --out guanacos_ID_DP13_MD50_2_SNPonly
  --recode


Random number seed: 1588031512
257764 MB RAM detected; reserving 128882 MB for main workspace.
1166730 variants loaded from .bim file.
39 people (0 males, 0 females, 39 ambiguous) loaded from .fam.
Ambiguous sex IDs written to guanacos_ID_DP13_MD50_2_SNPonly.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 39 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.700429.
990529 variants removed due to missing genotype data (--geno).
176201 variants and 39 people pass filters and QC.
Note: No phenotypes present.
--recode ped to lg_ID_DP13_MD50_2_SNPonly.ped +
lg_ID_DP13_MD50_2_SNPonly.map ... done.
Reply all
Reply to author
Forward
0 new messages