vcf-merge?

556 views
Skip to first unread message

test1

unread,
Sep 3, 2014, 5:47:51 PM9/3/14
to plink2...@googlegroups.com
can we merge multiple vcf files using plink2?
thanks
test

Christopher Chang

unread,
Sep 3, 2014, 6:11:58 PM9/3/14
to plink2...@googlegroups.com
General-purpose handling of VCF files is outside PLINK's scope; use e.g. VCFtools or PLINK/SEQ for that.

What PLINK can do is import VCF files into its own format (which can currently only represent biallelic variants), and export to VCF from it.  So if the *only* thing you need to keep in your VCF files is the biallelic variant calls, you can do the following:

1. import to plink format using --vcf + --out
2. merge the plink-format filesets with --merge-list
3. convert the merged data back to VCF using --recode vcf

But if you also still need to keep variant qualities or the like, PLINK is not the right tool for the job.

test1

unread,
Sep 4, 2014, 11:34:29 AM9/4/14
to plink2...@googlegroups.com
thank you...the problem was number 2. since merging in plink format using sequencing data is problematic. (there is mostly no rs numbers  and using position will give duplicates etc)...vcfftools also has its own story!

freeseek

unread,
Sep 17, 2014, 4:58:38 PM9/17/14
to plink2...@googlegroups.com
On Wednesday, September 3, 2014 6:11:58 PM UTC-4, Christopher Chang wrote:
General-purpose handling of VCF files is outside PLINK's scope; use e.g. VCFtools or PLINK/SEQ for that.

I would suggest using bcftools merge to merge VCF files, as it is much faster than anything else out there. However, when merging VCF files, you need to remember that genotypes that are homozygous reference in one VCF files, and therefore uncalled, and heterozygous in another VCF file, will merge as a combination of missing and called genotypes. This is quite problematic when merging VCF files and it is currently a huge limitation intrinsic to VCF files.
Reply all
Reply to author
Forward
Message has been deleted
Message has been deleted
0 new messages