extract one SNP from bgen file

644 views
Skip to first unread message

Xinzhu Wei

unread,
Jun 27, 2019, 9:49:30 PM6/27/19
to plink2...@googlegroups.com
Hi Christopher,

Is it possible to use plink2 to extract one SNP from a ".bgen" file? I am trying to convert it to vcf.

Here are the flags I used. 
./plink2 --bgen [.bgen Filename] --sample [.sample filename] --recode vcf --snps 3:46457412_T_C --out [Filename] 

It has been hours, and I wonder it is not doing what I want it to do. 
Start time: Thu Jun 27 15:50:19 2019
257923 MiB RAM detected; reserving 128961 MiB for main workspace.
Using up to 64 threads (change this with --threads).
--bgen: 6696680 variants detected, format v1.2.
487409 samples imported from .sample file to
rs113010081Imputation-temporary.psam .
--bgen: 6696k variants scanned.
--bgen: 58k variants converted.    

Am I missing something? Would the .bgen file work with rs numbers.

What would you recommend?


Many thanks

Christopher Chang

unread,
Jun 27, 2019, 10:22:33 PM6/27/19
to plink2-users
If you only need to do this once, plink2 is not the best tool for the job. Use something like bgenix instead.

However, if you need to perform many queries of this sort, and you only need dosages and not the raw genotype probability triplets, plink2 is extremely efficient if you convert to pgen format just once, and then work with that.

Xinzhu Wei

unread,
Jun 29, 2019, 4:51:34 PM6/29/19
to Christopher Chang, plink2-users
Dear Christopher,

Is there a place that I could read up on the cutoffs plink use to convert the non-integer genotype data from .bgen into the integer genotypes in .pgen, and how does it decide missing data?
I realize that a portion of the data was determined as missing data by plink. 

Thanks a lot.
April

On Thu, Jun 27, 2019 at 7:22 PM Christopher Chang <chrch...@gmail.com> wrote:
If you only need to do this once, plink2 is not the best tool for the job.  Use something like bgenix instead.

However, if you need to perform many queries of this sort, and you only need dosages and not the raw genotype probability triplets, plink2 is extremely efficient if you convert to pgen format just once, and then work with that.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/plink2-users/1bed72d3-f117-4035-b3b5-bb9d9b2393e4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
—————————————————————
photo 
Xinzhu (April) Wei
Postdoc, UC Berkeley

https://xinzhuaprilwei.weebly.com

Christopher Chang

unread,
Jun 29, 2019, 5:04:11 PM6/29/19
to plink2-users
.pgen does keep track of dosages.  Use "--export vcf vcf-dosage=DS" instead of plain "--export vcf" when you want an exported VCF to include those dosages.

Meanwhile, to control when associated hardcalls are set to missing vs. rounded to the nearest integer, see http://www.cog-genomics.org/plink/2.0/input#dosage_import .


On Saturday, June 29, 2019 at 1:51:34 PM UTC-7, Xinzhu Wei wrote:
Dear Christopher,

Is there a place that I could read up on the cutoffs plink use to convert the non-integer genotype data from .bgen into the integer genotypes in .pgen, and how does it decide missing data?
I realize that a portion of the data was determined as missing data by plink. 

Thanks a lot.
April

On Thu, Jun 27, 2019 at 7:22 PM Christopher Chang wrote:
If you only need to do this once, plink2 is not the best tool for the job.  Use something like bgenix instead.

However, if you need to perform many queries of this sort, and you only need dosages and not the raw genotype probability triplets, plink2 is extremely efficient if you convert to pgen format just once, and then work with that.

--
You received this message because you are subscribed to the Google Groups "plink2-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to plink2-users+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages