Plink2 .bgen Allele Swapped

275 views
Skip to first unread message

Letitia Sng

unread,
Jan 11, 2021, 8:41:10 PM1/11/21
to plink2-users
Hi, I've noticed that the reference and alternative alleles have been swapped after a few QC filtering measures on Plink2. For example, in the untouched .bgen file, rs558604819 (chr1:10642) has 'G' as the first/ref allele and 'A' as the alternative. 

Welcome to bgenix

(version: 1.1.7, revision )

(C) 2009-2017 University of Oxford

Building query : (11/?,0.1s,182.0/s)

# bgenix: started 2021-01-12 10:38:15

alternate_ids rsid chromosome position number_of_alleles first_allele alternative_alleles

1:10177_A_AC rs367896724 01 10177 2 A AC

1:10235_T_TA rs540431307 01 10235 2 T TA

1:10352_T_TA rs201106462 01 10352 2 T TA

1:10505_A_T rs548419688 01 10505 2 A T

1:10506_C_G rs568405545 01 10506 2 C G

1:10511_G_A rs534229142 01 10511 2 G A

1:10539_C_A rs537182016 01 10539 2 C A

1:10542_C_T rs572818783 01 10542 2 C T

1:10579_C_A rs538322974 01 10579 2 C A

1:10616_CCGCCGTTGCAAAGGCGCGCCG_C 1:10616_CCGCCGTTGCAAAGGCGCGCCG_C 01 10616 2 CCGCCGTTGCAAAGGCGCGCCG C

1:10642_G_A rs558604819 01 10642 2 G A

# bgenix: success, total 11 variants.

Thank you for using bgenix.


After the following job, the same SNP has 'A' as the reference allele and 'G' as the alternate. 

PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020)   www.cog-genomics.org/plink/2.0/

(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3

Logging to ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3.log.

Options in effect:

  --bgen ukb_imp_chr1_v3.bgen ref-first

  --debug

  --exclude /home/sng001/hapmap3_r3_b36_fwd.consensus.qc.poly.map.rsids.txt

  --export bgen-1.2

  --extract ukb_mfi_chr1_v3_INFO05.txt

  --hwe 1E-6 midp

  --keep-allele-order

  --maf 0.0001

  --max-alleles 2

  --memory 128000

  --out ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3

  --sample ukb27483_imp_chr1_v3_s487282.sample

  --threads 8

 

Start time: Thu Dec  3 10:59:48 2020

Note: --keep-allele-order no longer has any effect.

128704 MiB RAM detected; reserving 128000 MiB for main workspace.

Using up to 8 compute threads.

--bgen: 7402791 variants detected, format v1.2.

487409 samples imported from .sample file to

ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3-temporary.psam .

--bgen: ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3-temporary.pgen +

ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3-temporary.pvar written.

487409 samples (264295 females, 222987 males, 127 ambiguous; 487409 founders)

loaded from ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3-temporary.psam.

7402791 variants loaded from

ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3-temporary.pvar.

Note: No phenotype data present.

--extract: 4676180 variants remaining.

--exclude: 4557846 variants remaining.

Calculating allele frequencies... done.

Warning: --hwe observation counts vary by more than 10%.  Consider using

--geno, and/or applying different p-value thresholds to distinct subsets of

your data.

--hwe midp: 643888 variants removed due to Hardy-Weinberg exact test (founders

only).

1460278 variants removed due to allele frequency threshold(s)

(--maf/--max-maf/--mac/--max-mac).

2453680 variants remaining after main filters.

Writing ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3.bgen ... done.

Writing ukb_imp_chr1_v3_MAF0001_INFO05_HWE1-6_-HapMap3.sample ... done.

End time: Fri Dec  4 12:55:41 2020


Welcome to bgenix

(version: 1.1.7, revision )

(C) 2009-2017 University of Oxford

Building query                                              :  (1/?,0.0s,250.2/s)

# bgenix: started 2021-01-12 10:43:49

alternate_ids   rsid    chromosome      position        number_of_alleles       first_allele    alternative_alleles

.       rs558604819     1       10642   2       A       G

# bgenix: success, total 1 variants.

Thank you for using bgenix.

Am I missing a needed flag to keep the allele order? And I did not notice this till a few steps into the pipeline, so how does this affect other filtering, particularly calculating heterozygosity (--het) and missingness (--geno & --mind)?

Looking forward to your advice. 



Christopher Chang

unread,
Jan 11, 2021, 8:43:24 PM1/11/21
to plink2-users
"--export bgen-1.2 ref-first" instead of just "--export bgen-1.2".

Letitia Sng

unread,
Jan 11, 2021, 10:03:18 PM1/11/21
to plink2-users
Thanks for the quick response!

In addition, is there a way to change the reference allele back to the correct allele without running the pipeline again? Would something like --ref-allele work with .bgen-v1.2 files? 


Christopher Chang

unread,
Jan 11, 2021, 11:48:04 PM1/11/21
to plink2-users
You could swap the alleles in the final .bgen with e.g.
  plink2 --bgen wrong.bgen ref-last --export bgen-1.2 ref-first
Reply all
Reply to author
Forward
0 new messages