warning: impossible a1 allele assignment

1,588 views
Skip to first unread message

Mark Christiansen

unread,
Sep 12, 2016, 5:36:10 PM9/12/16
to plink2...@googlegroups.com

Hi Christopher,

I am working with a dataset that has 13k+ variants and I get a warning message for exactly 1000 variants that says "Impossible A1 allele assignment for variant rs***". 

Is it just by chance that there are exactly 1000 error messages, or does plink stop printing these messages after 1000 warnings? 

Our genetic data comes from an Omni chip, so I use the a1-allele flag with a file from this site:
http://www.well.ox.ac.uk/~wrayner/strand/ 

I am also using the --real-ref-alleles flag.  I think I am running into a known issue where A1/A2 can only be for the minor/major allele and not necessarily the reference/alternative allele.  It is unclear to me how I can get past this issue. 

Thank you in advance for providing any suggestions you may have!

Sincerely,

Mark


Christopher Chang

unread,
Sep 12, 2016, 5:47:40 PM9/12/16
to plink2-users
Hi Mark,

While there are a few cases where plink stops printing warnings to the console, all detected anomalies should be mentioned in the log file.  (For --a1-allele/--a2-allele, plink doesn't stop printing to the console at 1000.)

Have you checked what the .bim file entries for these variants look like?  (My first guess is that there are some variants where only one of the two alleles appears in your dataset, and the second allele code was lost and replaced with a non-'0' value.)

mchr...@uw.edu

unread,
Sep 20, 2016, 4:11:34 PM9/20/16
to plink2-users
Christopher, 

Thank you for your response.  I don't have .bim files, but looking at the .ped files, the problematic variants don't seem any different than non-problematic ones.  The values look valid for both alleles, and there is no more missing genotype info than in other parts of the dataset. 

Mark

Todd Johnson

unread,
Sep 22, 2016, 12:08:27 AM9/22/16
to plink2-users
Hi Mark,
Here are some ideas on your problem.

When forcing the REF allele in plink, the A2 allele is supposed to be assigned as REF, so you should be using --a2-allele with Mark Rayner's file rather than --a1-allele.  I assume that you're using one of his files under the RefAlt section of his page? Those are the only ones that look directly usable by the --a2-allele command.  Also, I think that --real-ref-alleles would only be used if you are exporting the .ped/.bed file into a VCF format.  Is that what you're doing?  Without --real-ref-alleles, the VCF file would be made, but then a flag added that the REF allele may not actually be the REF allele. With --real-ref-alleles, then the VCF is made without the flag and assumes that you correctly set REF=A2 and ALT=A1 in your plink commands. But, in your case, the alleles would be flipped.  As an aside, I think with the --a1-allele file --real-ref-alleles commands that you gave, that some contradictory behavior may have happened.  On loading, I believe that plink first checks the allele frequencies, automatically flips A1 to the minor allele, and if the SNP is monomorphic, A1 gets set to 0. If for some monomorphic SNPs, the REF allele was not the major allele, then after that automatic process, A2 would be the ALT allele and then the REF allele would not be found across the available A1/A2 alleles by plink when it tried to set A1 to the allele in your Mark Rayner file. If you have monomorphic SNPs in your input file, you might try --keep-allele-order (keep the order of alleles in the input A1/A2  and do not set A1 to 0), and then --a2-allele MRfile.  If you're just running a plink analysis and not exporting to VCF, then I believe that you should leave out the --real-ref-alleles flag.

Todd

Prakash Thakor

unread,
Apr 25, 2018, 7:44:10 AM4/25/18
to plink2-users
Hi Todd,
As discussed above, I am getting the same problem in which the A1 allele showing 0 frequency as well as the allele name also showing 0. I understand that it is because of monomorphic nature of marker but it should show atleast name of allele (i.e. A/G) instead 0.
Thanks in advance !!! 

Christopher Chang

unread,
Apr 25, 2018, 11:18:53 AM4/25/18
to plink2-users
plink does not actually set the minor allele of monomorphic variants to ‘0’. It’s only ‘0’ if you never tell plink what the second allele is (e.g. you only give plink a .ped file that mentions one allele code); if you provide a .bim file which has the second allele code, plink will keep track of it even if that second allele does not appear in your samples. Just make sure you’re always using —bfile/—make-bed instead of —file/—recode.

Incidentally, you can use —a1-allele + —make-bed to fill in minor alleles lost by inappropriate use of —recode.

Victor Hsu

unread,
Sep 30, 2018, 5:07:59 AM9/30/18
to plink2-users
Hello Christopher,

I have a same problem about the a2 or a2 allele assignments. I download the files from

I want to add a reference file (download from other published plink file)

I use 
  --a1-allele Wolf_Ref.txt
  --alleleACGT
  --allow-no-sex
  --bfile GoldenRetriever_cancer
  --dog
  --keep-allele-order
  --make-bed
  --not-chr 0 39 40 X Y MT
  --noweb
  --out GoldenRetriever_Cancer_Ref_Wolf

I check the bim file. They do have 0 in the REF/ALT. How can I to solve this problem?
Reply all
Reply to author
Forward
0 new messages