Hello,
I want to recode the IIDs of imputed data .bgen files into two different filesets, and merge these (working on eye-level analyses with Regenie). As I'm only interested in dosages, I've converted these to .pgen using PLINK2 (ref-first as UK Biobank):
plink2 --bgen data.bgen ref-first --sample data.sample --update-ids recoded_ids_a.txt --make-pgen --out recoded_file_a
plink2 --bgen data.bgen ref-first --sample data.sample --update-ids recoded_ids_b.txt --make-pgen --out recoded_file_b
However, at the merging step, I run into the following error:
plink2 \
--pfile recoded_file_a \
--pmerge \
recoded_file_b.pgen \
recoded_file_b.pvar \
recoded_file_b.psam \
--out merged_files
The biallelic variants with ID 'x' at position x:x in recoded_file_a.pvar appear to be the components of a 'split' multiallelic variant; if so, it must be 'joined' (with e.g. "bcftools norm -m").
As discussed previously on this forum (
https://groups.google.com/g/plink2-users/c/fVF9LGK1A0w), " if I override the error and pass
--multiallelics-already-joined (which of course is not true, these multiallelics are not joined and that is the point), the merge will work but at least some multiallelics get re-normalized by plink, showing up with several variants on the same line. "
I was hoping to use the undocumented command Chris mentioned for splitting multiallelic variants: --make-pgen multiallelics=- However, I run into error:
plink2 --bgen data.bgen ref-first --sample data.sample --update-ids recoded_ids_a.txt --make-pgen multiallelics=- --out recoded_file_a
Error: --bgen accepts at most 3 arguments.
I also tried dropping the multiallelics from my existing recoded pgen files:
plink2 --pfile recoded_file_a --make-pgen multiallelics=- --out modified_recoded_file_a
Error: Multiallelic dosages aren't supported yet.
And yet my log says "Note: All variants are biallelic; nothing to split." ? Please see attached below.
Would appreciate advice on how to get past this step. Is there a workaround in PLINK?
If not, as I'm only interested in biallelic variants, how can I set multiallele doses to missing, or remove these SNPs altogether, with PLINK or another tool?
Many thanks, Nik
PLINK v2.00a3.1LM 64-bit Intel (19 May 2022) www.cog-genomics.org/plink/2.0/
(C) 2005-2022 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to imp_mod_a_21.log.
Options in effect:
--make-pgen multiallelics=-
--out imp_mod_a_21
--pfile recoded_file_a
Start time: x
31629 MiB RAM detected; reserving 15814 MiB for main workspace.
Using up to 4 compute threads.
487409 samples (264222 females, 222938 males, 249 ambiguous; 487409 founders)
loaded from recoded_file_a.psam.
1261158 variants loaded from
recoded_file_a.pvar.
Note: No phenotype data present.
Writing modified_recoded_file_a.psam ... done.Note: All variants are biallelic; nothing to split.Error: Multiallelic dosages aren't supported yet.
Writing modified_recoded_file_a.pvar ... 0% 0% 1% 1% 2% 2% 3% 3% 4% 4% 5% 5% 6% 6% 7% 7% 8% 8% 9% 9% 10% 10% 11% 11% 12% 12% 13% 13% 14% 14% 15% 15% 16% 16% 17% 17% 18% 18% 19% 19% 20% 20% 21% 21% 22% 22% 23% 23% 24% 24% 25% 25% 26% 26% 27% 27% 28% 28% 29% 29% 30% 30% 31% 31% 32% 32% 33% 33% 34% 34% 35% 35% 36% 36% 37% 37% 38% 38% 39% 39% 40% 40% 41% 41% 42% 42% 43% 43% 44% 44% 45% 45% 46% 46% 47% 47% 48% 48% 49% 50% 50% 51% 51% 52% 52% 53% 53% 54% 54% 55% 55% 56% 56% 57% 57% 58% 58% 59% 59% 60% 60% 61% 61% 62% 62% 63% 63% 64% 64% 65% 65% 66% 66% 67% 67% 68% 68% 69% 69% 70% 70% 71% 71% 72% 72% 73% 73% 74% 74% 75% 75% 76% 76% 77% 77% 78% 78% 79% 79% 80% 80% 81% 81% 82% 82% 83% 83% 84% 84% 85% 85% 86% 86% 87% 87% 88% 88% 89% 89% 90% 90% 91% 91% 92% 92% 93% 93% 94% 94% 95% 95% 96% 96% 97% 97% 98% 98% 99% done.
End time: x