Hi Chris,
Working in UKB with bulk vcf data.
Running the command:
plink2 --bcf test1.bcf \
--make-pgen multiallelics=- \
--set-all-var-ids @_#_\$r_\$a \
--import-max-alleles 4 \
--new-id-max-allele-len 2 missing \
--out test1
Results in misnamed loci at some indels, and resutling fileset that is unusable:
(When I try run simple commands on the resulting binaries, i get an error:
line xxx has fewer tokens than expected.)
22 10562860 22_10562860_C_T C T 99.3889 PASS
22 10562862 .^@ CTT C 99.7007 PASS
22 10562862 .^@ CTT GTT 99.7007 PASS
22 10562862 .^@ CTT TTT 99.7007 PASS
22 10562889 22_10562889_G_T G T 99.2429 PASS
22 10562892 22_10562892_T_C T C 91.7276 PASS
22 10562892 22_10562892_T_TA T TA 91.7276 PASS
22 10562894 22_10562894_A_C A C 99.2429 PASS
22 10562895 22_10562895_A_C A C 99.8792 PASS
22 10562895 22_10562895_A_T A T 99.8792 PASS
22 10562910 22_10562910_TC_GC TC GC 99.1101 PASS
22 10562910 22_10562910_TC_T TC T 99.1101 PASS
22 10562911 22_10562911_CT_AT CT AT 98.5931 PASS
22 10562911 22_10562911_CT_C CT C 98.5931 PASS
Note that it is not just all multi-allelic indels affected as there seem to be some which are fine towards the bottom of the excerpt.
If I split multi-allelics and rename without including the allele-names in variant id naming (e.g. --set-all-var-ids @_#) , the file is well-formed.
Just thought I would flag to you.
Thanks as ever for an amazing piece of software.