Hi, I'm trying to remove duplicate snps from my dataset by using plink2's --rm-dup
plink2 --bfile all_phase3 --memory 10000 --rm-dup force-first --make-bed --out all_phase3_removed_dup
PLINK v2.00a3 64-bit (17 Feb 2020) www.cog-genomics.org/plink/2.0/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to all_phase3_removed_dup.log.
Options in effect:
--bfile all_phase3
--make-bed
--memory 10000
--out all_phase3_removed_dup
--rm-dup force-first
Start time: Fri Apr 9 00:21:41 2021
12194 MiB RAM detected; reserving 10000 MiB for main workspace.
Using up to 4 compute threads.
2504 samples (1271 females, 1233 males; 2497 founders) loaded from
all_phase3.fam.
84358431 variants loaded from all_phase3.bim.
Note: No phenotype data present.
--rm-dup: 4114 duplicated IDs, 5610 variants removed.
Writing all_phase3_removed_dup.fam ... done.
Writing all_phase3_removed_dup.bim ... done.
Writing all_phase3_removed_dup.bed ... done.
End time: Fri Apr 9 00:36:26 2021
plink1.9 --noweb --bfile all_phase3_removed_dup --extract dataplinkQCed.update.bim --allow-extra-chr --memory 10000 --flip dataplink.1000G.datasetmerged-merge.missnp --make-bed --out 1000G.dataplink.snps.flipped
PLINK v1.90b6.16 64-bit (17 Feb 2020) www.cog-genomics.org/plink/1.9/
(C) 2005-2020 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to 1000G.dataplink.snps.flipped.log.
Options in effect:
--allow-extra-chr
--bfile all_phase3_removed_dup
--extract dataplinkQCed.update.bim
--flip dataplink.1000G.datasetmerged-merge.missnp
--make-bed
--memory 10000
--noweb
--out 1000G.dataplink.snps.flipped
Note: --noweb has no effect since no web check is implemented yet.
12194 MB RAM detected; reserving 10000 MB for main workspace.
84352821 variants loaded from .bim file.
2504 people (1233 males, 1271 females) loaded from .fam.
Error: Duplicate ID '.'.