--clump in 2.0

2,137 views
Skip to first unread message

Cameron Both

unread,
Aug 22, 2019, 3:04:38 PM8/22/19
to plink2-users

Hi,
I have .bgen files that I wish to clump. I know plink 1.9 has the --clump functionality, but I was wondering if 2.0 has/will have the --clump functionality in the future?
I realize I could convert the .bgen to .bed/.bim/.fam and then run 1.9, but the size of these files makes that process prohibitive.
Thank you!

Christopher Chang

unread,
Aug 22, 2019, 3:51:07 PM8/22/19
to plink2-users
This will eventually be added, but it has not been prioritized since the plink 2.0 implementation is not expected to be much faster.

If .bed disk space is really that much of a problem even after chromosome-splitting, you can prune the set of samples a bit before running --clump.  If there are close relations present, you'd want to prune those anyway; and after that, --clump doesn't lose much accuracy if you e.g. randomly remove half of the samples first, as long as you started with a sensible minor-allele-count lower bound (and if you didn't, you should backtrack and fix that).

Cameron Both

unread,
Aug 23, 2019, 12:11:21 PM8/23/19
to plink2-users
Thank you for your help!

Uwe Menzel

unread,
Mar 13, 2020, 8:10:58 AM3/13/20
to plink2-users
Hi,

the problem is that plink1.9 complains about duplicate variant names, something that plink2 never did. Going back to 1.9 means renaming the markers or
throwing away the duplicates ...
Not very tempting.

Christopher Chang

unread,
Mar 13, 2020, 10:10:08 AM3/13/20
to plink2-users
plink2 does complain about duplicate variant names when they create real ambiguity.  So, yes, it will work when plink1.9 fails when your main dataset contains duplicate variant IDs which never appear in the --clump input files.  But if the --clump input files contain some of those duplicated IDs, plink2 will have no choice but to complain as well, since --clump is defined to only look at variant-ID and p-value columns for compatibility reasons (so CHROM/POS/REF/ALT can't be used to disambiguate).

(Relatedly, plink2 has some *new* duplicate-variant-ID errors; for example, --indep-pairwise doesn't allow them anymore, and --write-snplist forces you to explicitly opt-in.)

plink2's --set-all-var-ids and --rm-dup flags are very useful for deduplicating variants and their IDs so that everything else works as you'd expect.

Katherine

unread,
Mar 31, 2021, 11:52:27 AM3/31/21
to plink2-users
I was wondering if there has been any update to plink2 that includes the clump feature? I am trying to run this command: 
--clump foo.txt \
--clump-field foo \
--clump-p1 0.000005 \
--clump-p2 0.000005 \
--clump-r2 0.2 \
but I receive an error that the flag is not recognized. Is there any alternative LDClumping/SNP filtering command in plink2 that I can use?

Christopher Chang

unread,
Mar 31, 2021, 12:15:48 PM3/31/21
to plink2-users
No, this hasn't been added yet; continue exporting to .bed and using plink 1.9 for this.
Reply all
Reply to author
Forward
0 new messages