confusion to clump p1 and clump p2

316 views
Skip to first unread message

xujiay...@gmail.com

unread,
Aug 28, 2016, 7:33:23 AM8/28/16
to PRSice
Hi,
I have confusion to the clump snp T, the PRSice manual said the clump p1 means significant for index SNP, clump p2 means significant for clumped SNP, Why are they the same (clump p1 =1, clump p2=1 ) by default ?



Auto Generated Inline Image 1
Auto Generated Inline Image 2

Joni Coleman

unread,
Aug 31, 2016, 5:18:26 AM8/31/16
to PRSice
Hi,

Clumping works by assigning all SNPs in LD with a SNP with a lower p-value (the index SNP) to a single clump (represented by the index SNP). Clumping thresholds (clump.p1 and clump.p2) can be used to remove SNPs with high p-values from consideration (for example, if you were only interested in index SNPs with genome-wide significance, or if you only wanted to include SNPs with nominal significance in your clumps).

PRSice by default wants to include as many SNPs as possible, so the thresholds are set to 1 (i.e. include all SNPs in clumping, and allow index SNPs to have any p-value).

Hope that helps.

xujiay...@gmail.com

unread,
Sep 1, 2016, 8:35:43 AM9/1/16
to PRSice
Hi,
Thank you so much ! But another two quesions:
1. How to set the clump p1 and p2 in PRSice? Is it clump SNP T, clump p1 0.001(or others), clump p2 0.01(or others) ?
2. Do you have the recomend of clump p2 ? What's the relationship of clump p1 and p2? 

Joni Coleman

unread,
Sep 1, 2016, 10:34:40 AM9/1/16
to PRSice
1. Example assuming you want to set p1 to 0.001 and p2 to 0.01 below. Note the periods in these options - spaces define new options.

clump.snps T clump.p1 0.001 clump.p2 0.01

2. For the purposes of PRSice, I would leave p1 and p2 as 1, so that all possible SNPs are included in the scores. For PRSice, we only keep the index SNPs, so p2 doesn't really matter (beyond p2 being > p1, see below). 

An explanation of that point. p1 is the threshold for not including SNPs as index SNPs. p2 is the threshold for excluding ALL SNPs (index and non-index). In general p2 >= p1 (because if you are excluding all SNPs at a certain threshold, you're excluding the index SNPs by default, so it's pointless to set p1 > p2). 
In some applications, p2 matters - if you were examining the results of a GWAS on 100000s of people, you might care about the significance of the variants that are in LD with your top index SNPs, rather than just the number of such variants. 
However, in the case of PRSice clumping is a means of negating the effects of LD while keeping the most significant SNPs possible (otherwise we would just prune for LD, rather than clumping). As such, we're generally not interested in applying any of these p-value thresholds. 

xujiay...@gmail.com

unread,
Sep 1, 2016, 9:58:42 PM9/1/16
to PRSice
Thank you so much! That helps me a lot!
Reply all
Reply to author
Forward
0 new messages