I would like to use PLINK --score for the PRS, it is weird that I cannot find the SNPs that is selected for the PRS in the following code. Then I tried to extract those SNPs first (as the code in the 2nd way) and then use --score in PLINK. It has different results as the following code. I have no idea the reason for the difference. I would appreciate it so much if you could help me out.
the first way is using the similar code as in the tutorial
```
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz | awk '{print $3,$8}' > SNP.pvalue
yzh10@login2:/N/slate/yzh10/numom2b/GDM> echo "0.000001 0 0.000001" > range_list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz > base.QC.tsv
yzh10@login2:/N/slate/yzh10/numom2b/GDM> plink --bfile EUR.QC \
> --score base.QC.tsv 3 4 8 header \
> --write-snplist \
> --q-score-range range_list SNP.pvalue \
> --extract EUR.valid.snp \
> --out EUR.PRS
PLINK v1.90b7.2 64-bit (11 Dec 2023)
www.cog-genomics.org/plink/1.9/(C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to EUR.PRS.log.
Options in effect:
--bfile EUR.QC
--extract EUR.valid.snp
--out EUR.PRS
--q-score-range range_list SNP.pvalue
--score base.QC.tsv 3 4 8 header
--write-snplist
257366 MB RAM detected; reserving 128683 MB for main workspace.
5419600 variants loaded from .bim file.
5830 people (0 males, 5830 females) loaded from .fam.
--extract: 191265 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 5830 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is in [0.9999995, 1).
191265 variants and 5830 people pass filters and QC.
Note: No phenotypes present.
List of variant IDs written to EUR.PRS.snplist .
Warning: 7241950 lines skipped in --score file (7241950 due to variant ID
mismatch, 0 due to allele code mismatch); see EUR.PRS.nopred for details.
--score: 191265 valid predictors loaded.
Warning: 7241951 lines skipped in --q-score-range data file.
--score: 1 range processed.
Results written to EUR.PRS.*.profile.
```
the results is like this
```
yzh10@login2:/N/slate/yzh10/numom2b/GDM> head EUR.PRS.0.000001.profile
FID IID PHENO CNT CNT2 SCORE
202946750069_R03C01 202946750069_R03C01 -9 2 1 1.485e-07
202947050020_R05C01 202947050020_R05C01 -9 2 1 1.485e-07
202947050133_R04C01 202947050133_R04C01 -9 2 1 1.485e-07
203010920076_R01C01 203010920076_R01C01 -9 2 0 0
203010920076_R02C01 203010920076_R02C01 -9 2 0 0
203010920076_R04C01 203010920076_R04C01 -9 2 1 1.485e-07
203010920076_R06C01 203010920076_R06C01 -9 2 1 1.485e-07
203010920076_R07C01 203010920076_R07C01 -9 2 2 2.97e-07
203010920090_R01C01 203010920090_R01C01 -9 2 1 1.485e-07
```
the second way is to select SNPs first and then use PLINK --score
```
yzh10@login2:/N/slate/yzh10/numom2b/GDM> gunzip -c base.QC.gz | awk '$7 < 1e-6 {print $3}' > significant_snps.list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> comm -12 <(sort significant_snps.list) <(sort EUR.valid.snp) > final_snps_for_score.list
yzh10@login2:/N/slate/yzh10/numom2b/GDM> plink --bfile EUR.QC \
> --score base.QC.tsv 3 4 8 header \
> --extract final_snps_for_score.list \
> --out EUR.PRS
PLINK v1.90b7.2 64-bit (11 Dec 2023)
www.cog-genomics.org/plink/1.9/(C) 2005-2023 Shaun Purcell, Christopher Chang GNU General Public License v3
Logging to EUR.PRS.log.
Options in effect:
--bfile EUR.QC
--extract final_snps_for_score.list
--out EUR.PRS
--score base.QC.tsv 3 4 8 header
257366 MB RAM detected; reserving 128683 MB for main workspace.
5419600 variants loaded from .bim file.
5830 people (0 males, 5830 females) loaded from .fam.
--extract: 47 variants remaining.
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 5830 founders and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is in [0.9999995, 1).
47 variants and 5830 people pass filters and QC.
Note: No phenotypes present.
Warning: 7433168 lines skipped in --score file (7433168 due to variant ID
mismatch, 0 due to allele code mismatch); see EUR.PRS.nopred for details.
--score: 47 valid predictors loaded.
--score: Results written to EUR.PRS.profile .
yzh10@login2:/N/slate/yzh10/numom2b/GDM> head EUR.PRS.profile
FID IID PHENO CNT CNT2 SCORE
202946750069_R03C01 202946750069_R03C01 -9 94 28 0.0111638
202947050020_R05C01 202947050020_R05C01 -9 94 19 -0.000518085
202947050133_R04C01 202947050133_R04C01 -9 94 22 -0.00160957
203010920076_R01C01 203010920076_R01C01 -9 94 23 -1.80851e-05
203010920076_R02C01 203010920076_R02C01 -9 94 17 -0.0035
203010920076_R04C01 203010920076_R04C01 -9 94 26 0.0123585
203010920076_R06C01 203010920076_R06C01 -9 94 26 0.00277128
203010920076_R07C01 203010920076_R07C01 -9 94 23 0.01115
203010920090_R01C01 203010920090_R01C01 -9 94 24 0.00867234
```
We can see that the results are different in two ways, but they are supposed to be the same.