Rscript PRSice.R --dir . --prsice PRSice_mac --base PGC_training_hg38.txt --snp snpid --pvalue p --stat or --A1 a1 --A2 a2 --beta F --target targetfile --keep-ambig --binary-target T --cov-file Covariates_435_Sex_PC_Pheno.txt --cov-col Sex,PC1,PC2 --lower 0.000001 --upper 1 --bar-levels 0.000001,0.00001,0.0001,0.001,0.01,0.05,0.1,0.2,0.3,0.4,0.5,1 --fastscore --out PRSice_Results/Compare_with_Plink --no-clump
PRSice 2.0.14.beta (27 October 2017)
https://github.com/choishingwan/PRSice
(C) 2016-2017 Shing Wan (Sam) Choi, Jack Euesden, Cathryn M. Lewis, Paul F. O'Reilly
GNU General Public License v3
If you use PRSice in any publised work, please cite:
Jack Euesden Cathryn M. Lewis Paul F. O'Reilly (2015)
PRSice: Polygenic Risk Score software.
Bioinformatics 31 (9): 1466-1468
2017-11-29 09:19:06
./PRSice_mac
--fastscore
--stat or
--model add
--snp snpid
--binary-target T
--A2 a2
--bar-levels 0.000001,0.00001,0.0001,0.001,0.01,0.05,0.1,0.2,0.3,0.4,0.5,1
--upper 1
--thread 1
--target targetfile
--pvalue p
--lower 1e-06
--A1 a1
--out PRSice_Results/Compare_with_Plink
--cov-file Covariates_435_Sex_PC_Pheno.txt
--seed 708523575
--cov-col Sex,PC1,PC2
--no-clump
--base PGC_training_hg38.txt
Loading Genotype file: targetfile (bed)
435 people (196 male(s), 239 female(s)) observed
435 founder(s) included
6490 ambiguous variant(s) excluded
42526 variant(s) included
Rscript PRSice.R --dir . --prsice PRSice_mac --base PGC_training_hg38.txt --snp snpid --pvalue p --stat or --A1 a1 --A2 a2 --beta F --target targetfile --binary-target T --cov-file Covariates_435_Sex_PC_Pheno.txt --cov-col Sex,PC1,PC2 --lower 0.000001 --upper 1 --bar-levels 0.000001,0.00001,0.0001,0.001,0.01,0.05,0.1,0.2,0.3,0.4,0.5,1 --fastscore --out PRSice_Results/Compare_with_Plink_KeepAmbig --no-clump --keep-ambig
However, when I do the same analysis but only include a bar-level of 0.1 (as below), I get plots for thresholds of 0.1 and 1 and the output indicates that the number of SNPs used for scoring at p=1 is 20,999 which isn't right. It turns out that this 20,999 is exactly the number of SNPs that would be used for scoring at a threshold of 0.10 if ambiguous SNPs were excluded (14,509 SNPs) plus the total number of ambiguous SNPs that are kept by including the --keep-ambig flag (6,490 ambiguous SNPs). This issue doesn't occur when I don't use --keep-ambig. I don't know if this weird result is somehow related to the discrepancy noted above between PRSice and Plink scores.
Rscript PRSice.R --dir . --prsice PRSice_mac --base PGC_training_hg38.txt --snp snpid --pvalue p --stat or --A1 a1 --A2 a2 --beta F --target targetfile --binary-target T --cov-file Covariates_435_Sex_PC_Pheno.txt --cov-col Sex,PC1,PC2 --lower 0.000001 --upper 1 --bar-levels 0.1 --fastscore --out PRSice_Results/Compare_with_Plink_KeepAmbig --no-clump --keep-ambig
Thanks for your help on these and the previous issues. And thanks for providing PRSice!
Best,
Aaron